This is a discussion on connection timed out/server dropped connection, but I can telnet to 25 just fine within the mailing.postfix.users forums, part of the Mail Servers and Related category; This is a really long message, but I wanted to be sure I was complete and didn't leave out ...
|
|||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
|
|||
|
This is a really long message, but I wanted to be sure I was complete
and didn't leave out any information. But I'm sure I've fogotten something. I've got a couple of machines to which I relay email that exhibit a periodic problem. All of a sudden, postfix will start deferring email to these servers with Apr 14 22:02:04 A1 postfix-test/smtp[14101]: B514C19E53B: to=<xxx@xxx.com>, relay=none, delay=1, status=deferred (connect to mail.xxx.com[xxx.xxx.xxx.xxx]: server dropped connection without sending the initial SMTP greeting) Apr 14 22:02:07 A1 postfix-test/smtp[14175]: connect to mail.xxx.com[xxx.xxx.xxx.xxx]: server dropped connection without sending the initial SMTP greeting (port 25) OR Apr 15 17:09:27 A1 postfix-test/smtp[4944]: 8BCEC19D603: to=<xxx@xxx.com>, relay=none, delay=-1486, status=deferred (connect to mail.xxx.com[xxx.xxx.xxx.xxx]: Connection timed out) But at the very moment that postfix is deferring email, I'm able, from that very server that's deferring email to telnet directly to port 25 of the remote server: [root@A1 postfix]# telnet mail.xxx.com telnet mail.xxx.com 25 Trying xxx.xxx.xxx.xxx... Connected to mail.xxx.com. Escape character is '^]'. 220 mail.xxx.com; ESMTP Wed, 14 Apr 2004 15:24:17 -0700 What could possibly be the cause? The 220 prompt comes up from the server within a few seconds when I'm telnet'ing. It doesn't happen with every domain we relay to - just a selected few seem to exhibit this problem. But it's pretty consistent with them. I can't find anything common to them, except that we relay a pretty fair volume of email for each. One is running an old version of sendmail. Another is running Groupwise, and another is running Exchange. No common firewalls. I'm running postfix 2.1RC1 on RedHat 8 (kernel 2.4.18-14smp). The server sits behind a Foundry ServerIron XL load balancer, which does NAT. When the connections are deferring, netstat shows the connections in a "SYN_SENT" state: tcp 0 1 192.168.1.151:52682 xxx.xxx.xxx.xxx:25 SYN_SENT tcp 0 1 192.168.1.151:52680 xxx.xxx.xxx.xxx:25 SYN_SENT tcp 0 1 192.168.1.151:52662 xxx.xxx.xxx.xxx:25 SYN_SENT tcp 0 1 192.168.1.151:52661 xxx.xxx.xxx.xxx:25 SYN_SENT And I've captured a tcpdump of a working telnet attempt (done right when postfix was deferring mail), and one of the postfix attempts: tcpdump of telnet sessions - it works fine here: 22:31:07.072886 A1.53138 > mail.xxx.com.smtp: S 3602467833:3602467833(0) win 5840 <mss 1460,sackOK,timestamp 11517015 0,nop,wscale 0> (DF) [tos 0x10] 22:31:07.088740 mail.xxx.com.smtp > A1.53138: S 429229711:429229711(0) ack 3602467834 win 32120 <mss 1460,sackOK,timestamp 9898319 11517015,nop,wscale 0> (DF) 22:31:07.088780 A1.53138 > mail.xxx.com.smtp: . ack 1 win 5840 <nop,nop,timestamp 11517023 9898319> (DF) [tos 0x10] 22:31:07.114391 mail.xxx.com.smtp > A1.53138: P 1:66(65) ack 1 win 32120 <nop,nop,timestamp 9898322 11517023> (DF) 22:31:07.114413 A1.53138 > mail.xxx.com.smtp: . ack 66 win 5840 <nop,nop,timestamp 11517036 9898322> (DF) [tos 0x10] 22:31:10.595481 A1.53138 > mail.xxx.com.smtp: P 1:7(6) ack 66 win 5840 <nop,nop,timestamp 11518819 9898322> (DF) [tos 0x10] 22:31:10.611164 mail.xxx.com.smtp > A1.53138: . ack 7 win 32120 <nop,nop,timestamp 9898672 11518819> (DF) 22:31:10.612907 mail.xxx.com.smtp > A1.53138: P 66:111(45) ack 7 win 32120 <nop,nop,timestamp 9898672 11518819> (DF) 22:31:10.612927 A1.53138 > mail.xxx.com.smtp: . ack 111 win 5840 <nop,nop,timestamp 11518828 9898672> (DF) [tos 0x10] 22:31:10.614188 mail.xxx.com.smtp > A1.53138: F 111:111(0) ack 7 win 32120 <nop,nop,timestamp 9898672 11518819> (DF) 22:31:10.614272 A1.53138 > mail.xxx.com.smtp: F 7:7(0) ack 112 win 5840 <nop,nop,timestamp 11518828 9898672> (DF) [tos 0x10] 22:31:10.629762 mail.xxx.com.smtp > A1.53138: . ack 8 win 32120 <nop,nop,timestamp 9898673 11518828> (DF) tcpdump of postfix delivery attempt: 22:31:49.412819 A1.53287 > mail.xxx.com.smtp: S 3630904508:3630904508(0) win 5840 <mss 1460,sackOK,timestamp 11538695 0,nop,wscale 0> (DF) 22:31:49.414756 A1.53288 > mail.xxx.com.smtp: S 3624199470:3624199470(0) win 5840 <mss 1460,sackOK,timestamp 11538696 0,nop,wscale 0> (DF) 22:31:49.430190 mail.xxx.com.smtp > A1.53287: S 473491106:473491106(0) ack 3630904509 win 32120 <mss 1460,sackOK,timestamp 9902554 11538695,nop,wscale 0> (DF) 22:31:49.430244 A1.53287 > mail.xxx.com.smtp: . ack 1 win 5840 <nop,nop,timestamp 11538703 9902554> (DF) 22:31:49.432132 mail.xxx.com.smtp > A1.53288: S 465690567:465690567(0) ack 3624199471 win 32120 <mss 1460,sackOK,timestamp 9902554 11538696,nop,wscale 0> (DF) 22:31:49.432151 A1.53288 > mail.xxx.com.smtp: . ack 1 win 5840 <nop,nop,timestamp 11538704 9902554> (DF) 22:31:49.446946 mail.xxx.com.smtp > A1.53287: R 473491107:473491107(0) win 0 22:31:49.457721 mail.xxx.com.smtp > A1.53288: R 465690568:465690568(0) win 0 The domains are setup in a transport map, in an attempt to see if the delivery concurrency limits were causing the trouble: domain.com fastrelay:[mail.domain.com] domain2.com slowrelay:[mail.domain.com] Here's my postconf -n output: alias_maps = hash:/usr/local/etc/postfix-test/aliases alternate_config_directories = /usr/local/etc/postfix-gw biff = no command_directory = /usr/local/sbin config_directory = /usr/local/etc/postfix-test daemon_directory = /usr/local/libexec/postfix default_destination_concurrency_limit = 100 default_process_limit = 550 disable_vrfy_command = yes header_checks = regexp:/usr/local/etc/postfix-test/header_checks.regexp inet_interfaces = $myhostname mail_owner = postfix mailq_path = /usr/local/bin/mailq manpage_directory = /usr/local/man max_use = 10 maximal_backoff_time = 1800s message_size_limit = 30000000 minimal_backoff_time = 180s mydomain = <mydomain>.net myhostname = A1 mynetworks = 192.168.1.0/24,127.0.0/8 myorigin = myhostname newaliases_path = /usr/local/bin/newaliases queue_directory = /var/spool/postfix-test queue_run_delay = 350s readme_directory = no relay_domains = hash:/usr/local/etc/postfix-test/relay_domains sample_directory = /etc/postfix sendmail_path = /usr/local/bin/sendmail setgid_group = postdrop smtp_helo_timeout = 10s smtpd_client_restrictions = smtpd_helo_required = yes smtpd_helo_restrictions = smtpd_recipient_restrictions = reject_non_fqdn_recipient, reject_unknown_recipient_domain, permit_mynetworks, reject_unauth_destination, check_recipient_access hash:/usr/local/etc/postfix-test/recipient_checks regexp:/usr/local/etc/postfix-test/recipient_checks.regexp, check_sender_access hash:/usr/local/etc/postfix-test/sender_checks, check_client_access hash:/usr/local/etc/postfix-test/client_checks, reject_unauth_pipelining, reject_invalid_hostname, permit smtpd_sender_restrictions = syslog_facility = local3 syslog_name = postfix-test transport_maps = hash:/usr/local/etc/postfix-test/transport unknown_address_reject_code = 554 unknown_client_reject_code = 554 unknown_hostname_reject_code = 554 In addition, these don't show up in postconf -n: initial_destination_concurrency_limit = 200 fastrelay_destination_concurrency_limit = 200 slowrelay_destination_concurrency_limit = 20 (taking these out, so that the default values are used, doesn't make a difference) And here's my master.cf: smtp inet n - n - - smtpd #628 inet n - n - - qmqpd pickup fifo n - n 60 1 pickup cleanup unix n - n - 0 cleanup qmgr fifo n - n 300 1 qmgr #qmgr fifo n - n 300 1 nqmgr rewrite unix - - n - - trivial-rewrite bounce unix - - n - 0 bounce defer unix - - n - 0 bounce trace unix - - n - 0 bounce verify unix - - n - 1 verify flush unix n - n 1000? 0 flush proxymap unix - - n - - proxymap smtp unix - - n - - smtp relay unix - - n - - smtp # -o smtp_helo_timeout=9 -o smtp_connect_timeout=9 showq unix n - n - - showq error unix - - n - - error local unix - n n - - local virtual unix - n n - - virtual lmtp unix - - n - - lmtp fastrelay unix - - n - - smtp slowrelay unix - - n - - smtp # # Interfaces to non-Postfix software. Be sure to examine the manual # pages of the non-Postfix software to find out what options it wants. # # maildrop. See the Postfix MAILDROP_README file for details. # #maildrop unix - n n - - pipe # flags=DRhu user=mbox argv=/usr/local/bin/maildrop -d ${user}@${nexthop} ${extension} ${recipient} ${user} ${nexthop} # # The Cyrus deliver program has changed incompatibly, multiple times. # old-cyrus unix - n n - - pipe flags=R user=cyrus argv=/cyrus/bin/deliver -e -m ${extension} ${user} # Cyrus 2.1.5 (Amos Gouaux) cyrus unix - n n - - pipe user=cyrus argv=/cyrus/bin/deliver -e -r ${sender} -m ${extension} ${user} uucp unix - n n - - pipe flags=Fqhu user=uucp argv=uux -r -n -z -a$sender - $nexthop!rmail ($recipient) ifmail unix - n n - - pipe flags=F user=ftn argv=/usr/lib/ifmail/ifmail -r $nexthop ($recipient) bsmtp unix - n n - - pipe flags=Fq. user=foo argv=/usr/local/sbin/bsmtp -f $sender $nexthop $recipient 127.0.0.1:20025 inet n - n - - smtpd -o content_filter= -o local_recipient_maps= -o relay_recipient_maps= -o smtpd_restriction_classes= -o smtpd_client_restrictions= -o smtpd_helo_restrictions= -o smtpd_sender_restrictions= -o smtpd_recipient_restrictions=permit_mynetworks,rej ect -o mynetworks=127.0.0.0/8 -o strict_rfc821_envelopes=yes -o smtpd_error_sleep_time=0 -o smtpd_soft_error_limit=1001 -o smtpd_hard_error_limit=1000 anvil unix - - n - 1 anvil Thanks. I really appreciate any help I can get here. Ed |
|
|||
|
Hi Ed,
I was wondering if you resolved this issue. I also manage many postfix servers (2.1.5) that are load balanced by a Foundry ServerIron XL. Every so often mail will get deferred for no apparent reason during high and low mail flow. I have just discovered this problem so I was hoping you had some insight. In my maillogs, I see a lot of these errors: server dropped connection without sending the initial SMTP greeting Eventually, the deferred mail gets requeued and delivered, but this is very annoying. Any help is appreciated. Thanks. |