Discussion:
something is randomly closing ssh-tunnels (was: ipfilter randomly dropping..)
Petar Bogdanovic
2014-06-23 10:24:08 UTC
Permalink
During the past few weeks the ssh-tunnels to a remote machine started
failing randomly. In a previous mail to tech-net I prematurely blamed
ipfilter because disabling it yielded some immediate success.

Unfortunately, subsequent testing showed that having npf enabled instead
eventually lead to the same issues.

What I know:

* the server suddenly FINs the connection
* the server ignores everything after that and sends about 20-30
RSTs for lots of late ACKs sent by the client
* ipmon is able to track the connection but misses the FIN
* yet ipfilter manages to update its state table and reduces the
TTL of the connection from 24h to 30s
* a server-tcpdump captures the FIN
* a client-tcpdump captures the same FIN
* according to wireshark, the FINs in both pcaps have sequence
numbers that indicate lost segments (which at least in one
case makes little sense since it was captured directly at the
source)
* ssh and sshd both never try to tear down the connection
* ssh reports that the remote end has closed the connection
* sshd bails on a failed write() with ENETUNREACH
* the success rate of the tunnel changes a lot, first it was 50%
then it was 100% and during the past few days it was almost 0%
* server and client are managed very carefully by me and there
were no significant changes during the past 6+ months.

Now I'm thinking about compiling sshd with SO_DEBUG and a new kernel
with TCP_DEBUG but I'm not sure, what I'll be able to read from this.

Any suggestions are very much appreciated, since I'm running out of
ideas and would like to avoid messing with the (production-)server
more than is really necessary.

Thanks,

Petar Bogdanovic


P.S. pcaps of one "failed" tunnel are here:
http://smokva.net/pcap/crane.tgz
Petar Bogdanovic
2014-06-23 10:29:25 UTC
Permalink
Post by Petar Bogdanovic
Unfortunately, subsequent testing showed that having npf enabled instead
eventually lead to the same issues.
Clarification: both ipf and npf were/are configured and enabled at the
server and not somewhere in between (or at the client).
Brandon Vincent
2014-06-23 16:03:13 UTC
Permalink
Post by Petar Bogdanovic
* server and client are managed very carefully by me and there
were no significant changes during the past 6+ months.
Do you know if any changes were made to the network infrastructure?

Improperly configured NGFWs can make a silent mess to legitimate network
traffic.

Brandon Vincent
Petar Bogdanovic
2014-06-23 16:25:55 UTC
Permalink
Post by Brandon Vincent
Post by Petar Bogdanovic
* server and client are managed very carefully by me and there
were no significant changes during the past 6+ months.
Do you know if any changes were made to the network infrastructure?
That's difficult to say. The folks at one
end (where the server is) say no and
that's pretty much all i know about
infrastructure changes.

But the thing is.. the server apparently
sends a FIN.. hence the server closes the
connection, nothing in between. And it
does so out of the blue without any
messages or errors.
Post by Brandon Vincent
Improperly configured NGFWs can make a silent mess to legitimate network
traffic.
Brandon Vincent
--
Sent from my phone.
Brandon Vincent
2014-06-23 16:50:59 UTC
Permalink
Post by Petar Bogdanovic
But the thing is.. the server apparently
sends a FIN.. hence the server closes the
connection, nothing in between. And it
does so out of the blue without any
messages or errors.
It's still hard to say that there is nothing interfering with the
network traffic. I've seen newer firewalls gracefully terminate (not
just drop) explicit FTP connections when the client negotiates without
SSL/TLS.

If not a firewall, I'm still betting on some sort of network change.

Brandon Vincent
Petar Bogdanovic
2014-06-23 17:28:53 UTC
Permalink
Post by Brandon Vincent
Post by Petar Bogdanovic
But the thing is.. the server apparently
sends a FIN.. hence the server closes the
connection, nothing in between. And it
does so out of the blue without any
messages or errors.
It's still hard to say that there is nothing interfering with the
network traffic. I've seen newer firewalls gracefully terminate (not
just drop) explicit FTP connections when the client negotiates without
SSL/TLS.
If not a firewall, I'm still betting on some sort of network change.
Due to the "alternating" nature of the issue, I would bet the same. I
just can't wrap my head around the fact that the server actually closes
the connection. If some other participant triggers the FIN, I'd expect
some evidence inside the pcaps.. but apart from the client being late
with ACKs, the connection shows nothing abnormal to the naive eye.
Petar Bogdanovic
2014-06-23 17:19:41 UTC
Permalink
Is it only ssh or any connection? I suppose you could test it with netcat.
Good question, I forgot to add that to my original message. Until now,
only scheduled ssh-tunnels were affected. interactive ssh-sessions seem
fine, so does smtp, imap and http.

I'll try transfering some noise through netcat later this week. It's
also worth mentioning that this "tunnel-setup" is running untouched for
years..
Joerg Sonnenberger
2014-06-23 18:48:04 UTC
Permalink
Post by Petar Bogdanovic
Good question, I forgot to add that to my original message. Until now,
only scheduled ssh-tunnels were affected. interactive ssh-sessions seem
fine, so does smtp, imap and http.
Can you try to manually force the IPQoS option to lowdelay/throughput
for both use cases?

Joerg
Petar Bogdanovic
2014-06-24 07:55:27 UTC
Permalink
Post by Joerg Sonnenberger
Post by Petar Bogdanovic
Good question, I forgot to add that to my original message. Until now,
only scheduled ssh-tunnels were affected. interactive ssh-sessions seem
fine, so does smtp, imap and http.
Can you try to manually force the IPQoS option to lowdelay/throughput
for both use cases?
Instead of:

/usr/bin/ssh -24fNn -L 22007:localhost:9102 -R 9101:localhost:9101 -R 9103:localhost:9103 remote.host.de

I tried:

/usr/bin/ssh -24fNn -o IPQoS=lowdelay -L 22007:localhost:9102 (...)

and:

/usr/bin/ssh -24fNn -o IPQoS=throughput -L 22007:localhost:9102 (...)

but it failed again in both instances.
Mindaugas Rasiukevicius
2014-06-23 16:53:10 UTC
Permalink
Post by Petar Bogdanovic
During the past few weeks the ssh-tunnels to a remote machine started
failing randomly. In a previous mail to tech-net I prematurely blamed
ipfilter because disabling it yielded some immediate success.
Unfortunately, subsequent testing showed that having npf enabled instead
eventually lead to the same issues.
<...>
http://smokva.net/pcap/crane.tgz
Are you using 6.x or -current? If latter, is it the latest -current?
You can feed the pcap file to npftest [1] using your config and see what
is going on with NPF state tracking.

[1] http://nxr.netbsd.org/xref/src/usr.sbin/npf/npftest/README
--
Mindaugas
Petar Bogdanovic
2014-06-23 17:31:07 UTC
Permalink
Post by Mindaugas Rasiukevicius
Are you using 6.x or -current? If latter, is it the latest -current?
The setup is conservative, follows netbsd-6-0.
Manuel Bouyer
2014-06-23 20:16:21 UTC
Permalink
Post by Petar Bogdanovic
During the past few weeks the ssh-tunnels to a remote machine started
failing randomly. In a previous mail to tech-net I prematurely blamed
ipfilter because disabling it yielded some immediate success.
Unfortunately, subsequent testing showed that having npf enabled instead
eventually lead to the same issues.
* the server suddenly FINs the connection
* the server ignores everything after that and sends about 20-30
RSTs for lots of late ACKs sent by the client
* ipmon is able to track the connection but misses the FIN
* yet ipfilter manages to update its state table and reduces the
TTL of the connection from 24h to 30s
* a server-tcpdump captures the FIN
* a client-tcpdump captures the same FIN
* according to wireshark, the FINs in both pcaps have sequence
numbers that indicate lost segments (which at least in one
case makes little sense since it was captured directly at the
source)
* ssh and sshd both never try to tear down the connection
* ssh reports that the remote end has closed the connection
* sshd bails on a failed write() with ENETUNREACH
So it could actually have closed the connection after that.
Did your tcpdump sequences also capture ICMP traffics ?
Did an ICMP network unreacheable packet show up ?
--
Manuel Bouyer <***@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
Petar Bogdanovic
2014-06-24 07:42:58 UTC
Permalink
Post by Manuel Bouyer
So it could actually have closed the connection after that.
Did your tcpdump sequences also capture ICMP traffics ?
Did an ICMP network unreacheable packet show up ?
I just created a new capture, this time with icmp:

tcpdump -nvvv icmp or \( tcp port ssh
and not tcp src port 60227
and not tcp dst port 60227
and not tcp src port 60178
and not tcp dst port 60178 \)

60227 and 60178 were used by two interactive sessions.

Unfortunately, there is no ICMP in the capture except the first two
packets which were sent by me in order to test the filter expression:

http://smokva.net/pcap/crane-with-icmp.txt
Darren Reed
2014-06-24 12:39:46 UTC
Permalink
Post by Petar Bogdanovic
During the past few weeks the ssh-tunnels to a remote machine started
failing randomly. In a previous mail to tech-net I prematurely blamed
ipfilter because disabling it yielded some immediate success.
Unfortunately, subsequent testing showed that having npf enabled instead
eventually lead to the same issues.
* the server suddenly FINs the connection
* the server ignores everything after that and sends about 20-30
RSTs for lots of late ACKs sent by the client
* ipmon is able to track the connection but misses the FIN
* yet ipfilter manages to update its state table and reduces the
TTL of the connection from 24h to 30s
* a server-tcpdump captures the FIN
* a client-tcpdump captures the same FIN
* according to wireshark, the FINs in both pcaps have sequence
numbers that indicate lost segments (which at least in one
case makes little sense since it was captured directly at the
source)
* ssh and sshd both never try to tear down the connection
* ssh reports that the remote end has closed the connection
* sshd bails on a failed write() with ENETUNREACH
So the problem is this:
* sshd tries to write to the socket, gets ENETUNREACH

and then exits leading to the FIN packets being transmitted as the socket
is closed down in the normal course of things but by the time it is doing
the exit the network path has restored.

For ICMP packets to cause this, you would need to see many of them.

You've got public IP addresses in your capture file and you've made no
mention of using NAT, so I'm going to assume that the box with sshd/ssh
on it are connected to the Internet directly with some kind of cable modem
or similar.

Are you able to cross check the events from sshd with log data from those
devices?

For example, if the NIC facing outwards drops then you will get ENETUNREACH
because the destination with the default route has disappeared. Or if your
DHCP assigned IP address disappears briefly then again the route will
disappear
and ENETUNREACH.

How about these two for me:
netstat -s | grep -i unreach
netstat -s | grep -i route

And of course the other important thing to do in an experiment is to save
the output of "netstat -s" at the start of a run and compare that with its
output when the problem has been seen again.

Kind Regards,
Darren
Darren Reed
2014-06-24 13:39:47 UTC
Permalink
Post by Darren Reed
... * sshd bails on a failed write() with ENETUNREACH
* sshd tries to write to the socket, gets ENETUNREACH
and then exits leading to the FIN packets being transmitted as the socket
is closed down in the normal course of things but by the time it is doing
the exit the network path has restored.
For ICMP packets to cause this, you would need to see many of them.
Oh, I forgot, there are internal code paths in ipfilter/npf that can
return ENETUNREACH.

If you are using NetBSD 6 with ipfilter, comparing the output of this:

ipfstat | grep 'block reason'

from before and after might be illuminating.

Or maybe just compare the entire output of "ipfstat" and "ipfstat -s"
from before and after.

Kind Regards,
Darren
Petar Bogdanovic
2014-06-25 08:13:34 UTC
Permalink
Post by Darren Reed
Oh, I forgot, there are internal code paths in ipfilter/npf that can
return ENETUNREACH.
ipfstat | grep 'block reason'
from before and after might be illuminating.
Or maybe just compare the entire output of "ipfstat" and "ipfstat -s"
from before and after.
No problem, ipfstat before and after failed tunnel (reproducing it now
is very easy):

# ls -la
total 12
drwxrwxrwt 2 root wheel 512 Jun 25 10:10 .
drwxr-xr-x 19 root wheel 512 Jun 20 20:51 ..
-rw-r--r-- 1 root wheel 535 Jun 25 10:09 ipfstat-s.1403683750
-rw-r--r-- 1 root wheel 535 Jun 25 10:10 ipfstat-s.1403683819
-rw-r--r-- 1 root wheel 805 Jun 25 10:09 ipfstat.1403683750
-rw-r--r-- 1 root wheel 806 Jun 25 10:10 ipfstat.1403683819


# diff -u ipfstat-s.1403683750 ipfstat-s.1403683819
--- ipfstat-s.1403683750 2014-06-25 10:09:10.000000000 +0200
+++ ipfstat-s.1403683819 2014-06-25 10:10:19.000000000 +0200
@@ -1,27 +1,27 @@
IP states added:
- 17761 TCP
- 58310 UDP
+ 17772 TCP
+ 58329 UDP
92 ICMP
- 76918765 hits
- 439924 misses
+ 77026414 hits
+ 450385 misses
0 bucket full
0 maximum rule references
0 maximum
0 no memory
- 14 bkts in use
- 14 active
- 58402 expired
- 17747 closed
+ 18 bkts in use
+ 18 active
+ 58418 expired
+ 17757 closed
State logging enabled

State table bucket statistics:
- 14 in use
+ 18 in use
100% hash efficiency
- 0.24% bucket usage
+ 0.31% bucket usage
0 minimal length
1 maximal length
1.000 average length

TCP Entries per state
0 1 2 3 4 5 6 7 8 9 10 11
- 0 0 0 0 3 0 0 0 0 0 8 3
+ 0 0 0 0 3 0 0 0 0 0 8 4


# diff -u ipfstat.1403683750 ipfstat.1403683819
--- ipfstat.1403683750 2014-06-25 10:09:10.000000000 +0200
+++ ipfstat.1403683819 2014-06-25 10:10:19.000000000 +0200
@@ -1,22 +1,22 @@
bad packets: in 0 out 0
- IPv6 packets: in 0 out 5153
- input packets: blocked 53 passed 44336 nomatch 0 counted 0 short 0
-output packets: blocked 5218 passed 60118 nomatch 0 counted 0 short 0
+ IPv6 packets: in 0 out 5155
+ input packets: blocked 53 passed 92750 nomatch 0 counted 0 short 0
+output packets: blocked 5239 passed 129793 nomatch 0 counted 0 short 0
input packets logged: blocked 0 passed 0
-output packets logged: blocked 65 passed 0
+output packets logged: blocked 84 passed 0
packets logged: input 0 output 0
- log failures: input 0 output 13
+ log failures: input 0 output 32
fragment state(in): kept 0 lost 0 not fragmented 0
fragment state(out): kept 0 lost 0 not fragmented 0
-packet state(in): kept 1601 lost 0
+packet state(in): kept 1631 lost 0
packet state(out): kept 27 lost 0
ICMP replies: 2 TCP RSTs sent: 50
Invalid source(in): 0
-Result cache hits(in): 1935 (out): 6778
+Result cache hits(in): 4644 (out): 9491
IN Pullups succeeded: 0 failed: 0
OUT Pullups succeeded: 0 failed: 0
Fastroute successes: 3538 failures: 0
TCP cksum fails(in): 0 (out): 0
-IPF Ticks: 781113
+IPF Ticks: 781251
Packet log flags set: (0)
none
Darren Reed
2014-06-26 11:59:09 UTC
Permalink
Peter,

The workaround for this is to add "pass out log body quick proto tcp
from 85.X.X.X port = 22 to 77.X.X.X.X" at the end of all of your "keep
state" ipf rules.

I've added the "log body" bit to provide more information about the
ssh packets that aren't picked up by the ssh rules and session state.

Cheers,
Darren

Petar Bogdanovic
2014-06-25 08:03:56 UTC
Permalink
Post by Darren Reed
* sshd tries to write to the socket, gets ENETUNREACH
and then exits leading to the FIN packets being transmitted as the socket
is closed down in the normal course of things but by the time it is doing
the exit the network path has restored.
Right. Here is the (slightly redacted) output of ktruss:

http://smokva.net/pcap/crane-trace-sshd.txt
Post by Darren Reed
You've got public IP addresses in your capture file and you've made no
mention of using NAT, so I'm going to assume that the box with sshd/ssh
on it are connected to the Internet directly with some kind of cable modem
or similar.
The client connects through a FreeBSD router where an older
pf-version is doing filtering, nat and altq (priq).

The server has a public IP.
Post by Darren Reed
Are you able to cross check the events from sshd with log data from those
devices?
The most cross checking I've done is comparing the two pcaps. There is
not much to cross check otherwise, since nothing really complains about
anything..

Also the ssh tunnels seem to work fine with ipf disabled. I'm running a
third >3GB job today with ipf disabled. Without disabling ipf the job
wouldn't survive longer than 10-30 seconds.
Post by Darren Reed
netstat -s | grep -i unreach
netstat -s | grep -i route
Sure:

# netstat -s | head -n20
icmp:
413 calls to icmp_error
0 errors not generated because old message was icmp
Output histogram:
echoreply: 100
unreach: 413
photuris: 6
4 messages with bad code fields
0 messages < minimum length
0 bad checksums
0 messages with bad length
0 multicast echo requests ignored
0 multicast timestamp requests ignored
Input histogram:
echoreply: 6
unreach: 731
echo: 100
timxceed: 102
100 message responses generated
0 path MTU changes

# netstat -s | grep -i unreach
unreach: 413
unreach: 731
0 dropped due to ICMP unreachable
0 address unreachable
0 port unreachable
0 dropped due to ICMP unreachable
0 packets rcvd for unreachable dest

# netstat -s | grep -i route
0 SYNs dropped (no route or no space)
0 output packets discarded due to no route
0 no route available (output)
17313 output packets discarded due to no route
0 no route
0 bad router solicitation messages
0 bad router advertisement messages
0 router advertisement routes dropped
0 SYNs dropped (no route or no space)
0 no route available (output)
Post by Darren Reed
And of course the other important thing to do in an experiment is to save
the output of "netstat -s" at the start of a run and compare that with its
output when the problem has been seen again.
Here is the diff of two files I created before and after a failed job:

# ls -la netstat.*
-rw-r--r-- 1 root wheel 16332 Jun 25 09:58 netstat.a
-rw-r--r-- 1 root wheel 16332 Jun 25 09:59 netstat.b

# diff -u netstat.a netstat.b
--- netstat.a 2014-06-25 09:58:29.000000000 +0200
+++ netstat.b 2014-06-25 09:59:16.000000000 +0200
@@ -29,61 +29,61 @@
0 membership reports received for groups to which we belong
0 membership reports sent
tcp:
- 36143366 packets sent
- 6650618 data packets (13929288006 bytes)
+ 36178462 packets sent
+ 6684489 data packets (14022704288 bytes)
4682 data packets (4775110 bytes) retransmitted
- 17455512 ack-only packets (27272960 delayed)
+ 17456129 ack-only packets (27274542 delayed)
0 URG only packets
0 window probe packets
- 12029336 window update packets
- 8939 control packets
+ 12029943 window update packets
+ 8947 control packets
0 send attempts resulted in self-quench
- 48058930 packets received
- 3572420 acks (for 13901480535 bytes)
- 50912 duplicate acks
+ 48083472 packets received
+ 3589267 acks (for 13994738366 bytes)
+ 51243 duplicate acks
5 acks for unsent data
- 42343900 packets (64839052960 bytes) received in-sequence
- 13840 completely duplicate packets (852212 bytes)
+ 42346063 packets (64885718854 bytes) received in-sequence
+ 13866 completely duplicate packets (852212 bytes)
1651 old duplicate packets
781 packets with some dup. data (277997 bytes duped)
- 955822 out-of-order packets (1113010976 bytes)
+ 955823 out-of-order packets (1113010976 bytes)
0 packets (0 bytes) of data after window
0 window probes
- 496160 window update packets
- 9207 packets received after close
+ 499885 window update packets
+ 9215 packets received after close
0 discarded for bad checksums
0 discarded for bad header offset fields
0 discarded because packet too short
- 776 connection requests
- 17510 connection accepts
- 18155 connections established (including accepts)
- 18553 connections closed (including 1589 drops)
+ 780 connection requests
+ 17520 connection accepts
+ 18167 connections established (including accepts)
+ 18570 connections closed (including 1591 drops)
7 embryonic connections dropped
0 delayed frees of tcpcb
- 3522932 segments updated rtt (of 1688436 attempts)
+ 3539781 segments updated rtt (of 1692544 attempts)
2935 retransmit timeouts
146 connections dropped by rexmit timeout
1 persist timeout (resulting in 0 dropped connections)
176 keepalive timeouts
0 keepalive probes sent
0 connections dropped by keepalive
- 86368 correct ACK header predictions
- 41338655 correct data packet header predictions
- 48297 PCB hash misses
- 249 dropped due to no socket
+ 86473 correct ACK header predictions
+ 41340354 correct data packet header predictions
+ 48346 PCB hash misses
+ 255 dropped due to no socket
3 connections drained due to memory shortage
204 PMTUD blackholes detected
- 859 bad connection attempts
- 17900 SYN cache entries added
+ 871 bad connection attempts
+ 17910 SYN cache entries added
0 hash collisions
- 17510 completed
+ 17520 completed
0 aborted (no space to build PCB)
45 timed out
0 dropped due to overflow
0 dropped due to bucket overflow
345 dropped due to RST
0 dropped due to ICMP unreachable
- 17855 delayed free of SYN cache entries
+ 17865 delayed free of SYN cache entries
307 SYN,ACKs retransmitted
129 duplicate SYNs received for entries already in the cache
0 SYNs dropped (no route or no space)
@@ -93,18 +93,18 @@
0 packets with ECN CE bit
0 packets ECN ECT(0) bit
udp:
- 111819 datagrams received
+ 111837 datagrams received
0 with incomplete header
0 with bad data length field
0 with bad checksum
417 dropped due to no socket
0 broadcast/multicast datagrams dropped due to no socket
0 dropped due to full socket buffers
- 111402 delivered
- 70739 PCB hash misses
- 114114 datagrams output
+ 111420 delivered
+ 70752 PCB hash misses
+ 114132 datagrams output
ip:
- 47974139 total packets received
+ 47998699 total packets received
0 bad header checksums
0 with size smaller than minimum
0 with data size < data length
@@ -119,13 +119,13 @@
0 malformed fragments dropped
0 fragments dropped after timeout
0 packets reassembled ok
- 47970480 packets for this host
+ 47995040 packets for this host
0 packets for unknown/unsupported protocol
0 packets forwarded (0 packets fast forwarded)
0 packets not forwardable
0 redirects sent
0 packets no matching gif found
- 36067735 packets sent from this host
+ 36102877 packets sent from this host
108 packets sent with fabricated ip header
0 output packets dropped due to no bufs, etc.
0 output packets discarded due to no route
@@ -139,8 +139,8 @@
0 no route available (output)
0 generic errors (output)
0 bundled SA processed (output)
- 48291258 SPD cache lookups
- 48291258 SPD cache misses
+ 48315827 SPD cache lookups
+ 48315827 SPD cache misses

IPsec ah:
0 ah input packets processed
@@ -233,7 +233,7 @@
0 fast forward flows
0 packets not forwardable
0 redirects sent
- 206472 packets sent from this host
+ 206474 packets sent from this host
0 packets sent with fabricated ip header
0 output packets dropped due to no bufs, etc.
17379 output packets discarded due to no route
@@ -292,61 +292,61 @@
0 bad redirect messages
0 path MTU changes
tcp6:
- 36143366 packets sent
- 6650618 data packets (13929288006 bytes)
+ 36178462 packets sent
+ 6684489 data packets (14022704288 bytes)
4682 data packets (4775110 bytes) retransmitted
- 17455512 ack-only packets (27272960 delayed)
+ 17456129 ack-only packets (27274542 delayed)
0 URG only packets
0 window probe packets
- 12029336 window update packets
- 8939 control packets
+ 12029943 window update packets
+ 8947 control packets
0 send attempts resulted in self-quench
- 48058930 packets received
- 3572420 acks (for 13901480535 bytes)
- 50912 duplicate acks
+ 48083472 packets received
+ 3589267 acks (for 13994738366 bytes)
+ 51243 duplicate acks
5 acks for unsent data
- 42343900 packets (64839052960 bytes) received in-sequence
- 13840 completely duplicate packets (852212 bytes)
+ 42346063 packets (64885718854 bytes) received in-sequence
+ 13866 completely duplicate packets (852212 bytes)
1651 old duplicate packets
781 packets with some dup. data (277997 bytes duped)
- 955822 out-of-order packets (1113010976 bytes)
+ 955823 out-of-order packets (1113010976 bytes)
0 packets (0 bytes) of data after window
0 window probes
- 496160 window update packets
- 9207 packets received after close
+ 499885 window update packets
+ 9215 packets received after close
0 discarded for bad checksums
0 discarded for bad header offset fields
0 discarded because packet too short
- 776 connection requests
- 17510 connection accepts
- 18155 connections established (including accepts)
- 18553 connections closed (including 1589 drops)
+ 780 connection requests
+ 17520 connection accepts
+ 18167 connections established (including accepts)
+ 18570 connections closed (including 1591 drops)
7 embryonic connections dropped
0 delayed frees of tcpcb
- 3522932 segments updated rtt (of 1688436 attempts)
+ 3539781 segments updated rtt (of 1692544 attempts)
2935 retransmit timeouts
146 connections dropped by rexmit timeout
1 persist timeout (resulting in 0 dropped connections)
176 keepalive timeouts
0 keepalive probes sent
0 connections dropped by keepalive
- 86368 correct ACK header predictions
- 41338655 correct data packet header predictions
- 48297 PCB hash misses
- 249 dropped due to no socket
+ 86473 correct ACK header predictions
+ 41340354 correct data packet header predictions
+ 48346 PCB hash misses
+ 255 dropped due to no socket
3 connections drained due to memory shortage
204 PMTUD blackholes detected
- 859 bad connection attempts
- 17900 SYN cache entries added
+ 871 bad connection attempts
+ 17910 SYN cache entries added
0 hash collisions
- 17510 completed
+ 17520 completed
0 aborted (no space to build PCB)
45 timed out
0 dropped due to overflow
0 dropped due to bucket overflow
345 dropped due to RST
0 dropped due to ICMP unreachable
- 17855 delayed free of SYN cache entries
+ 17865 delayed free of SYN cache entries
307 SYN,ACKs retransmitted
129 duplicate SYNs received for entries already in the cache
0 SYNs dropped (no route or no space)
@@ -372,8 +372,8 @@
0 no route available (output)
0 generic errors (output)
0 bundled SA processed (output)
- 48291258 SPD cache lookups
- 48291258 SPD cache misses
+ 48315827 SPD cache lookups
+ 48315827 SPD cache misses

IPsec ah:
0 ah input packets processed
@@ -467,11 +467,11 @@
0 delivered
0 datagrams output
arp:
- 231 packets sent
+ 232 packets sent
0 reply packets
- 231 request packets
- 848 packets received
- 230 reply packets
+ 232 request packets
+ 849 packets received
+ 231 reply packets
618 valid request packets
618 broadcast/multicast packets
0 packets with unknown protocol type
Darren Reed
2014-06-24 13:01:01 UTC
Permalink
Post by Petar Bogdanovic
...
Now I'm thinking about compiling sshd with SO_DEBUG and a new kernel
with TCP_DEBUG but I'm not sure, what I'll be able to read from this.
Any suggestions are very much appreciated, since I'm running out of
ideas and would like to avoid messing with the (production-)server
more than is really necessary.
From that capture file, there is only *one* packet with the FIN bit set
(line 20076):

09:24:10.646480 IP (tos 0x0, ttl 64, id 6680, offset 0, flags [DF],
proto TCP (6), length 1084)
85.X.X.X.22 > 77.X.X.X.59412: Flags [FP.], cksum 0x744b (correct),
seq 8611816:8612848, ack 9744, win 4197, options [nop,nop,TS val 26 ecr
26], length 1032

which suggests that it isn't aligned with the normal pattern that you
are seeing?

(There are repeats later (RST packets) along with some sel-ack stuff.)

So far as I can tell, this is not a retransmission of previous data and
it is but maybe
2/1000 of a second after the last packet in that direction. Nothing
should be idling.

There is also no sign of TCP window stress so it seems unlikely to be due to
buffering problems.

Unfortunately your syslog message from sshd will not have enough
resolution to
determine which packets it is between in terms of time.

You might want to look more closely at the TCP session shutdown with
wireshark
to see what it thinks. Or a better idea would be to capture everything
in/out of that
NIC and concentrate on the part around where the session shuts down.

Darren
Petar Bogdanovic
2014-06-25 08:33:18 UTC
Permalink
Post by Petar Bogdanovic
From that capture file, there is only *one* packet with the FIN bit set
Yes.
Post by Petar Bogdanovic
09:24:10.646480 IP (tos 0x0, ttl 64, id 6680, offset 0, flags [DF],
proto TCP (6), length 1084)
85.X.X.X.22 > 77.X.X.X.59412: Flags [FP.], cksum 0x744b (correct),
seq 8611816:8612848, ack 9744, win 4197, options [nop,nop,TS val 26 ecr
26], length 1032
which suggests that it isn't aligned with the normal pattern that you
are seeing?
Well, when everything works fine, I see two FINs when the connection
closes, one from each end. Or did you mean something else?
Post by Petar Bogdanovic
You might want to look more closely at the TCP session shutdown with
wireshark to see what it thinks. Or a better idea would be to capture
everything in/out of that NIC and concentrate on the part around where
the session shuts down.
I just did a full tcpdump. Then I filtered out the particular ssh
session, all imap and submission traffic and there was almost nothing
left except a few dns bits and two LLDP_Multicasts (?).
Petar Bogdanovic
2014-06-24 20:14:13 UTC
Permalink
Have you tried with all packet filters disabled, just to make sure?
When this started, I actually did disable ipfilter. But back then the
failures were declining so when I finally decided to go filterless, the
problem was no longer reproducible---with or without ipf (or npf).

This week this changed and the tunnels were reliably failing again.. so
I did it again:

| 2014-06-24 08:00:06 | 0 | 0 | f |
| 2014-06-24 09:12:27 | 0 | 0 | f |
| 2014-06-24 09:14:18 | 0 | 0 | f |
| 2014-06-24 09:22:51 | 0 | 0 | f |
| 2014-06-24 09:24:00 | 0 | 0 | f |
| 2014-06-24 09:46:48 | 0 | 0 | f |
| 2014-06-24 09:48:43 | 0 | 0 | f |
| 2014-06-24 09:50:16 | 0 | 0 | f |
| 2014-06-24 09:59:52 | 0 | 0 | f |
| 2014-06-24 10:05:08 | 0 | 0 | f |
| 2014-06-24 10:06:13 | 0 | 0 | f |
| 2014-06-24 10:14:20 | 0 | 0 | f |
| 2014-06-24 10:17:15 | 0 | 0 | f |
| 2014-06-24 10:22:56 | 0 | 0 | f | <--- disabled ipf here
| 2014-06-24 10:49:24 | 7,830 | 3,005,086,948 | T | <--- successfuly transfered 3GB
| 2014-06-24 10:22:56 | 0 | 0 | f | <--- enabled ipf here
| 2014-06-24 12:12:45 | 0 | 0 | f |

So the local filter actually does play some role here.. :)
Post by Petar Bogdanovic
* sshd bails on a failed write() with ENETUNREACH
This can be because IPF (and NPF) return this errno to local sockets
when blocking a packet.
Probably the blocked packet is part of the port forwarded connection,
and sshd is reacting to the write error by closing the control
connection.
How do you filter your loopback interface?
Do you have specific stateless pass rules for lo0?
I've seen cases where, without explicit lo0 rules, the packet filter
tried to track state on loopback, and saw packets only in one direction,
leading to stalling sessions iff when window scaling was enabled and
traffic was just flowing fast enough to fill windows. This could work
fine for weeks, just to pop up after something unrelated was changed
(adding a database index to increase query response speed).
I've been using stateless pass rules for lo0 in both cases, ipf and npf.

My ipf configuration is older and might contain some ambiguities,
especially with regard to the lack of lo0 rules in ipf6.conf:

# ipf.conf
pass in quick on lo0 all
pass out quick on lo0 all

block in quick all with mbcast

block in all
block out all

pass in proto icmp icmp-type 3
pass out proto icmp icmp-type 3

pass in proto icmp icmp-type 8 keep state
pass out proto icmp icmp-type 8 keep state

block return-rst in proto tcp
block return-icmp in proto udp

(...)


# ipf6.conf
block in quick all
block out quick all


The npf configuration is simpler and makes no distinction between ip4
and ip6:

# npf.conf
$tcp_in = { domain, ... }
$udp_in = { domain }

$tcp_out = { domain, ... }
$udp_out = { domain, ... }

procedure "log" {
# ifconfig npflog0 create
log: npflog0
}

group (default) {
pass final on lo0 all

pass stateful in final proto icmp icmp-type 8 all
pass stateful out final proto icmp icmp-type 8 all

pass stateful in final proto tcp to any port $tcp_in
pass stateful in final proto udp to any port $udp_in
pass stateful out final proto tcp to any port $tcp_out
pass stateful out final proto udp to any port $udp_out

block return all
pass out proto icmp icmp-type 3 all
pass out proto tcp flags R/R all
}


Otherwise your interpretation is very intriguing given the fact that the
only connections failing are the ssh connections that do port-forwarding.

Worth mentioning: If transfers are short, they sometimes succeed even if
ipf is on.
Loading...