|
|
Brett Johnson <maillist@xxxxxxx> said: > This is essentially the same setup I had under CIPE awhile back. I never had any corruption there. OpenVPN was the only change because CIPE had reconnect problems. I have noticed that the corruptions have lessened gradually with each new release of OpenVPN. I really doubt that _any_ VPN package could corrupt an FTP session due to the combination of cryptographic authentication and TCP checksumming. As an experiment try this: OpenVPN actually has the ability to corrupt packets, intentionally. It is a debugging mode that is used for testing, and can be activated with the --gremlin option. --gremlin will cause OpenVPN to randomly corrupt bits in received packets, immediately after they are read from the network. Now try an FTP session over the tunnel and see if the use of --gremlin can corrupt the FTP transfer. In fact I just tried this test myself, transfering a 360KB file over an OpenVPN link with --gremlin specified on both sides of the connection. I even purposely turned off encryption/authentication, so that rather than being dropped immediately due to a failed HMAC test, the packets are passed into the TCP/IP stack and must be rejected by the IP checksumming code. The result is just as one would expect. The transfer goes very slowly, because the packet corruption results in packets getting dropped, so they need to be retransmitted. But after the transfer is complete, I did a cmp of the file against the original and they are identical. Now I try the same test again with encryption and authentication enabled. I get a lot of messages like this: Authenticate/Decrypt packet error: packet HMAC authentication failed but again, the transfer completes, slower than normal, and the transfered file is identical to the original. > As far as using tcpdump, I can't sit around watching hundreds of megabytes of packets going through trying to catch the bad segment that happens once every few days. Obviously, but this is a pattern recognition problem and tcpdump has quite a number of filters. Since we are looking for corrupted packets with bad checksums, that significantly cuts down on the number of packets which would need to be looked at. > I understand about the UDP layer having it's checksum, the VPN layer having its checksum (sha), and the TCP connection within the VPN (for FTP in this case) also having it's checksum. In reality I find it hard to believe that there could be a failure here, but this is the only place where the failures are showing up. The finger keeps pointing back to something withing the OpenVPN line of events. That's why I'm calling this a bug. Believe me, this isn't the first time I've sat down to troubleshoot this. I've been running OpenVPN for months now and have just put up with sporadic failures. Well clearly there's a bug somewhere, but I'm sceptical it's an OpenVPN bug for the reasons listed above. Even when I intentionally turn on packet corruption in OpenVPN using --gremlin, I am unable to corrupt an FTP session running over the tunnel, even if I also disable encryption/authentication. > Let's shift gears for a minute. How and where could it possibly fail? Can >OpenVPN modify the data after it comes out of encryption? It can but only does when you are using special purpose packet mangling options, such as --gremlin or --mssfix. >Does TCP checksumming really catch this? TCP checksumming will (with a high degree of accuracy) catch corruption that is picked up during packet processing and network transmission. It will not, however, protect against malicious corruption. That is the job of HMAC-SHA1 to protect against. > Could there be a problem with the tun device? TCP checksumming would catch it. Because it is inside the "window" that exists after the checksum is applied by the sender and before it is verified by the receiver. >Could it flub up the packet? Would the tap device be better to use? The tap device on linux seems not as solid as the tun device, based on some analysis I've done recently on tap device stalls. It doesn't corrupt packets, but it causes TCP connections to occasionally stall. My best guess is that the corruption is occuring before the packets are checksummed by TCP on the sending side, or after the packets are verified by TCP on the receiving side. That means a problem in the FTP client or server, a problem in the TCP/IP stack, or even something like file system corruption. I would try using other file transfer protocols such as rsync or scp, or other ftp client/server implementations to see if you can replicate the corruption. At this point I would say we need more data points before we can narrow down the cause. James ____________________________________________ Openvpn-users mailing list Openvpn-users@xxxxxxxxxxxxxxxxxxxxx https://lists.sourceforge.net/lists/listinfo/openvpn-users |