来自接入点的 1522 字节帧被网关丢弃

网络工程 以太网 linux 数据包丢失
2021-07-31 23:46:08

编辑 2:这可能是内核或模块中的错误。此处报告https://bugzilla.kernel.org/show_bug.cgi?id=109471谁能确认这种现象绝对不符合规范?

原来的:

我正在我的实验室中跟踪网络现象,其中一个 Wi-Fi 设备(可能是多个)可以毫无问题地连接到我的一个接入点 (AP),但是当通过另一个 AP 进行通信时,网络流被中断。坏 AP 发送长度为 1522 的“超大”以太网帧(参见下面的 rx_length_errors 计数器)。不存在问题的 AP 永远不会发送大于 1514 的帧,它们都被网关接受并转发。

我相信 Wi-Fi 客户端设备在两个 AP 上都使用了 1500 的 MTU,因为这两种情况下的 TCP 有效载荷始终处于/低于 1500 字节。这排除了客户端是一个因素。添加以太网帧时,一个 AP 添加 14 个字节(产生 1514 字节的帧),另一个 AP 添加 22 个字节(产生 1522 字节的帧)。

样本超大数据包(MAC 和 IP 为匿名而修改)。

03:51:25.978066 1b:22:fe:cd:f3:77 > a8:27:db:fe:9e:51, ethertype IPv4 (0x0800), length 1522: (tos 0x0, ttl 64, id 55530, offset 0, flags [DF], proto TCP (6), length 1500)
    192.168.1.149.41744 > 54.239.17.6.443: Flags [.], cksum 0x5b39 (correct), seq 1724444568:1724446028, ack 3109213756, win 377, length 1460

`

如果我连接到我的另一个 AP,最大帧大小最终为 1514,并且这些 AP 通过网关没有问题。

1514 字节帧与 1522 字节帧的比较揭示了不正确的四字节帧校验序列(正常校验和卸载?)以及四个尾部字节。总共有 8 个字节,这可能解释了工作和非工作之间的 8 个字节差异,但是我不能 100% 确定工作中不存在帧检查序列,即使wireshark 没有在数据包解析器中显示它,所以这可能只占额外 8 个字节中的 4 个。

(当封闭的 IP 数据包已经有 1500 字节长时,为什么 Linux 会用尾部填充以太网帧?)

在网关上,帧和长度错误随着来自 AP 的 1522 字节帧的锁步而增加。所以这就是我得出框架大小是一个重要因素的结论。

/sys/class/net/eth0/statistics/rx_frame_errors:36059
/sys/class/net/eth0  /statistics/rx_length_errors:36059

好的 AP 正在运行内核

Linux goodAP 3.18.11+ #781 PREEMPT Tue Apr 21 18:02:18 BST 2015 armv6l GNU/Linux

坏 AP 正在运行内核

Linux notsogoodAP 3.18.11-v7+ #781 SMP PREEMPT Tue Apr 21 18:07:59 BST 2015 armv7l GNU/Linux

所以问题是,为什么 Linux 服务器(接入点)生成过大(1522 字节)的以太网帧,并且帧校验序列看似无效?

编辑:添加从wireshark导出的完整数据包转储:

No.     Time            Source                Destination           Protocol Length Info
     35 21:55:21.644314 192.168.1.149        54.239.25.200         TCP      1522   [TCP Retransmission] 44053→443 [ACK] Seq=350 Ack=155 Win=88832 Len=1460 [ETHERNET FRAME CHECK SEQUENCE INCORRECT]

Frame 35: 1522 bytes on wire (12176 bits), 1522 bytes captured (12176 bits)
    Encapsulation type: Ethernet (1)
    Arrival Time: Dec 15, 2015 21:55:21.644314000 EST
    [Time shift for this packet: 0.000000000 seconds]
    Epoch Time: 1450234521.644314000 seconds
    [Time delta from previous captured frame: 0.150820000 seconds]
    [Time delta from previous displayed frame: 3.683211000 seconds]
    [Time since reference or first frame: 9.568260000 seconds]
    Frame Number: 35
    Frame Length: 1522 bytes (12176 bits)
    Capture Length: 1522 bytes (12176 bits)
    [Frame is marked: False]
    [Frame is ignored: False]
    [Protocols in frame: eth:ethertype:ip:tcp]
    [Coloring Rule Name: Bad TCP]
    [Coloring Rule String: tcp.analysis.flags && !tcp.analysis.window_update]
Ethernet II, Src: xx:xx:xx:xx:xx:xx (xx:xx:xx:xx:xx:xx), Dst: Raspberr_xx:xx:xx (xx:xx:xx:xx:xx:xx)
    Destination: Raspberr_xx:xx:xx (xx:xx:xx:xx:xx:xx)
        Address: Raspberr_xx:xx:xx (xx:xx:xx:xx:xx:xx)
        .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
        .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
    Source: xx:xx:xx:xx:xx:xx (xx:xx:xx:xx:xx:xx)
        Address: xx:xx:xx:xx:xx:xx (xx:xx:xx:xx:xx:xx)
        .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
        .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
    Type: IP (0x0800)
    Trailer: 2f8e4de5
    Frame check sequence: 0x29f37f68 [incorrect, should be 0x3b863a28]
        [FCS Good: False]
        [FCS Bad: True]
            [Expert Info (Error/Checksum): Bad checksum]
                [Bad checksum]
                [Severity level: Error]
                [Group: Checksum]
Internet Protocol Version 4, Src: 192.168.1.149 (192.168.1.149), Dst: 54.239.25.200 (54.239.25.200)
    Version: 4
    Header Length: 20 bytes
    Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00: Not-ECT (Not ECN-Capable Transport))
        0000 00.. = Differentiated Services Codepoint: Default (0x00)
        .... ..00 = Explicit Congestion Notification: Not-ECT (Not ECN-Capable Transport) (0x00)
    Total Length: 1500
    Identification: 0xaaf3 (43763)
    Flags: 0x02 (Don't Fragment)
        0... .... = Reserved bit: Not set
        .1.. .... = Don't fragment: Set
        ..0. .... = More fragments: Not set
    Fragment offset: 0
    Time to live: 64
    Protocol: TCP (6)
    Header checksum: 0x3234 [validation disabled]
        [Good: False]
        [Bad: False]
    Source: 192.168.1.149 (192.168.1.149)
    Destination: 54.239.25.200 (54.239.25.200)
    [Source GeoIP: Unknown]
    [Destination GeoIP: Unknown]
Transmission Control Protocol, Src Port: 44053 (44053), Dst Port: 443 (443), Seq: 350, Ack: 155, Len: 1460
    Source Port: 44053 (44053)
    Destination Port: 443 (443)
    [Stream index: 1]
    [TCP Segment Len: 1460]
    Sequence number: 350    (relative sequence number)
    [Next sequence number: 1810    (relative sequence number)]
    Acknowledgment number: 155    (relative ack number)
    Header Length: 20 bytes
    .... 0000 0001 0000 = Flags: 0x010 (ACK)
        000. .... .... = Reserved: Not set
        ...0 .... .... = Nonce: Not set
        .... 0... .... = Congestion Window Reduced (CWR): Not set
        .... .0.. .... = ECN-Echo: Not set
        .... ..0. .... = Urgent: Not set
        .... ...1 .... = Acknowledgment: Set
        .... .... 0... = Push: Not set
        .... .... .0.. = Reset: Not set
        .... .... ..0. = Syn: Not set
        .... .... ...0 = Fin: Not set
    Window size value: 347
    [Calculated window size: 88832]
    [Window size scaling factor: 256]
    Checksum: 0x1c58 [validation disabled]
        [Good Checksum: False]
        [Bad Checksum: False]
    Urgent pointer: 0
    [SEQ/ACK analysis]
        [iRTT: 0.028704000 seconds]
        [Bytes in flight: 2077]
        [TCP Analysis Flags]
            [Expert Info (Note/Sequence): This frame is a (suspected) retransmission]
                [This frame is a (suspected) retransmission]
                [Severity level: Note]
                [Group: Sequence]
            [The RTO for this segment was: 7.247571000 seconds]
            [RTO based on delta from frame: 12]
    Retransmitted TCP segment data (1460 bytes)



0000  b8 27 eb fe 9e 71 1c 56 fe cd f3 75 08 00 45 00   .'...q.V...u..E.
0010  05 dc aa f4 40 00 40 06 32 33 c0 a8 46 95 36 ef   ....@.@.23..F.6.
0020  19 c8 ac 15 01 bb 71 be a7 a7 cb 7c 15 6f 50 10   ......q....|.oP.
0030  01 5b 1c 58 00 00 17 03 03 08 18 00 00 00 00 00   .[.X............
0040  00 00 02 cd 63 94 47 c0 31 81 4d ea b2 13 b6 46   ....c.G.1.M....F
0050  cb 5d 4d 64 71 21 b1 c1 3c 6d d6 1c 5e 0a 89 4b   .]Mdq!..<m..^..K
0060  a3 f8 58 f3 a9 2e 95 81 2c 08 be cc 64 a2 11 c3   ..X.....,...d...
0070  8c 7e 10 7f 51 e5 87 e7 c0 49 08 19 65 0a 7a 5b   .~..Q....I..e.z[
0080  ee 22 2e 93 31 cf 22 5d 7e 36 5f ee 3b 2e 43 f0   ."..1."]~6_.;.C.
0090  83 23 9f 69 7f ae 82 22 04 f4 02 42 fb 28 43 f3   .#.i..."...B.(C.
00a0  8f 05 80 6c fd 7f ef 47 7b 07 d2 b7 d9 e8 ab 78   ...l...G{......x
00b0  54 67 af 61 bf 55 89 33 f3 85 5d 7b a4 53 34 73   Tg.a.U.3..]{.S4s
00c0  17 75 b0 da 6a 31 d5 0a 86 7a 11 66 7f 9f 81 5a   .u..j1...z.f...Z
00d0  96 bc 64 72 0d da 32 01 8d 88 70 d3 f5 e7 70 29   ..dr..2...p...p)
00e0  94 be 35 37 10 10 41 79 fc d3 4f f1 1d a2 c3 ef   ..57..Ay..O.....
00f0  32 81 50 e2 2a ca 6e bf 95 b8 77 1b ee 76 ba fb   2.P.*.n...w..v..
0100  8d 85 25 a2 47 7a 96 fd af c3 39 98 8b bd ff 31   ..%.Gz....9....1
0110  ae ca 15 60 e1 2f 00 4f 43 c4 20 11 92 47 91 10   ...`./.OC. ..G..
0120  f6 0c a7 ea 5d 54 f1 01 eb d4 9e 7a bb ae 08 5f   ....]T.....z..._
0130  9b 31 85 af 4e b4 0b 03 ad 6b 51 11 51 e5 f8 d8   .1..N....kQ.Q...
0140  af ff 78 46 3e 24 6c 2e 40 3e aa 27 b7 87 10 c7   ..xF>$l.@>.'....
0150  5a aa d3 33 fa b6 bc 4d e9 2f 18 34 81 98 28 34   Z..3...M./.4..(4
0160  18 de e8 32 ae c0 21 6a 52 20 3b 5f 12 6c f4 df   ...2..!jR ;_.l..
0170  71 c6 e0 cc b2 d1 75 94 f2 e3 63 e5 4d a5 7c ba   q.....u...c.M.|.
0180  1c 46 e1 83 e1 ca e3 c7 dc c3 08 d7 0a 3e e2 3b   .F...........>.;
2个回答

一个有效载荷为 1500 字节的以太网帧:

  • 没有 802.1Q 标签是 1518 字节
  • 带有 802.1Q 标签是 1522 字节

显示 1514 字节的 WAP 是由于该帧没有 802.1Q 标签,并且接口或驱动程序没有为 Wireshark 提供 FCS。来自Wireshark 维基

大多数以太网接口也不向 Wireshark 或其他应用程序提供 FCS,或者它们的驱动程序没有配置这样做;因此,Wireshark 通常只会获得绿色字段,尽管在某些平台上,某些接口会在传入数据包上提供 FCS。

向您显示 1522 字节帧大小的 WAP 显然向您显示了 FCS(四个字节),因此接口或驱动程序将其提供给 Wireshark。其他四个字节是由于 802.1Q 标签。

您的路由器不期待 802.1Q 标记帧,因此它会给您错误。您需要在 WAP 上禁用 802.1Q。WAP 可以在有线侧使用 802.1Q 来分离多个 SSID 之间的流量。

经过进一步调查,尝试不同的 USB wifi 卡后,我发现问题出在 USB wifi 设备上,而在其他 USB wifi 设备上没有发生。

将问题视为 wifi 卡驱动程序错误。

有问题的设备是“TP-Link TL-WN-722N (ath9k_htc) USB”。