Networking with network endian kernel

marcus · August 28, 2021, 10:24am

I’m having some trouble getting the network interfaces working when running a network (big) endian kernel. The exact same kernel source and kernel config works if I just change the endianness to little.

Now, receiving packets seems to work just fine, but transmit gets stuck on the very first packet.
This is what ifconfig looks like

eth6: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.42.10  netmask 255.255.255.0  broadcast 192.168.42.255
        inet6 fe80::20a:faff:fe24:296d  prefixlen 64  scopeid 0x20<link>
        ether 00:0a:fa:24:29:6d  txqueuelen 1000  (Ethernet)
        RX packets 111  bytes 10070 (9.8 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1  bytes 90 (90.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

So as you can see 111 packets have been received, but only 1 transmitted. If I run tcpdump then I see perfectly valid packets coming in, but nothing going out, even when a reply to an incoming packet would be expected. For example, here is two different machines, one with the ten64 in its ARP cache, the other one not, trying to ping the machine:

12:08:40.459261 IP 192.168.42.40 > 192.168.42.10: ICMP echo request, id 45866, seq 1, length 64
12:08:40.854371 STP 802.1w, Rapid STP, Flags [Learn, Forward], bridge-id 8000.2c:c8:1b:44:85:88.8002, length 36
12:08:41.484295 IP 192.168.42.40 > 192.168.42.10: ICMP echo request, id 45866, seq 2, length 64
12:08:42.508192 IP 192.168.42.40 > 192.168.42.10: ICMP echo request, id 45866, seq 3, length 64
12:08:42.854096 STP 802.1w, Rapid STP, Flags [Learn, Forward], bridge-id 8000.2c:c8:1b:44:85:88.8002, length 36
12:08:43.531942 IP 192.168.42.40 > 192.168.42.10: ICMP echo request, id 45866, seq 4, length 64
12:08:44.555376 IP 192.168.42.40 > 192.168.42.10: ICMP echo request, id 45866, seq 5, length 64
12:09:10.850176 STP 802.1w, Rapid STP, Flags [Learn, Forward], bridge-id 8000.2c:c8:1b:44:85:88.8002, length 36
12:09:22.850911 STP 802.1w, Rapid STP, Flags [Learn, Forward], bridge-id 8000.2c:c8:1b:44:85:88.8002, length 36
12:09:22.886453 ARP, Request who-has 192.168.42.10 tell 192.168.42.37, length 46
12:09:23.908819 ARP, Request who-has 192.168.42.10 tell 192.168.42.37, length 46

No reply is seen to either the ICMP echo request or the ARP request, even at the local level. The TX counter in ifconfig remains at 1.

So it would appear that the TX queue gets “stuck” somehow. Any ideas on where to start looking for this bug?

Thanks

marcus · August 29, 2021, 11:00am

Some additional information, this is the output from restool:

temari ~ # restool dpmac info dpmac.5
dpmac version: 4.5
dpmac object id/portal id: 5
plugged state: plugged
endpoint state: 1
endpoint: dpni.3, link is up
DPMAC link type: DPMAC_LINK_TYPE_PHY
DPMAC ethernet interface: DPMAC_ETH_IF_QSGMII
MAC address: 00:0a:fa:24:29:6d
maximum supported rate 1000 Mbps
Counters:
rx all frames: 81
rx frames ok: 81
rx frame errors: 0
rx frame discards: 0
rx u-cast: 0
rx b-cast: 6
rx m-cast: 75
rx 64 bytes: 28
rx 65-127 bytes: 38
rx 128-255 bytes: 15
rx 256-511 bytes: 0
rx 512-1023 bytes: 0
rx 1024-1518 bytes: 0
rx 1519-max bytes: 0
rx frags: 0
rx jabber: 0
rx align errors: 0
rx oversized: 0
rx pause: 0
rx bytes: 8142
tx frames ok: 0
tx u-cast: 0
tx m-cast: 0
tx b-cast: 0
tx frame errors: 0
tx undersized: 0
tx b-pause: 0
tx bytes: 0
temari ~ # restool dpni info dpni.3
dpni version: 7.14
dpni id: 3
plugged state: plugged
endpoint state: 1
endpoint: dpmac.5, link is up
link status: 1 - up
mac address: 00:0a:fa:24:29:6d
temari ~ #

Note that all the tx counters are 0, even though ifconfig says TX packets 1…

marcus · August 29, 2021, 11:17am

Oh, and I should mention that this is on kernel 5.10.52, with all the traverse patches applied (and Gentoo patches as well).

mcbridematt · August 29, 2021, 11:52am

Driver wise I would check __dpaa2_eth_tx to ensure the packet doesn’t get thrown out before it’s queued. I’m guessing this would be due to an incorrect byte swap.
If it is getting queued correctly, it might be getting stuck in the “portal” (DPIO) which gets a bit harder. dpaa2_io_service_enqueue_multiple_fq is where it should end up.

I’ve only ever had to probe up to the dpaa2-eth.c so I can’t give much insight into what happens from the DPIO/portal driver upwards.

restool only displays the “dataplane” counters which is from the hardware side, you can also get driver counters from ethtool. Some of the ones that might indicate an issue are below:

ethtool -S eth0
NIC statistics:
     [hw] rx frames: 3674224
..
     [hw] rx discarded frames: 0
     [hw] rx nobuffer discards: 0
     [hw] tx discarded frames: 0
     [hw] tx confirmed frames: 538312
     [hw] tx dequeued bytes: 60603308
     [hw] tx dequeued frames: 538312
     [hw] tx rejected bytes: 0
     [hw] tx rejected frames: 0
     [hw] tx pending frames: 0
\...
     [drv] enqueue portal busy: 0
     [drv] dequeue portal busy: 0
     [drv] channel pull errors: 0
...
     [qbman] rx pending frames: 0
     [qbman] rx pending bytes: 0
     [qbman] tx conf pending frames: 0
     [qbman] tx conf pending bytes: 0
     [qbman] buffer count: 1280
     [mac] rx 64 bytes: 98625
     [mac] rx 65-127 bytes: 61285
..

marcus · August 29, 2021, 12:07pm

Thanks! I’ll start digging where you suggest and see where I end up.
ethtool -S output is as follows; the lone tx packet appears in [drv], but neither in [hw] or [mac].

NIC statistics:
     [hw] rx frames: 7
     [hw] rx bytes: 667
     [hw] rx mcast frames: 4
     [hw] rx mcast bytes: 344
     [hw] rx bcast frames: 3
     [hw] rx bcast bytes: 323
     [hw] tx frames: 0
     [hw] tx bytes: 0
     [hw] tx mcast frames: 0
     [hw] tx mcast bytes: 0
     [hw] tx bcast frames: 0
     [hw] tx bcast bytes: 0
     [hw] rx filtered frames: 41
     [hw] rx discarded frames: 0
     [hw] rx nobuffer discards: 0
     [hw] tx discarded frames: 0
     [hw] tx confirmed frames: 0
     [hw] tx dequeued bytes: 0
     [hw] tx dequeued frames: 0
     [hw] tx rejected bytes: 0
     [hw] tx rejected frames: 0
     [hw] tx pending frames: 0
     [drv] tx conf frames: 0
     [drv] tx conf bytes: 0
     [drv] tx sg frames: 1
     [drv] tx sg bytes: 90
     [drv] rx sg frames: 0
     [drv] rx sg bytes: 0
     [drv] tx converted sg frames: 1
     [drv] tx converted sg bytes: 90
     [drv] enqueue portal busy: 0
     [drv] dequeue portal busy: 0
     [drv] channel pull errors: 0
     [drv] cdan: 7
     [drv] xdp drop: 0
     [drv] xdp tx: 0
     [drv] xdp tx errors: 0
     [drv] xdp redirect: 0
     [qbman] rx pending frames: 0
     [qbman] rx pending bytes: 0
     [qbman] tx conf pending frames: 0
     [qbman] tx conf pending bytes: 0
     [qbman] buffer count: 1274
     [mac] rx 64 bytes: 26
     [mac] rx 65-127 bytes: 14
     [mac] rx 128-255 bytes: 8
     [mac] rx 256-511 bytes: 0
     [mac] rx 512-1023 bytes: 0
     [mac] rx 1024-1518 bytes: 0
     [mac] rx 1519-max bytes: 0
     [mac] rx frags: 0
     [mac] rx jabber: 0
     [mac] rx frame discards: 0
     [mac] rx align errors: 0
     [mac] tx undersized: 0
     [mac] rx oversized: 0
     [mac] rx pause: 0
     [mac] tx b-pause: 0
     [mac] rx bytes: 4620
     [mac] rx m-cast: 45
     [mac] rx b-cast: 3
     [mac] rx all frames: 48
     [mac] rx u-cast: 0
     [mac] rx frame errors: 0
     [mac] tx bytes: 0
     [mac] tx m-cast: 0
     [mac] tx b-cast: 0
     [mac] tx u-cast: 0
     [mac] tx frame errors: 0
     [mac] rx frames ok: 48
     [mac] tx frames ok: 0

If the packet gets lost before even reaching the hardware, that should make debugging much easier.

marcus · August 29, 2021, 2:38pm

Ok, I found the bug. It was indeed in the “portal”. When qbman_swp_enqueue() sets the valid bit, it accesses the start of the struct qbman_eq_desc as a uint32_t, and forgets to use cpu_to_le32 when doing so. Thus, the valid bit ends up in the wrong place, and the HW will not see the descriptor as valid.
The following patch fixes the issue:

--- drivers/soc/fsl/dpio/qbman-portal.c.orig    2021-08-29 16:00:23.905830470 +0200
+++ drivers/soc/fsl/dpio/qbman-portal.c 2021-08-29 16:12:17.812402843 +0200
@@ -686,7 +686,7 @@
        eqcr_pi = s->eqcr.pi;
        for (i = 0; i < num_enqueued; i++) {
                p = (s->addr_cena + QBMAN_CENA_SWP_EQCR(eqcr_pi & half_mask));
-               p[0] = cl[0] | s->eqcr.pi_vb;
+               p[0] = cl[0] | cpu_to_le32(s->eqcr.pi_vb);
                if (flags && (flags[i] & QBMAN_ENQUEUE_FLAG_DCA)) {
                        struct qbman_eq_desc *d = (struct qbman_eq_desc *)p;
 
@@ -768,7 +768,7 @@
        eqcr_pi = s->eqcr.pi;
        for (i = 0; i < num_enqueued; i++) {
                p = (s->addr_cena + QBMAN_CENA_SWP_EQCR(eqcr_pi & half_mask));
-               p[0] = cl[0] | s->eqcr.pi_vb;
+               p[0] = cl[0] | cpu_to_le32(s->eqcr.pi_vb);
                if (flags && (flags[i] & QBMAN_ENQUEUE_FLAG_DCA)) {
                        struct qbman_eq_desc *d = (struct qbman_eq_desc *)p;
 
@@ -845,7 +845,7 @@
        for (i = 0; i < num_enqueued; i++) {
                p = (s->addr_cena + QBMAN_CENA_SWP_EQCR(eqcr_pi & half_mask));
                cl = (uint32_t *)(&d[i]);
-               p[0] = cl[0] | s->eqcr.pi_vb;
+               p[0] = cl[0] | cpu_to_le32(s->eqcr.pi_vb);
                eqcr_pi++;
                if (!(eqcr_pi & half_mask))
                        s->eqcr.pi_vb ^= QB_VALID_BIT;
@@ -913,7 +913,7 @@
        for (i = 0; i < num_enqueued; i++) {
                p = (s->addr_cena + QBMAN_CENA_SWP_EQCR(eqcr_pi & half_mask));
                cl = (uint32_t *)(&d[i]);
-               p[0] = cl[0] | s->eqcr.pi_vb;
+               p[0] = cl[0] | cpu_to_le32(s->eqcr.pi_vb);
                eqcr_pi++;
                if (!(eqcr_pi & half_mask))
                        s->eqcr.pi_vb ^= QB_VALID_BIT;

The bug still seems to be present in mainline.