More Details on Interrupt Balancing/DPAA2 Config/DPIO Splitting

It seems in my default-config Debian 5.10 Ten64 all the network interrupts across several of the GE ports happen on dpio.8/core0, leading to significant softirq imbalance. I assume this is roughly what is described by

Ten64 ships with a configuration that makes all 10 ethernet ports available. However, this is a compromise configuration that stretches the resources in the LS1088 - if you don’t need all 10 interfaces it is a good idea to remove any you are not using.

Are there more details available on how to disable unused interfaces or otherwise split the DPIOs across ports differently?

Good question! I’ve been meaning to write this up for the manual.

The best way to do this is to create your own data path layout (DPL) with only the Ethernet ports you need.
The procedure is:

  1. Flash the ‘eth0 only’ DPL in recovery mode and reboot
  2. Add the ports you want with ls-addni (note, eth0->GE0 is already in there, but you can edit this)
  3. Compile and flash the new DPL

The process is roughly:

# Download "eth-dpl-eth0-only.dtb" from the firmware package (also available under "components" on our server)
wget https://archive.traverse.com.au/pub/traverse/ls1088firmware/firmware-builds/latest/components/eth-dpl-eth0-only.dtb
mtd erase dpl && mtd write eth-dpl-eth0-only.dtb dpl
reboot

(Alternatively, if your kernel supports restool, you can stop here and just add the ports you want at runtime)

On next boot, go into recovery again, and add the specific ports you want
See DPAA2 configuration in the manual for the DPMAC numbers corresponding to each Ethernet port.
The default order is 7,8,9,10,3,4,5,6,1,2.

ls-addni dpmac.8 #GE1
ls-addni dpmac.9 #GE2
ls-addni dpmac.10 #GE3
ls-addni dpmac.2 # XG0
ls-addni dpmac.1 #XG1
# Dump the new DPL
restool dprc generate-dpl > new-dpl.dts
# Edit the DPL to fix the port ordering (see below)
vi new-dpl.dts
# Compile into a blob (ignore the warnings from dtc, since this not an actual device tree)
dtc < new-dpl.dts > new-dpl.dtb
# Flash
mtd erase dpl && mtd write new-dpl.dtb dpl
# Reboot

Port ordering
Linux will enumerate ethernet ports in reverse order of the network interfaces (DPNI).
To get around this, you want to connect the last DPNI to the first MAC which will become eth0.
You need to edit the “connection@X” nodes at the bottom of the DPL file.
e.g

                connection@1{
                        endpoint1 = "dpni@5"; /* was dpni@0 */
                        endpoint2 = "dpmac@7";
                };

                connection@2{
                        endpoint1 = "dpni@4"; /* was dpni@1*/
                        endpoint2 = "dpmac@8";
                };

See the source of eth-dpl-all.dts for how it does it.

Future updates
NXP has added features (DPNI_OPT_SHARED_FS) in recent MC firmwares that might allow a full 10 port configuration with better portal balancing. I’ve tried to use it and encountered issues getting the firmware to accept the configuration. I have a enquiry with them open on this and hopefully they will fix that issue soon.

2 Likes

Brilliant writeup!

I’d just like to add that the mapping from DPNI number to Linux interface name can be tweaked using udev rules. I have the following in rules.d to get interface names that match the marking on the ports, despite the active ports being discontiguous:

SUBSYSTEM=="net", ACTION=="add", DEVPATH=="/devices/platform/soc/80c000000.fsl-mc/dprc.1/dpni.0/net/*", NAME="eno0d0"
SUBSYSTEM=="net", ACTION=="add", DEVPATH=="/devices/platform/soc/80c000000.fsl-mc/dprc.1/dpni.1/net/*", NAME="eno0d2"
SUBSYSTEM=="net", ACTION=="add", DEVPATH=="/devices/platform/soc/80c000000.fsl-mc/dprc.1/dpni.2/net/*", NAME="eno0d4"
SUBSYSTEM=="net", ACTION=="add", DEVPATH=="/devices/platform/soc/80c000000.fsl-mc/dprc.1/dpni.3/net/*", NAME="eno0d6"
SUBSYSTEM=="net", ACTION=="add", DEVPATH=="/devices/platform/soc/80c000000.fsl-mc/dprc.1/dpni.4/net/*", NAME="eno0d8"

(eno0 is persistant network naming speak for “first onboard ethernet controller”.) There are also patches to allow rules to match on the DPMAC number, but these rules work with stock udev.

Some good news on this!

NXP has provided advice on how to create a DPL with all 10 ports while allowing packet flows to be distributed across all 8 cores. It can be done without any special firmware features providing certain hardware features like QoS classification and VLAN filtering aren’t used.

The trick is to reduce the number of entries put into the hardware flow steering table. By default 64 are allocated by firmware and userspace tools (ls-addni etc.). To use all 10 ports we have to reduce this to 56 and provide 8 queues (enough for all CPU cores):

The ls-addni command is:

ls-addni -nq=8 -t=1 -f=56 dpmac.X

Which results in this configuration in the DPL:

dpni@0 {
    num_queues = <8>;
    num_tcs = <1>;
    num_cgs = <1>;
    mac_filter_entries = <16>;
    vlan_filter_entries = <0>;
    fs_entries = <56>;
    qos_entries = <0>;
};

This seems to do the job:

(This is with 2x10G bridged in OpenWrt and 5 simultaneous iperf3 flows between hosts)

An updated default DPL can be downloaded at https://archive.traverse.com.au/pub/traverse/ls1088firmware/firmware-builds/branches/dpiobalance/399340667/components/eth-dpl-all.dtb

The changes to the default DPL is in this commit

If everything works ok I’ll release this as firmware v0.8.9 in a few days.

3 Likes

Nice! The dtb here seems to be working well. Switched to it last night and softirq seems to be relatively balanced since with no obvious issues.

@mcbridematt did you manage to release the new firmware already?

Not yet, sorry. There are a couple of other projects I’m working on at the moment.

There is also a device tree fix for kernel 5.15 and later (fixes duplicate interrupts for gpio-keys) that will come with it as well.
I think I will be able to release it next Monday.

Firmware v0.8.9 includes the new default DPL.

2 Likes

Hello Matt. Just an FYI that I’ve applied the 0.8.9 firmware update and found that the passthrough script (from here) seemed to stop working. I expect it’s because of the default num queues and flow steering table. I’ll look more closely at restool and how to configure those values.

For passthrough you should keep using eth-dpl-eth0-only.dtb as a base. The default DPL is intended for using all interfaces as ethX in Linux.

The flow steering issue doesn’t affect you if you create and use less than 10 DPNIs at runtime, as the default allocations are still within the LS1088’s resource limits.

1 Like

@mcbridematt
Looking around on the URL provided for the DPL, there is a file called " eth-dpl-ge0_4_xg0_1.dtb".
Is that for enabling the first 4 gigabit ports and both SFP+ cages?
I am mainly asking because if so, that would be a great config for my use-case.

Yes, exactly. It’s intended for a variant of Ten64 without the second group of 1G ports, so GE0->3+XG0+1 will appear as eth0-eth6.

Just installed it and wondered why my network was getting capped and a single CPU core getting hammered at full blast, and all others bored.

Looked at the DTS (via restool) and noticed all interfaces have “num_queues = <1>;” on the “eth-dpl-ge0_4_xg0_1.dtb” file. Changing all of them to 8 and re-flashing fixed my issues!

Is that a mistake in the file or purposeful?