SFP ports stop working with Linux 6.14 in Arch Linux

When upgrading my box today to the latest kernel, I found that I lost all connectivity on the SFP ports. When I rolled back to kernel version 6.13, they worked again.

On the bad kernel version, I do see these log lines:

Apr 10 19:20:05 Heimdall kernel: fsl_dpaa2_eth dpni.1 (unnamed net_device) (uninitialized): autoneg setting not compatible with PCS
Apr 10 19:20:05 Heimdall kernel: sfp dpmac2-sfp: Unable to ascertain link mode
Apr 10 19:20:05 Heimdall kernel: fsl_dpaa2_eth dpni.1 eth8: selection of interface failed, advertisement 00,00000000,00000000,00000000

When I have time, I’ll likely do a bisect to locate the exact commit if I can.

This appears to be due to a feature introduced in the 6.14 cycle, which deals with in-band negotiation (something generally used with <=1Gbps copper ports and SFPs)

https://patchwork.kernel.org/project/netdevbpf/cover/Z08kCwxdkU4n2V6x@shell.armlinux.org.uk/

And specifically this patch to the PCS driver used on this hardware (lynx):
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.15-rc1&id=6561f0e547be221f411fda5eddfcc5bd8bb058a5

I have just tested reverting the above commit on the Linus tree and the SFPs work again.

I’ll do some more checking next week, before formally reporting it as a regression.

2 Likes

Russell King has posted this patch to try:

diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index 1bdd5d8bb5b0..2147e2d3003a 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -3624,6 +3624,15 @@ static int phylink_sfp_config_optical(struct phylink *pl)
 	phylink_dbg(pl, "optical SFP: chosen %s interface\n",
 		    phy_modes(interface));
 
+	/* GBASE-R interfaces with the exception of KR do not have autoneg at
+	 * the PCS. As the PCS is media facing, disable the Autoneg bit in the
+	 * advertisement.
+	 */
+	if (interface == PHY_INTERFACE_MODE_5GBASER ||
+	    interface == PHY_INTERFACE_MODE_10GBASER ||
+	    interface == PHY_INTERFACE_MODE_25GBASER)
+		__clear_bit(ETHTOOL_LINK_MODE_Autoneg_BIT, config.advertising);
+
 	if (!phylink_validate_pcs_inband_autoneg(pl, interface,
 						 config.advertising)) {
 		phylink_err(pl, "autoneg setting not compatible with PCS");

The SFPs are now working again on my system. Can you check if the above works for you as well?

Can confirm that that patch on top of commit bc3372351d0c8b2726b7d4229b878342e3e6b0e8 fixes the SFP ports.

1 Like