Sfp+ working only in recovery

Hello,
I tried 2 different sfp+:

  • Mikrotik S+RJ10, listed as compatible
  • Cisco fet-10, not listed as incompatible ( :grinning:)
    They work perfectly in recovery mode but I can’t find a way to make them work in either linux or freebsd.

I recompiled recent kernels, played around with legacy and managed mode, without success.
In the meantime, I tested a DAC with no problem.
Do you have any idea of what is going on?
What should/could I do next ?

Regards

Your issue is likely to do with the SFP GPIOs, specifically TXDISABLE
https://ten64doc.traverse.com.au/network/sfp/#example-sfp-setup-standard-linux

The recovery firmware and our OpenWrt variants will set the GPIOs up if managed mode isn’t being used.
I don’t think it’s possible to do this under FreeBSD yet due to broken (or missing) I2C bus and device drivers.

A workaround is to use U-Boot to set the SFP pins:

# For XG0 (lower)
=> gpio clear gpio@76_1
gpio: pin gpio@76_1 (gpio 1) value is 0
# For XG1 (upper)
=> gpio clear gpio@76_5
gpio: pin gpio@76_5 (gpio 5) value is 0

These settings persist until the box is power cycled.

Hello,

Isn’t this needed for “older kernels or non-Linux managed” only ?
Anyway, I tried both legacy and managed modes and got at a certain point (I think in managed mode)

echo 369 > /sys/class/gpio/export
sys/class/gpio/export: No such file or directory

On Linux, I even tried a kernel 5.19-rc1 ( By the way the version 5.18 should be avoided as there is a bug that causes the appliance to crash under network activity, even an SSH connection : 215886 – dpaa2: TSO offload on lx2160a causes fatal exception in interrupt. But this is off topic.)

For Linux and FreeBSD, a DAC works. Is there a specificity that would allow it to work while still not allowing other sfp+ ? I truly have no idea.

‘Legacy’ SFP mode requires “manual” control of the SFP pins.
‘Managed’ mode for SFP requires the Linux phylink + sfp driver. This driver manages the pins based on SFP state:

# Linux SFP managed mode
cat /sys/kernel/debug/gpio
gpiochip4: GPIOs 368-383, parent: i2c/0-0076, 0-0076, can sleep:
 gpio-368 (                    |tx-fault            ) in  lo
 gpio-369 (                    |tx-disable          ) out lo
 gpio-370 (                    |mod-def0            ) in  lo ACTIVE LOW
 gpio-371 (                    |los                 ) in  lo
 gpio-372 (                    |tx-fault            ) in  lo
 gpio-373 (                    |tx-disable          ) out lo
 gpio-374 (                    |mod-def0            ) in  lo ACTIVE LOW
 gpio-375 (                    |los                 ) in  lo
 gpio-380 (                    |ten64:admin         ) out hi
 gpio-381 (                    |admin_led_lower     ) out lo

# Legacy / unmanaged mode
cat /sys/kernel/debug/gpio
gpiochip4: GPIOs 368-383, parent: i2c/0-0076, 0-0076, can sleep:
 gpio-369 (                    |sysfs               ) out lo
 gpio-373 (                    |sysfs               ) out lo
 gpio-376 (                    |sysfs               ) out lo
 gpio-377 (                    |sysfs               ) out lo
 gpio-378 (                    |sysfs               ) out lo
 gpio-379 (                    |sysfs               ) out lo
 gpio-381 (                    |admin_led_lower     ) out lo

sysfs gpio has been deprecated in recent kernel versions, you can use `CONFIG_GPIO_SYSFS=y to re-enable it.

I’m a bit behind on testing new kernel releases at the moment, hopefully I can look at 5.19 in the next few days.

Passive DAC cables don’t have any electronics in them (apart from the EEPROM), so the state of the control pins (like TXDISABLE) has no effect.

Hello @mcbridematt,
You were right, because the workaround (using u-boot to set the SFP pins) worked.
So, a big thank you :grinning:.

I should say that it worked everytime with the cisco SFP.
Just once with the mikrotik one. Do you have an idea why ?

But something is not clear to me, could you please explain ?
From the documentation, I thought there were 2 choices :

  1. using legacy mode AND setting GPIO setting, direction and value.
  2. using managed mode and
    a) using kernel patches for kernel version < 5.16
    b) nothing special when kernel version > 5.16

As I am using a 5.19 kernel, I thought it would work flawlessly without further action (as long as marvell_phy_** kernel option are set as module).
But it seems that I still have to configure GPIO, so I am kind of lost.

Could you please tell me where I am wrong ? Or what I am guessing wrong ?

I did another test.
eth9 ( using the mikrotik S+RJ10) is configured to get an IP address fropm DHCP.
When another interface is configured to use DHCP too, eth9 gets an ip address.
Whenever eth9 is the only interface configured, it does not get an IP address.
I do not know what to think about this.

Any hint ?

The Linux SFP system has some dependencies:

  1. I2C to be able to read the module EEPROM and sensors (does ethtool -m <iface> work?)
  2. In the Ten64 case, i2c_mux_pca954x (both SFP I2C ports are connected through this) and gpio-pca953x (SFP control GPIOs connected through this IC)

Is it possible you don’t have all of these in your kernel?
I have just updated our kernel configuration and patchset for 5.19-rc1, so these should work:

The S+RJ can be very ‘temperamental’ at times, I think by asserting TXDISABLE this causes the module to reset.

To add more info, my 10Gtek’s SFP+ 10GBASE-T is working in the Freebsd appstore image OOTB.

Nikos

I think you were right about the gpio, because it works now, with the kernel provided in your repository.
Unfortunately, it is a little bit picky and does not get an IP address every time and I have to reboot the ten64.
I can live like that for now, and wait to see if it will improve with time.

Thank you for your help @mcbridematt

Hi,
Do you know what version your S+RJ10 is?

You can see it with ethtool -m eth8/eth9 (must be in default/managed mode):

ethtool -m eth8
        Identifier                                : 0x03 (SFP)
        Extended identifier                       : 0x04 (GBIC/SFP defined by 2-wire interface ID)
        Connector                                 : 0x22 (RJ45)
...
        Vendor name                               : MikroTik
        Vendor OUI                                : 00:40:20
        Vendor PN                                 : S+RJ10
        Vendor rev                                : 2.16
..
        Date code                                 : 200405

I have a couple of versions here, 2.16 seems to work OK but 2.07 seems to have issues. The older 1.0 is ok but doesn’t have temperature monitoring or TX_DISABLE support.

I generally find ‘active’ SFPs (10GBase-T, GPON, etc) work better in the bottom (XG0/eth8) slot as well because the PCB under it works as a heatsink.

Hello,
I hadn’t seen your message, sorry.
I just checked, mine is 2.16.
The problem occurs one in ten times I would say.

@nikos does this mean we do have working 10G under FreeBSD then? Reading the last commen in panic under heavy network load · Issue #19 · mcusim/freebsd-src · GitHub suggests otherwise. Is this down to a specific SFP?

@dch Have you tried the method at the bottom of this post: FreeBSD 14.0-RELEASE on Ten64 ?