Hi, I’m struggling with the strange behavior of emc2301 fan controller.
Enabling PWM using “fan enable_pwm emc2301@2f 0 1” at u-boot console immediately sets fan rpm to 960, no matter what fan speed asserted using “fan set_speed” command.
Also I was able to confirm the same behavior using emc2301 Linux kernel module. Loading the module immediately sets fan rpm to 960 (fan noise actually drops near zero). “/sys/class/hwmon/hwmon3/fan1_input” stays 960 no matter “fan1_target” is (either automatically set according to temperature sensor or asserted manually)
I used v0.8.10 firmware and Linux 5.19.1 and 5.19.2.
Any known issues or past reports of misbehaving fan controller?
I’ve damaged a fan controller myself by ‘hotplugging’ between different fans, the symptoms were similar to what you describe. Haven’t heard of any failing in the field yet.
960rpm basically means ‘0’, it’s the lowest value that the driver will read from the device.
(Due to limitations on using floating-point math in the kernel, the emc2301 driver uses a less exact conversion)
Has your Ten64 always had this issue, or did it fail at random?
I think initially(the first day I worked with my unit) it worked okay and suddenly begin to fail. Currently, the symptom is constant (it fails always).
More findings:
Even though fan1_input reports 960, enabling emc2301 in FSC(fan speed control) mode doesn’t really stop the fan completely. Although it’s very quiet it does decent cooling to cool down thermal zones to 67 degrees Celsius. (I applied the 1.2GHz capping and 0.9v fix.)
Setting emc2301 in DS(direct setting) mode and writing 255 to the fan drive setting register makes the fan spin at full speed. Writing anything less than 255 to the register stops the fan completely.
So what I have is “moderate” “full” and “none” modes of control.
I slightly modified emc2301 kernel module to allow DS(direct setting) mode and wrote a script to control the cooling in userspace. It reasonably works for me.
The new driver doesn’t have FDT support (upstream refuses to add new fan-PWM DT bindings until someone (probably me) comes up with a ‘standard binding’) so we can’t adopt it as is.
(Lack of accepted device tree binding is also hindering my efforts to update and upstream U-Boot)
I would prefer to use RPM to define operating ranges rather than duty cycle % but could compromise if upstream doesn’t agree.
My unit is in “active duty” state so I cannot test the module from NVIDIA right now. Anyway, thank you for the pointer. If the situation allows me to test it I’ll report the result.
I’m seeing patches being mainlined one by one as I manually compile the kernel on each upstream release. Wish you luck for the rest of the journey.