Proxmox/PiMox installer for Ten64

mcbridematt · January 16, 2023, 1:08am

Hi all,

For a while now, the PiMox project has maintained an ARM64 port of Proxmox. I know a few users have been using our OpenWrt builds as guest VMs under it.

With a bit of tweaking I have managed to get Pimox to reliably install on a Ten64!
See my fork repo for more details.

The installer script is here: pimox-ten64/IA-Install.sh at master · mcbridematt/pimox-ten64 · GitHub or
https://github.com/mcbridematt/pimox-ten64/raw/master/IA-Install.sh

Proxmox/PiMox for Ten64

This is not an official port of Proxmox

This is a installer script for Proxmox/PiMox (Proxmox port for ARM64 machines) that
will work on the Ten64

It’s based off the original pimox7 installer script,
with tweaks to account for differences between the Raspberry Pi and Ten64
(for example, configuring GRUB to supply the kernel command line instead of RPI’s cmdline.txt).

Just after I finished testing the pimox7/PVE7.2 port, jiangcuo
published a set of packages for PVE7.3. Thankfully this installer script will work with their
repository as well.

If you want to use the pimox7 package set, just edit the repository used in the install script.

Installation instructions

Install a fresh Debian stable (11) using the appliance store / recovery environment

baremetal-deploy debian-stable /dev/nvme0n1

Once your fresh Debian install is up and running, copy the IA-Install.sh script to your Ten64

Run the IA-Install.sh as the default (debian) user, from the serial console (you will lose network connectivity during the install process!)

debian@debian:~$ sudo -i # (elevate to root, needs to be -i to get the root user profile)
root@debian:/home/debian# bash IA-Install.sh

Wait for the install to finish. It will reboot automatically.

Note: About 1GB of packages will be downloaded during the install process.

Notes

eth0 will be setup as the uplink for the default bridge interface vmbr0.
You can change this by editing /etc/network/interfaces, or using the Network configuration UI in Proxmox.

When creating a VM you will need to note the following:

You need to select UEFI as the BIOS for VMs. You will be prompted to assign a disk
drive for UEFI variables.

On Proxmox/PiMox 7.2, it creates a VM using IDE for the CDROM as default. This will not work.

In the VM creation wizard, leave the CDROM unconnected.

When the VM has been created, delete the ide2 CDROM device and create a new one using SCSI.

In the boot order options, edit the boot order so the new scsi CDROM device is after the VM disk and PXE/network

Troubleshooting

Sometimes the bridge interface (vmbr0) does not start automatically on boot.
You might need to login on the serial console and bring it up manually
ifup vmbr0
Resources / More Info

pimox project (Proxmox rebuilt from source)

jiangcuo’s PVE 7.3 (repackage of amd64 debs to arm64)

There are some threads on the Proxmox forum, e.g TUTORIAL: How to run PVE 7 on a Raspberry Pi

Many thanks to the Pimox community for bringing Proxmox to Arm64!

μVirt vs Proxmox/Pimox: which is better?

Proxmox is significantly more fully featured than μVirt (example: pause/resume VMs, LXC, more storage options, backup functionality), whereas μVirt was designed for low-overhead virtualization in ‘converged’ networking and compute environments. It’s smaller footprint also allows us to experiment with features like DPAA2 passthrough, self-contained k3s setup and more.

We are still developing μVirt and there are a few features we intend to add in the coming months (such as our 4G/5G management stack, configuring guest workloads like AWS Greengrass and potential hardware network acceleration)

mrcheap · March 22, 2023, 4:07pm

Great work, Promox is a great addition.

What’s the performance like?

Can openwrt be run as a VM with SFP+ support.

Krist · April 1, 2023, 2:17pm

I wanted to try this out, but the install script failed…

Reading package lists… Done
Building dependency tree… Done
Reading state information… Done
E: Unable to locate package proxmox-ve
Reading package lists… Done
Building dependency tree… Done
Reading state information… Done
E: Unable to locate package pve-manager
Reading package lists… Done
Building dependency tree… Done
Reading state information… Done
E: Unable to locate package proxmox-ve

Could be due to this error that came earlier:

Err:9 APQA网盘 - /foxi/Virtualization/proxmox/foxi/ bullseye Release
404 Not Found [IP: 162.14.99.182 443]
Reading package lists… Done
E: The repository ‘APQA网盘 - /foxi/Virtualization/proxmox/foxi/ bullseye Release’ does not have a Release file.
N: Updating from such a repository can’t be done securely, and is therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user configuration details.

I saw on the proxmox site that Proxmox VE 7.4 is out. Is that maybe the reason?

Krist · April 1, 2023, 2:44pm

Looks like the debian repo for pimox has moved. I corrected it in the install script, and this appears to work.

mrcheap · December 15, 2023, 3:19pm

I’ve been able to load the latest Proxmox 8.1 using the Debian bookworm from the recovery store and using jiangcuo Bookworm repository and instructions.

Performance was great and was having fun until:

An update has caused the untagged traffic to be discarded to the physical ethernet ports from VMs when using VirtIO network on a GuestVM.

Traffic on tagged VLANs have no issues.

Changing the network card emulation (checked the other three) resolves the issue but it comes with a serve performance hit.

GuestVMs can talk to host and other GuestVMs but fail as soon as they leave the box.

Unfortunately it’s on the 1 and 10G ports.

I was able to use a USB 3.0 2.5G Realtek dongle and this device has no issues with untagged traffic leaving the box.

Hence I suspect the switch, is there any updates or known bugs? Unless it’s in the kernel or mixture.

Minor concerns:
What’s the best way to control the fan? it feels like it’s at max, not sure if it’s summer here in Australia or no feedback controller.

Any patch to enable the SFP+ activity lights for Debian Bookworm?

Thanks again for your work!

mcbridematt · December 16, 2023, 1:29am

Interesting, I’ll take a look at that sometime. My hunch is that it’s something to do with IP checksum offloads.

Possibly, there was another user that had issues with checksums.
Another device on their network was appending garbage to end of the Ethernet frames, and the DPAA2 hardware was including the garbage data in the checksum calculations (when the frame left the Ten64). It could be a similar problem

I haven’t followed up that issue yet, but there are other ways it could be resolved without fixes upstream.

You will need to compile the emc2301 module from here:
traversetech / ls1088firmware / traverse-sensors · GitLab
Make sure to add it to /etc/modules so it loads at boot.
The fan speed should go down immediately when it’s loaded, and it will automatically increase with CPU temperature

Try this:

modprobe ledtrig-netdev
cd /sys/class/leds/ten64\:green\:sfp1\:down
echo netdev > trigger
echo eth8 > device_name
echo 1 > rx
echo 1 > tx
echo 1 > link

You can do the same with ten64:green:sfp2:up and eth9

Unfortunately this has to be done each boot, due to the DPAA2 architecture having split MACs and PHYs, it isn’t possible to use the device tree bindings (hope to fix that someday).

mrcheap · December 16, 2023, 4:20am

Thanks fan fixed and LEDs are working

I can mostly work around the issues with my managed switch, but the WAN / ISP connection not so simple, I could put a managed switch in to change the VLAN. I’m also finding USB ethernet adaptors have their own issues as well.

Thanks for the help.

mrcheap · January 5, 2024, 1:06am

I was able to use VLANs to get it going.

I set the WAN port PVID to a different VLAN and attached it to the OpenWRT WAN port as a tagged VLAN.

Unfortunately I’m unable to receive the STP packets, but it is workable.

I am having difficulties passing the mt7916 WIFI card to the OpenWRT VM.

I was able to perform this function when using μVirt

I am receiving an error: when starting the VM.
kernel: vfio-pci: probe of 0001:03:00.0 failed with error -22.

Searching the web, it says that error is either IOMMU is not enabled or the device is linked to others in a IOMMU group.

Judging by my dmesg, it appears the device is sharing IOMMU group
[ 6.528487] pcieport 0001:00:00.0: Adding to iommu group 2 - Freescale Semiconductor Inc Device 80c0
[ 6.545944] pcieport 0001:01:00.0: Adding to iommu group 2 - Pericom Semiconductor Device b304 (rev 01)
[ 6.552493] pcieport 0001:02:01.0: Adding to iommu group 2 - Pericom Semiconductor Device b304 (rev 01)
[ 6.577776] pcieport 0001:02:02.0: Adding to iommu group 2 - Pericom Semiconductor Device b304 (rev 01)
[ 18.411845] mt7915e 0001:03:00.0: Adding to iommu group 2
[29246.757966] pci 0001:03:00.0: Removing from iommu group 2
[29257.901759] vfio-pci 0001:03:00.0: Adding to iommu group 2

I’ve tried following this guide, but it’s written for Intel and AMD and doesn’t appear to have any impact. I suspect the CPU and or kernel doesn’t have ACS.

Are there any kernel commands I need to enable to help separate, or should I try using the other PCIe port?

Thanks again.

mcbridematt · January 5, 2024, 1:23am

I’ve been looking at this issue today, actually.

I can reproduce the issue, so that’s a start.
It seems Proxmox doesn’t bridge the VM directly to the Linux bridge, but through a veth pair. So my guess is the problem is in there somewhere.

Yes, both miniPCIe slots live behind a PCIe switch, so if one is passed through, the whole group will be.
Do you have a card in the other miniPCIe slot? If so, that also needs to be passed through.
The switch device itself will be handled automatically.

The only other tip I have is that disabling PCIe deep sleep states can help, it doesn’t seem to work well on the LS1088 SoC.

To do this, add vfio-pci.disable_idle_d3=1 to the kernel command line. Adding to GRUB_CMDLINE_LINUX_DEFAULT in /etc/default/grub and running update-grub will apply it to the next boot.

mrcheap · January 5, 2024, 1:44am

Thanks Matt,

I’ll try adding the PCIe deep sleep option.

Should I pass the entirety of 0001 instead of 0001:03:00 in the conf file, I can only select the wireless card and SSD in the web GUI.

agent: 1
bios: ovmf
boot: order=scsi0;scsi2;net0
cores: 2
efidisk0: local:103/vm-103-disk-0.qcow2,efitype=4m,pre-enrolled-keys=1,size=64M
hostpci0: 0001:03:00
memory: 2048
meta: creation-qemu=8.1.2,ctime=1703940700
name: OpenWRT-TST
net0: virtio=BC:24:11:98:9D:1D,bridge=vmbr2
net1: virtio=BC:24:11:15:6D:CE,bridge=vmbr1,link_down=1
numa: 0
ostype: l26
scsi0: local:103/vm-103-disk-1.raw,iothread=1,size=1G,ssd=1
scsi2: none,media=cdrom
scsihw: virtio-scsi-single
serial0: socket
smbios1: uuid=9e1073f7-fac2-400f-bba8-b873d7d98b1c
sockets: 1
vga: serial0

No other cards, output below of entire system.

root@traverse:/usr/src# lspci -t
-+-[0000:00]---00.0-[01-ff]--
 +-[0001:00]---00.0-[01-ff]----00.0-[02-04]--+-01.0-[03]----00.0
 |                                           \-02.0-[04]--
 \-[0002:00]---00.0-[01-ff]----00.0
root@traverse:/usr/src# lspci -knn
0000:00:00.0 PCI bridge [0604]: Freescale Semiconductor Inc Device [1957:80c0] (rev 10)
        Kernel driver in use: pcieport
0001:00:00.0 PCI bridge [0604]: Freescale Semiconductor Inc Device [1957:80c0] (rev 10)
        Kernel driver in use: pcieport
0001:01:00.0 PCI bridge [0604]: Pericom Semiconductor Device [12d8:b304] (rev 01)
        Kernel driver in use: pcieport
0001:02:01.0 PCI bridge [0604]: Pericom Semiconductor Device [12d8:b304] (rev 01)
        Kernel driver in use: pcieport
0001:02:02.0 PCI bridge [0604]: Pericom Semiconductor Device [12d8:b304] (rev 01)
        Kernel driver in use: pcieport
0001:03:00.0 Network controller [0280]: MEDIATEK Corp. Device [14c3:7906]
        Subsystem: MEDIATEK Corp. Device [14c3:7906]
        Kernel driver in use: mt7915e
        Kernel modules: mt7915e
0002:00:00.0 PCI bridge [0604]: Freescale Semiconductor Inc Device [1957:80c0] (rev 10)
        Kernel driver in use: pcieport
0002:01:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD Blue SN550 NVMe SSD [15b7:5009] (rev 01)
        Subsystem: Sandisk Corp WD Blue SN550 NVMe SSD [15b7:5009]
        Kernel driver in use: nvme
        Kernel modules: nvme

mrcheap · January 12, 2024, 4:11pm

I’ve tried a few different things with varying success.

I used the traverse Debian Kernel 6.2 per documentation and found more network connectivity. Yet to try untagged native but STP is working again.

For the Device passthrough I haven’t had any success, I tried installing a different MediaTek device in the other slot. Still get the VFIO bind error.

I found a kernel patch talking about:
The current smmu-v3 driver does not enable PCI ATS for physical functions
of ATS capable End Points when booted in smmu bypass mode
(iommu.passthrough=1)

Could this be the blocking point for pass through?

https://lore.kernel.org/linux-arm-kernel/2bffb6ca-e5f5-1a24-a47a-b0238e184ad6@os.amperecomputing.com/T/

mcbridematt · January 13, 2024, 12:04am

I managed to track down part of the problem.
The DPAA2 Ethernet driver is rejecting the packets as their data section (of the holding buffer/skb) is not aligned correctly
https://elixir.bootlin.com/linux/v6.1.72/source/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c#L1054

I haven’t gone looking for the actual “source” of the problem yet, I think something at the start of the chain (like tap, vhost-net or bridge) is resizing the buffer that holds the packet and causing it to be misaligned.

Interesting, thanks for the link.
I haven’t done VFIO in a while so I need to setup a test with a recent kernel version and check from there.

mcbridematt · January 26, 2024, 10:46pm

The network issue is a recent kernel regression. Given how recent it was, I didn’t think it was going to be something so simple breaking. Other VM hosts like muvirt are also affected.

A patch to the DPAA2 Ethernet driver that was accepted in early December, added the requirement for the outbound packet buffers to be aligned.
This was supposed to fix an issue with jumbo frames, but caused this problem with vhost-net as well.

I’ve reported it here: [PATCH v2 net 0/2] dpaa2-eth: various fixes

Hopefully the team at NXP can work out how to solve this. I can see a potential workaround in the driver, but have not attempted it yet.

The workarounds are:

Use a kernel without the patch (released before December 2023 or with the patch reverted)
Disable vhost in QEMU, specifying vhost=off instead of on.
In proxmox you can try this:

rmmod vhost_net
rm /dev/vhost-net

This will cause proxmox to not enable vhost when invoking QEMU.

mrcheap · January 31, 2024, 1:05pm

Thanks for chasing this one down!
The timing matches.

Any progress in the passthrough aspect?

mcbridematt · February 5, 2024, 9:30am

Can you try this kernel package: 6.1.76 ?

It has a revert for the tx-align change, so virtual machine networking should work correctly. The same change will be rolled out to all Traverse builds (Kernel, OpenWrt, muvirt) until a better solution is found.

For vhost-pci, I think there may be an issue on recent kernels that is exposing a known problem with the Layerscape PCIe and the Pericom PCIe switch.
If you see a kernel panic with pci_generic_config_read, it’s that problem (see this OpenSUSE bug report for the rundown). I think I will have to file a kernel bug to get attention on it.

The following passthrough configuration seems to work with the two cards I’ve tested (MT7916 and Intel AX210):

6.1.x kernel (need to bisect to see when this problem got worse)
vfio-pci.disable_idle_d3=1 added to the kernel command line

$ grep GRUB_CMDLINE_LINUX_DEFAULT /etc/default/grub GRUB_CMDLINE_LINUX_DEFAULT="earlycon arm-smmu.disable_bypass=0 net.ifnames=0 vfio-pci.disable_idle_d3=1"
(Make sure you run update-grub after modifying /etc/default/grub)
In Proxmox PCI device options, ensure “All Functions” is ticked

mrcheap · February 5, 2024, 12:11pm

I’ve loaded all the changes.

Unfortunately, I’ve just got a big cloud backup going via my ten64, but as soon as that finishes I’ll restart the new kernel. The one night I’m unable to restart a potential fix is released.

Thank you for investigating this issue, Will let you know the results.

mrcheap · February 7, 2024, 7:08am

Unfortunately no success with WIFI card

root@traverse:~# uname -a
Linux traverse 6.1.76-traverse-lts-6-1+109-00021-g9889cb2828ba #1 SMP Mon Feb  5 08:36:19 UTC 2024 aarch64 GNU/Linux

root@traverse:~#  grep GRUB_CMDLINE_LINUX_DEFAULT /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="earlycon arm-smmu.disable_bypass=0 net.ifnames=0 vfio-pci.disable_idle_d3=1"

Error from journal

Feb 07 07:08:01 traverse kernel: vfio-pci: probe of 0001:03:00.0 failed with error -22
Feb 07 07:08:01 traverse pvedaemon[8611]: Cannot bind 0001:03:00.0 to vfio
Feb 07 07:08:01 traverse pvedaemon[2927]: <root@pam> end task UPID:traverse:000021A3:000241B8:65C32C51:qmstart:103:root@pam: Cannot bind 0001:03:00.0 to vfio

Though network to the VM, without VLAN is working

mrcheap · February 9, 2024, 11:17pm

Maybe, I’ve altered to many settings somewhere it has broken and need to perform a fresh install.

Are you using the AsiaRF AW7916-NPD wifi card?

@mcbridematt could you please confirm what installation process you followed

mcbridematt · February 10, 2024, 7:50am

I’ve pushed a bunch of changes to the original installer script, for the new repo and kernel, as well as installing ifupdown2:

The method jiancuo provides for installation on stock Debian also works:

The main issue with installation is, depending on how you installed Debian, you won’t have ifupdown2 and need to remove whatever network manager is configured (be it NetworkManager, netplan or something else) and ensure /etc/network/interfaces is set up properly.

Even after that, ifupdown might still have problems (e.g “ifup vmbr0” failing). I haven’t had an opportunity to look into why it’s not functioning 100% of the time.

Can you check that no drivers are bound to the WiFi card and manually bind it to vfio-pci?
(Refer to VFIO - Traverse Ten64 Documentation)

# If this symlink/folder (driver) exists, then the card is being controlled 
# by another driver, and you need to unbind it from there first
readlink -f /sys/bus/pci/devices/0001\:03\:00.0/driver

# For example, if mt7915e is holding the card
echo 0001\:03\:00.0 > /sys/bus/pci/drivers/mt7915e/unbind
# or
echo 0001\:03\:00.0 > /sys/bus/pci/devices/0001\:03\:00.0/driver/unbind

# Bind the card to vfio-pci
echo "vfio-pci" > /sys/bus/pci/devices/0001\:03\:00.0/driver_override
echo 0001\:03\:00.0 > /sys/bus/pci/drivers/vfio-pci/bind

If the last step fails, the last couple of lines in dmesg should say why vfio-pci can’t bind to the card.

Yes, specifically:

$ lspci -nn
...
0001:03:00.0 Network controller [0280]: MEDIATEK Corp. Device [14c3:7906]

mrcheap · February 10, 2024, 3:59pm

Just tried a fresh install, unfortunately still no luck

I’ve also tried an Intel MPE-AXE3000H and MT7915 with still the same error 22.

root@traverse:~# readlink -f /sys/bus/pci/devices/0001\:03\:00.0/driver
/sys/devices/platform/soc/3500000.pcie/pci0001:00/0001:00:00.0/0001:01:00.0/0001:02:01.0/0001:03:00.0/driver

root@traverse:~# echo 0001\:03\:00.0 > /sys/bus/pci/drivers/mt7915e/unbind
-bash: echo: write error: No such device

root@traverse:~# echo 0001\:03\:00.0 > /sys/bus/pci/devices/0001\:03\:00.0/driver/unbind
-bash: /sys/bus/pci/devices/0001:03:00.0/driver/unbind: No such file or directory

root@traverse:~# echo "vfio-pci" > /sys/bus/pci/devices/0001\:03\:00.0/driver_override

root@traverse:~# echo 0001\:03\:00.0 > /sys/bus/pci/drivers/vfio-pci/bind
[ 2984.800003] vfio-pci: probe of 0001:03:00.0 failed with error -22
-bash: echo: write error: Invalid argument

last few lines of dmesg
[ 2649.806164] vfio-pci: probe of 0001:03:00.0 failed with error -22
[ 2650.161016] vfio-pci: probe of 0001:03:00.0 failed with error -22
[ 2686.657889] vfio-pci: probe of 0001:03:00.0 failed with error -22
[ 2898.155928] vfio-pci: probe of 0001:03:00.0 failed with error -22
[ 2984.800003] vfio-pci: probe of 0001:03:00.0 failed with error -22