I'm wondering if anyone has data to support me. I'm setting up a batch of mini PCs featuring RTL 8125 2.5 Gb/s card, and out of the box, it appears to be working well using Ubuntu stock r8169 driver.
However, I was only able to test it with 1000Mb/s switch, and research shows there are stability concerns when negotiated speed is actually 2.5.
Does anyone have experience using RTL 8125 on a 2.5Gb/s switch via r8169 (shipped with Ubuntu 22)and real-life example of issues that may happen?
Since about 18 months I run a cluster of Rockchip RK3588 based systems with dual Realtek 8125 ports. The systems have been running Ubuntu 22.04, 24.04 and 24.10 and kernels 5.10, 6.1, 6.8, 6.9, 6.11 and now 6.12, some with the mainline r8169 driver, some with Realtek r8125 driver and some with my rewrite of the Realtek r8125 driver. Some systems are connected to a business grade 10Gb switch (HPE), some to a consumer grade 2.5Gb switch (TPlink TL-SG105-M2) and some are directly connected using short cables. All cables are quality CAT8 (achieve full 10Gb on 10Gb NIC's).
In general, both drivers have been performing very well:
With the TPlink switch, receive performance degrades as time goes by due to transmit collisions (measured wih iperf3 -R option) to about 1.50 Gb/s after ~3 months. A power cycle is required to restore 2+Gb/s performance but that requires pulling the plug (no reset button) which for quite some owners has caused the switch to die!
Functionality-wise, the Realtek 8125 driver offers MSI/MSIX messaging support, 4 RSS queues and 2 transmit queues, as well as PTP support (although I never was able to get that working correctly) over and above the mainline r8169 driver.
Stability-wise, there have been a few bugs, but on "regular" heavy load (e.g. TCP, docker, kubernetes, NFS, Samba, clusters FS etc.) it has ben rock solid:
Some systems have been running 3 months 24/7 without a single failure.
I ended up rewriting the Realtek 8125 driver, fixing some issues (the 8 core RK3588 stresses the 6 TX/RX queues and ARM64 cache lines and some memory barriers were missing or not placed correctly), but most importantly reducing system CPU consumption, improving throughput by 20% when all 4 RSS and 2 TX queues are 100% loaded @2.5Gb and code size by 50%. Currently trying to fix PTP, making progress but not fully working yet (trial and error as Realtek does not provide HW documentation).
Edit: the TPlink switch itself is not the only thing to take into account. Also (parameters of)
iperf3
can significantly change the results. For exampleiperf3 -u -c <ip>
yields an abysmal 1Mb/s UDP result.iperf3 -u -c <ip> -b 0
, which allows for unlimited bandwidth, yields a more respective 1.62Gb/s, butiperf3 -u -c <ip> -b 10GB
performs strange enough at the full 2.5Gb/s (also on the TPlink switch).