The fifth generation Raspberry Pi has, for the first time, included a PCI Express slot for handling not only the I/O for things like USB, Ethernet, Camera, Display and GPIO, but also a exposes one lane fo an external PCIe connector.
The Raspberry Pi 5 defaults to all PCIe lanes running at Gen 2 speeds (5 Gigatransfers/sec) as this is what was certified. The internal lanes are always set to Gen 2 speed, but the external connector can be configured to run at Gen 3 speeds (8 GT/sec).
Performance Considerations
So why would you want to run Gen 3 vs Gen 2? Speed, naturally. But what kind of real-world performance differences can we expect forcing the Pi to run Gen 3 instead of the default Gen 2?
To measure this, we’ll use this testbed:
- Raspberry Pi 5 - 8GB RAM
- Samsung SSD 980 PRO 1TB Hard Drive
Baseline - PCIe Gen 2
To start, we’ll get baseline performance for a stock Raspberry Pi 5 configuration.
List the PCI buses and the devices connected to them:
$ lspci
0000:00:00.0 PCI bridge: Broadcom Inc. and subsidiaries BCM2712 PCIe Bridge (rev 21) 0000:01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller PM9A1/PM9A3/980PRO 0001:00:00.0 PCI bridge: Broadcom Inc. and subsidiaries BCM2712 PCIe Bridge (rev 21) 0001:01:00.0 Ethernet controller: Raspberry Pi Ltd RP1 PCIe 2.0 South Bridge
We’ll focus on bus
0000
as we can see device01
contains the NVMe SSD Controller.Inspect the PCI Express configuration for
0000:01
:sudo lspci -s 0000:01:00.0 -vvv | grep -Ew "LnkSta|LnkCap"
LnkCap: Port #0, Speed 16GT/s, Width x4, ASPM L1, Exit Latency L1 <64us LnkSta: Speed 5GT/s (downgraded), Width x1 (downgraded)
With this, we can see that our NVMe drive can support 16GT/s using 4 PCIe lanes, however the negotiated speed was downgraded to 5GT/s using a single PCIe lane which matches the default PCIe Gen 2 spec mentioned above.
To obtain a baseline performance mark for Gen 2 speeds, we time how long it takes to write a 50GB file to disk using:
$ dd if=/dev/zero of=./Testingfile bs=100M count=50 oflag=direct
50+0 records in 50+0 records out 5242880000 bytes (5.2 GB, 4.9 GiB) copied, 13.31 s, 394 MB/s
Which, in this example, takes 13.31 seconds at a rate of 394 MB/second. Not bad, but we can do better.
Enable PCIe Gen 3
As root, edit
/boot/firmware/config.txt
and append these two lines to the end:dtparam=pciex1 dtparam=pciex1_gen=3
Save and reboot the system.
Again we look at the PCIe link using:
$ sudo lspci -s 0000:01:00.0 -vvv | grep -Ew "LnkSta|LnkCap"
LnkCap: Port #0, Speed 16GT/s, Width x4, ASPM L1, Exit Latency L1 <64us LnkSta: Speed 8GT/s (downgraded), Width x1 (downgraded)
Now we see 8GT/s were negotiated which aligns with PCIe Gen 3 speeds. But does it amount to a noticeable difference in transfer rates?
Run the
dd
command again:$ dd if=/dev/zero of=./Testingfile bs=100M count=50 oflag=direct
50+0 records in 50+0 records out 5242880000 bytes (5.2 GB, 4.9 GiB) copied, 7.1666 s, 732 MB/s
Now we can output the 50GB file in only 7.1666 seconds (down from 13.31) with a transfer rate of 732 MB/sec. An increase of 85%!
Stability at High Speeds
When forcing devices to run at higher speeds, it’s important to monitor for errors that may occur. These errors will show up in the Uncorrectable Error Status Register. We can use lspci
again to look at these:
$ sudo lspci -s 0000:01:00.0 -vvv | grep UESta
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
See Uncorrectable Error Status Register for a description of these.
In the above example, all values are negative meaning no errors have occurred. If there is an error, the value will end with a +
sign.