Forward Error Correction (FEC), Features in PCIe 6.0

Forward Error Correction

The PCIe 6.0 specification is already among us in its first drafts and as such, there are quite a few new features, which we already covered in an article about this new version of the bus. But unlike others there is a feature that has had to be included and that will mark the future of motherboards and their transition from version 5.0 to this new 6.0, because, although the latter is compatible, the first one is not going to be compatible with it due to a feature called Forward Error Correction or FEC .

Why has this new standard been so difficult to develop? Why has it taken so long in general and so little since version 5.0? Well, the improvements are more than extensive, since we are currently on PCIe 4.0 and we are talking about quadrupling its speed in the same width of x16 lines. Logically, PCI-SIG has had to implement a series of improvements that guarantee data delivery, including PAM 4 and FEC, but what is the latter and how does it work on this specific bus?

Forward Error Correction or FEC, a necessary technology for PCIe 6.0

PCI-SIG-DevCon-2019_Briefing-Presentation_final_06 PCIe 6.0

Although we have already talked about PAM 4 as we say, FEC is not understood without it. PAM 4 has been among network engineers for some years, where in large data centers it has been the holy grail to save infrastructures or update them, among other technologies.

But it does not stop there, since it has been introduced in the PCIe bus for obvious reasons of modulation of the waves and of course, to achieve a greater bandwidth for each available Hz. Even with its advantages, it also has disadvantages that must be alleviated, such as its more fragile signal, for this reason and being the real reason for its implementation, the PCI-SIG included the so-called Forward Error Correction or FEC .

PCIe especificación 6.0

As its name suggests, FEC is nothing more than a means to correct errors in the sending and reception of a signal between different links or Host, where it manages to provide a constant flow of data with error correction included.

What it achieves is that it goes from a signal that can be critical in terms of data integrity to a stable signal without errors, which guarantees the correct operation of the equipment and its components.

The problem with this technology is its high latency

FEC PCIe 6.0

But not all that glitters is gold. FEC by itself and by its nature of correcting errors found in the purest CRC style is not suitable as such for a bus like PCIe and less in its version 6.0 with 128 Gb / s, not at all.

The problem with FEC is that it introduces latency to the bus, so the packet delivery rate is reduced and can generate an unwanted delay. As such, PCIe 6.0 technology uses a unique method to achieve low latency through a combination of a first bit error rate (FBER at 10 -6) combined with a lightweight, low-latency FEC to complete the initial fix.

But yes, FEC can correct errors, but for this it must know the exact location and magnitude of the error to make the corresponding choices. Why? very straightforward, the goal was to pay a near-zero latency penalty (zero is impossible) and then rely on a very robust CRC for detection, combined with fast link-level replay to handle any errors that the FEC could not correct (It is not infallible and therefore CRC is needed).

On the other hand, if the speed drops from 128 Gb / s in PCIe 6.0, it is possible that FEC can be bypassed, something that will generate lower latency to the system.

FEC PCIe 6.0 4

What will happen if FEC cannot correct the errors? Well, it is time for the CRC to enter generating a NAK, but it will trigger the latency to be roundtrip to check the data up to 100 ns.

It is clear that the use of FEC is justified, it is not perfect, but it is the best method to generate the lowest possible latency having error correction, something totally necessary for something as delicate as the passing of data from CPU, memory and GPU.