APEI(4) Device Drivers Manual APEI(4)

apeiACPI Platform Error Interfaces

apei* at apeibus?

apei reports hardware errors discovered through APEI, the ACPI Platform Error Interfaces.

apei also supports injecting errors.

When the hardware detects an error and reports it to apei, it will print information about the error to the console.

Example of a correctable memory error, automatically corrected by the system, with no further intervention needed:

apei0: error source 1 reported hardware error: severity=corrected nentries=1 status=0x12<CE,GEDE_COUNT=0x1>
apei0: error source 1 entry 0: SectionType={0xa5bc1114,0x6f64,0x4ede,0xb8b8,{0x3e,0x83,0xed,0x7c,0x83,0xb1}} (memory error)
apei0: error source 1 entry 0: ErrorSeverity=2 (corrected)
apei0: error source 1 entry 0: Revision=0x201
apei0: error source 1 entry 0: Flags=0x1<PRIMARY>
apei0: error source 1 entry 0: FruText=CorrectedErr
apei0: error source 1 entry 0: MemoryErrorType=8 (PARITY_ERROR)

Example of a fatal uncorrectable memory error:
apei0: error source 0 reported hardware error: severity=fatal nentries=1 status=0x11<UE,GEDE_COUNT=0x1> apei0: error source 0 entry 0: SectionType={0xa5bc1114,0x6f64,0x4ede,0xb8b8,{0x3e,0x83,0xed,0x7c,0x83,0xb1}} (memory error) apei0: error source 0 entry 0: ErrorSeverity=1 (fatal) apei0: error source 0 entry 0: Revision=0x201 apei0: error source 0 entry 0: Flags=0x1<PRIMARY> apei0: error source 0 entry 0: FruText=UncorrectedErr apei0: error source 0 entry 0: ErrorStatus=0x400<ErrorType=0x4=ERR_MEM> apei0: error source 0 entry 0: Node=0x0 apei0: error source 0 entry 0: Module=0x0 apei0: error source 0 entry 0: Device=0x0 panic: fatal hardware error
Details of the hardware error sources can be dumped with acpidump(8).

acpi(4), acpihed(4), acpidump(8)

ACPI Specification 6.5, https://uefi.org/specs/ACPI/6.5/18_Platform_Error_Interfaces.html, Chapter 18: ACPI Platform Error Interfaces (APEI).

The apei driver first appeared in NetBSD 10.1.

The apei driver was written by Taylor R Campbell <riastradh@NetBSD.org>.

No sysctl interface to read BERT after boot.

No simple sysctl interface to inject errors with EINJ, or any way to inject errors at physical addresses in pages allocated for testing. Perhaps there should be a separate kernel module for that.

Nothing reads, writes, or clears ERST. NetBSD could use it to store dmesg or other diagnostic information on panic.

Many hardware error source types in the HEST are missing, such as PCIe errors.

apei is not wired to any machine-dependent machine check exception notifications.

No formal log format or sysctl/device interface that programs can reliably act on.

NetBSD makes no attempt to recover from uncorrectable but recoverable errors, such as discarding a clean cached page where an uncorrectable memory error has occurred.

March 18, 2024 NetBSD 10.99