10 Replies Latest reply on Jan 9, 2018 12:19 AM by Intel Corporation

    R2308IP4LHPC - EDAC sbridge failed to register device (not an ECC memory error)

    jeichenhofer

      After installing fedora on my new server, I'm seeing the following problem/error log messages:

      EDAC sbridge: Failed to register device with error -22.

      EDAC sbridge: Couldn't find mci handler

      As far as I can tell, everything is working fine, but I want to avoid any errors or missing functionality in the future. My searching so far has only found ECC memory errors (like here), but those are usually accompanied with other errors about ECC being disabled. I'm not sure which other log files might have information, or whether this issue even needs attention.

       

      Does anyone know how to continue investigating this error?

       

      Here's the output of grepping /var/log/messages for edac, sbridge, and mci:

      Dec 22 13:29:03 hostname_removed kernel: ERST: Error Record Serialization Table (ERST) support is initialized.

      Dec 22 13:29:03 hostname_removed kernel: pstore: using zlib compression

      Dec 22 13:29:03 hostname_removed kernel: pstore: Registered erst as persistent store backend

      Dec 22 13:29:03 hostname_removed kernel: ghes_edac: This EDAC driver relies on BIOS to enumerate memory and get error reports.

      Dec 22 13:29:03 hostname_removed kernel: ghes_edac: Unfortunately, not all BIOSes reflect the memory layout correctly.

      Dec 22 13:29:03 hostname_removed kernel: ghes_edac: So, the end result of using this driver varies from vendor to vendor.

      Dec 22 13:29:03 hostname_removed kernel: ghes_edac: If you find incorrect reports, please contact your hardware vendor

      Dec 22 13:29:03 hostname_removed kernel: ghes_edac: to correct its BIOS.

      Dec 22 13:29:03 hostname_removed kernel: ghes_edac: This system has 16 DIMM sockets.

      Dec 22 13:29:03 hostname_removed kernel: EDAC MC0: Giving out device to module ghes_edac.c controller ghes_edac: DEV ghes (INTERRUPT)

      Dec 22 13:29:03 hostname_removed kernel: EDAC MC1: Giving out device to module ghes_edac.c controller ghes_edac: DEV ghes (INTERRUPT)

      Dec 22 13:29:03 hostname_removed kernel: GHES: APEI firmware first mode is enabled by APEI bit and WHEA _OSC.

      Dec 22 13:29:03 hostname_removed kernel: Serial: 8250/16550 driver, 32 ports, IRQ sharing enabled

      Dec 22 13:29:03 hostname_removed kernel: Non-volatile memory driver v1.3

      --

      Dec 22 13:29:08 hostname_removed kernel: RAPL PMU: hw unit of domain pp0-core 2^-16 Joules

      Dec 22 13:29:08 hostname_removed kernel: RAPL PMU: hw unit of domain package 2^-16 Joules

      Dec 22 13:29:08 hostname_removed kernel: RAPL PMU: hw unit of domain dram 2^-16 Joules

      Dec 22 13:29:08 hostname_removed kernel: EDAC sbridge: Couldn't find mci handler

      Dec 22 13:29:08 hostname_removed kernel: EDAC sbridge: Couldn't find mci handler

      Dec 22 13:29:08 hostname_removed kernel: EDAC sbridge: Failed to register device with error -22.

      Dec 22 13:29:08 hostname_removed kernel: intel_rapl: Found RAPL domain package

      Dec 22 13:29:08 hostname_removed kernel: intel_rapl: Found RAPL domain core

      Dec 22 13:29:08 hostname_removed kernel: intel_rapl: Found RAPL domain dram

      --

      Dec 23 08:24:58 hostname_removed kernel: ERST: Error Record Serialization Table (ERST) support is initialized.

      Dec 23 08:24:58 hostname_removed kernel: pstore: using zlib compression

      Dec 23 08:24:58 hostname_removed kernel: pstore: Registered erst as persistent store backend

      Dec 23 08:24:58 hostname_removed kernel: ghes_edac: This EDAC driver relies on BIOS to enumerate memory and get error reports.

      Dec 23 08:24:58 hostname_removed kernel: ghes_edac: Unfortunately, not all BIOSes reflect the memory layout correctly.

      Dec 23 08:24:58 hostname_removed kernel: ghes_edac: So, the end result of using this driver varies from vendor to vendor.

      Dec 23 08:24:58 hostname_removed kernel: ghes_edac: If you find incorrect reports, please contact your hardware vendor

      Dec 23 08:24:58 hostname_removed kernel: ghes_edac: to correct its BIOS.

      Dec 23 08:24:58 hostname_removed kernel: ghes_edac: This system has 16 DIMM sockets.

      Dec 23 08:24:58 hostname_removed kernel: EDAC MC0: Giving out device to module ghes_edac.c controller ghes_edac: DEV ghes (INTERRUPT)

      Dec 23 08:24:58 hostname_removed kernel: EDAC MC1: Giving out device to module ghes_edac.c controller ghes_edac: DEV ghes (INTERRUPT)

      Dec 23 08:24:58 hostname_removed kernel: GHES: APEI firmware first mode is enabled by APEI bit and WHEA _OSC.

      Dec 23 08:24:58 hostname_removed kernel: Serial: 8250/16550 driver, 32 ports, IRQ sharing enabled

      Dec 23 08:24:58 hostname_removed kernel: Non-volatile memory driver v1.3

      --

      Dec 23 08:25:02 hostname_removed kernel: ipmi_si dmi-ipmi-si.0: IPMI kcs interface initialized

      Dec 23 08:25:02 hostname_removed kernel: IPMI SSIF Interface driver

      Dec 23 08:25:02 hostname_removed systemd-udevd[946]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.

      Dec 23 08:25:02 hostname_removed kernel: EDAC sbridge: Couldn't find mci handler

      Dec 23 08:25:02 hostname_removed kernel: EDAC sbridge: Couldn't find mci handler

      Dec 23 08:25:02 hostname_removed kernel: EDAC sbridge: Failed to register device with error -22.

      Dec 23 08:25:02 hostname_removed systemd-udevd[947]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.

      Dec 23 08:25:02 hostname_removed kernel: intel_rapl: Found RAPL domain package

      Dec 23 08:25:02 hostname_removed kernel: intel_rapl: Found RAPL domain core