5. Events and logging

This section describes how to monitor and configure logging using the CLI, GUI, REST, IPMI and Redfish interfaces.

5.1. BMC command line

The standard Linux journal is available with the journalctl command on the BMC.

5.2. GUI

The logging operations available via the GUI are described in Table 5.1. The GUI page for viewing event logs is shown in Fig. 5.1.

Table 5.1 GUI logging operations

Command

Description

Event log

Display event logs.

_images/event-logging.png

Fig. 5.1 GUI event logging operations

5.3. REST API

You can perform logging operations through the REST API either by sending curl queries to the URI or by using the Graphcore openbmctool.py utility. Table 5.2 describes the commands available.

Table 5.2 Logging operations using the REST interface

Command

Description

list

List log entries (all types), log managers and log configurations available on the system.

List all logs:

$ curl -k https://<bmcip>/xyz/openbmc_project/logging/list -u <bmcuser>:<bmcpass>

enumerate

Show detailed information about log entries (all types), log managers and log configurations available on the system.

Enumerate logs:

$ curl -k https://<bmcip>/xyz/openbmc_project/logging/enumerate -u <bmcuser>:<bmcpass>

Configuration

Configure syslog logging on the BMC.

Configure syslog server address and port:

$ curl -k -H "Content-Type: application/json" -X PUT -d '{"data":<port>}' https://<bmcuser>:<bmcpass>@<bmcip>/xyz/openbmc_project/logging/config/remote/attr/Port
$ curl -k -H "Content-Type: application/json" -X PUT -d '{"data":"<address>"}' https://<bmcuser>:<bmcpass>@<bmcip>/xyz/openbmc_project/logging/config/remote/attr/Address
$ python3 openbmctool.py -H <bmcip> -U <bmcuser> -P <bmcpass> logging remote_logging_config -a ADDRESS -p PORT

View syslog configuration:

$ python3 openbmctool.py -H <bmcip> -U <bmcuser> -P <bmcpass> logging remote_logging view

Disable syslog:

$ python3 openbmctool.py -H <bmcip> -U <bmcuser> -P <bmcpass> logging remote_logging disable

Example of listing log entries, log managers and logging configuration:

$ curl -k https://<bmcip>/xyz/openbmc_project/logging/list -u <bmcuser>:<bmcpass>
{
  "data": [
    "/xyz/openbmc_project/logging/config",
    "/xyz/openbmc_project/logging/config/remote",
    "/xyz/openbmc_project/logging/entry",
    "/xyz/openbmc_project/logging/entry/15",
    "/xyz/openbmc_project/logging/entry/16",
    "/xyz/openbmc_project/logging/entry/17",
    "/xyz/openbmc_project/logging/entry/18",
    "/xyz/openbmc_project/logging/entry/19",
    "/xyz/openbmc_project/logging/entry/20",
    "/xyz/openbmc_project/logging/entry/21",
    "/xyz/openbmc_project/logging/entry/21/callout",
    "/xyz/openbmc_project/logging/internal",
    "/xyz/openbmc_project/logging/internal/manager",
    "/xyz/openbmc_project/logging/rest_api_logs"
  ],
  "message": "200 OK",
  "status": "ok"
}

5.4. IPMI

You can read all the events for the IPU-M2000 using the IPMI commands shown in Table 5.3.

Table 5.3 SEL operations using IPMI interface

Command

Description

SEL clear

Clear all event logs

ipmitool -I lanplus -C 3 -p 623 -U <bmcuser> -P <bmcpass> -H <bmcip> sel clear

SEL list

Display list of events

ipmitool -I lanplus -C 3 -p 623 -U <bmcuser> -P <bmcpass> -H <bmcip> sel list

SEL elist

Displays extended info list of events

ipmitool -I lanplus -C 3 -p 623 -U <bmcuser> -P <bmcpass> -H <bmcip> sel elist

Some examples of SEL entries for sensors going over their threshold, or errors with inventory items (such as not present or not functional) are shown below.

  • When the inlet sensor goes above 45°C or the exhaust temperature sensor goes above 60°C, two generic system hardware failure events are logged as shown below.

    $ ipmitool -I lanplus -C 3 -p 623 -U <bmcuser> -P <bmcpass> -H <bmcip> sel elist
    XX | 05/11/2020 | 08:06:21 AM CEST | System Event #0x90 | Undetermined system hardware failure | Asserted
    XX | 05/11/2020 | 08:06:24 AM CEST | System Event #0x90 | Undetermined system hardware failure | Asserted
    

    Note

    Event logging for sensors is not limited to inlet and exhaust sensors (it is available for all sensors). However, only these two sensors will cause a system shutdown.

    All sensor related events are logged as generic hardware failures. You will need to cross check the SDR to specific sensor fault.

    In addition, if you have configured an SNMP manager (see SNMP trap) for receiving SNMP event traps, you can see the following traps associated with the SEL entries. You can use these to identify the faulty sensor (in this case, the inlet sensor going over 45°C).

    2020-05-11 08:06:24 <bmc_fqdomain> [UDP: [<bmcip>]:<port>->[<snmpmgrip>]:<dport>]:
    iso.3.6.1.6.3.1.1.4.1.0 = OID: iso.3.6.1.4.1.49871.1.0.0.1      iso.3.6.1.4.1.49871.1.0.1.1 = Gauge32: 47   iso.3.6.1.4.1.49871.1.0.1.2 = Opaque: UInt64: 168713071370436978 iso.3.6.1.4.1.49871.1.0.1.3 = INTEGER: 3    iso.3.6.1.4.1.49871.1.0.1.4 = STRING: "xyz.openbmc_project.Sensor.Threshold.Error.CriticalHigh - SENSOR_DATA=|/xyz/openbmc_project/sensors/temperature/inlet:Value=45000|"
    
    2020-05-11 08:06:24 <bmc_fqdomain> [UDP: [<bmcip>]:60408->[<snmpmgrip>]:<dport>]:
    iso.3.6.1.6.3.1.1.4.1.0 = OID: iso.3.6.1.4.1.49871.1.0.0.1      iso.3.6.1.4.1.49871.1.0.1.1 = Gauge32: 48   iso.3.6.1.4.1.49871.1.0.1.2 = Opaque: UInt64: 168728099461005682 iso.3.6.1.4.1.49871.1.0.1.3 = INTEGER: 3    iso.3.6.1.4.1.49871.1.0.1.4 = STRING: "xyz.openbmc_project.State.Shutdown.ThermalEvent.Error.Ambient - _PID=265"
    
  • When a power supply failure or removal is detected, the following SEL entry can be observed:

    $ ipmitool -I lanplus -C 3 -p 623 -U <bmcuser> -P <bmcpass> -H <bmcip> sel elist
    XX | 05/11/2020 | 08:39:53 AM CEST | Power Supply #0x0a | Presence detected | Asserted
    

    You need to cross check with the SDR to identify if this is a failure or absence of the power supply.

    • In the case of a functional error, the following output will be seen:

      powersupply1     | 0Ah | ok  | 10.2 | Presence Detected, Failure detected
      
    • In the case of a presence error, the following output will be seen:

      powersupply1     | 0Ah | ok  | 10.2 |
      

    For this power supply SEL entry, the following trap is received on the SNMP manager.

    2020-05-12 13:38:14 <bmc_fqdomain> [UDP: [<bmcip>]:<sport>->[<snmpgrip>]:<dport>]:
    iso.3.6.1.6.3.1.1.4.1.0 = OID: iso.3.6.1.4.1.49871.1.0.0.1      iso.3.6.1.4.1.49871.1.0.1.1 = Gauge32: 52       iso.3.6.1.4.1.49871.1.0.1.2 = Opaque: UInt64: 625323637452308850       iso.3.6.1.4.1.49871.1.0.1.3 = INTEGER: 3        iso.3.6.1.4.1.49871.1.0.1.4 = STRING: "xyz.openbmc_project.Inventory.Error.Nonfunctional - CALLOUT_INVENTORY_PATH=/xyz/openbmc_project/inventory/system/chassis/powersupply1"
    
  • When an IPU failure is detected, the following SEL entry will be seen:

    $ ipmitool -I lanplus -C 3 -p 623 -U <bmcuser> -P <bmcpass> -H <bmcip> sel elist
    XX | 05/12/2020 | 04:13:22 PM CEST | Processor #0x11 | Disabled | Asserted
    

    You need to cross check with the SDR to identify if it is a functional or presence error.

    • In the case of a functional error, the following output will be seen:

      ipu0             | 11h | ok  | 45.1 | Presence detected, Disabled
      
    • In the case of a presence error, the following output will be seen:

      ipu0             | 11h | ok  | 45.1 |
      

    Note

    RNICs are also defined as processor/IO modules and similar SEL/SDR entries will appear in the case of an RNIC failure.

    For the above IPU SEL entry, a trap similar to that shown below will be received by the configured SNMP manager.

    2020-05-12 22:53:45 ipum.example.com [UDP: [<bmcip>]:<sport>->[<snmpmgrip>]:<dport>]:
    iso.3.6.1.6.3.1.1.4.1.0 = OID: iso.3.6.1.4.1.49871.1.0.0.1      iso.3.6.1.4.1.49871.1.0.1.1 = Gauge32: 63    iso.3.6.1.4.1.49871.1.0.1.2 = Opaque: UInt64: 768481056411091314        iso.3.6.1.4.1.49871.1.0.1.3 = INTEGER: 3     iso.3.6.1.4.1.49871.1.0.1.4 = STRING: "xyz.openbmc_project.Inventory.Error.Nonfunctional - CALLOUT_INVENTORY_PATH=/xyz/openbmc_project/inventory/system/chassis/motherboard/ipu0"
    
  • When an NVME failure is detected, an SEL entry like the following will be logged:

    XX | 05/13/2020 | 10:18:37 AM CEST | Drive Slot / Bay #0x42 | Drive Fault | Asserted
    

    A cross check with SDR is needed to figure out if it is a functional or presence error.

    • In the case of a functional error:

      nvme0            | 42h | ok  |  4.1 | Drive Present, Drive Fault
      
    • In the case of a presence error:

      nvme0            | 42h | ok  |  4.1 |
      

    For the above NVMe SEL entry, a trap like the following is received on the SNMP manager:

    2020-05-13 10:18:37 ipum.example.com [UDP: [<bmcip>]:<sport>->[snmpmgrip]:<dport>]:
    iso.3.6.1.6.3.1.1.4.1.0 = OID: iso.3.6.1.4.1.49871.1.0.0.1      iso.3.6.1.4.1.49871.1.0.1.1 = Gauge32: 71       iso.3.6.1.4.1.49871.1.0.1.2 = Opaque: UInt64: 944969603430220146        iso.3.6.1.4.1.49871.1.0.1.3 = INTEGER: 3        iso.3.6.1.4.1.49871.1.0.1.4 = STRING: "xyz.openbmc_project.Inventory.Error.Nonfunctional - CALLOUT_INVENTORY_PATH=/xyz/openbmc_project/inventory/system/chassis/motherboard/nvme0"
    
  • When an event is de-asserted, a generic system entry with Deasserted state is recorded in the SEL as shown below:

    05/13/2020 | 08:24:21 AM CEST | System Event #0x90 | Undetermined system hardware failure | Deasserted
    

5.5. Redfish

You can do logging operations through Redfish interface either by sending curl queries or browsing the URI.

SEL endpoints can be found in the Systems collection available at https://<bmcip>/redfish/v1/Systems/system/LogServices/.

Journal endpoints can be found in the Manager collection available at https://<bmcip>/redfish/v1/Managers/bmc/LogServices.

Table 5.4 Logging operations using Redfish interface

Command

Description

List SEL

List SEL entries.

$ curl -k https://<bmcip>/redfish/v1/Systems/system/LogServices/EventLog/Entries -u <bmcuser>:<bmcpass>

Delete logging entries

Delete system event entries.

$ curl -k  https://<bmcip>/redfish/v1/Systems/system/LogServices/EventLog//Actions/LogService.ClearLog -u <bmcuser>:<bmcpass>

List journal logs

List systemd journal logs through Redfish.

$ curl -k https://<bmcip>/redfish/v1/Managers/bmc/LogServices/Journal/Entries -u <bmcuser>:<bmcpass>

Note

Redfish SELs do not include the inventory item that created the log.

Example output from the list SEL command is shown below:

$ curl -k https://<bmcip>/redfish/v1/Systems/system/LogServices/EventLog/Entries -u <bmcuser>:<bmcpass>
{
      "@odata.context": "/redfish/v1/$metadata#LogEntryCollection.LogEntryCollection",
      "@odata.id": "/redfish/v1/Systems/system/LogServices/EventLog/Entries",
      "@odata.type": "#LogEntryCollection.LogEntryCollection",
      "Description": "Collection of System Event Log Entries",
      "Members": [
        {
          "@odata.context": "/redfish/v1/$metadata#LogEntry.LogEntry",
          "@odata.id": "/redfish/v1/Systems/system/LogServices/EventLog/Entries/15",
          "@odata.type": "#LogEntry.v1_4_0.LogEntry",
          "Created": "2020-05-11T11:45:21+00:00",
          "EntryType": "Event",
          "Id": "15",
          "Message": "xyz.openbmc_project.Inventory.Error.Nonfunctional",
          "Name": "System Event Log Entry",
          "Severity": "Critical"
        },
        {
          "@odata.context": "/redfish/v1/$metadata#LogEntry.LogEntry",
          "@odata.id": "/redfish/v1/Systems/system/LogServices/EventLog/Entries/16",
          "@odata.type": "#LogEntry.v1_4_0.LogEntry",
          "Created": "2020-05-11T11:45:23+00:00",
          "EntryType": "Event",
          "Id": "16",
          "Message": "xyz.openbmc_project.Inventory.Error.Nonfunctional",
          "Name": "System Event Log Entry",
          "Severity": "Critical"
        },
      ],
      "[email protected]": 2,
      "Name": "System Event Log Entries"
}

Example output from the command to list journal entries is shown below:

$ curl -k https://<bmcip>/redfish/v1/Managers/bmc/LogServices/Journal/Entries -u <bmcuser>:<bmcpass>
{
   "@odata.context": "/redfish/v1/$metadata#LogEntryCollection.LogEntryCollection",
   "@odata.id": "/redfish/v1/Managers/bmc/LogServices/BmcLog/Entries",
   "@odata.type": "#LogEntryCollection.LogEntryCollection",
   "Description": "Collection of BMC Journal Entries",
   "Members": [
     {
       "@odata.context": "/redfish/v1/$metadata#LogEntry.LogEntry",
       "@odata.id": "/redfish/v1/Managers/bmc/LogServices/Journal/Entries/1589216068566513",
       "@odata.type": "#LogEntry.v1_4_0.LogEntry",
       "Created": "2020-05-11T16:54:28+00:00",
       "EntryType": "Oem",
       "Id": "1589216068566513",
       "Message": "Booting Linux on physical CPU 0x0",
       "Name": "BMC Journal Entry",
       "OemRecordFormat": "BMC Journal Entry",
       "Severity": "OK"
     },
     ....
   ],
   "[email protected]": 2096,
   "[email protected]": "/redfish/v1/Managers/bmc/LogServices/Journal/Entries?$skip=1000",
   "Name": "Open BMC Journal Entries"
 }