| IPMCTL-START-DIAGNOSTIC(1) | ipmctl | IPMCTL-START-DIAGNOSTIC(1) |
ipmctl-start-diagnostic - Starts a diagnostic test
ipmctl start [OPTIONS] -diagnostic [TARGETS]
Starts a diagnostic test.
-h, -help
-ddrt
-smbus
The -ddrt and -smbus options are mutually exclusive and may not be used together.
-lpmb
-spmb
The -lpmb and -spmb options are mutually exclusive and may not be used together.
-o (text|nvmxml), -output (text|nvmxml)
-diagnostic [Quick|Config|Security|FW]
-dimm [DimmIDS]
Starts all diagnostics.
ipmctl start -diagnostic
Starts the quick check diagnostic on PMem module 0x0001.
ipmctl start -diagnostic Quick -dimm 0x0001
If a PMem module is unmanageable, then Quick test will report the reason, while Config, Security and FW tests will skip unmanageable PMem modules.
Each diagnostic generates one or more log messages. A successful test generates a single log message per PMem module indicating that no errors were found. A failed test might generate multiple log messages each highlighting a specific error with all the relevant details. Each log contains the following information.
Test
State
Message
SubTestName
| Test Name | Valid SubTest Names |
| Quick | 4 • Manageability 4 • Boot status 4 • Health |
| Config | 4 • PMem module specs 4 • Duplicate PMem module 4 • System Capability 4 • Namespace LSA 4 • PCD |
| Security | 4 • Encryption status 4 • Inconsistency |
| FW | 4 • FW Consistency 4 • Viral Policy 4 • Threshold check 4 • System Time |
State
Events are generated as a result of invoking the Start Diagnostics command in order to analyze the Intel® Optane™ PMem module for potential issues.
Diagnostic events may fall into the following categories:
Each event includes the following pieces of information:
The following sections list each of the possible events grouped by category of the event.
The quick health check diagnostic verifies that the Intel® Optane™ PMem module’s host mailboxes are accessible and that basic health indicators can be read and are currently reporting acceptable values.
Table 1. Table Quick Health Check Events
| Code | Severity | Message | Arguments |
| 500 | Info | The quick health check succeeded. | |
| 501 | Warning | The quick health check detected that PMem module [1] is not manageable because subsystem vendor ID [2] is not supported. UID: [3] | 4 1. PMem module Handle 4 2. Subsystem Vendor ID 4 3. PMem module UID |
| 502 | Warning | The quick health check detected that PMem module [1] is not manageable because subsystem device ID [2] is not supported. UID: [3] | 4 1. PMem module Handle 4 2. Subsystem Device ID 4 3. PMem module UID |
| 503 | Warning | The quick health check detected that PMem module [1] is not manageable because firmware API version [2] is not supported. UID: [3] | 4 1. PMem module Handle 4 2. FW API version 4 3. PMem module UID |
| 504 | Warning | The quick health check detected that PMem module [1] is reporting a bad health state [2]. UID: [3] | 4 1. PMem module Handle 4 2. Actual Health State 4 3. PMem module UID |
| 505 | Warning | The quick health check detected that PMem module [1] is reporting a media temperature of [2] C which is above the alarm threshold [3] C. UID: [4] | 4 1. PMem module Handle 4 2. Actual Media Temperature 4 3. Media Temperature Threshold 4 4. PMem module UID |
| 506 | Warning | The quick health check detected that PMem module [1] is reporting percentage remaining at [2]% which is less than the alarm threshold [3]%. UID: [4] | 4 1. PMem module Handle 4 2. Actual Percentage Remaining 4 3. Percentage Remaining Threshold 4 4. PMem module UID |
| 507 | Warning | The quick health check detected that PMem module [1] is reporting reboot required. UID: [2] | 4 1. PMem module Handle 4 2. PMem module UID |
| 511 | Warning | The quick health check detected that PMem module [1] is reporting a controller temperature of [2] C which is above the alarm threshold [3] C. UID: [4] | 4 1. PMem module Handle 4 2. Actual Controller Temperature 4 3. Controller Temperature Threshold 4 4. PMem module UID |
| 513 | Error | The quick health check detected that the boot status register of PMem module [1] is not readable. UID: [2] | 4 1. PMem module Handle 4 2. PMem module UID |
| 514 | Error | The quick health check detected that the firmware on PMem module [1] is reporting that the media is not ready. UID: [2] | 4 1. PMem module Handle 4 2. PMem module UID |
| 515 | Error | The quick health check detected that the firmware on PMem module [1] is reporting an error in the media. UID: [2] | 4 1. PMem module Handle 4 2. PMem module UID |
| 519 | Error | The quick health check detected that PMem module [1] failed to initialize BIOS POST testing. UID: [2] | 4 1. PMem module Handle 4 2. PMem module UID |
| 520 | Error | The quick health check detected that the firmware on PMem module [1] has not initialized successfully. The last known Major:Minor Checkpoint is [2]. UID: [3] | 4 1. PMem module Handle 4 2. Major checkpoint : Minor checkpoint in Boot Status Register 4 3. PMem module UID |
| 523 | Error | The quick health check detected that PMem module [1] is reporting a viral state. The PMem module is now read-only. UID: [2] | 4 1. PMem module Handle 4 2. PMem module UID |
| 529 | Warning | The quick health check detected that PMem module [1] is reporting that it has no package spares available. UID: [2] | 4 1. PMem module Handle 4 2. PMem module UID |
| 530 | Info | The quick health check detected that the firmware on PMem module [1] experienced an unsafe shutdown before its latest restart. UID: [2] | 4 1. PMem module Handle 4 2. PMem module UID |
| 533 | Error | The quick health check detected that the firmware on PMem module [1] is reporting that the AIT DRAM is not ready. UID: [2] | 4 1. PMem module Handle 4 2. PMem module UID |
| 534 | Error | The quick health check detected that the firmware on PMem module [1] is reporting that the media is disabled. UID: [2] | 4 1. PMem module Handle 4 2. PMem module UID |
| 535 | Error | The quick health check detected that the firmware on PMem module [1] is reporting that the AIT DRAM is disabled. UID: [2] | 4 1. PMem module Handle 4 2. PMem module UID |
| 536 | Error | The quick health check detected that the firmware on PMem module [1] failed to load successfully. UID: [2] | 4 1. PMem module Handle 4 2. PMem module UID |
| 538 | Error | PMem module [1] is reporting that the DDRT IO Init is not complete. UID: [2] | 4 1. PMem module Handle 4 2. PMem module UID |
| 539 | Error | PMem module [1] is reporting that the mailbox interface is not ready. UID: [2] | 4 1. PMem module Handle 4 2. PMem module UID |
| 540 | Error | An internal error caused the quick health check to abort on PMem module [1]. UID: [2] | 4 1. PMem module Handle 4 2. PMem module UID |
| 541 | Error | The quick health check detected that PMem module [1] is busy. UID: [2] | 4 1. PMem module Handle 4 2. PMem module UID |
| 542 | Error | The quick health check detected that the platform FW did not map a region to SPA on PMem module [1]. ACPI NFIT NVPMem module State Flags Error Bit 6 Set. UID: [2] | 4 1. PMem module Handle 4 2. PMem module UID |
| 543 | Error | The quick health check detected that PMem module [1] DDRT Training is not complete/failed. UID: [2] | 4 1. PMem module Handle 4 2. PMem module UID |
| 544 | Error | PMem module [1] is reporting that the DDRT IO Init is not started. UID: [2] | 4 1. PMem module Handle 4 2. PMem module UID |
| 545 | Error | The quick health check detected that the ROM on PMem module [1] has failed to complete initialization, last known Major:Minor Checkpoint is [2]. | 4 1. PMem module Handle 4 2. Major checkpoint : Minor checkpoint in Boot Status Register 4 3. PMem module UID |
This diagnostic test group verifies that the BIOS platform
configuration matches the
installed hardware and the platform configuration conforms to best known
practices.
Table 2. Table Platform Configuration Check Events
| Code | Severity | Message | Arguments |
| 600 | Info | The platform configuration check succeeded. | |
| 601 | Info | The platform configuration check detected that there are no manageable PMem modules. | |
| 606 | Info | The platform configuration check detected that PMem module [1] is not configured. UID: [2] | 4 1. PMem module Handle 4 2. PMem module UID |
| 608 | Error | The platform configuration check detected [1] PMem modules installed on the platform with the same serial number [2]. | 4 1. Number of PMem modules with duplicate serial numbers. 4 2. The duplicate serial number |
| 609 | Info | The platform configuration check detected that PMem module [1] has a goal configuration that has not yet been applied. A system reboot is required for the new configuration to take effect. UID: [2] | 4 1. PMem module Handle 4 2. PMem module UID |
| 618 | Error | The platform configuration check detected that a PMem module with physical ID [1] is present in the system but failed to initialize. UID: [2] | 4 1. PMem module handle in the SMBIOS table 4 2. PMem module UID |
| 621 | Error | The platform configuration check detected PCD contains invalid data on PMem module [1]. UID: [2] | 4 1. PMem module Handle 4 2. PMem module UID |
| 622 | Error | The platform configuration check was unable to retrieve the namespace information. | |
| 623 | Warning | The platform configuration check detected that the BIOS settings do not currently allow memory provisioning from this software. | |
| 624 | Error | The platform configuration check detected that the BIOS could not apply the configuration goal on PMem module [1] because of errors in the goal data. The detailed status is COUT table status: [2] [3], Partition change table status: [4], Interleave change table 1 status: [5], Interleave change table 2 status: [6]. | 4 1. PMem module Handle 4 2. Validation Status 4 3. Text error code corresponding to the status code 4 4. Partition Size Change Status 4 5. Interleave Change Status 4 6. Interleave Change Status |
| 625 | Error | The platform configuration check detected that the BIOS could not apply the configuration goal on PMem module [1] because the system has insufficient resources. The detailed status is COUT table status: [2] [3], Partition change table status: [4], Interleave change table 1 status: [5], Interleave change table 2 status: [6]. | 4 1. PMem module Handle 4 2. Validation Status 4 3. Text error code corresponding to the status code 4 4. Partition Size Change Status 4 5. Interleave Change Status 4 6. Interleave Change Status |
| 626 | Error | The platform configuration check detected that the BIOS could not apply the configuration goal on PMem module [1] because of a firmware error. The detailed status is COUT table status: [2] [3], Partition change table status: [4], Interleave change table 1 status: [5], Interleave change table 2 status: [6]. | 4 1. PMem module Handle 4 2. Validation Status 4 3. Text error code corresponding to the status code 4 4. Partition Size Change Status 4 5. Interleave Change Status 4 6. Interleave Change Status |
| 627 | Error | The platform configuration check detected that the BIOS could not apply the configuration goal on PMem module [1] for an unknown reason. The detailed status is COUT table status: [2] [3], Partition change table status: [4], Interleave change table 1 status: [5], Interleave change table 2 status: [6]. | 4 1. PMem module Handle 4 2. Validation Status 4 3. Text error code corresponding to the status code 4 4. Partition Size Change Status 4 5. Interleave Change Status 4 6. Interleave Change Status |
| 628 | Error | The platform configuration check detected that interleave set [1] is broken because the PMem modules were moved [2]. | 4 1. Interleave set index ID 4 2. List of moved PMem modules. |
| 629 | Error | The platform configuration check detected that the platform does not support ADR and therefore data integrity is not guaranteed on the PMem modules. | |
| 630 | Error | An internal error caused the platform configuration check to abort. | |
| 631 | Error | The platform configuration check detected that interleave set [1] is broken because the PMem module with UID: [2] is missing from location (Socket-Die-iMC-Channel-Slot) [3]. | 4 1. Interleave set index ID 4 2. PMem module UID 4 3. Location ID |
| 632 | Error | The platform configuration check detected that interleave set [1] is broken because the PMem module with UID: [2] is misplaced. It is currently in location (Socket-Die-iMC-Channel-Slot) [3] and should be moved to (Socket-Die-iMC-Channel-Slot) [4]. | 4 1. Interleave set index ID 4 2. PMem module UID 4 3. Location ID 4 4. Location ID |
| 633 | Error | The platform configuration check detected that the BIOS could not fully map memory on PMem module [1] because of an error in current configuration. The detailed status is CCUR table status: [2] [3]. | 4 1. PMem module Handle 4 2. Current Configuration Status 4 3. Text error code corresponding to the status code |
The security check diagnostic test group verifies that all
Intel® Optane™ PMem modules
have a consistent security state.
Table 3. Table Security Check Events
| Code | Severity | Message | Arguments |
| 800 | Info | The security check succeeded. | |
| 801 | Info | The security check detected that there are no manageable PMem modules. | |
| 802 | Warning | The security check detected that security settings are inconsistent [1]. | 4 1. A comma separated list of the number of PMem modules in each security state |
| 804 | Info | The security check detected that security is not supported on all PMem modules. | |
| 805 | Error | An internal error caused the security check to abort. |
This test group verifies that all PMem modules of a given
subsystem
device ID have consistent FW installed and other FW modifiable attributes are
set in accordance with best practices.
Table 4. Table Firmware Consistency and Settings Check Events
| Code | Severity | Message | Arguments |
| 900 | Info | The firmware consistency and settings check succeeded. | |
| 901 | Info | The firmware consistency and settings check detected that there are no manageable PMem modules. | |
| 902 | Warning | The firmware consistency and settings check detected that firmware version on PMem modules [1] with subsystem device ID [2] is non-optimal, preferred version is [3]. | 4 1. Comma separated list of PMem module UIDs 4 2. Subsystem device ID 4 3. Preferred firmware version |
| 903 | Warning | The firmware consistency and settings check detected that PMem module [1] is reporting a non-critical media temperature threshold of [2] C which is above the fatal threshold [3] C. UID: [4] | 4 1. PMem module Handle 4 2. Current media temperature threshold 4 3. Fatal media temperature threshold 4 4. PMem module UID |
| 904 | Warning | The firmware consistency and settings check detected that PMem module [1] is reporting a non-critical controller temperature threshold of [2] C which is above the fatal threshold [3] C. UID: [4] | 4 1. PMem module Handle 4 2. Current controller temperature threshold 4 3. Fatal controller temperature threshold 4 4. PMem module UID |
| 905 | Warning | The firmware consistency and settings check detected that PMem module [1] is reporting a percentage remaining of [2]% which is below the recommended threshold [3]%. UID: [4] | 4 1. PMem module Handle 4 2. Current percentage remaining threshold 4 3. Recommended percentage remaining threshold 4 4. PMem module UID |
| 906 | Warning | The firmware consistency and settings check detected that PMem modules have inconsistent viral policy settings. | |
| 910 | Error | An internal error caused the firmware consistency and settings check to abort. | |
| 911 | Warning | The firmware consistency and settings check detected that PMem modules have inconsistent first fast refresh settings. |
| 2024-04-22 | ipmctl |