Dell EMC NVDIMM-N Persistent Memory User Guide February 2021 Rev.
Notes, cautions, and warnings NOTE: A NOTE indicates important information that helps you make better use of your product. CAUTION: A CAUTION indicates either potential damage to hardware or loss of data and tells you how to avoid the problem. WARNING: A WARNING indicates a potential for property damage, personal injury, or death. © 2017 - 2021 Dell Inc. or its subsidiaries. All rights reserved. Dell, EMC, and other trademarks are trademarks of Dell Inc. or its subsidiaries.
Contents Chapter 1: Introduction................................................................................................................. 5 Chapter 2: Change list...................................................................................................................6 Chapter 3: NVDIMM-N Overview................................................................................................... 7 Normal Operation...............................................................................
Block Mode................................................................................................................................................................... 36 DAX Mode..................................................................................................................................................................... 36 Storage Spaces Support .......................................................................................................................................
1 Introduction DellEMC’s NVDIMM-N Persistent Memory is a disruptive Storage Class Memory technology that enables unprecedented performance improvement over legacy storage technologies. Each NVDIMM-N provides 16GB of nonvolatile memory and has the same form factor as a standard 288-Pin DDR4 DIMM. The NVDIMM-N resides in a standard CPU memory slot, placing data close the processor.
2 Change list Table 2. Change list Version Changes A00 Original Version A01 Added ESXi 6.7 support information. Removed Linux errata that is no longer applicable. Edits to remainder of document for clarity. A02 Added Modular Server specific information, support for R840, R940xa, changes to the BBU LED behavior and edits to the remainder of document for clarity. NVDIMM-N supported on RHEL 7.5 A03 Added minimum supported platform firmware versions Support for Windows 2019, RHEL 7.6 and ESXi 6.7 U1.
3 NVDIMM-N Overview The Figure below is an overview of the NVDIMM-N showing its main components and system interfaces. Core to the NVDIMMN are the DDR4 DRAM devices that allow the NVDIMM-N to operate as an RDIMM. The components that allow the NVDIMM-N to persist data are the Controller, Flash, and Power Voltage Regulators that are also integrated on the DIMM. Figure 1.
Figure 2. NVDIMM-N Normal Operation Backup to Flash In the event of a server shutdown, cold reboot, or power loss, a Save signal is sent to the NVDIMM-N Controller which then triggers the NVDIMM-N Controller to backup all its DRAM contents to its onboard flash storage. The NVDIMM-N Save event is triggered anytime the server is about to power down and power loss to NVDIMM-Ns is imminent. The backup process takes approximately one minute to complete.
Restore from Flash On server power-up, BIOS re-initializes the DRAM on the NVDIMM-N. BIOS commands the NVDIMM-N Controller using the SMBus Management Interface to restore its DRAM contents from Flash. The restore process takes approximately one minute to complete. This duration is independent of the number of NVDIMM-Ns installed in the server because Restores occur in parallel across all NVDIMM-Ns. BIOS then exposes the NVDIMM-N to the Server OS as Persistent Memory.
4 Hardware Topics: • • • • • Server Hardware Configuration Modular Chassis Hardware Configuration NVDIMM-N Module Details Battery Minimum Platform Firmware Versions Server Hardware Configuration NVDIMM-Ns are currently supported in the T640, R640,R740/R740XD, R840, R940, R940xa, MX740c and MX840c PowerEdge Servers. Each server supports from 1x to a maximum of 12x 16GB NVDIMM-Ns for a total max persistent memory capacity of 192GB.
Table 3.
Figure 6. MX740c Memory Layout Table 4.
Table 4.
2. While other configurations may work, they have not been fully validated and are not currently supported by DellEMC. Modular Chassis Hardware Configuration The MX7000 Modular chassis currently offers two different servers that support NVDIMM-N: MX740c (2-socket) and MX840c (4-socket). In order for an power loss condition to be detected, the chassis must have at least one Management Module installed.
Battery A battery is required to provide backup power to copy contents from DRAM to flash. Although JEDEC based NVDIMM-Ns can utilize Super Caps as backup power, DellEMC’s battery is a centralized power solution that provides a more compact, reliable, and integrated power source. Power delivery is integrated into the system board, and does not require individual cables to each NVDIMM-N that is typical of Super Cap based solutions.
Figure 8. R740/R740XD System Board Connections NOTE: Connector locations will be different for each server. Please refer to your particular server’s Installation and Service Manual for more information. Figure 9. R740 Battery Installation Instructions NOTE: Battery installation locations will be different for each server. Please refer to your particular Server’s Installation and Service Manual for instructions.
Minimum Platform Firmware Versions For NVDIMM-N modules to be functional on PowerEdge servers, the minimum platform firmware versions are required to be as follows: ● BIOS: 1.1.7 ● iDRAC: 3.00.00.00 NOTE: Certain operating systems require specific minimum versions of BIOS, NVDIMM-N and/or iDRAC firmware. Please refer to the individual sections of the operating systems for more details.
5 BIOS Topics: • • BIOS Configuration Settings for NVDIMM-N BIOS Error Messages BIOS Configuration Settings for NVDIMM-N This section focuses only on the BIOS setup options that affect NVDIMM-N operation. For a description of all setup options, please refer to each server’s Installation and Service Manual. Persistent Memory BIOS settings are configurable by going into BIOS System Setup. Press F2 at the BIOS screen below to enter BIOS System Setup. Figure 10.
Figure 11. Memory Settings Node Interleaving Specifies if Non-Uniform Memory Architecture (NUMA) is supported. If this field is set to Enabled, memory interleaving is supported if a symmetric memory configuration is installed. If the field is set to Disabled, the system supports NUMA (asymmetric) memory configurations. This option is set to Disabled by default. Node interleaving is not supported when NVDMM-N is present in the system.
Figure 12. Persistent Memory screen The following table describes each option that is available in the BIOS setup screen. Table 7. BIOS setup screen Option Description Node Interleaving Specifies if Non-Uniform Memory Architecture (NUMA) is supported. If this field is set to Enabled, memory interleaving is supported if a symmetric memory configuration is installed. If the field is set to Disabled, the system supports NUMA (asymmetric) memory configurations. This option is set to Disabled by default.
Table 7. BIOS setup screen (continued) Option Description Dimms This option is set to Disable by default. NVDIMM-N Enables or disables interleaving on NVDIMM-N. When Enabled, NVDIMM-N interleaving will follow the same interleaving policy that applies to RDIMMs. Volatile RDIMM interleaving policy is not affected by this option. RDIMM system memory and NVDIMM-N persistent memory will remain as two distinct memory regions. Interleave This option is set to Disable by default.
Figure 13. System BIOS Settings screen BIOS Error Messages When BIOS detects an NVDIMM-N related error during POST, BIOS displays an F1/F2 prompt and a corresponding error message. Multiple messages will appear when multiple errors were detected. BIOS will also log an event for each error in the Server System Event Log (SEL) and Life Cycle Log (LCL). Please refer to the JEDEC JESD245B spec for more information on each NVDIMM-N related failure.
This NVDIMM-N module will be set to read only mode. Remove input power to the system, reseat the NVDIMM-N module, and restart the server. If the issue persists, replace the faulty memory module identified in the message. UEFI0302 Set Energy Source Policy Error on NVDIMM-N located at [Location]. This NVDIMM-N module will be set to read only mode. Remove input power to the system, reseat the NVDIMM-N module, and restart the server.
6 iDRAC NVDIMM-N Management Topics: • • • iDRAC Graphical User Interface Remote Management NVDIMM-N Error Reporting iDRAC Graphical User Interface The image below shows the iDRAC Web GUI Dashboard when remotely managing the server. Figure 14. iDRAC Graphical User Interface NVDIMM-N Status Select the Memory link on the Dashboard to get more information for memory health.
Figure 15. NVDIMM-N Status NOTE: 1. All NVDIMM-N errors will be reported to the OS and logged in the server System Event Log. NVDIMM-N Health Status currently only reflects Correctable Error Threshold Exceeded and Uncorrectable Error status on the NVDIMM-N. Other errors are reported to OS and logged, but are not be reflected in the iDRAC/OM NVDIMM-N Health Status. 2. NVDIMM-N DIMMs are currently reported as DDR4 16GB Single-Rank 2666 DIMMs in the Memory Details page.
Log Messaging Errata The following errata affects the messaging in the System Event Log: ● When UEFI0340 is logged in the Lifecycle controller log, the System Event Log and Lifecycle controller logs can have entries with the following message “An unsupported event occurred.” This message can be ignored and shall be fixed in a future iDRAC release.
Table 10. NVDIMM-N Error Reporting (continued) ID Event Message Recommended Action MEM9033 An unsupported Non-Volatile Dual In-line Memory Module (NVDIMM) device is of unsupported configuration and unable to operate as currently configured. Review the memory configuration and ensure the configuration is as per memory rules that are defined in the system Owner's Manual on the support site. MEM9034 The Non-Volatile Dual In-line Memory Module (NVDIMM) device in the slot [location] is not responding.
Table 10. NVDIMM-N Error Reporting (continued) ID Event Message Recommended Action BAT0017 The NVDIMM battery has failed. Remove and reinstall the NVDIMM-N Battery. If the issue persists, contact your service provider. For information about removing and reinstalling the NVDIMM, see the system Owner's Manual on the support site. BAT0019 The NVDIMM battery is absent. Remove and reinstall the NVDIMM-N Battery. If the issue persists, contact your service provider.
7 Server Behavior with NVDIMM-Ns The server behavior changes slightly when NVDIMM-Ns are installed. This section covers differences that can be observed as the server shuts down and boots up. This section will also describe scenarios where the server will automatically shutdown to ensure that NVDIMM-N DRAM data is securely stored to flash.
Boot Server BIOS restores NVDIMM-N DRAM data from its onboard Flash during boot time. BIOS verifies that the NVDIMM-N Battery is installed and has sufficient charge for a Save event in case of an power loss. BIOS also verifies that the installed server Power Supplies are sufficiently sized for the server configuration. This is to ensure that after an power loss, the PSUs are able to provide enough power to hold up the server until Battery power takes over.
8 DIMM Configuration Changes Dell EMC recommends that NVDIMM-N data contents be backed up to external storage before making any changes to the server memory configuration. This applies to both NVDIMM-Ns and RDIMMs. Due to memory Error Correction (ECC) algorithms that are unique to each memory slot and memory configuration, NVDIMM-Ns may generate errors after a memory configuration change.
9 Windows Topics: • • • • • • BIOS Requirements Set Up Windows Drivers Storage Class Memory in Windows Server 2016 Storage Class Memory in Windows Server 2019 Windows Errata BIOS Requirements Both Windows 2016 and 2019 require the minimum BIOS version to be at 1.6.13 so that NVDIMM-N modules can be used without any issues.
Storage Class Memory in Windows Server 2016 Device manager Below picture describes the windows device manager view of NVDIMM-N root device and NVDIMM-N disk instances in Windows Server 2016. Figure 18. Windows device manager view of NVDIMM-N root device and NVDIMM-N disk instances Identifying the right NVDIMM-N disks Windows PowerShell and NVDIMM-N disk properties GUI in deice manager provides information that will can be used to uniquely identify the physical NVDIMM-N module.
The Serial Number for every NVDIMM-N is unique, and physical location values in PowerShell can be mapped to silk screen using the following table Table 11.
Figure 20. Using device manager GUI Location information in the above can translated to Physical silk screen using following table Table 12.
Table 12. DIMM Slot Location (continued) Location DIMM Slot Location 337 B12 NVDIMM-N health status and properties NVDIMM-N health status can be queried using following PowerShell command. Figure 21. NVDIMM-N health status and properties Windows native driver can handle different health events. For more details on the various health conditions, see the windows documentation ( https://docs.microsoft.
Storage Spaces Support Windows Server 2016 supports NVDIMM-N devices that allow for extremely fast input/output (I/O) operations. One attractive way of using such devices is as a write-back cache to achieve low write latencies. Microsoft blog discusses how to set up a mirrored storage space with a mirrored NVDIMM-N write-back cache as a virtual drive. To setup storage spaces configuration on NVDIMM-N, see Configuring Storage Spaces with a NVDIMM-N write-back cache.
Figure 25. Device Manager All NVDIMM-N devices are controlled by the nvdimm.sys driver, while the logical disks are controlled by the pmem.sys driver. Both types of device objects are created by scmbus.sys, the bus driver for persistent memory. This bus driver object can be found in Device Manager under "System Devices" New features in Windows Server 2019 Label support and Namespace management With Windows Server 2019, OS provides support for Label and Namespace management.
Figure 27. List PMEM Unused regions, PMEM Physical Devices and PMEM Disks Figure 28.
PowerShell Cmdlets #Get-PmemDisk ● Returns one or more logical persistent memory disks. ● The returned object has information about size, atomicity type, health status, and underlying physical devices. #Get-PmemPhysicalDevice ● Returns one or more physical persistent memory devices (NVDIMMs). ● The returned object has information about size(s), RFIC, device location, and health/operational status. #New-PmemDisk ● Creates a new disk out of a given unused region.
Figure 30. Visibility in PowerShell Configuring NVDIMM-N for Hyper-V Virtual Machines The article referenced here, Cmdlets for configuring persistent memory devices for Hyper-V VMs, provides details about configuring Hyper-V VMs with JEDEC compliant NVDIMM-N. NVDIMM-N RO Behavior Windows Server 2019 By design, Windows Server 2019 manages NVDIMM-N in a manner that differs from that of Windows Server 2016.
Workaround: None.
10 Linux NVDIMM-N hardware is supported on versions 7.3, 7.4 , 7.5 and 7.6 versions of RHEL. Topics: • • • • • • • • Identify and Configure PMEM —Persistent Memory Device Installation Verify Existing Filesystem Read-Only Mode NVDIMM-N Interleave Management Utility RHEL 7.6 features Linux Errata Identify and Configure PMEM —Persistent Memory Device When the OS is up running, verify that NVDIMM-Ns are populated correctly. Go into root user $ su Identify whether NVDIMM-Ns appear as /dev/pmem0, /dev/pmem1, .
Installation Dump the RHEL ISO onto the USB stick via dd command. # dd if=/home/dell/RHEL7.3.iso of=/dev/sdb bs=4M conv=noerror,sync BIOS boots the Linux kernel from USB. Follow the on-screen steps to finish installing RHEL. After the installation is completed, reboot the server. For detailed installation instructions, please refer to https://access.redhat.com/documentation/en-US/ Red_Hat_Enterprise_Linux/7/html/Installation_Guide/index.
When OS is up running, $ su CPU0’s 6 NVDIMM-Ns show up as /dev/pmem0, CPU1’s 6 NVDIMM-Ns appear as /dev/pmem1. # ls /dev/pmem* View the size of /dev/pmem0 and /dev/pmem1, each should be around 6*16 GB = 96GB because each NVDIMM-N is 16 GB. # lsblk Create xfs file system for /dev/pmem0 and /dev/pmem1 # mkfs.
Mount /dev/pmem0 and /dev/pmem1 # mount -t xfs -o dax /dev/pmem0 /mnt/nvdimm0 To see whether it is writable on /dev/pmem0 and /dev/pmem1 # touch /mnt/nvdimm0/write.txt # shutdown Management Utility Management Utility ‘ndctl’ && mdadm 1. Press PowerOn button on the server. 2. Follow the guidance in Section 4 to setup BIOS. 3. Enable the Persistence Memory, disable “NVDIMM Interleave”, disable “NVDIMM Read-Only”. 4. Install RHEL or start OS if it is already installed.
mdadm Create software RAID on NVDIMM-Ns. Say there are 6 devices, /dev/pmem0.. /dev/pmem5. Create directory /mnt/md0 /mnt/md1 /mnt/md2 /mnt/md5 /mnt/md6. $ mkdir -p /mnt/md0 Create RAID 0 $ mdadm --create --verbose /dev/md0 --level=0 --raid-devices=6 /dev/pmem0 /dev/pmem1 /dev/pmem2 /dev/pmem3 /dev/pmem4 /dev/pmem5 $ cat /proc/mdstat $ mkfs.
Run command below to create namespaces. This command if used as is by default creates /dev/pmem devices. In order to create namespaces in other modes, refer to the link https://www.mankier.com/1/ndctl-create-namespace This command should be run as many times as the number of NVDIMM-N modules plugged into the system. ● $ ndctl create-namespace For more information on how to use ndctl utility refer to the users guide here https://docs.pmem.io/ndctl-users-guide Linux Errata Following errata effects RHEL 7.
11 ESXi Topics: • • • • • • • Set up Storage Supported Guest OSes with NVDIMM support Overall Health Status Operational and Diagnostics Logging information NVDIMM-N Errors ESXi Errata Set up Beginning with ESXi version 6.7 NVDIMM-N hardware is supported. The NVDIMM-N devices will be detected on startup with auto-generated namespaces. All NVDIMM-N devices will have capacity pooled into a single logical memory array for access by ESXi virtual machines. NVDIMM-N Hardware is now supported on ESXi 6.7U1.
troubleshooting. The next column indicates free space and it is expected to be “0 B” for all NVDIMM-N devices that are fully mapped and operated correctly. Health should be Normal. Detailed explanation of Health section is provided in “Overall health Status” below. Translation of ID to physical NVDIMM-N in host system can be seen below. Table 14.
Figure 32. Interleave sets while Interleaving is Disabled If Interleaving is Enabled in BIOS F2 setup, then the total NVDIMM-N capacity will be split into pools based on CPU socket. A total of two interleave sets will display with the aggregate capacity for the CPU socket displaying as one Interleave Set. Figure 33.
Figure 34. Datastores Supported Guest OSes with NVDIMM support ● ● ● ● ● ● ● Windows Server 2016 Build 14393 and above Windows 10 Anniversary Update Version 1607 and above RedHat Enterprise Linux 7.4 and above SUSE Linux Enterprise 12 SP2 and above Photon OS 1.0 Revision 2 and above CentOS 7.4 and above Ubuntu 17.04 and above Overall Health Status Health status of the NVDIMM-N modules is represented in a tabular column of the Modules and Namespace section of the ESXi interface.
Outdated firmware ESXi requires NVDIMM-N modules to have a minimum 9324 firmware image. If they have older firmware they will not behave correctly. In the event of out dated firmware on the memory (lower than 9324), the system will boot into the ESX hypervisor and the DIMMs will be visible in the UI. No namespaces will be populated and the DIMMs cannot be mounted to a VM guest OS.
Figure 37. NVDIMM-N Errors Refer to table below for the Overall Health Status message shown on ESXi Web Client in the event of the following errors: Table 16.
12 General Errata NVDIMM-N does not support PPR on 14G products, and correctable error logging code does not differentiate RDIMM and NVDIMM-N. As a result, the error message "MEM0802 -The memory health monitor feature has detected a degradation in the DIMM installed in DIMM. Reboot system to initiate self-heal process" appears and on the next boot, MRC PPR will skip NVDIMM-N. Workaround: None.