Dell Fluid Cache for SAN Version 2.1.
Notes, cautions, and warnings NOTE: A NOTE indicates important information that helps you make better use of your computer. CAUTION: A CAUTION indicates either potential damage to hardware or loss of data and tells you how to avoid the problem. WARNING: A WARNING indicates a potential for property damage, personal injury, or death. © 2016 Dell Inc. All rights reserved. This product is protected by U.S. and international copyright and intellectual property laws.
Contents 1 Preface........................................................................................................................................ 6 Audience.......................................................................................................................................6 Related Documentation............................................................................................................ 6 Accessing Enterprise Manager and Storage Center Documentation.................
Mapping Volumes in Fluid Cache....................................................................................25 Removing Volume Mappings........................................................................................... 26 Removing Volumes............................................................................................................ 27 Deleting a Volume From a Fluid Cache Cluster...........................................................28 Adding Servers to a Fluid Cache Cluster...
Fluid Cache Node Is Unavailable in Enterprise Manager................................................. 44 Cached LUNs Are Unavailable...............................................................................................44 Cannot Create a Fluid Cache Cluster.................................................................................. 44 Fluid Cache License Is Expired.............................................................................................. 45 Fluid Cache License Is Invalid..
1 Preface Dell Fluid Cache for SAN is a server-side caching accelerator software. Fluid Cache makes high-speed PCI Express (PCIe) SSDs a shared, distributed cache resource. Fluid Cache is deployed on clusters of Dell PowerEdge systems connected using RoCE-enabled Ethernet adapters and operates within a SAN environment employing a Dell Compellent backing store. This guide provides information on how to install, configure and manage a Fluid Cache for SAN 2.1.0 for Linux based environments.
Component Document Content Fluid Cache Release Notes Describes new features, known issues, and upgrade steps for Enterprise Manager. Compatibility Matrix Lists the compatibility matrix of different components included in the Dell Fluid Cache for SAN infrastructure. PowerEdge Owner’s Manual Describes how to install, remove, configure, and troubleshoot server components. Rack Placement Describes how to rack the server.
Accessing Enterprise Manager and Storage Center Documentation Documentation for Dell Compellent products is not available at dell.com/support/manuals. To download Enterprise Manager and Storage Center documentation: 1 2 Go to portal.compellent.com. Enter your user name and password and click Login. If you do not have a registration, send an email to customer.portal@compellent.com. 3 In the portal page, click Knowledge Center.
2 Dell Fluid Cache for SAN Product Overview Dell Fluid Cache for SAN is a server-side caching accelerator software. Fluid Cache makes high-speed PCI Express (PCIe) SSDs a shared, distributed cache resource. Fluid Cache is deployed on clusters of Dell PowerEdge systems connected using RoCE-enabled Ethernet adapters and operates within a SAN environment employing a Dell Compellent backing store.
Storage Area Network The network that Dell Compellent Storage Center uses to handle data connectivity within the SAN. Figure 1. Fluid Cache Connectivity 1 Fluid Cache nodes communicate with each other over a private network using RDMA. 2 Enterprise Manager creates, manages, and monitors the Fluid Cache clusters. 3 Fluid Cache nodes communicate with the Management IP (VIP) of the Storage Controllers. 4 Enterprise Manager manages the Dell Compellent array.
Example Cabling Diagram Figure 2.
• Fluid Cache is installed on four servers, which conforms to the minimum of three servers per each cache cluster and maximum of nine servers required. • Each of the servers has a network adapter connected to a port on each of the two cache network switches. The switches are uplinked to each other. • The servers access the Storage Center using SAN connectivity. • Enterprise Manager is used to configure and monitor Fluid Cache.
Fluid Cache for SAN in Enterprise Manager GUI Use the Enterprise Manager Graphical User Interface (GUI) to centrally manage your Dell Fluid Cache for SAN clusters and monitor the status and performance of the Dell Fluid Cache for SAN cluster.
3 Preparing the Fluid Cache Components Before installing Fluid Cache, you must prepare the components of the Fluid Cache network: the servers, cache devices, network cards, and switches. The instructions that follow assume that you have an existing SAN configured and managed. All nodes in the cache cluster must be connected to the SAN and visible on the Dell Compellent array. Instructions for racking and cabling a SAN solution are beyond the scope of this document.
Checking Network Connections For Fluid Cache to function correctly, each Fluid Cache server must be able to communicate with other network components. Make sure that the following ports are available: Table 2.
Checking Application Settings All applications that use volumes mapped to Fluid Cache must be configured to start after Fluid Cache and exit before Fluid Cache. Preparing the Servers Make sure that each server in the Fluid Cache cluster has the latest supported BIOS version, Lifecycle Controller firmware, and iDRAC firmware. • For updating Dell Lifecycle Controller and BIOS firmware on 13th generation of PowerEdge servers, see the Dell Lifecycle Controller Graphical User Interface Version 2.05.05.
Non-NVMe Cache Devices • Firmware— Make sure the firmware is up to date by using this command: dmesg | grep mtip32xx | grep Firmware For each cache device, you must get results similar to the following, showing firmware version B1490508 or later: mtip32xx 0000:46:00.0: Firmware Ver.: B1490508 NOTE: The firmware version must be B1490508 or later.
5 If a blade enclosure is used, disable FlexAddress in the blade enclosure. 6 Repeat this process for each network adapter in the cache network. NOTE: Make sure that all RoCE network adapters used by the Fluid Cache network are dedicated to the cache network and are not configured for any other network traffic. Bonding Network Adapter Ports Fluid Cache supports port bonding in active/passive mode (also called active/backup or master/slave).
6 In the configuration file, edit the parameters as follows: DEVICE= BOOTPROTO="none" ONBOOT="yes" NM_CONTROLLED="no" SLAVE="yes" MASTER="bond0" 7 For the other interface in the bonded port, repeat the tasks in 5–6. 8 Start network connection to the bonded port by running the following command: ifup bond0 9 Check the status of your bonded port and its interfaces by running the following command: ifconfig The output must contain entries similar to the following.
• Flow control (transmit and receive) is enabled and Data Center Bridging (DCB) is disabled. NOTE: Enabling flow control is a requirement for Fluid Cache. 3 Save the running configuration. 4 To implement the changes, restart the switch. Configuring a Dell Networking Switch The following procedure is for one of the supported Dell Networking switches listed in the table in Requirements For Fluid Cache. For all other switches, see the manufacturer’s documentation.
4 Installing and Setting up Fluid Cache Before completing the tasks in this section, install the required Linux dependencies. Topics: • Installing the Fluid Cache Software • Setting up the Fluid Cache Servers Installing the Fluid Cache Software 1 Copy to the server the Fluid Cache tar.gz package that you downloaded earlier. 2 Expand the tar.gz package. A new Fluid Cache directory is created, which contains an RPM file.
• Netmask: ______________________ An example of the required information is a device with an IP address of 172.18.1.2, whose network address is 172.18.1.0, and netmask is 255.255.255.0. 1 Change to the following directory:/opt/dell/fluidcache/bin/ 2 Start the Host Cache Node (HCN) Setup tool by running the following command: ./hcn_setup.py HCN Setup sets up a server for use as a Fluid Cache cluster node, and starts an agent on the server that allows it to be discovered by Enterprise Manager.
5 Fluid Cache for SAN Cluster Creation and Management Operations After you configure and validate the Fluid Cache for SAN components, use Enterprise Manager to create and manage the Fluid Cache Environment as described in the following sections. • Creating a Fluid Cache Cluster. See Creating a Fluid Cache Cluster • Managing a Fluid Cache Cluster Environment. See Managing a Fluid Cache Cluster Environment • Maintaining a Fluid Cache Cluster Environment.
a In the Host or IP Address box, type the host name or IP address associated with the management network of any available Fluid Cache server. b The Port box is autopopulated. Change only if necessary. c In the User Name box, type the username , which is fldc. You can also use the root user name and password in these boxes, if available. d In the User Password box, type the password . The default value is calvin.
Managing a Fluid Cache Cluster Environment Configuring Fluid Cache Volumes A Fluid Cache volume extends a normal Storage Center volume to be contained across the cache devices in a Fluid Cache cluster as well as permanently stored in the Storage Center volume.
3 Right-click the volume and select Map Volume to Server. 4 In the Map Volume to Server window, select the server. 5 Click Next. 6 Select Enable Fluid Cache. 7 From the Host Cache Policy drop-down menu, select a cache mode: • Write-back (default): In addition to caching reads, write-back mode allows the caching of written data without waiting for the Compellent Array to acknowledge the write operation. Write-back caching requires a cache device on two or more servers in the cluster.
Removing Volume Mappings from a Server 1 Make sure the volume is no longer in use. 2 In Enterprise Manager’s Storage view, expand Storage Centers if necessary and select the appropriate Storage Center. (Do not select Fluid Cache Clusters or its contents.) 3 In the Storage tab, expand Servers if necessary and locate the server whose Fluid Cache mappings you want to remove. 4 Right-click the server and select Remove Mappings. The Remove Mappings window is displayed.
Deleting a Volume From a Fluid Cache Cluster Use Enterprise Manager to completely delete a volume from a Fluid Cache cluster while maintaining just the cluster. 1 Click the Storage view. 2 In the Storage pane, expand Fluid Cache Clusters if necessary and select the cluster with the mapped volume to delete. 3 In the Cache tab, expand Volumes, select the volume to be deleted and click Delete. The Delete dialog box appears.
Removing a Server from a Fluid Cache Cluster 1 If the server belongs to a server cluster (a “subcluster”) within a Fluid Cache cluster, remove the server from the subcluster: a Prior to removing the server from a Fluid Cache cluster, you must shutdown the host or stop the Fluid Cache service. b In Enterprise Manager’s Storage view, select the appropriate Storage Center. (Do not select Fluid Cache Clusters or its contents.) c In the Storage tab, expand Servers if necessary and locate the server.
CAUTION: Any existing data on a cache device is lost when the device is added to the Fluid Cache cluster. Back up this data before proceeding. 5 Click OK. The devices now appear in the list in the Devices section. Setting Storage Capacity for each Cache Server The maximum SSD storage capacity supported is 3.2 TB per node. However, the size of the cache devices on a fluid cache node affects the memory utilization of that server node. The number of nodes in the cluster also affect the memory utilization.
Reactivate a Volume on a Fluid Cache Cluster Use Enterprise Manager to reconnect to a volume in a failed state from a Fluid Cache cluster while maintaining the cluster and the volume. 1 Click the Storage view. 2 In the Storage pane, expand Fluid Cache Clusters if necessary and select the cluster with the questionable volume. 3 In the Volumes pane of the Summary tab, double-click the volume to be reactivated and click Reactivate Volume. The Reactivate Volume dialog box appears. 4 Click OK.
Reconnect a Fluid Cache Cluster to a Storage Center Use Enterprise Manager to reconnect a Storage Center to a Fluid Cache. 1 Click the Storage view. 2 In the Storage pane, expand Fluid Cache Clusters if necessary and select the Fluid Cache cluster. 3 In the Cache tab, select Storage Centers and click Reconnect Host Cache Cluster to Storage Center. Change the License for a Fluid Cache cluster Use Enterprise Manager to change the license for a Fluid Cache cluster. 1 Click the Storage view.
CAUTION: All cached volumes and their data become inaccessible when the Fluid Cache cluster is shut down, unless they're remapped to another cluster first. If you do not remap the volumes before shutting down, call Dell Technical Support Services for help remapping a shut down cluster. NOTE: Volumes mapped to a Fluid Cache cluster that's been shut down can't be remapped anywhere other than a Fluid Cache cluster.
Shutting Down and Restarting a Cluster Shut down a cluster if, for example, you need to perform system maintenance but do not need to make any configuration changes to the cache network itself. To shut down a cluster: 1 Exit any applications that access cached volumes. 2 In Enterprise Manager, click Storage. 3 In the Storage pane, expand Fluid Cache Clusters if necessary, and then select the Fluid Cache cluster. 4 Click Shutdown. The Shutdown window is displayed. 5 Click Yes.
6 Fluid Cache Web Page Overview The Fluid Cache web page collects all possible information about the Fluid Cache clusters and provides detailed status and activity report about the hosts and cache devices within the clusters. Topics: • Accessing Fluid Cache Web Page • Understanding Fluid Cache Web Page Accessing Fluid Cache Web Page In the address bar, enter the URL of one of the hosts that is part of Fluid Cache cluster on port 8082.
• Peering state– Displays the role of the CFM, The possible options are Primary or Secondary. You can have only one primary and two secondary CFMs up at any time. • Hostname — Displays the hostname of this node. • Address — Displays the management IP address of this node. • Listen Interfaces — Displays the cache network IP address of this node. Peer CFMs The Peer CFMs section displays the following information about other Fluid Cache hosts that serves as CFMs.
• Fill Read Ops – Displays the cumulative number of cold read operations by the cache server on this host • Failed Fill Read Ops – Displays the number of cold read operations that failed. • Slow Fill Read Ops – Displays the number of cold read operations that took longer than two seconds. • Failed Flush Write Ops- Displays the number of write operations on Compellent that failed. • Slow Flush Write Ops – Displays the number of write operations on the Compellent that took longer than two seconds.
• Flush write dirty (GB) — Display the total amount of dirty data that has been written to the Compellent, may be different from the Flush write because whole cache blocks can be written to the Compellent even if all of the sectors within the cache block are not dirty. • Reads/s— Display the current rate of reads per second from the Compellent. • Writes/s — Display the current rate of writes per second to the Compellent.
A Troubleshooting Fluid Cache Installations If you have issues running Fluid Cache after a successful completion of the installation procedure, contact your Compellent Copilot. Troubleshooting the Compellent array and SAN architecture is beyond the scope of this document. For additional troubleshooting information, refer to the Enterprise Manager Administrator’s Guide and the documentation for other hardware and software components. See Related Documentation.
• Unable to Add a Volume to a Fluid Cache Cluster • Event Messages Are Not Being Delivered • Storage Center is Not Available • Fluid Cache Server is Not Available • Information Displays Differently Between Storage Centers and Fluid Cache Clusters • Verify That All Parts of the Fluid Cache Cluster are Communicating with Each Other • Verify the Data Path is Working • The Cluster in the Fluid Cache Clusters Display is Marked Red • Problems Configuring Server Clusters Defined on a Storage Cente
Possible Cause HCN Setup could not set up Fluid Cache on the server because the MPIO service was configured for multipathing, but devices required by Fluid Cache were not blacklisted in the etc/multipath.conf file. Solution Add blacklist entries for the devices required by Fluid Cache. On each node in the cluster, modify the Devices section of the /etc/multipath.
Server Does Not Appear in List of Servers Possible Cause A configuration issue is preventing the server from appearing in the list. Solution From the server, run the command ip addr. The cache network interface’s state should display as UP. If not, recheck the server configuration. See Preparing the Fluid Cache Servers and Setting up the Fluid Cache Servers. Possible Cause Firewall or iptables settings are preventing network communication. Solution Check your firewall and iptables settings.
Solution To check device function, select the device in Enterprise Manager and in the Event tab, look for a device failure message. Replace the cache device if necessary, using instructions in the Dell Compellent Enterprise Manager User’s Guide. Cache Device Cannot Be Added to a Cluster Possible Cause The cache device is not functioning properly. Under some conditions, the process of adding a device completes normally even though the device being added is not functioning properly.
Fluid Cache Node Is Unavailable in Enterprise Manager Possible Cause The node failed during restart. The node was unable to perform a graceful shutdown because the chkconfig command was used to shut down the Fluid Cache agent service. Solution Do not use the chkconfig command to disable the Fluid Cache agent service. Perform normal node recovery procedures to return the node to normal operation.
Fluid Cache License Is Expired Possible Cause System settings such as changes to the system date cause the current Fluid Cache license to expire. You can still access data on cached volumes, but performance is degraded because the Fluid Cache cluster has been placed in maintenance mode and caching is no longer active. Solution Check the status of the license file by selecting the Fluid Cache cluster in Enterprise Manager and referring to the status shown on the Events or Cache tabs.
Also, check for iptables entries that may be blocking Fluid Cache network traffic. Note that some default installations for RHEL create an iptables entry for ib_send_bw that prevents connections to another server and thus blocks Fluid Cache network traffic. Possible Cause One of the ports required by Fluid Cache is in use by another process. Solution Refer to the required ports listed in Checking Network Connections and reassign ports as needed.
Cannot Assign or Remove a Storage Center Possible Cause The Storage Center is already assigned to another Fluid Cache cluster. Solution In Enterprise Manager, see whether or not a Storage Center is listed for the Fluid Cache cluster. Possible Cause Network connectivity issues are preventing Enterprise Manager from communicating with Storage Center. Solution Make sure the network is functioning properly. Refer to Checking Network Connections, Checking Security Settings.
Cluster or Application Has Performance Issues Possible Cause One or more cache devices are uninstalled, have failed, or do not have the correct firmware or drivers. Solution Use Enterprise Manager to check the functionality of the cache devices. Possible Cause The Compellent storage array is overloaded. Solution In Enterprise Manager, check the storage latencies and throughput on the cached volumes. Add more capacity to the Compellent array if necessary.
degraded because the Fluid Cache cluster has been placed in maintenance mode and caching is no longer active. Solution Check the status of the license file by selecting the Fluid Cache cluster in Enterprise Manager and referring to the status shown on the Events or Cache tabs. Contact your Dell representative to purchase a Fluid Cache license.
Fluid Cache License File is Invalid Verify that the license didn’t expire or that a system change caused the license to be invalidated. • The Fluid Cache license status can be verified on either the Fluid Cache clusters’ Events tab or Cache tab. • An evaluation license is valid for only 90 days. Contact your Dell sales representative to purchase a Dell Fluid Cache for SAN license.
Storage Center is Not Available If you receive an error that you don’t have a Storage Center or you don’t see any Storage Centers, make sure that the Storage Center(s) is running version 6.5.1 or later. Fluid Cache Server is Not Available If you do not see a Fluid Cache server that you expect to be listed, click the Rescan button as the server may not have been discovered by the other servers in the cluster.
If the EM client displays a red mark over the cluster in the Storage Centers view of the cluster, it means that the Storage Center is reporting that it cannot communicate with the cluster servers over the management network. • Verify that the network is operational between the cluster servers and the Storage Center by using a network tool such as ping. • Note that it may take several minutes for the Storage Center to report the cluster status (down or up).
Since server clusters defined on a Storage Center must have the same operating system, all servers in the Fluid Cache cluster must also have the same operating system. If a Storage Center server with matching HBAs was previously created on the Storage Center prior to the assignment and it was defined with a different operating system on the Storage Center, than the Fluid Cache cluster will not be able to assign the Storage Center to the Fluid Cache cluster.