Optimizing Dell EMC SC Series Storage for Oracle OLAP Processing Abstract This paper highlights the advanced features of Dell EMC™ SC Series storage and provides guidance on how they can be leveraged to deliver a cost-effective solution for Oracle® OLAP and DSS deployments.
Revisions Revisions Date Description September 2014 Initial release October 2016 Updated for agnosticism with Dell SC Series all-flash arrays; updated format July 2017 Consolidated best practice information and focus on OLAP environment; updated format November 2019 vVols branding update Acknowledgements Updated by: Mark Tomczik and Henry Wong The information in this publication is provided “as is.” Dell Inc.
Table of contents Table of contents 1 Introduction ...................................................................................................................................................................5 1.1 Audience .............................................................................................................................................................5 1.2 Prerequisites .....................................................................................................
Executive summary Executive summary Intelligent choices must be made when selecting the hardware for an environment that will service the needs of a business and expected data growth.
Introduction 1 Introduction The purpose of this document is to provide general best practices and sizing guidelines when deploying an Oracle data warehouse on any SC Series array. The information provided in this document is built upon the guiding principles and building blocks in the document, Dell EMC SC Series Arrays and Oracle. Therefore, it should be thoroughly reviewed before implementing the best practices mentioned in this document.
SC Series overview 2 SC Series overview The SC Series array is built on the core Fluid Data architecture that has made flash storage affordable to customers. Traditional storage arrays typically can accommodate one storage type, or when multiple storage types are used, the storage must be configured in such a way that it is inflexible and difficult to adjust to any future workload changes.
SC Series overview Replication technology protects a virtual volume by creating a replica on a remote SC Series array using either asynchronous or synchronous mode. After the initial setup, SCOS copies only the data changes made since the last snapshot. This uses less network bandwidth and less time. In the event of primary array/site failure, the replicas can be activated by the administrator on the remote site. Live Volume is a hyper-volume that stretches across two different SC Series arrays.
Optimize an SC Series array for an Oracle data warehouse 3 Optimize an SC Series array for an Oracle data warehouse Unlike an OLTP database, a data warehouse is characterized by relatively low volume of transactions. The majority of the activities in a data warehouse involves complex queries and aggregates large set of data. The volume of data tends to grow steadily over time and is kept available for a much longer period of time.
Optimize an SC Series array for an Oracle data warehouse Redundant ports and paths, and HA fault domains 3.2 Direct attach with front-end SAS For SC Series models that support using front-end SAS connectivity to the servers, configure multiple SAS connections between each server and the array to provide path redundancy. Multipath software should be configured on each server to provide automatic path failover and I/O load balancing.
Optimize an SC Series array for an Oracle data warehouse Oracle database storage layout model 10 Optimizing Dell EMC SC Series Storage for Oracle OLAP Processing | 2009-M-BP-O
Optimize an SC Series array for an Oracle data warehouse 3.3.1 Oracle ASM Dell EMC and Oracle recommend using Oracle Automatic Storage Management (ASM) to manage SC Series volumes for the database and clusterware. This section reviews the general guidelines and additional considerations for a data warehouse. ASM guidelines: • • • • • • • • • Configure a separate diskgroup for the Oracle clusterware. A data warehouse can span across multiple diskgroups.
Optimize an SC Series array for an Oracle data warehouse 3.3.2 Linux LVM Linux Logical Volume Manager (LVM) is a common general-purpose storage manager included in all popular Linux distributions. Since Oracle software is excluded from ASM, LVM can be used to store the software on a local Linux file system. Similar to ASM, it is not necessary to mirror logical volumes in LVM. LVM allows striping data files across multiple volumes in a volume group.
Optimize an SC Series array for an Oracle data warehouse Table 1 shows an example of a simple two-tier SC Series storage configuration for a data warehouse. The capacity tier is ideal for storing a large amount of historical data while the high performance tier is great for handling the occasional transactions and queries.
Optimize an SC Series array for an Oracle data warehouse The following list shows an example of how to apply different storage profiles to different data types: • • • • • Dell EMC and Oracle recommend using Oracle ASM to manage the data storage. The same storage profile must be applied to each SC Series volume that constitutes the ASM disk groups. In most cases, the predefined Recommended profile offers the most effective way to manage data across all storage tiers and RAID levels.
Optimize an SC Series array for an Oracle data warehouse +SYSTEMDG Metadata High Priority (Tier 1) T1/ all RAID levels • Pin data in T1 only if it is deemed highly critical and requires the best performance • New data writes to the fastest write layer T1 R10 • Aged data can still move to T1 R5 or R6 to free up space Recommended All tiers/ all RAID levels • Metadata changes infrequently; mostly read I/Os • Redo logs are circular logs; mostly write I/Os • New data/logs writes to the fastest layer T1 R1
Optimize an SC Series array for an Oracle data warehouse Volume attributes in DSM 3.4.3 Initial loading of historical data If a large amount of data will be transferred into the data warehouse initially, it might not be desirable to consume all tier 1 storage for the initial load. A common practice is to direct data during ingestion to the lowest tier where capacity is abundant. In DSM, select the Low Priority storage profile on each volume for the duration of the ingestion.
Optimize an SC Series array for an Oracle data warehouse space-saving benefits at the most convenient time. The following offers an example where data reduction features might be suitable in a data warehouse. Note: Do not enable SC Series data reduction on the same set of data where database/application-level data reduction is already enabled. There might not be significant reduction savings on already reduced data.
Optimize an SC Series array for an Oracle data warehouse Per-volume data-reduction statistics Array-wide data-reduction statistics 18 Optimizing Dell EMC SC Series Storage for Oracle OLAP Processing | 2009-M-BP-O
Optimize an SC Series array for an Oracle data warehouse Volume space allocation statistics 3.4.5 Thin provisioning compared to preallocating storage By default, all SC Series volumes are thin provisioned (the SC Series array allocates pages to the volume as data is written to it). Alternatively, the whole volume can be preallocated at the time of creation. However, regardless of the actual space consumption, the full volume size is reserved as used and cannot be reassigned to other volumes.
Optimize an SC Series array for an Oracle data warehouse For example, the following SQL statement creates a bigfile tablespace intended for storing the historical data. The statement preallocates 1TB of space initially. When it is full, additional storage will be added in 1GB amounts.
Optimize an SC Series array for an Oracle data warehouse • • Snapshots allow fast recovery and improve RTO and RPO. Data from snapshots are immediately available through view volumes. An instant backup should be created before major upgrades or changes in the data warehouse. Considerations for planning snapshot schedules: • • • • • Snapshots can be taken on a schedule or on-demand. Schedule snapshots based on backup policy and RPO. SC Series storage allows a schedule as frequent as every 5 minutes.
Optimize an SC Series array for an Oracle data warehouse 3.6.2 Configure I/O for bandwidth Storage configurations for a data warehouse should be chosen based on the I/O bandwidth and not just on the overall storage capacity. The following factors should be considered: • • • • • 3.6.3 The expected throughput of the data warehouse and all other applications that share the SC Series array.
Optimize an SC Series array for an Oracle data warehouse Oracle Automatic Workload Repository (AWR) can be used to help estimate how much storage is required for snapshots. It provides detailed usage information on redo logs, archived logs, temporary space, and data files. Administrators should be able to deduce the data-change rate from the AWR reports.
Conclusion 4 Conclusion SC Series arrays provide well-balanced storage solutions that benefit large data warehouses by reducing the total cost without sacrificing the performance. Thin provisioning, continuous self-optimization, and data reduction work seamlessly together to provide the most cost-efficient solution. Performance is delivered where it is needed through auto-tiering and multi-RAID levels. When the data warehouse grows in the future, the array can scale easily by adding drives at any time.
Storage profiles and RAID levels A Storage profiles and RAID levels Storage profiles available with write SSDs and HDDs, or with HDDs only Name Initial write tier Tier (T) and RAID levels Progression Recommended (All Tiers) 1 Writes: T1 RAID 10 To all tiers Snapshots: RAID 5/RAID 6 High Priority (Tier 1) 1 Writes: T1 RAID 10 Stay in tier 1 Snapshots: T1 RAID 5/RAID 6 Medium Priority (Tier 2) 2 Writes: T2 RAID 10 Stay in tier 2 Snapshots: T2 RAID 5/RAID 6 Low Priority (Tier 3) 3 Writes: T
Storage profiles and RAID levels RAID stripe width in SC Series arrays 26 RAID Level Stripe width description RAID 0 Stripes across all drives in the tier with no redundancy RAID 10 Stripes data along with one copy across all drives in the tier RAID 10-DM Stripes data along with two copies across all drives in the tier RAID 5-5 Distributes parity across five drives (4 data segments, 1 parity segment for each stripe); tier requires at least 6 drives (5 for RAID, and one for spare) RAID 5-9 Distr
Technical support and resources B Technical support and resources Dell.com/support is focused on meeting customer needs with proven services and support. Storage technical documents and videos provide expertise that helps to ensure customer success on Dell EMC storage platforms. B.1 Additional resources Referenced or recommended Dell EMC publications: • • • • • • • • • • • • • Dell Storage Manager Administrator’s Guide available on the Dell Support website. Dell Storage Center OS 7.