Data management paper

ManualsBrandsIntel ManualsOtherIntel Xeon Processor 7020

This white paper provides an overview

of Terracotta BigMemory Max and the

benets of running BigMemory Max

on servers powered by the Intel Xeon

processor E7 v2 family. The paper

describes how Terracotta BigMemory Max

and Intel technologies can work together to

enable real-time analysis of large data sets.

What’s Holding Back Real-time Big

Data Analysis?

Since the age of mainframes, enterprises

have relied on analyzing structured data

stored in databases for making business

decisions. Analysis was often slow and

complex, and infrastructure performance

wasn’t up to the task of providing real-

time analysis of high-velocity data. While

solutions such as Apache Hadoop* have

opened up new avenues to capture, store,

and analyze data, enterprises that have

embraced big data solutions are now

grappling with the challenge of managing

massive amounts of hard-disk-based

data across large server clusters, and the

delays that can be caused by latency.

Big data solutions have changed the data analysis landscape, and enterprises are

increasingly harnessing the power of big data. In-memory data management platforms—a

natural progression of the big data revolution—are pushing data management and

analytics into the realm of real-time. Use cases that were difcult or impossible ve years

ago—such as real-time fraud detection and digital ad marketplaces—are now a reality.

Enterprises increasingly need to take immediate action on big data intelligence, which is

driving changes to the traditional enterprise architecture. In particular, these enterprises are

keeping data instantly available in high-speed machine memory, rather than locking it away

in slow, disk-bound databases. Terracotta BigMemory Max*, combined with servers equipped

with the Intel® Xeon® processor E7 v2 family, can give enterprises high performance, highly

available in-memory data access and management, and predictable speeds at any scale.

For example, a traditional Apache Hadoop

cluster is a distributed network of servers,

or nodes, running the Apache Hadoop

software. A component of Apache Hadoop

is the Hadoop Distributed File System*

(HDFS), which distributes and mirrors

data across the cluster nodes. Each node

stores data on traditional storage devices,

such as hard disks or solid-state drives

(SSDs). Apache Hadoop optimizes analysis

tasks such that the analysis takes place

as close as possible to the source data,

which can help reduce latency. But while

hard disk speeds have improved over the

years, hard disk performance still pales in

comparison to DRAM or even SSD speeds.

Even with an aggressively optimized

Apache Hadoop cluster, real-time analysis

can be difcult given the performance

restraints of traditional storage.

Real-time analysis can also suffer due

to the methods many enterprises use to

load data into an Apache Hadoop cluster.

Enterprises typically rely on traditional

relational databases to store information.

This data is then batch-loaded into Apache

Terracotta and Intel: Breaking

Down Barriers to In-memory

Big Data Management

White Paper

Intel® Xeon® Processor

Big Data Management

Summary of content (6 pages)

PAGE 1
White Paper Intel® Xeon® Processor Big Data Management Terracotta and Intel: Breaking Down Barriers to In-memory Big Data Management Big data solutions have changed the data analysis landscape, and enterprises are increasingly harnessing the power of big data. In-memory data management platforms—a natural progression of the big data revolution—are pushing data management and analytics into the realm of real-time.
PAGE 2
Terracotta and Intel: Breaking Down Barriers to In-memory Big Data Management Table of Contents What’s Holding Back Real-time Big Data Analysis? . . . . . . . . . . . . . . . . 1 Terracotta BigMemory Max* Enables Real-time Big Data Analysis. . . . . . . . 2 How BigMemory Max Works. . . . . . . . 3 How Applications Access BigMemory Max. . . . . . . . . . . . . . . . . 3 Terracotta and Intel: Powering the New Generation of Data Analysis. . . 3 Scalability. . .
PAGE 3
Terracotta and Intel: Breaking Down Barriers to In-memory Big Data Management How BigMemory Max Works How Applications Access BigMemory Max Traditional servers typically rely on the venerable concept of hierarchal storage. Operating systems and applications, which need to run as fast as possible, load from slower disk-based storage into a server’s high-speed RAM.
PAGE 4
Terracotta and Intel: Breaking Down Barriers to In-memory Big Data Management Performance Built on Java, Terracotta BigMemory Max can run as a standalone application using only the Java Development Kit (JDK), or on many leading Java application stacks, including IBM WebSphere*, Oracle WebLogic*, and Apache Tomcat*. Intel works closely with Java Virtual Machine (JVM) vendors to increase the performance of their JVMs on Intel hardware.
PAGE 5
Terracotta and Intel: Breaking Down Barriers to In-memory Big Data Management The test configuration consisted of a single server running BigMemory Max 4, and was equipped with four of the Intel Xeon processor E7-4890 v2 and 6 terabytes of RAM. The tests were repeated against multiple BigMemory Max data store sizes of 2 TB, 4 TB, and 5.5 terabytes.6 Seven client machines were used to produce the workload, and were each equipped with two of the Intel Xeon processor E5-2697 v2.
PAGE 6
Terracotta and Intel: Breaking Down Barriers to In-memory Big Data Management Max data store from 1 billion, 2 billion, and 3 billion elements, with each element consisting of an average of 2 kilobytes. servers into a Terracotta Server Array, as BigMemory demonstrates linear scale out of TPS at a constant low latency rate. utilize up to 6 TB of memory on a single server and deliver consistent, predictable throughput and latency.