Data management paper

This white paper provides an overview
of Terracotta BigMemory Max and the
benets of running BigMemory Max
on servers powered by the Intel Xeon
processor E7 v2 family. The paper
describes how Terracotta BigMemory Max
and Intel technologies can work together to
enable real-time analysis of large data sets.
What’s Holding Back Real-time Big
Data Analysis?
Since the age of mainframes, enterprises
have relied on analyzing structured data
stored in databases for making business
decisions. Analysis was often slow and
complex, and infrastructure performance
wasn’t up to the task of providing real-
time analysis of high-velocity data. While
solutions such as Apache Hadoop* have
opened up new avenues to capture, store,
and analyze data, enterprises that have
embraced big data solutions are now
grappling with the challenge of managing
massive amounts of hard-disk-based
data across large server clusters, and the
delays that can be caused by latency.
Big data solutions have changed the data analysis landscape, and enterprises are
increasingly harnessing the power of big data. In-memory data management platforms—a
natural progression of the big data revolution—are pushing data management and
analytics into the realm of real-time. Use cases that were difcult or impossible ve years
agosuch as real-time fraud detection and digital ad marketplaces—are now a reality.
Enterprises increasingly need to take immediate action on big data intelligence, which is
driving changes to the traditional enterprise architecture. In particular, these enterprises are
keeping data instantly available in high-speed machine memory, rather than locking it away
in slow, disk-bound databases. Terracotta BigMemory Max*, combined with servers equipped
with the Intel® Xeon® processor E7 v2 family, can give enterprises high performance, highly
available in-memory data access and management, and predictable speeds at any scale.
For example, a traditional Apache Hadoop
cluster is a distributed network of servers,
or nodes, running the Apache Hadoop
software. A component of Apache Hadoop
is the Hadoop Distributed File System*
(HDFS), which distributes and mirrors
data across the cluster nodes. Each node
stores data on traditional storage devices,
such as hard disks or solid-state drives
(SSDs). Apache Hadoop optimizes analysis
tasks such that the analysis takes place
as close as possible to the source data,
which can help reduce latency. But while
hard disk speeds have improved over the
years, hard disk performance still pales in
comparison to DRAM or even SSD speeds.
Even with an aggressively optimized
Apache Hadoop cluster, real-time analysis
can be difcult given the performance
restraints of traditional storage.
Real-time analysis can also suffer due
to the methods many enterprises use to
load data into an Apache Hadoop cluster.
Enterprises typically rely on traditional
relational databases to store information.
This data is then batch-loaded into Apache
Terracotta and Intel: Breaking
Down Barriers to In-memory
Big Data Management
White Paper
Intel® Xeon® Processor
Big Data Management

Summary of content (6 pages)