1.1

ManualsBrandsVMware ManualsApplicationsvFabric SQLFire

Table Of Contents

vFabric SQLFire User's Guide

VMware vFabric SQLFire 1.1

VMware vFabric Suite 5.3

This document supports the version of each product listed and

supports all subsequent versions until the document is replaced by

a new edition. To check for more recent editions of this document,

see http://www.vmware.com/support/pubs.

EN-001171-00

Summary of content (722 pages)

PAGE 1
vFabric SQLFire User's Guide VMware vFabric SQLFire 1.1 VMware vFabric Suite 5.3 This document supports the version of each product listed and supports all subsequent versions until the document is replaced by a new edition. To check for more recent editions of this document, see http://www.vmware.com/support/pubs.
PAGE 2
You can find the most up-to-date technical documentation on the VMware Web site at: http://www.vmware.com/support/ The VMware Web site also provides the latest product updates. If you have comments about this documentation, submit your feedback to: docfeedback@vmware.com Copyright © 2013 VMware, Inc. All rights reserved. This product is protected by U.S. and international copyright and intellectual property laws. VMware products are covered by one or more patents listed at http://www.vmware.
PAGE 3
Contents About the SQLFire User's Guide.................................................................................13 Supported Configurations and System Requirements.............................................15 Part I: Getting Started with vFabric SQLFire............................................1 vFabric SQLFire in 15 Minutes.....................................................................................................................2 Chapter 1: Overview of vFabric SQLFire...............
PAGE 4
Create a SQLFire Cluster...........................................................................................................................35 Connect to the Cluster Using SQLF...........................................................................................................37 Create Replicated Tables and Execute Queries.........................................................................................38 Implement a Partitioning Strategy..................................................
PAGE 5
Contents Persist Table Data to a Disk Store...............................................................................................................84 Optimizing Availability and Performance.....................................................................................................85 Starting System with Disk Stores................................................................................................................85 Disk Store Management...........................................
PAGE 6
Chapter 22: Using SQLFire.NET Designer...........................................143 Installing SQLFire.NET Designer..............................................................................................................143 Connecting to a SQLFire Distributed System...........................................................................................143 Editing Tables...................................................................................................................................
PAGE 7
Contents Chapter 30: SQLFire Cache Strategies.................................................181 Chapter 31: Using a RowLoader to Load Existing Data......................183 How SQLFire Invokes a RowLoader.........................................................................................................183 Implementing the RowLoader Interface....................................................................................................184 Using the JDBCRowLoader Example.........................
PAGE 8
Chapter 40: Starting and Configuring SQLFire Servers.....................223 Start and Stop SQLFire Servers Using sqlf..............................................................................................223 Specify the Server Working Directory.......................................................................................................224 Specify Client Connection Information......................................................................................................
PAGE 9
Contents Using VSD to Analyze Statistics...............................................................................................................280 Chapter 48: Using Java Management Extensions (JMX)....................291 Using a JMX Manager Node.....................................................................................................................291 Using a vFabric SQLFire JMX Agent.......................................................................................................
PAGE 10
sqlf remove-jar..........................................................................................................................................375 sqlf replace-jar..........................................................................................................................................377 sqlf revoke-missing-disk-store...................................................................................................................379 sqlf server.....................................
PAGE 11
Contents remove......................................................................................................................................................442 rollback......................................................................................................................................................443 run.............................................................................................................................................................444 set connection..
PAGE 12
SYSROUTINEPERMS..............................................................................................................................652 SYSSCHEMAS.........................................................................................................................................653 SYSSTATEMENTS...................................................................................................................................654 SYSSTATISTICS.......................................................
PAGE 13
About the SQLFire User's Guide Revised April 30, 2013. The SQLFire User's Guide provides step-by-step procedures for installing, configuring, managing, and developing applications with VMware® vFabric™ SQLFire. The guide also provides a complete reference for the SQLFire tools, supported SQL statements, and APIs.
PAGE 14
PAGE 15
Supported Configurations and System Requirements Before installing vFabric SQLFire, make sure your system meets the minimum system requirements for installing and running the product. Before installing vFabric SQLFire, make sure your system meets the minimum system requirements for installing and running the product.
PAGE 16
Host Machine Requirements Requirements for each host: • A supported Java SE installation. • File system that supports long file names. • Adequate per-user quota of file handles (ulimit for Linux) • TCP/IP. • System clock set to the correct time. • For each Linux host, the hostname and host files must be properly configured. See the system manpages for hostname and hosts. • Time synchronization service such as Network Time Protocol (NTP).
PAGE 17
SQLFire provides a managed ADO.NET driver that you can use for developing non-Java client applications. The ADO.NET driver uses IKVM technology to translate SQLFire JDBC core libraries to .NET MSIL. The ADO.NET driver is supported for applications built using the Microsoft .NET 2.0 SP1 or higher framework.
PAGE 18
PAGE 19
Part 1 Getting Started with vFabric SQLFire Getting Started with vFabric SQLFire provides step-by-step procedures for installing, configuring, and using VMware® vFabric™ SQLFire. The guide also explains main concepts and provides tutorials to help you quickly begin using SQLFire.
PAGE 20
vFabric SQLFire in 15 Minutes Need a quick introduction to vFabric SQLFire? Take this 15-minute tour to try out the basic features and functionality. The vFabric SQLFire tutorials expand on these concepts, and demonstrate additional product features. See Tutorials on page 35. 1. Download the latest SQLFire 1.1 ZIP file distribution from the download page: https://my.vmware.com/web/vmware/info/slug/application_platform/vmware_vfabric_sqlfire/1_0. Save the downloaded file in your home directory. 2.
PAGE 21
SQLFire Server pid: 8897 status: running Logs generated in /home/yozie/vFabric_SQLFire_11_bNNNNN/quickstart/server1/sqlfserver.log Starting SQLFire Server using locators for peer discovery: localhost[10334] Starting network server for SQLFire Server at address localhost/127.0.0.1[1529] SQLFire Server pid: 9003 status: running Logs generated in /home/yozie/vFabric_SQLFire_11_bNNNNN/quickstart/server2/sqlfserver.log Both servers also bind to the localhost address.
PAGE 22
12. By default, SQLFire replicates new tables that you create onto data store members. You can validate this using the query: sqlf> select tablename, datapolicy from sys.systables where tablename='QUICKTABLE'; TABLENAME |DATAPOLICY ----------------------------------------------------------------------------QUICKTABLE |REPLICATE 1 row selected 13.
PAGE 23
15. To view the details of the query plan, disconnect as a thin client from the distributed system, and then reconnect as a peer client. A peer client participates as a member of the SQLFire distributed system and can coordinate queries, but it does not host any actual data.
PAGE 24
returned_rows 542 no_opens 1 RESULT-HOLDER execute_time 0.490328 ms returned_rows 267 no_opens 1 member_node localhost(17535):52946 RESULT-HOLDER execute_time 1.65846 ms returned_rows 275 no_opens 1 member_node localhost(17438):1230 DISTRIBUTION-END (64.76%) execute_time 19.233818 ms returned_rows 542 Local plan: member localhost(17535):52946 begin_execution 2013-02-27 15:33:30.762 end_execution 2013-02-27 15:33:30.842 TABLESCAN (100.00%) execute_time 3.
PAGE 25
Chapter 1 Overview of vFabric SQLFire vFabric SQLFire is a memory-optimized, distributed database management system designed for applications that have demanding scalability and availability requirements. Applications can manage database tables entirely in memory, or they can persist tables to disk to reload the data after restarting the system. A SQLFire distributed system can be easily scaled out using commodity hardware.
PAGE 26
Getting Started with vFabric SQLFire Data Flow Diagram GemFire, Apache Derby, and SQL Components vFabric SQLFire incorporates core vFabric GemFire technology and Apache Derby RDBMS components to provide a high-performance, distributed database management system. SQLFire extends standard SQL statements where necessary for creating and managing tables and configuring the SQLFire system. The sections that follow document how SQLFire utilizes the GemFire and Derby component functionality.
PAGE 27
Overview of vFabric SQLFire The SQLFire community site provides a comparison of SQLFire to other data management systems, such as vFabric GemFire. Apache Derby RDBMS Components SQLFire integrates vFabric GemFire functionality with several components of the Apache Derby relational database management system (RDBMS): • JDBC driver. SQLFire supports a native, high performant JDBC driver (peer driver) and a thin JDBC driver. The peer driver is based on the Derby embedded driver and JDBC 4.
PAGE 28
PAGE 29
Chapter 2 Understanding the SQLFire Distributed System A SQLFire deployment consists of distributed member processes that connect to each other to form a peer-to-peer network, also known as a distributed system or SQLFire cluster. The sections that follow explain the interactions of main system components and processes. Tutorials on page 35 help you get started configuring and using a SQLFire distributed system.
PAGE 30
Getting Started with vFabric SQLFire Servers, Peer Clients, and Server Groups A SQLFire server is a process that hosts data and is a member of a peer-to-peer distributed system. SQLFire servers run in Java Virtual Machines (JVMs). You start a SQLFire server using the sqlf tool from a command prompt or terminal window. sqlf launches servers as standalone processes that are akin to database servers.
PAGE 31
Understanding the SQLFire Distributed System with certain consistency guarantees. Each member in the group participates in membership decisions, which ensures that either all members see a new member or no members see it. The membership coordinator, a key component of the GMS, handles "join" and "leave" requests, and also handles members that are suspected of having left the system.
PAGE 32
Getting Started with vFabric SQLFire // SQLFire data-aware procedure invocation CallableStatement callableStmt = connection.prepareCall("{CALL order_credit_check() " + "ON TABLE Orders WHERE customerID IN (?)}"); callableStmt.
PAGE 33
Chapter 3 Installing vFabric SQLFire You can install vFabric SQLFire from the VMware yum repository (RHEL only), from a downloaded RPM file (RHEL only), or from a downloaded ZIP file. The installation procedure varies according to whether you obtained vFabric SQLFire as a standalone product or as part of VMware® vFabric Suite™ Advanced. Installation Note for vFabric Suite Customers vFabric SQLFire is available as a standalone product installation, and as part of vFabric Suite Advanced.
PAGE 34
Getting Started with vFabric SQLFire Install SQLFire as a Component of vFabric Suite Advanced If you purchased vFabric Suite Advanced, VMware recommends that you install vFabric SQLFire by first installing the VMware RPM repository and then using yum to perform the actual installation. Follow this procedure: 1. On the RHEL computer, start a terminal either as the root user or as an unprivileged user who has sudo privileges.
PAGE 35
Installing vFabric SQLFire Note: The yum install command installs the most recent version of the vFabric SQLFire RPM that it finds in all installed repository. If you want to install a different version, you must explicitly specify that version with the yum install command. Use yum search vfabric-sqlfire --showduplicates to find all versions that are available in the installed repositories.
PAGE 36
Getting Started with vFabric SQLFire Install SQLFire as a Standalone Product If you purchased the standalone product (rather than as part of vFabric Suite Advanced), follow this procedure to download and install only the SQLFire RPM: 1. From the VMware downloads page, select VMware vFabric SQLFire. On the SQLFire 1.
PAGE 37
Installing vFabric SQLFire 7. (Optional) Specify the configuration of the vfabric-sqlfire process by editing the file /etc/sysconfig/sqlfire, which is the file sourced by the script that you will later use to start the SQLFire process (/etc/init.d/vfabric-sqlfire.) The /etc/sysconfig/sqlfire file includes many comments to help you decide whether you need to modify it.
PAGE 38
Getting Started with vFabric SQLFire • Download the ZIP file distribution of vFabric SQLFire. Procedure 1. Change to the directory where you downloaded the SQLFire software, and unzip the ZIP file: • UNIX and Linux (Bourne and Korn shells - sh, ksh, bash). If you are using the command line, type the following command: $ unzip vFabric_SQLFire_XX_bNNNNN.
PAGE 39
Chapter 4 Activating vFabric SQLFire Licensing vFabric SQLFire includes a default evaluation license that enables you to run the product tutorial and perform simple evaluation activities. You can obtain custom evaluation licensing and purchase production licensing from your VMware account manager or the VMware license portal. Understand vFabric SQLFire License Options vFabric SQLFire has a number of licensing options. Read the following sections to understand the different license types.
PAGE 40
Getting Started with vFabric SQLFire SQLFire licenses do not restrict the number of clients that can connect to a distributed system. Any number of thin clients and peer clients (SQLFire accessor members) are permitted. vFabric SQLFire Local Licensing and vFabric Suite Licensing If you obtained vFabric SQLFire as a standalone product, you install a license locally on each physical and virtual machine that runs vFabric SQLFire.
PAGE 41
Activating vFabric SQLFire Licensing Installing a vFabric SQLFire Professional License 1. Obtain a product vFabric SQLFire Professional license from VMware. 2. Install the license in SQLFire. Choose one of the following options: • On each SQLFire data store member, add the vFabric SQLFire license serial number directly to the sqlfire.properties file. See Option 1: Install Licenses Using sqlfire.
PAGE 42
Getting Started with vFabric SQLFire 2. Install the license in SQLFire. Choose one of the following options: • On each data store member in the distributed system, add the vFabric SQLFire Enterprise serial number directly to the sqlfire.properties file. See Option 1: Install Licenses Using sqlfire.properties on page 25; or • Create a file that contains both the vFabric SQLFire Enterprise serial number, and copy the file to the vFabric serial number directory on all SQLFire data store members.
PAGE 43
Activating vFabric SQLFire Licensing Local VMware vFabric Directories The location of the local VMware vFabric home directory, if it exists, varies by operating system: Windows %ALLUSERSPROFILE%\VMware\vFabric Linux (or other OS) /opt/vmware/vFabric The location of the local VMware vFabric serial number directory, if it exists, varies by operating system: Windows %ALLUSERSPROFILE%\VMware\vFabric Linux (or other OS) /etc/opt/vmware/vfabric Install and Configure vFabric SQLFire Licenses Installation
PAGE 44
Getting Started with vFabric SQLFire For example: # sqlfire.properties for data store or accessor member license-serial-number=#####-#####-#####-#####-##### If you need to specify multiple serial numbers, use a comma separated list: # sqlfire.
PAGE 45
Activating vFabric SQLFire Licensing If you install an invalid serial number or if SQLFire cannot obtain a dynamic license from the vFabric License Server, SQLFire fails to start and throws an exception. In Getting Started with vFabric Suite , see "Check the Validity of an Existing License Key." to learn how to verify your license. See Licensing Problems on page 697 for a list of common problems and their solutions.
PAGE 46
PAGE 47
Chapter 5 Upgrading vFabric SQLFire SQLFire 1.1 is not compatible with persistence files (disk stores) that were created in earlier versions of SQLFire (1.0.x). You must use the sqlf upgrade-disk-store command to upgrade disk stores to the SQLFire 1.1 format after upgrading the SQLFire software to version 1.1. Refer to the Release Notes to learn about changes that were introduced in the latest update.
PAGE 48
Getting Started with vFabric SQLFire RHEL: Upgrade vFabric SQLFire from RPM If your guest operating system is Red Hat Enterprise Linux (RHEL) and you have installed a previous version of SQLFire using yum and RPM, VMware recommends that you use the RPM distribution to upgrade vFabric SQLFire. You complete the upgrade procedure on every virtual and physical machine that runs SQLFire. Note: Thoroughly test your development systems with the new version before moving into production.
PAGE 49
Upgrading vFabric SQLFire 4. If you are upgrading from SQLFire 1.0.x, run the sqlf command upgrade-disk-store to upgrade each disk store to a format that is compatible with vFabric SQLFire 1.1. See sqlf upgrade-disk-store on page 397 for more information. Note: You must update the default disk stores as well as any additional disk stores that were defined through the CREATE DISKSTORE statement. Specify the full path to each disk store in multiple invocations of the sqlf upgrade-disk-store command.
PAGE 50
Getting Started with vFabric SQLFire 4. If you are upgrading from SQLFire 1.0.x, run the sqlf command upgrade-disk-store to upgrade each disk store to a format that is compatible with vFabric SQLFire 1.1. See sqlf upgrade-disk-store on page 397 for more information. Note: Update the default disk stores as well as any additional disk stores that were defined through the CREATE DISKSTORE statement. Specify the full path to each disk store in multiple invocations of the sqlf upgrade-disk-store command.
PAGE 51
Chapter 6 Connect to vFabric SQLFire with JDBC Tools Third-party JDBC tools can help you browse data in tables, issue SQL commands, design new tables, and so forth. You can configure these tools to use the SQLFire JDBC thin client driver to connect to a SQLFire distributed system. Although the instructions for setting up each tool vary, the general process for establishing a connection involves configuring the JDBC client driver and setting JDBC connection URL properties. Follow these basic steps: 1.
PAGE 52
PAGE 53
Chapter 7 Tutorials Learn to configure and use SQLFire features such as table replication and partitioning, persisting data to disk, and dynamically resizing the cluster. Main Steps The tutorial is divided into the following steps, which explain how to set up a cluster of SQLFire servers on multiple Java VMs and then distribute data across the cluster. Perform the steps in the sequence shown.
PAGE 54
Getting Started with vFabric SQLFire 4. To manage client connections to the available SQLFire server members, the cluster in this tutorial uses a SQLFire locator member. Create a new directory for the locator member: mkdir locator A locator maintains a list of available servers in the cluster, and updates that list as servers join and leave the cluster. Locators also load balance client connections across all available servers. 5.
PAGE 55
Tutorials sqlf server start -dir=server2 -locators=ip_address[10101] -client-bind-address=ip_address -client-port=1529 In each command, the -locators option defines the peer discovery address to use for joining the SQLFire distributed system. (Production deployments generally use multiple locator members, in which case you would specify a comma-separated list of locator host[port] connections when starting a server.
PAGE 56
Getting Started with vFabric SQLFire 4. To connect to the SQLFire cluster using the JDBC thin client driver, use the connect client command and specify the host and port number of the SQLFire locator: connect client 'ip_address:1527'; Notice that SQLFire does not have the concept of a "database". When you connect to a SQLFire cluster, the distributed system that you connect to is defined by the locator (or alternately, the mcast-port) specified in the JDBC or ADO.NET connection. 5.
PAGE 57
Tutorials 0 rows inserted/updated/deleted [...] 2. Run the loadTables.sql script to populate the tables with data: run 'loadTables.sql'; The script output completes with: sqlf> insert into FLIGHTAVAILABILITY values ('US1357',2,'2004-04-18',0,0,3); 1 row inserted/updated/deleted 3. Enter the following command to show the table names that you created (tables in the APP schema): show tables in APP; 4.
PAGE 58
Getting Started with vFabric SQLFire FLIGHTS, FLIGHTS_HISTORY, and FLIGHTAVAILABILITY are fact tables, and they will be partitioned. You will co-locate these tables to ensure that all rows that are associated with FLIGHT_ID are maintained in a single partition. This step ensures that frequent join queries based on a selected flight are pruned to a single member and executed efficiently. Procedure 1. In a separate terminal window or GUI editor, open the create_colocated_schema.
PAGE 59
Tutorials SEGMENT_NUMBER INTEGER NOT NULL , FLIGHT_DATE DATE NOT NULL , ECONOMY_SEATS_TAKEN INTEGER DEFAULT 0, BUSINESS_SEATS_TAKEN INTEGER DEFAULT 0, FIRSTCLASS_SEATS_TAKEN INTEGER DEFAULT 0, CONSTRAINT FLIGHTAVAIL_PK PRIMARY KEY ( FLIGHT_ID, SEGMENT_NUMBER, FLIGHT_DATE), CONSTRAINT FLIGHTS_FK2 Foreign Key ( FLIGHT_ID, SEGMENT_NUMBER) REFERENCES FLIGHTS ( FLIGHT_ID, SEGMENT_NUMBER) ) PARTITION BY COLUMN (FLIGHT_ID) COLOCATE WITH (FLIGHTS); 2. In the sqlf session, execute the create_colocated_schema.
PAGE 60
Getting Started with vFabric SQLFire 1. In a separate terminal window or GUI editor, examine the contents of the create_persistent_schema.sql script. Notice that this script uses the PERSISTENT keyword in each CREATE TABLE statement.
PAGE 61
Tutorials 1fc6a853-69d6-4ffe-8029-218acf165c34 is waiting for the data previously hosted at [/10.118.33.206:/Users/yozie/vFabric_SQLFire_11_bNNNNN /server2/./datadictionary created at timestamp 1343149035681 version 0 diskStoreId 49c1e827-4bb5-49cc-951b-221d47bbc92f] to be available These are not error messages. They indicate that the SQLFire member you are starting is waiting for another member to become available online. 6.
PAGE 62
Getting Started with vFabric SQLFire Connecting to distributed system: locators=ip_address[10101] Successfully shut down 3 members 9. After all data stores have stopped, shut down the locator as well: sqlf locator stop -dir=locator1 Perform Additional Tasks After you complete the SQLFire tutorial, you can perform related tutorial tasks to explore other areas of the product. Explore the toursDB Database The example toursDB database manages information about air transportation for scheduling flights.
PAGE 63
Chapter 8 vFabric SQLFire Features and Benefits The sections that follow summarize main features. The SQLFire community site provides additional information about the features and benefits of SQLFire and a comparison of SQLFire to other data management systems. See also the SQLFire product page.
PAGE 64
PAGE 65
Part 2 Managing Your Data in vFabric SQLFire Managing Your Data in vFabric SQLFire describes how to design your database schema to take advantage of replication and partitioning in a distributed SQLFire cluster. You can use this information to design new databases, or to adapt an existing database for use with SQLFire. This guide also describes how to persist your data to disk store files, and it provides a simple procedure for importing an existing database into SQLFire.
PAGE 66
PAGE 67
Chapter 9 Designing vFabric SQLFire Databases Unlike in traditional database design, SQLFire requires the designer to analyze the common access patterns and choose a partitioning strategy that results in queries that are pruned to a single partition. In addition, fact tables may need to be replicated in order to make data available for join operations against partitioned data. This chapter describes the basic table design principles that you can use to achieve linear scaling with SQLFire.
PAGE 68
Managing Your Data in vFabric SQLFire • In an online auction application, you may need to stream incoming auction bids to hundreds of clients with very low latency. To do so, you must manage selected "hot" auctions on a single partition so that they receive sufficient processing power. As the processing demand increases, add more partitions and route the application logic that matches bids to clients to the data store itself.
PAGE 69
Designing vFabric SQLFire Databases • Use parallel, data-aware procedures to run the logic for the query on the member that stores some or all of the data (to minimize data distribution hops). The procedure should execute multiple queries and then join the results using application code. Using Data-Aware Stored Procedures on page 153 provides more information. • Split the query into multiple queries, and perform the join in the application client code.
PAGE 70
Managing Your Data in vFabric SQLFire Step 1: Determine the entity groups Entity groups are generally course-grained entities that have children, grand children, and so forth, and they are commonly used in queries. This example chooses these entity groups: Entity group Description Customer This group uses the customer identity along with orders and order details as the children. Product This group uses product details along with the associated supplier information.
PAGE 71
Designing vFabric SQLFire Databases Entity group Tables Order Details Product Product Suppliers Category Step 3: Define the partitioning key for each group In this example, the partitioning keys are: Entity group Partitioning key Customer CustomerID Product ProductID This example uses customerID as the partitioning key for the Customer group. The customer row and all associated orders will be collocated into a single partition.
PAGE 72
Managing Your Data in vFabric SQLFire So for the product entity group, table Products is partitioned by ProductID, and the Suppliers and Categories tables are replicated to all of the members where Products is partitioned. Applications can now join Products, Suppliers and categories. For example: select * from Products p , Suppliers s, Categories c where c.categoryID = p.categoryID and p.supplierID = s.supplierID and p.
PAGE 73
Chapter 10 Using Server Groups to Manage Data Use server groups to control where table data is stored. Server Groups Overview A server group specifies the SQLFire members that will host data for a table. You use a server group to logically group SQLFire data stores for managing a table's data. Any number of SQLFire members that host data can participate in one or more server groups. You specify named server groups when you start a SQLFire data store.
PAGE 74
Managing Your Data in vFabric SQLFire A SQLFire member that is booted with host-data=false is an accessor and does not host table data, even if you specify one or more server groups. However, peer clients that host data can also participate in server groups.
PAGE 75
Using Server Groups to Manage Data By default, all servers that host data are added to the "default" server group. Different logical database schema are often managed in different server groups. For example, an order management system might manage all customers and their orders in an "Orders" schema deployed to one server group. The same system might manage shipping and logistics data in a different server group.
PAGE 76
Managing Your Data in vFabric SQLFire For example, if you start a SQLFire server from the command line with sqlf, use the server-groups property to specify the names of one or more server groups that the server should join: sqlf server start -server-groups=OrdersDB,OrdersReplicationGrp,DBProcessInstance1 In this example, the SQLFire server participates in three server groups: OrdersDB, OrdersReplicationGrp and DBProcessInstance1.
PAGE 77
Chapter 11 Partitioning Tables Horizontal partitioning involves spreading a large data set (many rows in a table) across members in a cluster. SQLFire uses a variant of the consistent hash algorithm to ensure that data is uniformly balanced across all members of the target server group. How Table Partitioning Works You specify the partitioning strategy of a table in the PARTITION BY clause of the CREATE TABLE statement.
PAGE 78
Managing Your Data in vFabric SQLFire Figure 2: Partitioned Table Data Although each bucket is assigned to one or more specific servers, you can use a procedure to relocate buckets in a running system, in order to improve the utilization of resources across the cluster. See Rebalancing Partitioned Data on SQLFire Members on page 66. Understanding Where Data Is Stored SQLFire uses a table's partitioning column values and the partitioning strategy to calculate routing values (typically integer values).
PAGE 79
Partitioning Tables Figure 3: Partitioned Data in Buckets If you set the redundant-copies for the table to be greater than zero, SQLFire designates one of the copies of each bucket as the primary copy. All writes to the bucket go through the primary copy. This ensures that all copies of the bucket are consistent.
PAGE 80
Managing Your Data in vFabric SQLFire Creating Partitioned Tables You create a partitioned table on a set of servers identified by named server groups (or on the default server group if you do not specify a named server group). Clauses in the CREATE TABLE statement determine how table data is partitioned, colocated, and replicated across the server group. This topic focuses on the partitioning_clause in CREATE TABLE. The CREATE TABLE reference page describes all of the options in more detail.
PAGE 81
Partitioning Tables PARTITION BY RANGE ( column-name ) ( VALUES BETWEEN value AND value [ , VALUES BETWEEN value AND value ]* ) | PARTITION BY LIST ( column-name ) ( VALUES ( value [ , value ]* ) [ , VALUES ( value [ , value ]* ) ]* ) | PARTITION BY ( expression ) } [ COLOCATE WITH ( table-name [ , table-name ] * ) ] } [ [ [ [ REDUNDANCY integer-constant ] BUCKETS integer-constant ] MAXPARTSIZE integer-constant ] RECOVERYDELAY integer-constant ] Note: If the table has no primary key, then SQLFire generat
PAGE 82
Managing Your Data in vFabric SQLFire Partition Based on Columns This statement creates a table that is partitioned by the "CustomerName" column. All rows with the same CustomerName are guaranteed to be colocated in the same process space. Here, the SERVER GROUPS clause determines the peers and servers that host data for the partitioned table. A server group is a subset of all the peers and servers that host data in the distributed system.
PAGE 83
Partitioning Tables ) PARTITION BY LIST ( Status ) ( VALUES ( 'pending', 'returned' ), VALUES ( 'shipped', 'received' ), VALUES ( 'hold' ) ); Partition Based on an Expression Expression partitioning partitions a table by evaluating a SQL expression that you supply.
PAGE 84
Managing Your Data in vFabric SQLFire You can colocate another table, "cities," using the command: CREATE TABLE CITIES ( CITY_ID INTEGER NOT NULL CONSTRAINT CITIES_PK Primary key, CITY_NAME VARCHAR(24) NOT NULL, COUNTRY VARCHAR(26) NOT NULL, AIRPORT VARCHAR(3), LANGUAGE VARCHAR(16), COUNTRY_ISO_CODE CHAR(2) CONSTRAINT COUNTRIES_FK REFERENCES COUNTRIES (COUNTRY_ISO_CODE) ) PARTITION BY COLUMN (COUNTRY_ISO_CODE) COLOCATE WITH (COUNTRIES) In this example, both "countries" and "cities" are partitioned using th
PAGE 85
Partitioning Tables Rebalancing is a SQLFire member operation that affects partitioned tables created in the cluster. Rebalancing performs two tasks: • If the a partitioned table's redundancy setting is not satisfied, rebalancing does what it can to recover redundancy. See Making a Partitioned Table Highly Available on page 66. • Rebalancing moves the partitioned table's data buckets between host members as needed to establish the best balance of data across the distributed system.
PAGE 86
Managing Your Data in vFabric SQLFire Configuring suspect-member Alerts When any member of the distributed system fails, it is important for other services to detect the loss quickly and transition application clients to other members. Any peer or server in the cluster can detect a problem with another member of the cluster, which initiates "SUSPECT" processing with the membership coordinator.
PAGE 87
Chapter 12 Replicating Tables SQLFire server groups control which SQLFire data store members replicate the table's data. SQLFire replicates table data both when a new table is initialized in a cluster and when replicated tables are updated. How SQLFire Replicates Tables SQLFire replicates every single row of a replicated table synchronously to each table replica in the target server group(s).
PAGE 88
Managing Your Data in vFabric SQLFire Also note that this version of SQLFire supports joins only on co-located data. Instead of using partitioning in all cases, you should consider having applications replicate smaller tables that are joined with other partitioned tables. Note: If multiple applications update the same row of a replicated table at the same time outside of a transaction, the table data can become out of sync when SQLFire replicates those updates.
PAGE 89
Chapter 13 Estimating Memory Requirements Designing a SQLFire database also involves estimating the memory requirements for your data based on the size of the actual table values and indexes, the overhead that SQLFire requires for your data, and the overall usage pattern for your data. You can estimate the memory requirements for tables using general guidelines for SQLFire overhead. Or, you can load tables with representative data and then query the SQLFire SYS.
PAGE 90
Managing Your Data in vFabric SQLFire Viewing Memory Usage in SYS.MEMORYANALYTICS SQLFire includes instrumentation to display the memory used by individual tables and indexes in a SQLFire member. You can view this memory usage information by starting a Java agent process when you boot SQLFire members, and then querying the SYS.MEMORYANALYTICS virtual table from a client connection. Enabling SYS.MEMORYANALYTICS Follow this procedure to enable the SYS.MEMORYANALYTICS virtual table.
PAGE 91
Estimating Memory Requirements APP.FLIGHTS.DESTINDEX (Index Entry Size, Value Size, Row Count) |dyozie-e4310(6880):3439/59731 |4888,3784,87 [...] Understanding Table and Index Values Querying the SYS.MEMORYANALYTICS table provides run-time information about the tables and indexes available in SQLFire. Table Values Each table has a single row in SYS.MEMORYANALYTICS identitfied with the SQLENTITY format: schema_name.table_name (Entry Size, Value Size, Row Count).
PAGE 92
Managing Your Data in vFabric SQLFire For example, the following row from SYS.MEMORYANALYTICS shows that APP.FLIGHTS.DESTINDEX uses 2803 bytes for skip lists: SQLENTITY |ID |MEMORY ---------------------------------------------------------------APP.FLIGHTS.DESTINDEX (Index Entry Overhead, SkipList Size, Max& |dyozie-e4310(6880):3439/59731 |2104,696,3 (2803 = 2.74 kb) The second row for an index uses the SQLENTITY format: schema_name.table_name.index_name (Index Entry Size, Value Size, Row Count).
PAGE 93
Estimating Memory Requirements APP.FLIGHTS (sqlfire,gemfire,others) |dyozie-e4310(6880):3439/59731 |12912,299736,680704 (993352 = 970.07 kb) APP.FLIGHTS.DESTINDEX (sqlfire,gemfire,others) |dyozie-e4310(6880):3439/59731 |5072,0,3288 (8360 = 8.16 kb) [...
PAGE 94
PAGE 95
Chapter 14 Using Disk Stores to Persist Data You can persist table data to disk as a backup to the in-memory copy, or overflow table data to disk when memory use gets too high. Overview of Disk Stores The two disk store options, overflow and persistence, can be used individually or together. Overflow uses disk stores as an extension of in-memory table management for both partitioned and replicated tables. Persistence stores a redundant copy of all table data managed in a peer.
PAGE 96
Managing Your Data in vFabric SQLFire for persisting data. If you use a peer client to execute DDL statements that require persistence and there are no data stores available in the distributed system, SQLFire throws a data store unavailable exception (SQLState: X0Z08). You must start locators and data stores before starting peer clients in your distributed system.
PAGE 97
Using Disk Stores to Persist Data Disk Store Directories When you create a disk store, optionally specify the location of directories where sqlf stores persistence-related files. sqlf generates disk store artifacts in a directory that it chooses in the following order: 1. If you provide absolute directory paths, sqlf uses the paths as-is. You must ensure that the proper directory structure exists. 2.
PAGE 98
Managing Your Data in vFabric SQLFire First Part of File Name: Usage Identifier Usage Identifier Values Usage Examples OVERFLOW Oplog data from overflow tables and queues OVERFLOWoverflowDS1_1.crf only. BACKUP Oplog data from persistent and persistent+overflow tables and queues. BACKUPoverflowDS1.if, BACKUPSQLF-DEFAULT-DISKSTORE.if DRLK_IF Access control - locking the disk store. DRLK_IFoverflowDS1.lk, DRLK_IFSQLF-DEFAULT-DISKSTORE.
PAGE 99
Using Disk Stores to Persist Data File Extension Values Usage Notes drf Oplog: delete operations Pre-allocated 10% of the total max-oplog-size at creation. krf Oplog: key and crf offset information Created after the oplog has reached the max-oplog-size. Used to improve performance at startup. Disk Store Operation Logs At creation, each operation log is initialized at the disk store's MAXLOGSIZE value, with the size divided between the crf and drf files.
PAGE 100
Managing Your Data in vFabric SQLFire Pooling Each SQLFire member manages its own disk store, and no disk contention exists between processes. Each partition can manage its data on local disks. If the application "write" load can be uniformly balanced across the cluster, the aggregate disk throughput is equal to the disk transfer rate time the number of partitions, assuming a single disk per partition. Disk transfer rates can be up to 100MB/sec on commodity machines today.
PAGE 101
Using Disk Stores to Persist Data agree with each other. Each member of the distributed system logs membership changes to its persistent files and uses this information during recovery to determine the replica that has the latest changes, and members automatically synchronize these changes at startup. Guidelines for Designing Disk Stores Work with your system designers and developers to plan for disk storage requirements in testing and production database systems.
PAGE 102
Managing Your Data in vFabric SQLFire Related Topics CREATE TABLE on page 476 Persist Table Data to a Disk Store on page 84 Evicting table data from memoryUse eviction to remove table data from memory, or to persist the overflow data in a disk store. Creating a Disk Store or Using the Default You can create a disk store for persistence and/or overflow or use the default disk store. Data from multiple tables can be stored in the same disk store.
PAGE 103
Using Disk Stores to Persist Data Note: When you specify asynchronous persistence, asynchronous writes to the disk store use certain disk store attributes. See Disk Store Persistence Attributes on page 79. Related Topics CREATE DISKSTORE on page 465 CREATE TABLE on page 476 Evicting table data from memoryUse eviction to remove table data from memory, or to persist the overflow data in a disk store.
PAGE 104
Managing Your Data in vFabric SQLFire Startup Process When you start a member with disk stores, the stores are loaded back into memory to initialize the member’s table data. Note: Peer clients rely on data stores for persistence. See Peer Client Considerations for Persistent Data on page 84. If the member does not hold all of the most recent data in the system: 1. The member does not immediately join the server group, but waits for the member with the most recent data.
PAGE 105
Using Disk Stores to Persist Data 1. MA is started first. SQLFire recognizes that MA does not have the most recent disk store data and waits for MB to start before creating TableP in MA. 2. MB is started. SQLFire recognizes MB as having the most recent disk data for TableP and initializes it from disk. 3. MA recovers its TableP data from disk and updates it as needed from the data in MB.
PAGE 106
Managing Your Data in vFabric SQLFire Note: Each of these commands operates either on the online disk stores or offline disk stores, but not both. sqlf Command Online or Offline Command See . . .
PAGE 107
Using Disk Stores to Persist Data Example sqlf validate-disk-store ds1 hostB/bupDirectory /partitioned_table entryCount=6 bucketCount=10 Disk store contains 1 compactable records. Total number of table entries in this disk store is: 6 Compacting Disk Store Log Files You can configure automatic compaction for an operation log based on percentage of garbage content. You can also request compaction manually for online and offline disk stores.
PAGE 108
Managing Your Data in vFabric SQLFire Offline compaction runs essentially in the same way, but without the incoming DML operations. Also, because there is no current open log, the compaction creates a new one to get started. Run Online Compaction Old log files become eligible for online compaction when their garbage content surpasses a configured percentage of the total file. A record is garbage when its operation is superseded by a more recent operation for the same record.
PAGE 109
Using Disk Stores to Persist Data You must provide all of the directories in the disk store. If no oplog max size is specified, SQLFire uses the system default. Offline compaction can take a lot of memory. If you get a java.lang.OutOfMemory error while running this, you made need to increase your heap size. See the sqlf command help for instructions on how to do this.
PAGE 110
Managing Your Data in vFabric SQLFire drwxrwxr-x bash-2.05$ 2 jpearson users 512 Apr 7 15:09 . Backing Up and Restoring Online Disk Stores When you invoke the sqlf backup command, SQLFire backs up disk stores for all members that are running in the distributed system at that time. Each member with persistent data creates a backup of its own configuration and disk stores.
PAGE 111
Using Disk Stores to Persist Data If the operation does not succeed at backing up all known members, you see a message like this: Connecting to distributed system: locators=warsaw.vmware.com[26357] The following disk stores were backed up: DiskStore at hosta.vmware.com /home/dsmith/dir1 DiskStore at hostb.vmware.com /home/dsmith/dir2 The backup may be incomplete. The following disk stores are not online: DiskStore at hostc.vmware.
PAGE 112
Managing Your Data in vFabric SQLFire To restore a backup of an offline system: 1. Make sure the system is either down or not using the directories you will use for the restored files. 2. Make sure your members are configured to use the directories where you put the files. 3. Reverse your backup file copy procedure, copying all the backed up files into the directories you want to use. 4. Start the system members.
PAGE 113
Using Disk Stores to Persist Data table are online, because their offline status may be causing PartitionOfflineExceptions in cache operations or preventing the system from satisfying redundancy. Example: sqlf list-missing-disk-stores Connecting to distributed system: mcast=/239.192.81.2:12348 DiskStore at straw.gemstone.com /export/straw3/users/jpearson/testGemFire/hostB/DS1 Note: Make sure this sqlf call can find a sqlfire.properties file to locate the distributed system.
PAGE 114
PAGE 115
Chapter 15 Exporting and Importing Data with vFabric SQLFire You can use either sqlf commands or Apache DdlUtils to import data into SQLFire. SQLFire installs and uses DdlUtils for sqlf import and export functionality. Using SQLF Commands to Export and Import Data The sqlf utility provides several commands that use the DdlUtils 1.1 API to export and import database schemas and table data. You can use these sqlf commands with SQLFire and other JDBC datasources.
PAGE 116
Managing Your Data in vFabric SQLFire Export, Alter, and Import a Database Schema Using SQLF Use sqlf commands to export the schema and data from a third-party database, and then import the schema and data to vFabric SQLFire. Procedure 1. To use the sqlf export commands with a third-party database, you require a JDBC driver and connection URL for the database. Use a Java client application such as SQuirreL SQL to verify that you can connect to the database.
PAGE 117
Exporting and Importing Data with vFabric SQLFire -url=jdbc:mysql://localhost:3306/sqlfdb?user=username&password=password 7. Use the sqlf write-data-to-db command and specify both the data XML file and the schema XML file to import the data to SQLFire: sqlf write-data-to-db -files=mysql-data.xml -schema-files=mysql-schema.xml -client-bind-address=localhost -client-port=1527 8. If necessary, manually add triggers, views, and Java stored procedures in the SQLFire database.
PAGE 118
Managing Your Data in vFabric SQLFire 4. Make sure that you can connect to the external database and to SQLFire servers. You will need to supply the connection URL and properties for both databases when using DdlUtils. 5. Edit the build.xml file in the ddlutils/example directory to specify the JDBC connection information for both SQLFire and the database from which you are importing data. The example build file contains sample entries for SQLFire and MySQL.
PAGE 119
Exporting and Importing Data with vFabric SQLFire --> PAGE 120
Managing Your Data in vFabric SQLFire