Home > Blogs > VMware vFabric Blog > Monthly Archives: February 2012

Monthly Archives: February 2012

Four Easy Steps to Deploy Enterprise Java Apps with Tier 3’s Starter Kit Blueprint for vFabric™

Are you looking for a quick and easy way to deploy a small vFabric environment for your Java apps in just a few clicks – without the need to troubleshoot?

Tier 3, a VMware vCloud-powered service provider, has announced access to its Starter Kit Blueprint for VMware vFabric, which comes with four virtual machine instances and advanced vFabric application services. Each of the servers is configured with four gigabytes of memory and two virtual CPUs (vCPUs) by default, but users can change the configuration easily in the Tier 3 Control Portal after deployment.
Tier 3’s vFabric Starter Kit comes with:

Deployment is very easy and can be done in four easy steps:

  1. Logon to the Tier 3 Control Portal and go to Blueprints > Search for “vFabric Starter Kit”
  2. Click on the vFabric Starter Kit Blueprint and on the next screen click Deploy Blueprint.
  3. Next you will want to customize your blueprint by specifying a server password, name of network to place the servers on, and primary and secondary DNS.
  4. Last step is to deploy the customized vFabric Starter Kit blueprint by clicking on the Deploy Blueprint button at the bottom of the screen.

The platform will now do all the orchestration and deploy a fully working vFabric environment for you. To access the environment all you have to do is logon via VPN (Tier 3 has instructions here).

For a full overview of an install with vFabric Starter Blueprint, watch a video demo taking you from deployment to configuration with Tier 3 product manager Shantu Roy.

Getting started

Getting started is simple. Go to Tier 3 and activate your account, which will give you access to this Blueprint.

How VMware vFabric GemFire reduced a data-intensive batch job’s processing time from 15 HOURS to 19 MINUTES

The problem:

    A great majority of data-intensive jobs run on databases, powered by either high-end  servers (sometimes even mainframes). From a traditional data processing perspective, all the data is kept on a single database instance or even on multiple instances, but all sharing the same data. (share-everything, no data partitioning).

 

Share-everything

 

 This approach was still valid for a couple of years, and the legacy is there to prove it. However, with the data explosion brought by the last decade and specially on the last few years, most of those jobs are not currently satisfying their goals. Due to relational model, this huge growth on data usually brings exponential increases on processing time of those jobs, specially those ones who have to iterate or search through all the data available. The jobs which handled around 1 million records a few years ago nowadays are handling dozens of millions of the same records and the processing times jumped from minutes or few hours to many hours and even days.

    This problem is specially recurrent on some industries as follows:
    – Finance Institutions – where risk management, trading positioning, cash flow and other important jobs run overnight.
    – Telco and other service providers – running billing batch jobs.
    – CRM systems for any industry – demanding consolidation of customer bases with external systems
 

    The cost on the delay of those executing jobs is very variable, however always having an important impact most enterprises can't handle. This impact can come from either delays on opening your bank branches in the morning up to being able to handle less days on your billing cycle (and so impacting cash flow).

     As traditional databases only scale well vertically, the database vendor's approaches are to scale the hardware vertically, moving the DB machine to a larger server – of course extremely expensive. Because of that I've seen customers buying machines with up to 2 terabytes and dozens of CPUs only to run their business critical batches faster.

 

So, what's the alternative?

    To handle big data processing we need another approach. Shared-everything model is clearly limited on scalability and need to handle distributed locks to be able to keep the multiple servers synchronized on inserts and updates.

     Instead of that, GemFire suggests the strategy of data partitioning, where data is divided into the different servers which compose a GemFire Distributed System. Of course this strategy is combined with replication for maximum availability. The idea is data records are divided between GemFire servers on a way each server can now run its data functions independently based only on the data which is currently hosted on it.

     So, as an example if we take a batch job which would iterate over 100 millions of records, we could partition this data on 10 nodes and each one would handle only 10 millions. Of course, adding another 10 nodes would automatically re-partition the data so each node would handle 5 millions records and so on. On a batch processing, function can be distributed among all the members, such as each server only processes those records which are available for it. Like that, we scale not only on having multiple processors working in parallel, but also on dividing the big data on smaller chunks of records and having each server to process a much smaller set of data.
     However, this becomes even more interesting if we knew each of those servers can be standard commodity x86 machines, usually multiple times cheaper together than a single instance of the big machines used to host enterprise databases. This way, we scale horizontally both saving costs and improving performance at the same time, while guarantee linear scaling.

The proof of concept

    The proof of concept was conducted on a potential customer – big healthcare company from Latin America – handling millions of customers and seeing its data increasing about 30% per year.  Due to business model, legal affairs and regulatory / compliance needs, they should be able to run a customer reconciliation process as frequent as possible in order to avoid fraud and unnecessary payments.

     This process which few years ago took few hours to complete is nowadays taking from 10-15 hours due to huge increase of data on recent years. All the possible tuning was done on both RDBMS and data structure, but the results didn't change much and customer is currently only capable of running this process twice a month due to other batches which uses the same database.

     At the moment we came into this opportunity for the GemFire PoC, customer was aiming to grow vertically his database machine once more to a multi-million server and of course pay much more licenses of his database management system, since it's licensed by CPUs.

     GemFire proof of concept goals on 5 working days were:
     – Migrate current batch Job from PL-SQL to Java, acessing GemFire API.
     – Show impressive gains of performance on commodity x86 servers, running production data
     – Prove horizontal linear scalability, showing performance gains while more servers were add to the environment.
     – Prove high availability of the solution.

The solution environment

      The batch job which had around 700 lines of PL-SQL core was migrated to Java on a 16-hour work effort. Partitioning and replication strategies were defined to best leverage GemFire benefits while keeping high availability of the solution.

       The solution was deployed using 16 x  2 vCPU (Xeon 3.07 GHz) 8 Gb RAM server, allocated on a stepped approach to show scalability.  As we were using production data, results taken were compared with the ones from production system to make sure there were no mistakes on the implementation.

 

Results

       The GemFire distributed function (the migrated batch) was executed against 6, 8, 12 and 16 servers from the same configuration above and results were taken as follow:

    6 server nodes – 128 minutes.

    8 server nodes – 61 minutes, or  ~50% better than with 6 nodes (as we added 2 nodes or 33% capacity)
    12 server nodes – 29 minutes, or ~ 50% better than with 8 nodes (as we added 4 nodes, or 50% capacity)
    16 server nodes – 19 minutes, or ~ 30% better than with 12 nodes (as we added 4 nodes or ~ 30% capacity)
   
    Those impressive results really showed both extreme high performance with linear horizontal scalability and also high availability, as each server had two copies of its data on other members of the cluster and the many tests we conducted taking few members down didn't affect the results or forced a rollback in any case.
     For customer this really means he can now execute those jobs on a much less expensive environment and on few minutes, which would enable them to do it more frequently and increase business income.
   
Conclusion

    GemFire can be used as a Data Fabric and Data Grid solution to migrate big data processing functions (as batch jobs running on either database or mainframe programs) while reducing a lot the processing time and saving resources both on hardware and database licenses.
    The impact on modifying source code is usually extremely small compared to the money most companies are losing due to their ineficient data handling processes and payed on the next following weeks of the project implementation. 

   As you might know, this is only one of the various use cases for GemFire Data Fabric. The extreme low-latency and high-throughput case were mentioned on previous posts. 

 

Insight Plugin Development Rock’n It

Interested in creating a plugin for vFabric Spring Insight, but don't know where to start?

Insight-hst-pluginJeroen Reijn wrote a great blog on how to create a plugin for the Hippo Site Toolkit (an image of plugin is at the left):
http://blog.jeroenreijn.com/2011/10/get-in-control-with-spring-insight.html

By chance this coincides with a webinar I'm (John Kew) doing this Thursday on plugin development, based off a similar talk I did with Gary Russell at SpringOne2GX. You can register for the webinar at: http://www.springsource.com/webinars

We are working furiously towards our 1.8 release for AppInsight 1.1 and vFabric 5.1. We also have a bunch of work going into the next major 2.0 release of Insight. We have a great number of new plugins and features coming. Get ready for some sweet Insighty action!

Using Gemfire to Offload Data from Mainframes

Many different reasons do exist on why customers want to migrate from mainframe to a modern platform. Reduzing cost is obviosly one of those main reasons and it basically come on different aspects:

     - Reduzing load (MIPs)

     - Reduzing space and sometimes power usage on Data Centers

     - Increasing the consolidation ratio

     - Be able to run commodity, much less expensive hardware

     - Use a more productive development environment 

     - Employ less specialized developers, since mainframe developers are more rare at each day (and so more expensive)

Other reasons may be related to time-to-market (or how can I change my mainframe application overnight to comply to new market regulations or support the new product will be launched next week?) or eliminating vendor lock-in.

Traditional approach

Regardless, many customers are still very cautious when offloading from mainframe, and this is very understandable. These legacy applications still run the core business for many of them, and although they might cost a lot to maintain are usually very reliable. So, the main strategy used has been writing the new applications (for example, to serve new devices or products) on a modern environment and make those applications access the mainframe to read and write data related to the core business – which is still kept on the mainframe.

Although this strategy can speed-up the development of new systems sometimes, the mainframe is still needed for the vast majority of data access and MIP usage is usually *not* reduced. Actually, it can be increased as new users (and business transactions) come on new channels and devices, which at the very end always access mainframe data.

Also, another long-term problems are created by this approach. Data is segregated on two different models – the legacy mainframe model and the new modernized model – which co-exist but are more different at each day. Complex (and high costly) hooks must be wriiten on the applications to convert from-and-to the "new data model" to-and-from "the legacy mainframe model".  Sincronization of data is also a challenge and frequently cause issues, leading the customers to lose credibility on those new platforms and getting more scared each day on offloading from the so trusted mainframe.

Offloading from mainfame using Gemfire

Based on this, a new strategy has been used by lots of customers worldwide to sucessfully migrate from mainframe to a much more cost-effective modern pleatform but still extremely reliable, with very close-to-real-time performance on an incremental step-by-step approach. This would not only allow those companies to modernize their development environment but more important highly reduce their MIP usage, allowing them to even fully migrate from mainframe when they are sufficient confortable for it.

Using Gemfire Data Fabric platform – based on an elastic high performant data grid model – customers can build a distributed, high performant and horizontal-scalable data access layer on top of their legacy platforms (e.g. mainframe). Data can be loaded from the mainframe (or any other legacy platform) and written back to it as needed, although the transactions are done on a micro-second latency rate, using the distributed memory of the Gemfire data grid cluster members, which is far faster and more scalable than the traditional transactions based on disk persistence. Replication between server peers during transactions is transparent, scalable and guarantees as much transaction consistency and durability as needed, fully reliable. Although data is asynchronously written to disk (so, not depending on disk poor I/O throughput) it is is replicated through the memory of the participating peers and chance of any data loss is limited to a catastrophic failure (data-center complete failure) – or as small as on a traditional disk-persistency, traditional database or mainframe approach, However, Gemfire still takes care of data-center reliable replication through WAN network – possibly an alternate data-center (either backup or active-active) – guaranteeing geographical redundancy. 

This way, the mainframe can still be used to load the legacy data from and as archival data storage, but will not necessarily participating on any transactions (although it can for particular cases). This will immediately speed-up transactions to memory and local data-center network rates and enable to scale horizontally on demand, while reducing the load on mainframe.  Gemfire will guarantee data will still be written to mainframe asynchronously (usually in batches and on a sub-second base) when needed, thus not creating any challenge for other legacy applications which still rely on the data from the legacy data storage. 

Consistency between legacy datastore and the Data Fabric is kept using events which are triggered on Gemfire on each data access. As an example, data can be written also on mainframe to keep syncronization at each time it is inserted or updated on Gemfire. It can also be loaded from mainframe on a timely fashion or each time a value is not found on the Data Fabric, for example. On the other way, a change on data kept on the legacy datastore can be sent to a queue or trigger a function in order to let Gemfire know some value has changed. 

The same events can be used to notify client applications on simple changes in values stored on the Data Fabric or even based on complex criteria (so, server-side continuously running query).

A combination of those data events which can be based on either simple or complex criteria and trigger other events can be seen as an embedded, data-friendly Complex Event Processing (CEP) platform, which can also build a extremely valuable business real-time on demand data platform. 

However, probably one of the most exciting characteristic of this approach is the Data Fabric will run on commodity hardware and will scale horizontally on a linear base. This means customers can start with very small environments and add more servers when needed / desired, which would immediately increase not only the memory capacity, but also the processing power, while Gemfire works as a grid computing platform, distributing the processing between peers (Read: Running jobs on mainframe < link to other article>). Most cases report enhancements on transaction throughput on an order of hundreds to thousands of times and speed-up jobs which traditionally ran on hours to a few minutes or even seconds. 

Based on this, Gemfire would be suitable for basically two different use cases on offloading from Mainframe:

     - Low-latency, high volume data transactions. 

      - Long-running, data intensive jobs – such as batches running overnight

 Wrap-up 

Gemfire has been used with great success as the distributed data platform for core business of large enterprises all over the world for the last decade on important industries like finance trading and telco pre-paid real-time charging. It has a very important fit on VMWare's Cloud Application Platform offering, solving a number of challenges for data in a modern world, such as the classical horizontal scalability issue of relational databases, disk I/O bottleneck, big data / data explosion and scalable access to legacy.  Return on investment for such projects has been as fast as a few months, based on high platform cost savings and business advantage acchieved.

Most customers start by using Gemfire on top of their legacy platforms (e.g. traditional RDBMS', mainframes, file-based persistence) to immediately gain dramatic performance increase on their transactions. Overtime, they gradually start modernizing their applications to access data directly from Gemfire, and some of them even realize they don't need their legacy platforms anymore.

 

Java Stored Procedure Performance — Myths vs. Facts

“Java Stored Procedures will never be as fast as SQL Stored Procedures” is a common refrain from database administrators based on their experience with stored procedures in “normal” relational databases. Most databases are written in a low level language such as C. SQL is managed as a byte-code compiled language with the equivalent of a virtual machine optimized for SQL to retrieve data and execute the other instructions in the SQL language and the extensions for stored procedures. To have Java Stored Procedures, the database has to spin up a Java Virtual Machine (JVM) or use an existing one. Then, the database must communicate with it across the JNI boundary or use some interprocess communication mechanism. Similarly, the code running in the JVM must send data back to the main database processes. Likely there is a lot of bi-directional processing as the database engine has to hand off data such as pointers to rows on disk, or requests to look up data. The alternative is to duplicate the indexes and other meta data used in the database in the Java memory space. All of this overhead guarantees that Java Stored Procedures will generally be much slower than SQL Stored Procedures for main stream databases.

    SQLFire is different. It is written in Java. All of the memory is in the Java VM space. SQL statements are byte compiled into Java structures. Java Stored Procedures run in the same memory and processing space that database operations are performed. There is NO overhead for using them. It is likely that writing a non-Java stored procedure language to run in SQLFire would be slower than using Java as the stored procedure language, because these other instructions would have to be mapped into Java operation for the JVM.

    SQLFire provides other significant advantages for performance of stored procedures. First like other data bases written to be primarily in-memory databases, primary key based data access or index lookups, result in direct pointers to rows in memory. No pulling data off pages of memory or disk. Row data is stored in a byte format for rows. It consists of fixed length fields followed by variable length fields, etc to keep footprint to a minimum. When operating on a ResultSet locally (or otherwise) you never get direct access to this byte. There is no deserialization. The overhead comes down to how you access the individual fields in the row. Note: JDBC provides a lot of flexibility with implicit type conversions.

    Another key advantage for store procedures in SQLFire is that unlike other memory oriented databases, SQLFire is a distributed database so that stored procedures and query operations are parallelized across the grid. This means that if you have 10 SQLFire instances, stored procedures are run in parallel across the subsets of the data on each member resulting in ten times as many CPU’s being involved in a single stored procedure invocation. There is some overhead on collating results (depending on what you are doing) so that the system may only be 8 or 9 times faster than doing the operation on a single SQLFire instance. Still this is a noticeable performance improvement.

    One of the concerns for traditional DBAs is that they are not Java programmers. SQLFire preserves the standard SQL language model for DBA’s to perform all of their normal tasks, while enabling developers to move their logic to the database where their code can get enhanced performance through parallelization and direct access to the data in memory. In this way the SQLFire stored procedure model is biased towards Java application developers who are used to running their code in Application server JVMs (i.e. my code runs alongside the container in the same process). SQLFire provides them the mechanism to avoid moving data across the wire and allows their code to run in-situ with the data. Doing this you get impressive access latency especially when iterating over sets. Using this sort of model, we have seen application performance gains of one or two orders of magnitude. i.e 10x to 100x performance speed up.

vFabric now has a common language (Java) that can be used in both the app server and the database. This means that app developers can write stored procedures in a language they already understand, with a tool chain they already have.

How to offload data from an existing data store and load it into SQLFire

vFabric SQLFire is a memory-optimized distributed SQL database delivering dynamic scalability and high performance for data-intensive modern applications. In this demo we load the Australian Football Leagure 2012 Fixture data into an SQLFire distributed in memory database.

A common requirement for using SQLFire is offloading data from an existing data store and loading it into SQLFire. This example show how you can load data into a SQLFire distributed system using Spring Batch from a CSV file and then how to present the data from a simple Spring MVC application. Finally the data is persisted to each SQLFire member to ensure we only need to load the data once and it will always be retained upon restarts of the distributed system.

Demo Setup

1. Download the zip file below and extract onto your file system. The demo scripts are written for a unix environment and tested on a MAC. Please convert the SH files to BAT files if on windows. The ANT commands will work whether your on windows or unix.

afl2012.zip

Note: The downloaded zip provides the JAR files for spring core/batch so there is no need to download those.

2. Ensure you have the JDK 6 in your path. It should work with JDK 1.5 but was never tested this JDK 1.5

[Mon Jan 30 09:55:39 papicella@:~/vmware/software/sqlfire/vFabric_SQLFire_101/pasdemos/afl2012 ] $ java -version
java version "1.6.0_26"
Java(TM) SE Runtime Environment (build 1.6.0_26-b03-383-11A511)
Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02-383, mixed mode)

3. Ensure Apache Ant is installed.

[Mon Jan 30 09:55:33 papicella@:~/vmware/software/sqlfire/vFabric_SQLFire_101/pasdemos/afl2012 ] $ ant -version
Apache Ant(TM) version 1.8.2 compiled on June 3 2011

4. Verify you have SQLFire installed as shown below.

[Tue Feb 07 12:56:56 papicella@:~/vmware/software/sqlfire/vFabric_SQLFire_101 ] $ d
total 8
drwxr-xr-x  10 papicella  staff  340 27 Jan 08:34 docs/
drwxr-xr-x   4 papicella  staff  136 27 Jan 08:34 bin/
drwxr-xr-x   6 papicella  staff  204 27 Jan 08:34 ../
drwxr-xr-x  15 papicella  staff  510 27 Jan 08:34 quickstart/
drwxr-xr-x  19 papicella  staff  646 27 Jan 08:34 lib/
drwxr-xr-x   7 papicella  staff  238 27 Jan 08:35 examples/
-rw-r–r–   1 papicella  staff  167 27 Jan 08:46 key.txt
drwxr-xr-x   9 papicella  staff  306 27 Jan 20:34 ./
drwxr-xr-x   7 papicella  staff  238  5 Feb 20:38 pasdemos/

Running the Demo

1. Ensure you have a clean environment by running "ant clean" as shown below.

[Mon Jan 30 09:59:12 papicella@:~/vmware/software/sqlfire/vFabric_SQLFire_101/pasdemos/afl2012 ] $ ant clean
Buildfile: /Users/papicella/vmware/software/sqlfire/vFabric_SQLFire_101/pasdemos/afl2012/build.xml

clean:

BUILD SUCCESSFUL
Total time: 0 seconds

2. Edit setup.sh to ensure you have placed SQLFire into your path as shown below.

export PATH=/Users/papicella/vmware/software/sqlfire/vFabric_SQLFire_101/bin:$PATH

export CUR_DIR=`pwd`

export CLASSPATH=

3. Setup your environment by sourcing setup.sh as shown below.

[Mon Jan 30 10:14:05 papicella@:~/vmware/software/sqlfire/vFabric_SQLFire_101/pasdemos/afl2012 ] $ . ./setup.sh

4. Run the "init" target to setup the required directories.

[Mon Jan 30 10:19:29 papicella@:~/vmware/software/sqlfire/vFabric_SQLFire_101/pasdemos/afl2012 ] $ ant init
Buildfile: /Users/papicella/vmware/software/sqlfire/vFabric_SQLFire_101/pasdemos/afl2012/build.xml

init:
    [mkdir] Created dir: /Users/papicella/vmware/software/sqlfire/vFabric_SQLFire_101/pasdemos/afl2012/server1
    [mkdir] Created dir: /Users/papicella/vmware/software/sqlfire/vFabric_SQLFire_101/pasdemos/afl2012/server2
    [mkdir] Created dir: /Users/papicella/vmware/software/sqlfire/vFabric_SQLFire_101/pasdemos/afl2012/classes
     [copy] Copying 1 file to /Users/papicella/vmware/software/sqlfire/vFabric_SQLFire_101/pasdemos/afl2012/server1
     [copy] Copying 1 file to /Users/papicella/vmware/software/sqlfire/vFabric_SQLFire_101/pasdemos/afl2012/server2

BUILD SUCCESSFUL
Total time: 0 seconds

5. Package the application using the "package" target as shown below.

[Mon Jan 30 10:24:30 papicella@:~/vmware/software/sqlfire/vFabric_SQLFire_101/pasdemos/afl2011 ] $ ant package
Buildfile: /Users/papicella/vmware/software/sqlfire/vFabric_SQLFire_101/pasdemos/afl2012/build.xml

init:

compile:
    [javac] Compiling 5 source files to /Users/papicella/vmware/software/sqlfire/vFabric_SQLFire_101/pasdemos/afl2012/classes
     [copy] Copying 1 file to /Users/papicella/vmware/software/sqlfire/vFabric_SQLFire_101/pasdemos/afl2012/classes
     [copy] Copying 1 file to /Users/papicella/vmware/software/sqlfire/vFabric_SQLFire_101/pasdemos/afl2012/classes

package:
      [jar] Building jar: /Users/papicella/vmware/software/sqlfire/vFabric_SQLFire_101/pasdemos/afl2012/lib/sqlfire-afl2012.jar

BUILD SUCCESSFUL
Total time: 0 seconds

6. Start 2 SQLFire membors as follows. These members are using a multicast port for discovery instead of a locator.

[Mon Jan 30 10:27:14 papicella@:~/vmware/software/sqlfire/vFabric_SQLFire_101/pasdemos/afl2011 ] $ ./start-servers.sh
Starting SQLFire Server using multicast for peer discovery: 239.192.81.1[12333]
Starting network server for SQLFire Server at address localhost/127.0.0.1[1528]
Starting SQLFire Server using multicast for peer discovery: 239.192.81.1[12333]
Starting network server for SQLFire Server at address localhost/127.0.0.1[1527]
SQLFire Server pid: 657 status: running
Logs generated in /Users/papicella/vmware/software/sqlfire/vFabric_SQLFire_101/pasdemos/afl2012/server1/sqlfserver.log
SQLFire Server pid: 658 status: running
  Distributed system now has 2 members.
  Other members: 10.117.85.24(657:datastore)<v0>:13038/49735
Logs generated in /Users/papicella/vmware/software/sqlfire/vFabric_SQLFire_101/pasdemos/afl2012/server2/sqlfserver.log

7. At this point lets ensure we can connect from a client using the script "test-connect.sh" if all went well it should show today's date.

[Mon Jan 30 10:29:01 papicella@:~/vmware/software/sqlfire/vFabric_SQLFire_101/pasdemos/afl2011 ] $ ./test-connect.sh
sqlf version 10.4
sqlf> sqlf> 1        
———-
2012-01-30

1 row selected
sqlf>
all done..

8. Lets view our 2 SQLFire members using the script "view-members.sh". This will show you the PID for each member in the distributed system

[Mon Jan 30 10:38:33 papicella@:~/vmware/software/sqlfire/vFabric_SQLFire_101/pasdemos/afl2011 ] $ ./view-members.sh
sqlf version 10.4
sqlf> sqlf> > Id                                              |NetServers                    |PID       
——————————————————————————————-
10.117.85.24(657)<v0>:13038/49735               |localhost/127.0.0.1[1527]     |657       
10.117.85.24(658)<v1>:22550/49734               |localhost/127.0.0.1[1528]     |658       

2 rows selected
sqlf>
all done..

9. Create the schema data running "create-schema-data.sh" as shown below. This will create the necessary tables and populate some of those with some known SQL data.

[Mon Jan 30 10:33:09 papicella@:~/vmware/software/sqlfire/vFabric_SQLFire_101/pasdemos/afl2011 ] $ ./create-schema-data.sh
……..
sqlf>
all done..

10. At this point we need to load a comma sperated CSV file into a temporary table using spring batch. This is all the 2012 fixture data which we will then massage into the real tables once it's loaded. Run then target "run-loadfixture" as shown below.

[Mon Jan 30 10:41:00 papicella@:~/vmware/software/sqlfire/vFabric_SQLFire_101/pasdemos/afl2011 ] $ ant run-loadfixture
Buildfile: /Users/papicella/vmware/software/sqlfire/vFabric_SQLFire_101/pasdemos/afl2012/build.xml

init:

compile:

package:

run-loadfixture:
     [echo] Running load fixture with jvm args : -server -showversion -Xms512m -Xmx512m
     [java] java version "1.6.0_26"
     [java] Java(TM) SE Runtime Environment (build 1.6.0_26-b03-383-11A511)
     [java] Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02-383, mixed mode)
     [java]
     [java] Jan 30, 2012 10:41:04 AM org.springframework.context.support.AbstractApplicationContext prepareRefresh
     [java] INFO: Refreshing org.springframework.context.support.ClassPathXmlApplicationContext@b32e13d: startup date [Mon Jan 30 10:41:04 EST 2012]; root of context hierarchy
     [java] Jan 30, 2012 10:41:04 AM org.springframework.beans.factory.xml.XmlBeanDefinitionReader loadBeanDefinitions
     [java] INFO: Loading XML bean definitions from class path resource [application-config.xml]
     [java] Jan 30, 2012 10:41:04 AM org.springframework.beans.factory.support.DefaultListableBeanFactory preInstantiateSingletons
     [java] INFO: Pre-instantiating singletons in org.springframework.beans.factory.support.DefaultListableBeanFactory@5e0feb48: defining beans [sqlfireUcpDataSource,fixtureDataReader]; root of factory hierarchy
     [java] Jan 30, 2012 10:41:05 AM vmware.au.spring.sqlfire.afl.LoadFixtureData run
     [java] INFO: Loaded 414 fixture_data records into sqlfire

BUILD SUCCESSFUL
Total time: 2 seconds

11. Now we massage the fixture data loaded from CSV into the real application table called "games". We do this using the ant target "run-loadgames" as shown below.

[Mon Jan 30 10:44:49 papicella@:~/vmware/software/sqlfire/vFabric_SQLFire_101/pasdemos/afl2011 ] $ ant run-loadgames
Buildfile: /Users/papicella/vmware/software/sqlfire/vFabric_SQLFire_101/pasdemos/afl2012/build.xml

init:

compile:

package:

run-loadgames:
     [echo] Running load into games with jvm args : -server -showversion -Xms512m -Xmx512m
     [java] java version "1.6.0_26"
     [java] Java(TM) SE Runtime Environment (build 1.6.0_26-b03-383-11A511)
     [java] Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02-383, mixed mode)
     [java]
     [java] Jan 30, 2012 10:46:05 AM org.springframework.context.support.AbstractApplicationContext prepareRefresh
     [java] INFO: Refreshing org.springframework.context.support.ClassPathXmlApplicationContext@5afec107: startup date [Mon Jan 30 10:46:05 EST 2012]; root of context hierarchy
     [java] Jan 30, 2012 10:46:05 AM org.springframework.beans.factory.xml.XmlBeanDefinitionReader loadBeanDefinitions
     [java] INFO: Loading XML bean definitions from class path resource [application-config.xml]
     [java] Jan 30, 2012 10:46:06 AM org.springframework.beans.factory.support.DefaultListableBeanFactory preInstantiateSingletons
     [java] INFO: Pre-instantiating singletons in org.springframework.beans.factory.support.DefaultListableBeanFactory@5e0feb48: defining beans [sqlfireUcpDataSource,fixtureDataReader]; root of factory hierarchy
     [java] round=1, team=Richmond, id=1, opponent=Carlton, id=4, venue=MCG, id=1
     [java] round=1, team=Western Bulldogs, id=16, opponent=West Coast Eagles, id=15, venue=Etihad Stadium, id=3
…..
     [java] round=23, team=Hawthorn, id=9, opponent=West Coast Eagles, id=15, venue=MCG, id=1
     [java] Inserted 198 Games…

BUILD SUCCESSFUL
Total time: 5 seconds

Viewing the Data

Below are 3 different ways to view the data in the distributed system. They are as follows.

  1. Using the sqlf command line client
  2. Using DbVisualizer (RDBMS Graphical Tool)
  3. Spring MVC WebUI

1. Using the sqlf command line client

Here we simply connect using the command line client provided with SQLFire.

1.1. Type in the following : sqlf
1.2. Connect using : connect client 'localhost:1527';
1.3. Query the current list of teams using : select * from teams;

[Mon Jan 30 10:50:35 papicella@:~/vmware/software/sqlfire/vFabric_SQLFire_101/pasdemos/afl2012 ] $ sqlf
sqlf version 10.4
sqlf> connect client 'localhost:1527';
sqlf> select * from teams;
TEAM_ID    |NAME           |LONG_NAME                     |WEB_SITE                                         
————————————————————————————————————-
18         |Giants         |GWS GIANTS                    |http://www.gwsgiants.com.au/                     
1          |Tigers         |Richmond                      |http://richmondfc.com.au/                        
16         |Bulldogs       |Western Bulldogs              |http://westernbulldogs.com.au/                   
12         |Power          |Port Adelaide                 |http://portadelaidefc.com.au/                    
5          |Pies           |Collingwood                   |http://collingwoodfc.com.au/                     
17         |Suns           |Gold Coast Suns               |http://goldcoastfc.com.au/                       
4          |Blues          |Carlton                       |http://carltonfc.com.au/                         
2          |Crows          |Adelaide Crows                |http://afc.com.au/                               
10         |Roos           |North Melbourne               |http://kangaroosfc.com.au/                       
13         |Saints         |St. Kilda                     |http://saints.com.au/                            
8          |Cats           |Geelong Cats                  |http://gfc.com.au/                               
14         |Swans          |Sydney Swans                  |http://sydneyswans.com.au/                       
15        
|Eagles         |West Coast Eagles             |http://westcoasteagles.com.au/                   
11         |Demons         |Melbourne                     |http://demons.com.au/                            
7          |Dockers        |Fremantle                     |http://fremantlefc.com.au/                       
9          |Hawks          |Hawthorn                      |http://hawthornfc.com.au/                        
3          |Lions          |Brisbane Lions                |http://lions.com.au/                             
6          |Bombers        |Essendon                      |http://bombersfc.com.au/                         

18 rows selected

 

2. Using DbVisualizer (RDBMS Graphical Tool)

2.1. Setup DbVisualizer as shown in the BLOG entry below. This describes how to load the SQLFire client driver required to make a connection to our distributed system.

http://theblasfrompas.blogspot.com/2012/01/using-dbvisualizer-to-connect-to.html

2.2. Issue a query as follows to view games for "Richmond".

select g.round_ID, h.long_name "Home Team", a.long_name "Away Team", g.game_date, substr(v.name, 1, 20) "Venue", g.tv_network
from games g, teams h, teams a, venues v
where  (g.HOME_TEAM_ID = 1 or g.away_team_id = 1)
and g.home_team_id = h.team_id
and g.away_team_id = a.team_id
and g.venue_id = v.venue_id
order by g.round_id

Query Results

Img1

3. Spring MVC WebUI

It is assumed you already have a tcServer instance created and running, in this example the tcServer is named "sqlfire-demo"

3.1 Download the WAR file below onto your file system and unzip the file to reveal the WAR file.

application.zip

3.2. Create an instance as shown below.

[Mon Jan 30 12:10:42 papicella@:~/vmware/software/tcserver/vfabric-tc-server-standard-2.6.3.RELEASE ] $ ./tcruntime-instance.sh create sqlfire-demo
Creating instance 'sqlfire-demo' …
  Using separate layout
  Creating bin/setenv.sh
  Applying template 'base'
    Copying template's contents
    Applying fragment 'context-fragment.xml' to 'conf/context.xml'
    Applying fragment 'server-fragment.xml' to 'conf/server.xml'
    Applying fragment 'web-fragment.xml' to 'conf/web.xml'
    Applying fragment 'tomcat-users-fragment.xml' to 'conf/tomcat-users.xml'
    Applying fragment 'catalina-fragment.properties' to 'conf/catalina.properties'
  Applying template 'base-tomcat-7'
    Copying template's contents
    Applying fragment 'server-fragment.xml' to 'conf/server.xml'
    Applying fragment 'web-fragment.xml' to 'conf/web.xml'
    Applying fragment 'catalina-fragment.properties' to 'conf/catalina.properties'
  Applying template 'bio'
    Copying template's contents
    Applying fragment 'server-fragment.xml' to 'conf/server.xml'
  Configuring instance 'sqlfire-demo' to use Tomcat version 7.0.23.A.RELEASE
  Setting permissions
Instance created
Connector summary
  Port: 8080   Type: Blocking IO   Secure: false

3.3.  Start the instance as shown below.

[Mon Jan 30 12:31:44 papicella@:~/vmware/software/tcserver/vfabric-tc-server-standard-2.6.3.RELEASE ] $ ./tcruntime-ctl.sh sqlfire-demo start
INFO Instance name:      sqlfire-demo
INFO Script directory:   /Users/papicella/vmware/software/tcserver/vfabric-tc-server-standard-2.6.3.RELEASE
INFO tc Runtime location:/Users/papicella/vmware/software/tcserver/vfabric-tc-server-standard-2.6.3.RELEASE
INFO Instance base:      /Users/papicella/vmware/software/tcserver/vfabric-tc-server-standard-2.6.3.RELEASE
INFO Binary dir:         /Users/papicella/vmware/software/tcserver/vfabric-tc-server-standard-2.6.3.RELEASE/tomcat-7.0.23.A.RELEASE
INFO Runtime version:    7.0.23.A.RELEASE
INFO Script version:     2.6.3.RELEASE
Using CATALINA_BASE:   /Users/papicella/vmware/software/tcserver/vfabric-tc-server-standard-2.6.3.RELEASE/sqlfire-demo
Using CATALINA_HOME:   /Users/papicella/vmware/software/tcserver/vfabric-tc-server-standard-2.6.3.RELEASE/tomcat-7.0.23.A.RELEASE
Using CATALINA_TMPDIR: /Users/papicella/vmware/software/tcserver/vfabric-tc-server-standard-2.6.3.RELEASE/sqlfire-demo/temp
Using JRE_HOME:        /System/Library/Frameworks/JavaVM.framework/Versions/CurrentJDK/Home
Using CLASSPATH:       /Users/papicella/vmware/software/tcserver/vfabric-tc-server-standard-2.6.3.RELEASE/tomcat-7.0.23.A.RELEASE/bin/bootstrap.jar:/Users/papicella/vmware/software/tcserver/vfabric-tc-server-standard-2.6.3.RELEASE/tomcat-7.0.23.A.RELEASE/bin/tomcat-juli.jar
Using CATALINA_PID:    /Users/papicella/vmware/software/tcserver/vfabric-tc-server-standard-2.6.3.RELEASE/sqlfire-demo/logs/tcserver.pid

3.4. Drop The WAR file into $CATALINA_HOME/sqlfire-demo/webapps directory. It should explode the contents automatically and deploy the application.

[Mon Jan 30 12:32:57 papicella@:~/vmware/software/tcserver/vfabric-tc-server-standard-2.6.3.RELEASE/sqlfire-demo/webapps ] $ d
total 8592
drwx——   6 papicella  staff      204 30 Jan 12:12 ROOT/
drwx——  12 papicella  staff      408 30 Jan 12:12 ../
-rw-r–r–   1 papicella  staff  4396232 30 Jan 12:38 sqlfire-afl.war
drwxr-xr-x   7 papicella  staff      238 30 Jan 12:39 sqlfire-afl/
drwx——   5 papicella  staff      170 30 Jan 12:39 ./

3.5 Access the application as follows.

http://localhost:8080/sqlfire-afl/index.html

Img2

3.6 Click on "Show all teams" link
3.7. Click on "Tigers" link to show Richmond's 2012 games for the AFL 2012 season.

Img3

Exploring The Setup

– If you wish to stop the SQLFire distributed cluster use the script "stop-all.sh" as shown below.

[Mon Jan 30 12:52:54 papicella@:~/vmware/software/sqlfire/vFabric_SQLFire_101/pasdemos/afl2012 ] $ ./stop-all.sh
Connecting to distributed system: mcast=/239.192.81.1:12333
Successfully shut down 2 members

– In this howto the data which is stored in the distributed system is also persisted locally to each SQLFire server using "server1" and "server2" directories referenced in the start-servers.sh script . This enables it to be rehydrated across cluster restarts to avoid having to load the data back into the distributed system. This is achieved using SQL as follows when we create the tables.

create diskstore STORE1;

CREATE TABLE TEAMS
(team_id int generated always as identity NOT NULL CONSTRAINT teams_pk_team_id primary key,
name VARCHAR (15),
long_name VARCHAR (30),
web_site VARCHAR (50))
persistent 'STORE1' REPLICATE
;

Learn More

For more information on vFabric SQLFire use the following link.