Pivotal’s Analytics Workbench (AWB) offers partners the power of a 1000-node Hadoop cluster, providing a publicly-available data processing powerhouse, free of charge, for 90 day engagements. There are a number of reasons why a business or research institution should take Pivotal up on the offer—they can utilize the infrastructure and resources to spin up a Hadoop cluster of parallel size, and consider Pivotal AWB as an opportunity to test Hadoop’s high-performance capabilities at a significant scope. Moreover, the project provides Pivotal and its partners with opportunities to experiment with a large, reliable, and robust Hadoop deployment.
Pivotal AWB evangelizes the power and flexibility of Hadoop, while also offering an opportunity to test the platform’s capabilities and performance in real-world test cases. “We see the cluster as a lighthouse,” says Pivotal’s Principal Program Manager Tasneem Maistry. “It enables people to take advantage of, and demonstrate, the high-performance capabilities of Hadoop. We also see the cluster as a test bench for our technology,”
“Mellanox Technologies provides AWB with flexibility on the fabric identity and capabilities,” said Eyal Gutkind, Sr. Manager of Market Development at Mellanox Technologies. “We find that AWB is an excellent platform for our customers to learn and experience Hadoop and to explore the benefits High Performance Networks such as InfiniBand and 40 Gigabit Ethernet brings to Hadoop-based applications.”
To this end, the AWB team collaborated in October with Pivotal’s partner Mellanox Technologies to test the effect of flipping the cluster’s network connection between InfiniBand and Ethernet on-the-fly. In cloud computing contexts, it at times is necessary to switch nodes from a high-performance InfiniBand network to Ethernet to serve different types of applications. How would such a switch impact the performance of applications running on the Pivotal AWB? How quickly and seamlessly can this shift be performed? These are the questions that the AWB team and their partner Mellanox were asking when they performed the test.
“The work we did last October proved that it is easy on the networking side to flip between InfiniBand and Ethernet,” says Maistry. “Flipping the nodes required no touch on the hardware, so the work could be executed remotely.” Meanwhile, “on the switches side, we updated to the latest software and firmware from the command line, easily transitioning between Infiniband and Ethernet.”
This all happened without the test application running on Pivotal AWB becoming aware of the fabric technology change. Nor did the change impact application performance. “On the application side,” says Maistry, “we manipulated the DNS so that the application was not even aware that in the morning it was running on the InfiniBand network. In the afternoon we shut down the application, and performed the flip to Ethernet in parallel. When we restarted the application it was unaware of the fabric replacement.”
Maistry’s AWB team was able to verify that the running application was not affected at all by the switch. Armed with this knowledge, the team proved the platform’s stability and flexibility, while taking away some key insights:
- If you define the DNS on the server side, you can make this switch seamlessly
- The benefit is the ability to offer different capabilities to different workloads
- This approach allows the system admin and the applications on top of it to switch from InfiniBand to Ethernet based on their needs, without the applications being aware that the network has changed
Maistry adds that “We were able to flip it from Ethernet to InfiniBand, but you can also run mixed nodes in a cluster, and if you need to bridge the two on the same subnet, you can create a gateway.” This opens up new opportunities for developers and administrators working on a deployment similar to the AWB. For example, developers who need to certify an application for both 40 Gigabit Ethernet and InfiniBand could do so in parallel. Traditionally, if you wanted to benchmark across InfiniBand and Ethernet, you would typically double the cluster and switches, and often resort to building two clusters side by side.
Not only does this approach save cost and labor, the AWB team determined during the testing that they were able to run the cluster in mixed mode. Mixed mode allows you to use one vendor and bridge separate network-connected platforms—for example, connecting a Hadoop cluster to both EMC storage arrays and VMware vSphere. Such an approach reduces cost, difficulty, and latency.
While the AWB 1000-node Hadoop cluster is currently running in InfiniBand mode, the team found through this testing that they are fortunate enough to switch back and forth between InfiniBand and 40 Gigabit Ethernet as needed. Moreover, their tests demonstrated the capability to run both InfiniBand applications and their legacy application counterparts on an Ethernet network within a single cluster.
Learn more about how you can harness the power of Pivotal’s 1000-node Hadoop cluster:
• Request an invite to use Pivotal Analytics Workbench
• View projects in the Pivotal Analytics Workbench library
• Learn more about Pivotal Analytics Workbench