Why data products need to be part of your cloud platform.
About two months ago I was called out to spend some time with a team from a customer of ours to talk about “Devops.” Some background: at Pivotal we’re rather opinionated on what Devops means and how you get there. For us it’s about process and cultural change, coupled with the right platform. This achieves a software development and delivery capability that eliminates the friction between development and operations, and allows for the continuous delivery of value to the customer.
Having spent the last three years focusing on an incredible application platform that is a tremendous enabler of Devops, I’ve had this conversation quite frequently in a variety of venues, from customer meetings to conference halls.
This meeting proceeded like many before it. When I talked about the need for environment parity across the entire software lifecycle in order to avoid “it works on my machine” finger-pointing, nervous laughter filled the room. My hosts had a great appreciation for developers who had self-service provisioning resources at their fingertips to do their work, and have those resources automatically reclaimed when the job was done.
When I described how immutable infrastructure — the stack you build once and run as many instances of it as needed — eliminated configuration drift and untouchable snowflakes, there were murmurs of “we need to do that.” The mood shifted to astonishment when I explained that at Pivotal we only did production deploys during normal business hours, something made possible because we derisk software upgrades with things like blue/green and parallel deployments.
When I talked about the need for environment parity across the entire software lifecycle in order to avoid “it works on my machine” finger-pointing, nervous laughter filled the room.
When I suggested that they needed to make a fundamental shift from treating change as the exception to treating it as a rule, there were nods of appreciation. When our two-hour meeting stretched into the fourth hour, we talked about how to get started, the practices that could be tried, and the tooling required to support it.
The team was open minded and frank in assessing their current state. They were charged up from our meeting and were brainstorming possible solutions. It was a whole lot of fun! And not that unusual.
Except for one thing.
The team I was meeting with that day were not developers, or IT operations and not tasked with bringing a new compute platform to developers. This was the data team. That is, members of IT who mind the data technologies, databases, data analytics, storage and more, across their enterprise. That afternoon I realized that this data team had goals that were very similar to the many other (non data-centric) client developer teams I had been working with over the last three years: to provide a platform that enables data scientists, developers and operators to get the resources they need to build, deploy and manage their software products at start-up speeds, without the friction that has historically plagued them.
Bridging to Data
For the last three years I have been part of the Cloud Foundry product team at Pivotal. My day job has been engaging with customers and partners to help them navigate their way onto the third platform. My focus, and the focus of the Cloud Platform organization as a whole, has been predominantly on compute for developers and operators — in particular, the Cloud Foundry Elastic Runtime and the Spring Framework, as well as, BOSH, the “cloud native workhorse” and toolchain for managing complex, distributed systems like Cloud Foundry.
Yes, the stateless, resilient, scale-out applications that would be deployed to Pivotal Cloud Foundry would bind to data-oriented services drawn from a rich marketplace that we’ve built together with our partners. And, yes, Pivotal also has the Big Data Suite of world-class data products. But the fact is, the cloud platform and those data products have largely been treated as two independent initiatives, more like two dancers who acknowledge that they are partners, but have been dancing on their own.
The experience I had with this customer a few months ago planted a seed, and in the weeks following that meeting, I talked with more customers and colleagues, and researched what was happening with data products in the industry.
After several months on a different assignment within Pivotal, I was delighted to rejoin the product team to focus on this challenge with like-minded peers. The astoundingly successful Pivotal Cloud Foundry platform forms the base into which we will deeply integrate data-centric capabilities, both from within Pivotal and through engagement with strategic partners. I see a Pivotal data team that partners with industry leaders who are pushing the envelope with the most innovative data products available in the market. And we look forward to taking the very logical next step continuing to shape the future compute platform with more data muscle.
The astoundingly successful Pivotal Cloud Foundry platform forms the base into which we will deeply integrate data-centric capabilities, both from within Pivotal and through engagement with strategic partners.
And it’s fantastic! It feels a great deal like the way “Platform-as-a-Service” felt three or four years ago: new and innovative technologies that are increasingly available, where IT leaders sense see the potential to disrupt and enable innovation. Enterprises will be looking for a path to evaluate and bring these data and other technologies in house, and learn how to leverage them to their advantage.
Businesses who have embraced cloud native application platforms continue to say that it is helping them unleash great value. Bringing data products into the platform in a deeply meaningful way will help businesses unlock even more value. Pivotal is not the only one to see that vision. Data is pretty much the new black for cloud native. But it’s an implementation issue that we can and are tackling.
I look forward to engaging with a great many of you in the broader tech community. Here are some conversations I’m looking to take part in:
- What types of data and data sources do you have that are woefully underutilized?
- What do your existing data pipelines look like?
- How do your data scientists build models, and are they using all available data?
- How are these models put into production? And how quickly can you get them there?
- Can you have multiple models running in parallel so that you can assess the results relative to one another?
I’m keen to hear from you.