apps data_science features machine_learning predictive_analytics

Pivotal’s Data Science Predictions for 2014

2014_data_scienceThe promise of data science was more celebrated and scrutinized in 2013 than ever. While demonstrations of data science’s ever-growing value and importance are well-documented at this point, 2014 is set to be the year that the hype subsides and it becomes an essential part of business operations. Moreover, it will drive the development of not only apps, but network-connected objects and devices in the next year.

Performing data science is a constant process, requiring that data be ingested from an increasing number of sources, analyzed, and acted upon. In her talk at October’s Strata conference, Pivotal’s Annika Jimenez stated that the end goal of data science lies in “driving automated, low latency actions in response to events of interest.” This cycle requires increased automation, as well as closer integration between data scientists and app developers.

This is central to Jimenez’s predictions for 2014: close collaboration between the data science and app development sides of the house, early in the process, to build apps that not only respond to past user behavior but instead integrate predictive models into their processing. “We’re seeing the world of data science and app dev converge,” she says. “The materialization of value from data science has been elusive, and that speaks to how organizations are operationalizing those insights. 2014 is going to be the year that enterprises realize the value of data science via apps.”

Even among the most forward-thinking companies, creation of statistical models and development of apps has remained silo’d in the past. That will cease to be the case in 2014, Jimenez expects. “App developers are going to have to think much earlier in their process about the role of data science,” she says, “and data scientists are going to need to develop a model with a concrete plan how it will be implemented. That will change how the model will be developed from the outset—what’s the complexity of the feature, and is that realistic with the latency of the app being developed.”

With the increased importance of integrated app development and operationalization of data science in 2014, the data science field will mature. While much has been made in recent years about the dearth of qualified data scientists, that may change in the next year as academic programs begin producing graduates. “We should theoretically start seeing the output of the new academic curriculum around data science,” says Jimenez. “We should see a greater population of data scientists coming out of those programs.”

These new candidates will grow more necessary with the proliferation of websites, devices, and network-connected objects in 2014, particularly wearable technologies. “We’ll see the Internet of Things explode in 2014,” says Jimenez. “Not just the industrial internet, which will be a big driver, but as we saw at Databeat, a lot of startups will arise from devices with sensors.” While this will drive adoption and app development at the consumer level, it will also affect expectations for enterprise devices and apps. “The apps will be conceptionalized by what the data enables,” she says, “and enterprises will need to take the spirit and run with it.”

“I’m betting there are still way more things than there are webpages,” says Jimenez, “and the data coming off of sensors that will be capable of producing data every ten minutes will clearly be the next big wave of Big Data. This will inspire all sorts of entrepreneurial activity, impact industrial companies like GE, as well as our daily lives through wearables.”

Recent News For Data Science

  • MADlib, the open source analytics library maintained by Pivotal as well as researchers at UC Berkeley, Stanford University, and University of Florida, gets a major update including new mathematical, statistical and machine-learning methods that are designed to help a broad spectrum of industries.
  • A major update to Spring, now called Spring.io, is released including several new big data enablers including Spring XD. See what attendees of SpringOne 2GX had to say about the new release here.
  • Roman Shaposhnik, who is best known for his work on the Apache Hadoop project at both Apache Hadoop founder Yahoo! and later at Hadoop start-up Cloudera, where he notably started the Bigtop project, joins Pivotal citing,”The Pivotal One vision is way bigger than any of its parts (even if that part is as big as an entire Hadoop ecosystem).”

Recommended Reading