Oil spills from mining accidents can cost tens of billions per incident. The famed BP Oil spill in the Gulf cost $40 billion alone for the company, never mind the uncalculated cost of the region impacted.
How could this type of economic and environmental disaster be avoided? Our answer is smart systems. And they are not just a pipe dream, the Pivotal Data Science team has been working hard over the past few years to provide real, practical solutions to answer this type of issue for the oil and gas industry, as well as others.
The idea is to instrument the drilling rig to be a smart system. Drawing on concepts of the Internet of Things (IoT), where any machine from personal wearables, like smart phones and Fitbits, to industrial equipment like jet engines, power turbines and drilling rigs can be constructed as a system of sensors and actuators, essentially creating the the sense organs and limbs of the smart system. Put these together with the remarkable advancements in big data technologies which enable us to pull petabytes of data into a Data Lake that we can use as a basis to employ complex machine learning models efficiently, and we have the basic two ingredients of a smart system that could monitor and prevent accidents, downtime and even ensure energy efficiency.
Digital Brain = Data Lake + Data Science
How will a smart offshore oil platform work? Let’s look at the three elements—the sensors, the brain, and the actuators.
The sensors in a drilling rigs measure temperature, pressure, and Monitoring While Drilling (MWD) variables, which can include seismic, gamma ray and high frequency electromagnetic data, as well as hydraulic and mechanical variables. To create the digital brain, we load all of this data along with measurements made off the drill like those of drilling fluid properties and previous seismic data into a Data Lake. Here, the all the data is stored together and we can extract patterns in the data. Since the data for one oilfield involving multiple boreholes can run into hundreds of variables and billions of rows we use a parallel modeling package like MADlib to extract patterns from the data in an efficient manner. This involves clustering the data and then regressing over the rate of penetration of the drill.
Once we have a good model, we can operationalize it by checking the actual rate against the predicted rate. If the predicted rate is different, we flag it as an anomaly. We also create a library of anomalies and label them. Armed with that dataset, we are in a position to monitor the drilling and take appropriate action if we detect an anomaly. For instance, if the anomaly is associated with a blowout, we can set the brain to stop drilling by activating the actuators (the control system) and send a red alert to the control room to initiate a response team to investigate. They may find that this anomaly is just be an indication that the drill bit is wearing out earlier than anticipated and needs to be changed. Or, importantly, they may discover a serious threat and be able to stop a catastrophe like the BP oil spill.
In the case of the former, this has an added bonus of helping us to do predictive maintenance, and improve the productivity of our operations. It is an example of the application of Data Science methodology on the appropriate technology.
The Technology Behind the Science
Here at Pivotal we have been making great strides in building the platform that houses the Data Lake. This includes a parallel storage system based on HDFS with parallel database (namely, HAWQ and Greenplum) and in-memory (Gemfire) modules with a variety of other components that make it easy to ingest and store data, compute on it, and take action. The whole platform is based on open standards, which makes it highly compatible with a lot of third-party software and effectively future-proofs it.
Other Applications for the Digital Brain
Another example of making a system smart would be putting a digital brain into a smart grid. Right now we have smart meters collecting an enormous amount of data regarding power usage from every business and household. This maps the entire range of activity in any city! But how do we get value from that?
The answer is once again to load this data into a Data Lake and look at the frequency content of the time series signals we get from every smart meter. This enables us to cluster every meter and find out outliers or anomalies. Then again, as the system tracks the changing of the behavior of a cluster or even individual smart meters, we can identify anomalies as they happen. Over time, we can train our model and label these anomalies as meter malfunctions, meter tampering or vegetation management (a tree or a branch falling on a power line). Now we have created a Smart Grid than can be our eyes and ears everywhere across the power grid, and can take the appropriate action to prevent downtime and at the same time achieve optimal performance.
The possibilities of the digital brain to transform industry are manifold. We are working steadily to apply this methodology widely, including another industry we are talking about this week at Strata—smart cities and the connected car .
The true potential of the IoT will be realized when we are able to create digital brains and transform the IOT from just things to a self-aware systems. This will never eliminate the human element, rather it will make human intervention more effective and reduce delays and scope of error in action.
Jeff Immelt of GE has said that “zero unplanned downtime” is a key goal for GE’s use of the Industrial Internet. But we can take this even further—what about zero unplanned outages, zero industrial accidents and zero environmental disasters?
The opportunity is right here, let us all make it happen!