posted

Question . . .” “Yes . . . !” “Of Life, the Universe and Everything . . .” said Deep Thought. “Yes . . . !” “Is . . .” said Deep Thought, and paused. “Yes . . . !” “Is . . .” “Yes . . . !!! . . . ?” “Forty-two,” said Deep Thought, with infinite majesty and calm [1].

The points I’ll make here and illustrated by the famous quote above fall into the category of “Greg’s Special Pet Peeves”. I’ll state the problem very simply, IoT solutions often forget to include units, sensor characteristics, and data provenance for streams of measured values from sensors destined for analytics engines. Just as in Deep Thought’s answer to ‘Life, the Universe and Everything’, forty-two alone tells us only that we have the answer to recursively apply the successor function to zero, forty-two times :-). It is imperative that we, the IoT community, start to work together to seamlessly include units and data provenance tightly bound to streams of measured values.

For physical units, see SI Units and for units in information science, see IEC 80000-13:2008. SI Units pretty much cover every conceivable physical quantity we’d ever want to measure. E.g., acceleration: m/s^2, energy: kgm^2 s^(-2), etc. Go see the full list at the link above to be really impressed. Important also is the standardization of units with which we are probably all familiar, e.g., byte, octet, etc.

To-date, most of the IoT solutions I’ve seen encode the knowledge of the units of a received datum in code logic. Something like,

// use grams in a computation assuming the value it holds
// actually represents mass in grams

How often have you written or seen code like the above?

There a couple issues with encoding the knowledge of units in logic: 1: No check possible for local computation, 2: Sharing becomes more difficult. Making a bunch of obvious simplifying assumptions, the code below can perform a local check to ensure that the value received from the sensor does indeed reflect mass in grams.

if grams.units != SIUnits.grams

throw units exception

// Now do some computation knowing the value in grams actually

// represents mass in grams

Thus, I strongly believe that a stream of timestamp, value tuples sourced from a sensor and destined for analysis should include a method to programmatically access the units (and more, see below) from the stream and thus, the stream have the capability to hold metadata about the values in the steam. Once we make such metadata easily and programmatically settable and retrievable with publicly available libraries and appropriately constructed streams, we can encourage developers to use them to enhance the ability of logic to perform correct computations. For one of the more infamous examples of the consequences of a units mismatch in logic, check out this page on the Mars Climate Orbiter. In the early 2000’s I had a position at NASA JPL as a Distinguished Visiting Scientist helping a team design a software architecture, based on Real-Time Java, to help developers avoid mistakes as above that might have such catastrophic consequences. A pretty simple idea, really, encode SIUnits as types. Thus, even simple assignment is automatically checked for units. As streams of measured values from IoT sensors start to pervade everything humans do, the chance for errors and subsequent serious and bad consequences will grow dramatically. Awareness of this issue is the first step!

One of the great promises of IoT is that huge amounts of measurements of our physical world are available, more or less, publicly and allows innovative applications to use these streams to create specific value for humans in general, society, organizations, etc. In my humble opinion such innovation would be drastically hampered without a standardized way to access metadata about each individual datum.

Additionally, take a look at IEEE 1451. Here’s a presentation by the spec editor, Kang Lee. Essentially, 1451 attempts to standardize common metadata one needs to know about a measured value before performing complex calculations or analyses. Examples are, manufacturer, accuracy, units, precision, response function, time-in-service, cycles, and many more. It is clear that many in the sensing, industrial, space, and other areas have been concerned about such metadata for some time. The IoT community must embrace these concepts and work on how to integrate them into everything we do with numbers!

Why a special pet peeve? It’s that we in the computer industry so often get enamored with the promise of technology that we rush ahead without, IMHO, careful consideration of fundamental issues. It feels to be we are, in this particular way, doing the same with IoT. As a result, the value of our industry’s contribution to humankind may be significantly diminished.

[1] Adams, Douglas. The Hitchhiker’s Guide to the Galaxy (p. 108). Random House Publishing Group. Kindle Edition.