Internet of Things for Data Strategists – A Quick Guide
I recently read an excellent article “A Strategist’s Guide to the Internet of Things” by Strategy & (formerly Booz & Company), which prompted me to think about what the Internet of Things for data strategists and data professionals. As we understand it today, IoT is several billion devices in the physical world, being connected to the internet and to each other, collecting, sharing, analyzing, and responding with data to make our lives and businesses smarter, safer, and better. What does this mean from a data strategist’s point of view?
Perhaps this oversimplified view of IoT that I am about to provide will help data strategists see how much of the basic principles of data management still apply, while requiring upgrades to technologies to deal with increased data volume, variety and velocity to make IoT projects successful.
From an overall perspective, IoT will generate lots of data and faster, but the basic goals are the same. Value to the consumer or business comes from distilling useful information from this data to inform decisions taken by humans or automatically by devices. The outcomes we seek are also familiar – reduce risk and cost, capture opportunities, improve service, etc. Data management in the IoT context also moves in familiar phases – Ingest, Store, Integrate, Analyze, Contextualize, and Disseminate. A key difference is that because of the volumes of data, these functions are less centralized and often distributed across the network.
• Data Ingestion – Given the immaturity of the IoT market and variety of endpoints and domains, the first challenge is the disparity of source data that must be captured and normalized.
• Data Storage – The next decision is whether to store the data at all or let it stream. If the data will be stored it is typically ‘dumped’ in a data lake in fairly unstructured format for later processing to increase efficiencies
• Analytics – The new field of data science is developing fast to apply time-tested statistical techniques or develop new algorithms to extract insights from either the static or just arrived data in data lakes, or apply them in real-time for streaming analytics. The availability of cheap computing power has encouraged unbounded analytics where the “data speaks” to reveal patterns without a constraining hypotheses.
• Integration / Contextualization – Typically multiple sets of data are integrated together before it is analyzed. Contextualization implies having background information to either properly analyze streaming data, or to translate the results of analytics into the right decisions in context of other business or customer knowledge.
• Dissemination – the last mile of how the intelligence in IoT is redistributed to achieve business outcomes is done by either sending it to the right people, at the right time to make decisions, or automatically trigger machines to do the work.
Once you break down the fundamental processes of IoT and look past hype and terminology, it isn’t so scary anymore. Yet in order to deal with the challenges of immense volume, variety and velocity of data, new technologies are certainly needed. Distributed big data computing platforms such as Hadoop have grabbed attention because they can economically handle the Store & Analyze phases of the IoT. In the hardware space, sensors, wireless technologies, and high speed networks capture and move data increasingly faster across the IoT. A key third piece of technology needed is data virtualization, which provides the needed abstraction layers for ingesting disparate data, providing real-time integration of contextual data, and disseminating analysis and intelligence as IoT Data Services to users and applications. From my own experience, several leading companies are now using this combination of technologies to gain new value from IoT: Predictive service and parts replacement for heavy equipment; Location-based commerce offerings within navigation devices; Climate sensor-based crop insurance; Young drivers saving on car insurance based on vehicle telematics, and so on. I know this works, and it will for you too.
Latest posts by Suresh Chandrasekaran (see all)
- Data Virtualization Performance and Source System Impact (Part 2) - September 27, 2016
- Data Virtualization Performance and Source System Impact (Part 1) - September 21, 2016
- Brexit or Data Lakexit … Beware the Perils of Going it Alone - August 30, 2016