How data management unlocks the internet of things

Use cases for analyzing data

ATLANTA – Ron Bodkin, founder and president of Think Big, hosted a panel titled “The key to unlocking value in the internet of things? Managing data!” at the Teradata PARTNERS 2016 conference in Atlanta, and emphasized that a variety of data is a key element to any “internet of things” system, and that figuring out what data is valuable is essential.

Teradata PARTNERS 2016

Teradata PARTNERS 2016

He gave a number of use cases associated with analyzing data that comes from IoT devices:

  • Predictive maintenance;
  • Search and view product detail;
  • Identify critical alerts;
  • Root cause analysis; and
  • Understand usage.

Bodkin claimed that companies have always been gathering and organizing data that is accessible and useful, but they haven’t modeled it for business intelligence consumption, or analyst consumption.

A changing landscape of data management

img_20160913_115917

Bodkin spoke about the changing landscape of data analytics brought on by IoT:

  • JSON-like structures: complex collections of relations, array, map of items;
  • Graphs: storing complex, dynamically changing, not static relationships; and
  • Binary/CLOB/specialized data: ability to execute specialized programs to interpret and process.

As well as new patterns seen in the IoT space when looking at big data systems.

  • Denormalized facts;
  • Profile;
  • Event history;
  • Timeline;
  • Network;
  • Distributed sources;
  • Late data;
  • Deep aggregates;
  • Recovery; and
  • Multiple active cluster.

He elaborated on four of these elements:

Event history:

  • Fact table about common events that allows analytics in context eg. wearable device, telematics
    • Stored in columnar format

Timeline pattern:

  • Table of actors with event over time
    • Device history, usage in consumer journey;
    • Enable support/analytics on specific items, long-lived analysis; and
    • May have hierarchy of actors (clusters, device, components) or array of events.

Network: Ongoing status of configuration

  • Parts in assembly;
  • Related items; and
  • Peer groups.

“Most IoT is not about a single device but a complex assemblance of devices and how they work together,” Bodkin said.

Late data:

  • Delays from intermittent connectivity, upstream failures;
  • Linage tracking is critical; and
  • Watermarks to identify when sufficient data has arrived.

But there are challenges that come with collecting and analyzing data. Delay is endemic, so it is important to have a system that can account for delay by recognizing when enough data is present to get insights out of it. Watermarking is one way to do so by answering the question of: “When do I have enough data in the system to reliably work with it?” as well as know when I have enough data to process, and have different times of triggering.

Case Studies

Bodkin introduced two case studies illustrating how Teradata helped companies that required big data analytics.

The first case study featured an unnamed “global manufacturer” of storage devices: hard-drives, solid-state drives, object storage. It produces hundreds of millions of products each year, built with complex components in geographically dispersed manufacturing sites. Making things even more difficult, the company had compiled five years of data because of warranty reasons. Its goal was to expose the entire DNA of a device from development, manufacturing and reliability testing, as well as the “living” behavior of the device.

The company’s problem was that its engineers were having a hard time finding the correct data, and needed to speed the cycle time for new product development.

Teradata PARTNERS 2016

Teradata PARTNERS 2016

Some technical challenges include:

  • Data silos across manufacturing facilities;
  • Difficulty storing and exposing binary and other data types;
  • Current DW’s unable to keep pace with volume; and
  • No platform for large-scale analytics.

Solution

Teradata helped put the data together and solved new problems, which allowed the scan of 380 billion test points for 8 million products. Several irregular distributions were found, which allowed the team to identify a software bug that caused failures, saving millions of dollars.

data management

The second case study included an unnamed “global health care device manufacturer.”

Their goal was to focus on improving patient outcomes. To do so, it expanded data collection with a storage shift from 50 terabytes to 20 petabytes after beginning the initiative.

The solution included:

  • Microservices architecture;
  • Operation real-time analytics;
  • Data lake to feed warehouse;
  • Public cloud/security-first approach; and
  • Agile production releases.