The Internet of Things (IoT) opens the doors to use cases we deemed impossible in the past: connecting and reading data from machines we used to look at as black boxes, such as elevators, home routers, different sensors controlling temperature, humidity, room occupancy, to improve the quality of life, ensure operational efficiency, proactively monitor and control factories etc.
Engineers work on IoT applications to enable such use cases at scale. Data, be it raw, prepared or consumable, be it real time or historical, is at the centre of most IoT use cases. IoT applications are complex, and often affected by events outside of the engineers' control, such as: the network provider drops the connection, or a high % of your device fleet goes offline and comes online at the same time. Multiple points of failure in the end-to-end architecture, external factors, are just some examples of what can generate challenges for data accuracy, consistency, completeness and timeliness. Problems that seem minor at small scale, such as: 1% of your fleet not sending data for 20 minutes, might break your use case at large scale. Is it one device that is misbehaving, or are 100,000 devices not performing as expected?
In this talk, we will look at how we, as engineers, can build for resilience, and ensure data quality at scale, in our IoT applications, and we will focus on using AWS IoT and Data services. We will examine concrete examples of what can go wrong at scale with fleets of devices, which affects data quality, and how to design and build to mitigate such scenarios.