Data Management Aspects in IoT

 General audience classification icon
Data management is a critical task in IoT. Due to the high number of devices (things) already available, that is tens of billions. Considering the data traffic generated by each of them through, e.g. sensor networks, infotainment (soft news) or surveillance systems, mobile social network clients, and so on, we are now even beyond the ZettaByte (ZB 2^70, 10^21 bytes) era. This opened up several new challenges in (IoT) data management, giving rise to data sciences and big data technologies. Such challenges have not to be considered as main issues to solve but also as significant opportunities fuelling the digital economy with new directions such as Cloudonomics [1] and IoTonomics, where data can be considered as a utility, a commodity to manage, curate, store, and trade appropriately. Therefore, properly managing data in IoT contexts is not only critical but also of strategic importance for business players as well as for users, evolving into prosumers (producers-consumers).

From a technological perspective, the main aspects of dealing with IoT data management are:

  • Data source - data generation and production is a relevant part of IoT, involving sensors probing the physical system. In a cyber-physical-social system view, such sensors could be virtual (e.g. software) or even human (e.g. citizens, crowdsensing). The main issues in data production are related to the type and format of data, heterogeneity in measurements and similar issues. Semantics is the key to solving these issues through specific standards such as Sensor Web Enablement and Semantic Sensor Networks.
  • Data collection/gathering - once data are generated, these should be gathered and made available for processing. The collection process needs to ensure that the data collected are defined and accurate so that subsequent decisions based on the findings are valid. Some types of data collection include census (data collection about everything in a group or statistical population), sample survey (collection method that provides for only part of the total population), and administrative byproduct (data collection is a byproduct of an organization’s day-to-day operations). Usually, wireless communication technologies such as Zigbee, BlueTooth, LoRa, Wi-Fi and 3G/4G networks are used by IoT smart objects and things to deliver data to collection points.
  • Filtering - is a specific preprocessing activity, usually performed at data source or data collector (IoT) nodes (e.g. motes, base stations, hotspots, gateways), aiming at cleaning noisy data, filtering noise and not helpful information.
  • Aggregation/fusion - to reduce bandwidth before sending data to processing nodes, these are further elaborated, compressed, aggregated and fused (sensor/data fusion) to reduce the overall volume of raw data to be transmitted and stored.
  • Processing - once data are adequately collected, filtered, aggregated, and fused, they can be processed. Processing can be local and remote and usually includes preprocessing activities to prepare data for actual processing. Local processing, when possible, is mainly tasked with a fast, lightweight computation on edges (Edge computing) and in the Fog layer, wherever possible, quickly providing results and local analytics. More complex computations are usually demanded to remote (physical or virtual) servers provided by local nodes (e.g. communication servers, cloudlets) in a Fog computing fashion or by Cloud providers as virtual machines hosted in data centres. This kind of computation can also involve historical data, providing global analytics, but hardly meets time-constrained applications and real-time requirements.
  • Storage/archive - remote servers are also used for permanently storing and archiving data, making these available for further processing, even to third parties. The database is often used for that, mainly based on distributed, NoSQL key-store technologies to improve reliability and performance.
  • Delivering/presentation/visualization - processing activity results must then be delivered to requestors and users. These have to be, therefore, adequately organized and formatted, ready for end-users. IoT data visualization is becoming an integral part of the IoT. Data visualization provides a way to display this avalanche of collected data in meaningful ways that clearly present insights hidden within this mass amount of information.
  • Security and privacy - data privacy and security are among the most critical issues in IoT data management. Good results and reliable techniques for secure data transmission, such as TLS and similar, are available. This way, IoT data security issues mainly concern [2] securing IoT devices, since they are usually resource-constrained and therefore do not allow to adopt traditional cryptography scheme to data encryption/decryption. Data privacy and integrity should also be enforced in remote storage servers, anonymizing data and allowing owners to properly manage (monitoring, removing) them while ensuring availability. Indeed, security and privacy issues vertically span the whole IoT stack. A promising technique to address IoT security issues, attracting growing interest from both academic and business communities, is blockchain [3].
en/iot-open/introduction/introduction_to_data-related_design_questions_of_iot.txt · Last modified: 2023/11/23 16:08 by pczekalski
CC Attribution-Share Alike 4.0 International Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0