Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
en:iot-reloaded:iot_data_analysis [2024/12/10 15:41] blankaen:iot-reloaded:iot_data_analysis [2025/05/17 08:56] (current) agrisnik
Line 1: Line 1:
 ====== IoT Data Analysis ====== ====== IoT Data Analysis ======
-IoT systems are built to provide better insights into different processes and systems so that better decisions can be made. The insights are provided by measuring the statuses of the systems or process elements represented by data. Unfortunately, without properly interpreting the data content, the bits and bytes become useless. Therefore, providing a means for understanding data is an essential property of a modern IoT system. +IoT systems are built to provide better insights into different processes and systems to make better decisions. The insights are provided by measuring the statuses of the systems or process elements represented by data. Unfortunately, the bits and bytes become useless without adequately interpreting the data content. Therefore, providing a means for understanding data is an essential property of a modern IoT system. 
 Today, IoT systems produce a vast amount of data, which is very hard to use manually. Thanks to modern hardware and software developments, it is possible to develop fully or semi-automated systems for data analysis and interpretation, which may go further into decision-making and acting according to the decisions.  Today, IoT systems produce a vast amount of data, which is very hard to use manually. Thanks to modern hardware and software developments, it is possible to develop fully or semi-automated systems for data analysis and interpretation, which may go further into decision-making and acting according to the decisions. 
  
Line 11: Line 11:
 === Variety === === Variety ===
  
-Jain explained that big data is highly heterogeneous in terms of source, kind, and nature. Having different systems, processes, sensors, and other data sources, variety is usually a distinctive feature of practical IoT systems. For instance, a system of intelligent office buildings would need data from a building management system, appliances and independent sensors, and external sources like weather stations or forecasts from appropriate external weather forecast APIs (Application programming interfaces). Additionally, the given system might require historical data from other sources, like XML documents, CSV files or other sources, diversifying the sources even more. +Jain explained that Big Data is highly heterogeneous regarding source, kind, and nature. Having different systems, processes, sensors, and other data sources, variety is usually a distinctive feature of practical IoT systems. For instance, a system of intelligent office buildings would need data from a building management system, appliances and independent sensors, and external sources like weather stations or forecasts from appropriate external weather forecast APIs (Application programming interfaces). Additionally, the given system might require historical data from other sources, like XML documents, CSV files or other sources, diversifying the sources even more. 
  
 === Veracity === === Veracity ===
Line 19: Line 19:
 === Velocity === === Velocity ===
  
-Data velocity characterises the data bound to the time and its importance during a specific period or at a particular time instant. A good example might be any real-time system like an industrial process control system, where reactions or decisions must be made during a fixed period of time, requiring data at particular time instants. In this case, data has a flow nature of a specific density. +Data velocity characterises the data bound to the time and its importance during a specific period or at a particular time instant. A good example might be any real-time system like an industrial process control system, where reactions or decisions must be made during a fixed period, requiring data at particular time instants. In this case, data has a flow nature of a specific density. 
  
 === Value === === Value ===
Line 26: Line 26:
  
 ====== ====== ====== ======
-Dealing with big data requires specific hardware and software infrastructure. While there is a certain number of typical solutions and a lot more customise, some of the most popular are explained here:+Dealing with Big Data requires specific hardware and software infrastructure. While there is a certain number of typical solutions and a lot more customised, some of the most popular are explained here:
  
 === Relational DB-based systems === === Relational DB-based systems ===
Line 36: Line 36:
   * Enables asynchronous reactions to events by triggering internal events.    * Enables asynchronous reactions to events by triggering internal events. 
   * Data reading might be scaled out using multiple entities, while writing might be scaled up using more productive servers.    * Data reading might be scaled out using multiple entities, while writing might be scaled up using more productive servers. 
-Unfortunately, scaling out data writing is not always possible and is usually supported at a high cost for software products. +Unfortunately, scaling out data writing is not always possible and is usually supported at a high cost for software products (figure 1)
  
 <figure RelationalDBMS> <figure RelationalDBMS>
Line 45: Line 45:
 === Complex Event Processing (CEP) systems === === Complex Event Processing (CEP) systems ===
  
-CEP systems are very application-tailored, enabling significant productivity at a reasonable cost. High productivity is usually needed for processing data streams, such as voice or video. Maintaining a limited time window for data processing is possible, which is relevant for systems close to real-time. +CEP systems are very application-tailored, enabling significant productivity at a reasonable cost. High productivity is usually needed for processing data streams, such as voice or video. Maintaining a limited time window for data processing is possible, which is relevant for systems close to real-time (figure {{ref>CEP_systems}})
 Some of the most common drawbacks to be considered are: Some of the most common drawbacks to be considered are:
   * It might be scaled up only by introducing higher productivity hardware, which is limited by the application-specific design. To some extent, the design might be more flexible if microservices and containerisation are applied.    * It might be scaled up only by introducing higher productivity hardware, which is limited by the application-specific design. To some extent, the design might be more flexible if microservices and containerisation are applied. 
-  * Due to the factors mentioned above and the complexity, the maintenance costs are usually higher than a universal design.+  * Due to the factors mentioned above and the complexity, the maintenance costs are usually higher than a universal design (figure 2).
  
-<figure CEP systems>+<figure CEP_systems>
 {{ :en:iot-reloaded:ceps.png?400 | CEP Systems}} {{ :en:iot-reloaded:ceps.png?400 | CEP Systems}}
 <caption>CEP Systems</caption> <caption>CEP Systems</caption>
Line 57: Line 57:
 === NoSQL systems === === NoSQL systems ===
  
-As the name suggests, the main characteristic is higher flexibility in data models, which overcomes the limitations of highly structured relational data models. NoSQL systems are usually distributed, where the distribution is the primary tool to enable supreme flexibility. In IoT systems, software typically gets older faster than hardware, which requires the maintenance of many versions of communication protocols and data formats to ensure back compatibility. Another reason is the variety of hardware suppliers, where some protocols or data formats are specific to the given vendor. +As the name suggests, the main characteristic is higher flexibility in data models, which overcomes the limitations of highly structured relational data models (figure {{ref>NoSQL_systems}}). NoSQL systems are usually distributed, where the distribution is the primary tool to enable supreme flexibility. In IoT systems, software typically gets older faster than hardware, which requires the maintenance of many versions of communication protocols and data formats to ensure back compatibility. Another reason is the variety of hardware suppliers, where some protocols or data formats are specific to the given vendor. 
 It also provides a means for scalability out and up, enabling high future tolerance and resilience. A typical approach uses a key-value or key-document approach, where a unique key indexes incoming data blocks or documents (JSON, for instance). It also provides a means for scalability out and up, enabling high future tolerance and resilience. A typical approach uses a key-value or key-document approach, where a unique key indexes incoming data blocks or documents (JSON, for instance).
-Some other designs might extend the SQL data models by others – object models, graph models, or the mentioned key-value models, providing highly purpose-driven and, therefore, productive designs. However, the complexity of the design raises problems of data integrity as well as the complexity of maintenance. +Some other designs might extend the SQL data models by others – object models, graph models, or the mentioned key-value models, providing highly purpose-driven and, therefore, productive designs. However, the complexity of the design raises problems of data integrity as well as the complexity of maintenance (figure 3)
  
-<figure NoSQL systems>+<figure NoSQL_systems>
 {{ :en:iot-reloaded:keyvalue.png?400 | NoSQL Systems}} {{ :en:iot-reloaded:keyvalue.png?400 | NoSQL Systems}}
 <caption>NoSQL Systems</caption> <caption>NoSQL Systems</caption>
en/iot-reloaded/iot_data_analysis.1733845308.txt.gz · Last modified: 2024/12/10 15:41 by blanka
CC Attribution-Share Alike 4.0 International
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0