This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
en:iot-reloaded:iot_data_analysis [2024/11/29 10:23] – pczekalski | en:iot-reloaded:iot_data_analysis [2025/05/17 08:56] (current) – agrisnik | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== IoT Data Analysis ====== | ====== IoT Data Analysis ====== | ||
- | IoT systems, in their essence, | + | IoT systems are built to provide |
Today, IoT systems produce a vast amount of data, which is very hard to use manually. Thanks to modern hardware and software developments, | Today, IoT systems produce a vast amount of data, which is very hard to use manually. Thanks to modern hardware and software developments, | ||
- | As various resources have stated, IoT in most cases, complies with the so-called big 5Vs of Big Data, where just one correspondence is needed to solve a Big Data problem. As has been explained by Jain et al. ((Jain, A., Mittal, S., Bhagat, A., Sharma, D.K. (2023). Big Data Analytics and Security Over the Cloud: Characteristics, | + | As various resources have stated, IoT, in most cases, complies with the so-called big 5Vs of Big Data, where just one correspondence is needed to solve a Big Data problem. As has been explained by Jain et al. ((Jain, A., Mittal, S., Bhagat, A., Sharma, D.K. (2023). Big Data Analytics and Security Over the Cloud: Characteristics, |
=== Volume === | === Volume === | ||
Line 11: | Line 11: | ||
=== Variety === | === Variety === | ||
- | As Jain explained, big data is highly heterogeneous | + | Jain explained |
=== Veracity === | === Veracity === | ||
Line 19: | Line 19: | ||
=== Velocity === | === Velocity === | ||
- | Data velocity characterises the data bound to the time and its importance during a specific period or at a particular time instant. A good example might be any real-time system like an industrial process control system, where reactions or decisions must be made during a fixed period | + | Data velocity characterises the data bound to the time and its importance during a specific period or at a particular time instant. A good example might be any real-time system like an industrial process control system, where reactions or decisions must be made during a fixed period, requiring data at particular time instants. In this case, data has a flow nature of a specific |
=== Value === | === Value === | ||
- | Since the IoT systems and their data analysis subsystems are built to add value to their owner, the costs of the development and ownership should exceed the returned value. | + | Since IoT systems and their data analysis subsystems are built to add value to their owners, the development and ownership |
====== ====== | ====== ====== | ||
- | Dealing with big data requires specific hardware and software infrastructure. While there is a certain number of typical solutions and a lot more customise, some of the most popular are explained here: | + | Dealing with Big Data requires specific hardware and software infrastructure. While there is a certain number of typical solutions and a lot more customised, some of the most popular are explained here: |
=== Relational DB-based systems === | === Relational DB-based systems === | ||
Those systems are based on well-known relational data models and appropriate database management systems like MS SQL Server, Oracle Server, MySQL, etc. There are some advantageous features of those systems, for instance: | Those systems are based on well-known relational data models and appropriate database management systems like MS SQL Server, Oracle Server, MySQL, etc. There are some advantageous features of those systems, for instance: | ||
- | * Advantages of SQL (Structured Querying Language): enabling easy manipulation | + | * Advantages of SQL (Structured Querying Language): enabling easy data manipulation while maintaining a relatively good expressiveness of the data model. |
- | * A well-designed set of software tools and interfaces enabling integration with a large number of different systems; | + | * A well-designed set of software tools and interfaces enabling integration with many different systems. |
* A lot of built-in data processing routines (stored procedures) provide higher development productivity. | * A lot of built-in data processing routines (stored procedures) provide higher development productivity. | ||
* Enables asynchronous reactions to events by triggering internal events. | * Enables asynchronous reactions to events by triggering internal events. | ||
* Data reading might be scaled out using multiple entities, while writing might be scaled up using more productive servers. | * Data reading might be scaled out using multiple entities, while writing might be scaled up using more productive servers. | ||
- | Unfortunately, | + | Unfortunately, |
<figure RelationalDBMS> | <figure RelationalDBMS> | ||
- | {{ : | + | {{ : |
- | < | + | < |
</ | </ | ||
=== Complex Event Processing (CEP) systems === | === Complex Event Processing (CEP) systems === | ||
- | CEP systems are very application-tailored, | + | CEP systems are very application-tailored, |
Some of the most common drawbacks to be considered are: | Some of the most common drawbacks to be considered are: | ||
* It might be scaled up only by introducing higher productivity hardware, which is limited by the application-specific design. To some extent, the design might be more flexible if microservices and containerisation are applied. | * It might be scaled up only by introducing higher productivity hardware, which is limited by the application-specific design. To some extent, the design might be more flexible if microservices and containerisation are applied. | ||
- | * Due to the factors mentioned above and the complexity, the maintenance costs are usually higher than a universal design. | + | * Due to the factors mentioned above and the complexity, the maintenance costs are usually higher than a universal design |
- | < | + | < |
- | {{ : | + | {{ : |
- | < | + | < |
</ | </ | ||
=== NoSQL systems === | === NoSQL systems === | ||
- | As the name suggests, the main characteristic is higher flexibility in data models, which overcomes the limitations of highly structured relational data models. NoSQL systems are usually distributed, | + | As the name suggests, the main characteristic is higher flexibility in data models, which overcomes the limitations of highly structured relational data models |
- | It also provides a means for scalability out and up, enabling high future tolerance and resilience. A typical approach | + | It also provides a means for scalability out and up, enabling high future tolerance and resilience. A typical approach |
- | Some other designs might extend the SQL data models by others – object models, graph models, or the mentioned key-value models, providing highly purpose-driven and, therefore, productive designs. However, the complexity of the design raises problems of data integrity as well as the complexity of maintenance. | + | Some other designs might extend the SQL data models by others – object models, graph models, or the mentioned key-value models, providing highly purpose-driven and, therefore, productive designs. However, the complexity of the design raises problems of data integrity as well as the complexity of maintenance |
- | < | + | < |
- | {{ : | + | {{ : |
- | < | + | < |
</ | </ | ||
Line 69: | Line 69: | ||
This is probably the most productive type of system, providing high flexibility, | This is probably the most productive type of system, providing high flexibility, | ||
- | * Hazelcast | + | * Hazelcast |
- | * JBOSS Infinispan https:// | + | * JBOSS Infinispan |
- | * IBM eXtreme Scale https:// | + | * IBM eXtreme Scale ((https:// |
- | * Gigaspace XAP Elastic caching edition www.gigaspaces.com/ | + | * Gigaspace XAP Elastic caching edition |
- | * Oracle Coherence | + | * Oracle Coherence |
- | * Terracotta enterprise suite www.terracotta.org/ | + | * Terracotta enterprise suite ((www.terracotta.org/ |
- | * Pivotal Gemfire | + | * Pivotal Gemfire |
- | ====== | ||
+ | <WRAP excludefrompdf> | ||
This chapter is devoted to the main groups of algorithms for numerical data analysis and interpretation, | This chapter is devoted to the main groups of algorithms for numerical data analysis and interpretation, | ||
Line 88: | Line 88: | ||
* [[en: | * [[en: | ||
* [[en: | * [[en: | ||
+ | </ |