Differences

This shows you the differences between two versions of the page.

--- en:iot-reloaded:introduction_to_time_series_analysis [2024/12/10 21:40] – pczekalski
+++ en:iot-reloaded:introduction_to_time_series_analysis [2025/05/13 14:59] (current) – [A cooling system case] pczekalski
@@ Line 3: / Line 3: @@
 In the context of IoT systems, there might be several reasons why time series analysis is needed.
-The most widely ones are the following:
+The most widely used ones are the following:
   * **Process dynamics forecasting** for higher-performing decision support systems. An IoT system, coupled with appropriate cloud computing or other computing infrastructure, can provide not only a rich insight into the process dynamics but also a reliable forecast using regression algorithms like the ones discussed in the regressions section or more advanced like autoregressive integrated moving average (ARIMA) and seasonal ARIMA (SARIMA) ((Hyndman, Rob J; Athanasopoulos, George. 8.9 Seasonal ARIMA models. oTexts. Retrieved 19 May 2015.)) ((Box, George E. P. (2015). Time Series Analysis: Forecasting and Control. WILEY. ISBN 978-1-118-67502-1.)).
   * **Anomaly detection** is a highly valued feature of IoT systems. In its essence, anomaly detection is a set of methods enabling the recognition of unwanted or abnormal behaviour of the system over a specific time period. Anomalies might be expressed in data differently:
@@ Line 20: / Line 20: @@
 ===== A cooling system case =====
-A given industrial cooling system has to maintain a specific temperature mode of around -18oC. Due to the technology specifics, it goes through a defrost cycle every few hours to avoid ice deposits, leading to inefficiency and potential malfunction. However, a relatively short power supply interruption has been noticed at some point, which needs to be recognised in the future for reporting appropriately. The logged data series is depicted in the following figure {{ref>Cooling_system}}:
+A given industrial cooling system has to maintain a specific temperature mode of around -18C. Due to the specifics of the technology, it goes through a defrost cycle every few hours to avoid ice deposits, leading to inefficiency and potential malfunction. However, a relatively short power supply interruption has been noticed at some point, which needs to be recognised in the future for reporting appropriately. The logged data series is depicted in the following figure {{ref>Cooling_system}}:
@@ Line 57: / Line 57: @@
   * Samples of different patterns are different in length.
   * Samples of the same pattern are of different lengths.
-  * The interested phenomena (spikes) are located at different locations within the samples and are slightly different.
+  * The interesting phenomena (spikes) are located at different locations within the samples and are slightly different.
 The abovementioned issues expose the problem of calculating distances from one example to another since comparing data points will produce misleading distance values. To avoid it, a Dynamic Time Warping  (DTW) metric has to be employed ((Gold, Omer; Sharir, Micha (2018). "Dynamic Time Warping and Geometric Edit Distance: Breaking the Quadratic Barrier". ACM Transactions on Algorithms. 14 (4). doi:10.1145/3230734. S2CID 52070903.)). For the practical implementations in Python, it is highly recommended to visit TS learn library documentation ((Romain Tavenard, Johann Faouzi, Gilles Vandewiele, Felix Divo, Guillaume Androz, Chester Holtz, Marie Payne, Roman Yurchak, Marc Rußwurm, Kushal Kolar, & Eli Woods (2020). TSlearn, A Machine Learning Toolkit for Time Series Data. Journal of Machine Learning Research, 21(118), 1-6.)).
@@ Line 74: / Line 74: @@
 </figure>
-As it might be noticed, the query (black) samples are somewhat different from the ones found to be "closest" by the KNN. However, because of the DTW advantages, the classification is done perfectly.
+As might be noticed, the query (black) samples are somewhat different from the ones found to be "closest" by the KNN. However, because of the DTW advantages, the classification is done perfectly.
 The same idea demonstrated here might be used for unknown anomalies by setting a similarity threshold for DTW, classifying known anomalies as shown here, or even simple forecasting.

en/iot-reloaded/introduction_to_time_series_analysis.1733866821.txt.gz · Last modified: 2024/12/10 21:40 by pczekalski