This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
en:iot-reloaded:regression_models [2024/12/02 17:13] – [Piecewise linear models] ktokarz | en:iot-reloaded:regression_models [2024/12/10 21:33] (current) – pczekalski | ||
---|---|---|---|
Line 15: | Line 15: | ||
<figure Galton' | <figure Galton' | ||
- | {{ : | + | {{ : |
- | < | + | < |
</ | </ | ||
Line 22: | Line 22: | ||
<figure Linear model 1> | <figure Linear model 1> | ||
- | {{ : | + | {{ : |
- | < | + | < |
</ | </ | ||
Line 32: | Line 32: | ||
* β0 and β1 y axis crossing and slope coefficients of the linear function correspondingly | * β0 and β1 y axis crossing and slope coefficients of the linear function correspondingly | ||
- | Unfortunately, | + | Unfortunately, |
It means that the following equation might describe the model: | It means that the following equation might describe the model: | ||
<figure Linear model 2> | <figure Linear model 2> | ||
- | {{ : | + | {{ : |
- | < | + | < |
</ | </ | ||
Line 46: | Line 46: | ||
<figure Model error> | <figure Model error> | ||
- | {{ : | + | {{ : |
- | < | + | < |
</ | </ | ||
Line 53: | Line 53: | ||
<figure Coefficient velues> | <figure Coefficient velues> | ||
- | {{ : | + | {{ : |
- | < | + | < |
</ | </ | ||
Line 65: | Line 65: | ||
<figure Galton' | <figure Galton' | ||
- | {{ : | + | {{ : |
- | < | + | < |
</ | </ | ||
Line 74: | Line 74: | ||
<figure Coefficient velues> | <figure Coefficient velues> | ||
- | {{ : | + | {{ : |
- | < | + | < |
</ | </ | ||
Line 83: | Line 83: | ||
* ei - error of the model' | * ei - error of the model' | ||
- | Since an error for a given yith might be positive or negative and the model itself minimises the overall error, one might expect that the error is normally | + | Since an error for a given yith might be positive or negative and the model itself minimises the overall error, one might expect that the error is typically |
<figure Galton' | <figure Galton' | ||
- | {{ : | + | {{ : |
- | < | + | < |
</ | </ | ||
Line 94: | Line 94: | ||
<figure Error_distribution_example> | <figure Error_distribution_example> | ||
- | {{ : | + | {{ : |
- | < | + | < |
</ | </ | ||
Line 101: | Line 101: | ||
<figure Error_distribution_example2> | <figure Error_distribution_example2> | ||
- | {{ : | + | {{ : |
- | < | + | < |
</ | </ | ||
From this discussion, a few essential notes have to be taken: | From this discussion, a few essential notes have to be taken: | ||
* Error distributions (around 0) should be treated as carefully as the models themselves; | * Error distributions (around 0) should be treated as carefully as the models themselves; | ||
- | * In most cases, error distribution is hard to notice even if the errors are illustrated; | + | * In most cases, error distribution is complex |
* It is essential to look into the distribution to ensure that there are no regularities. | * It is essential to look into the distribution to ensure that there are no regularities. | ||
If any regularities are noticed, whether a simple variance increase or cyclic nature, they point to something the model does not consider. It might point to a lack of data, i.e., other factors that influence the modelled process, but they are not part of the model, which is therefore exposed through the nature of the error distribution. It also might point to an oversimplified look at the problem, and more complex models should be considered. In any of the mentioned cases, a deeper analysis should be considered. | If any regularities are noticed, whether a simple variance increase or cyclic nature, they point to something the model does not consider. It might point to a lack of data, i.e., other factors that influence the modelled process, but they are not part of the model, which is therefore exposed through the nature of the error distribution. It also might point to an oversimplified look at the problem, and more complex models should be considered. In any of the mentioned cases, a deeper analysis should be considered. | ||
Line 113: | Line 113: | ||
<figure Linear model> | <figure Linear model> | ||
- | {{ : | + | {{ : |
- | < | + | < |
</ | </ | ||
- | Here, the error is considered to be normally distributed around 0, with its standard deviation sigma and variance sigma squared. Variance provides at least a numerical insight into the error distribution; | + | Here, the error is considered to be normally distributed around 0, with its standard deviation sigma and variance sigma squared. Variance provides at least a numerical insight into the error distribution; |
<figure Sigma> | <figure Sigma> | ||
- | {{ : | + | {{ : |
- | < | + | < |
</ | </ | ||
Line 127: | Line 127: | ||
<figure Variance> | <figure Variance> | ||
- | {{ : | + | {{ : |
- | < | + | < |
</ | </ | ||
Line 134: | Line 134: | ||
===== Multiple linear regression ===== | ===== Multiple linear regression ===== | ||
- | In many practical problems, the target variable Y might depend on more than one independent variable X, for instance, wine quality, which depends on its level of serenity, amount of sugars, acidity and other factors. In the case of applying a linear regression model, it seems much complicated, but it is still a linear model of the following form: | + | In many practical problems, the target variable Y might depend on more than one independent variable X, for instance, wine quality, which depends on its level of serenity, amount of sugars, acidity and other factors. In the case of applying a linear regression model that doesn' |
<figure Multiple linear model> | <figure Multiple linear model> | ||
- | {{ : | + | {{ : |
- | < | + | < |
</ | </ | ||
Line 144: | Line 144: | ||
<figure Multiple linear model error estimate> | <figure Multiple linear model error estimate> | ||
- | {{ : | + | {{ : |
- | < | + | < |
</ | </ | ||
- | Unfortunately, | + | Unfortunately, due to the number of factors (dimensions), the results of multiple linear regression cannot be visualised in the same way as those of a single linear regression. Therefore, numerical analysis and interpretation of the model should be done. In many situations, numerical analysis is complicated and requires a semantic interpretation of the data and model. To do it, visualisations reflecting the relation between the dependent variable and independent variables result in multiple graphs. Otherwise, the quality of the model is hardly assessable or even unassessable. |
===== Piecewise linear models ===== | ===== Piecewise linear models ===== | ||
Line 155: | Line 155: | ||
<figure Piecewise linear model> | <figure Piecewise linear model> | ||
- | {{ : | + | {{ : |
- | < | + | < |
</ | </ | ||
Line 163: | Line 163: | ||
<figure Complex_data_example> | <figure Complex_data_example> | ||
- | {{ : | + | {{ : |
- | < | + | < |
</ | </ | ||
Line 170: | Line 170: | ||
<figure Piecewise_linear_model_two> | <figure Piecewise_linear_model_two> | ||
- | {{ : | + | {{ : |
- | < | + | < |
</ | </ | ||
Line 177: | Line 177: | ||
<figure Piecewise_linear_model_many> | <figure Piecewise_linear_model_many> | ||
- | {{ : | + | {{ : |
- | < | + | < |
</ | </ | ||