This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
en:iot-reloaded:random_forests [2024/12/10 20:48] – pczekalski | en:iot-reloaded:random_forests [2024/12/10 21:39] (current) – pczekalski | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== Random | + | ====== Random |
Random forests ((https:// | Random forests ((https:// | ||
Line 18: | Line 18: | ||
Some advantages: | Some advantages: | ||
* RF uses more knowledge than a single decision tree. | * RF uses more knowledge than a single decision tree. | ||
- | * Furthermore, | + | * Furthermore, |
* This is true because a single data source might suffer from data anomalies reflected in model anomalies. | * This is true because a single data source might suffer from data anomalies reflected in model anomalies. | ||
Line 47: | Line 47: | ||
Each tree's strength depends on various factors, including its depth and the features it uses for splitting. However, there is a trade-off between correlation and strength. For example, reducing m (the number of features considered at each split) increases the diversity among the trees, lowering correlation. Still, it may also reduce the strength of each tree, as it may limit its access to highly predictive features. | Each tree's strength depends on various factors, including its depth and the features it uses for splitting. However, there is a trade-off between correlation and strength. For example, reducing m (the number of features considered at each split) increases the diversity among the trees, lowering correlation. Still, it may also reduce the strength of each tree, as it may limit its access to highly predictive features. | ||
- | Despite this trade-off, Random Forests balance these dynamics by optimising m to minimise the ensemble error. Generally, a moderate reduction in m lowers correlation without significantly compromising the strength of each tree, thus leading to an overall decrease in the forest’s error rate. | + | Despite this trade-off, Random Forests balance these dynamics by optimising m to minimise the ensemble error. Generally, a moderate reduction in m lowers correlation without significantly compromising the strength of each tree, thus leading to an overall decrease in the forest's error rate. |
**Implications for the Forest Error Rate:** The forest error rate in a Random Forest model is influenced by the correlation among the trees and the strength of each tree. Specifically: | **Implications for the Forest Error Rate:** The forest error rate in a Random Forest model is influenced by the correlation among the trees and the strength of each tree. Specifically: | ||
- | * Increasing correlation among trees typically increases the error rate, as it reduces the ensemble’s ability to correct individual trees' errors. | + | * Increasing correlation among trees typically increases the error rate, as it reduces the ensemble's ability to correct individual trees' errors. |
* Increasing the strength of each tree (i.e., reducing its error rate) generally decreases the forest error rate, as each tree becomes a more reliable classifier. | * Increasing the strength of each tree (i.e., reducing its error rate) generally decreases the forest error rate, as each tree becomes a more reliable classifier. | ||
Consequently, | Consequently, |