This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| en:iot-reloaded:dbscan [2024/12/02 21:25] – [Selecting eps and MinPts values] ktokarz | en:iot-reloaded:dbscan [2024/12/10 20:44] (current) – pczekalski | ||
|---|---|---|---|
| Line 6: | Line 6: | ||
| One of the essential concepts is the point' | One of the essential concepts is the point' | ||
| - | <figure Point' | + | {{ : |
| - | {{ : | + | |
| - | < | + | |
| - | </ | + | |
| where: | where: | ||
| Line 30: | Line 27: | ||
| * Points that are not core and are not reachable from any core point are considered noise or outliers. | * Points that are not core and are not reachable from any core point are considered noise or outliers. | ||
| - | < | + | < |
| - | {{ : | + | {{ : |
| - | < | + | < |
| </ | </ | ||
| Line 39: | Line 36: | ||
| * If it is a core point, form a cluster by grouping it with all directly density-reachable points. | * If it is a core point, form a cluster by grouping it with all directly density-reachable points. | ||
| * Move to the next unvisited point and return to step 1. | * Move to the next unvisited point and return to step 1. | ||
| - | * Border points are added to the nearest cluster, and points | + | * Border points are added to the nearest cluster, and points not reachable from any core point are marked as noise. |
| Line 51: | Line 48: | ||
| * It struggles with clusters of varying densities since eps is fixed. | * It struggles with clusters of varying densities since eps is fixed. | ||
| - | DBSCAN is great for discovering clusters in data with noise, especially when clusters are not circular or spherical. | + | DBSCAN is excellent |
| - | Some application examples: | + | Some application examples |
| - | < | + | < |
| - | {{ : | + | {{ : |
| - | < | + | < |
| </ | </ | ||
| - | < | + | < |
| - | {{ : | + | {{ : |
| - | < | + | < |
| </ | </ | ||
| - | A typical application in signal processing: | + | A typical application in signal processing |
| - | < | + | < |
| - | {{ : | + | {{ : |
| - | < | + | < |
| </ | </ | ||
| Line 75: | Line 72: | ||
| Usually, MinPts is selected using some prior knowledge of the data and its internal structure. If it is done, the following steps might be applied: | Usually, MinPts is selected using some prior knowledge of the data and its internal structure. If it is done, the following steps might be applied: | ||
| - | * Calculate the average distance between every point and its k-nearest neighbours, where k = MinPts; | + | * Calculate the average distance between every point and its k-nearest neighbours, where k = MinPts. |
| * The average distances are sorted and depicted on a chart, where x – is the index of the sorted average distance, y – is the distance value. | * The average distances are sorted and depicted on a chart, where x – is the index of the sorted average distance, y – is the distance value. | ||
| * The optimal eps value is when y increases rapidly, as shown in the following picture (figure {{ref> | * The optimal eps value is when y increases rapidly, as shown in the following picture (figure {{ref> | ||