The objectives of this section are:
define density-based clustering
explain the major parts
introduce the DBSCAN algorithm
list the limitations and advantages of this method
By the time you have completed this section you will be able to:
explain the basic DBSCAN algorithm
label points into the appropriate group type
determine which scenarios this algorithm would yield good results.
Density-based clustering algorithms are algorithms that aim to discover areas of high density that are separated from each other by areas of low-density. DBSCAN is a popular density-based clustering algorithm but before we launch into a description one needs to understand some fundamental parts of a center-based approach.
In this approach density is estimated for a particular point in a data set by counting the number of points within a certain radius. The size of the radius is crucial because if the radius is too large then all points in the data set will have the same density and if the radius is too small then the density of each data point will be 1.
Data points can be classified into three categories
Figure 1 is a labeled diagram which shows the classification of certain points.