To Be Continued...
Now lets extract the two main things in our data, Annual Income and Spending Score.
Now we have an array X which contains these two values. Lets import SciKit-Learn so that we can use KMeans for the creating clusters. K-Means have to have definite number of clusters i.e. K. so instead of doing hit and trial method, we have used K-Means++ along with Elbow Method to find the optimal number of K. So then we can use the K for KMeans.
Now lets try to print out the statistics for number of K.
This will give you the output showing that optimal number of K is 5.
Nows let apply the KMeans to our data with the number K=5 and plot the number of clusters along with their centroids. Dont worry, many of the lines below are just to add detail to the plot. So dont worry if it looks too much :)
This gives us the output, plotting 5 clusters on the graph against Annual Income and Spending Score.
This output here shows us that we have 5 different types of customers belong from with different annual income groups and spending scores but basically showing us what group to target and what to offer to those customers so that we can get more and more customers and keep our current customers happy.
Last updated
Was this helpful?