# To Be Continued...

Now lets extract the two main things in our data, **Annual Income** and **Spending Score.**

```
X = df.iloc[:, [3,4]].values
```

Now we have an array **X** which contains these two values. Lets import **SciKit-Learn** so that we can use **KMeans** for the creating clusters. **K-Means** have to have definite number of clusters i.e. **K.** so instead of doing hit and trial method, we have used **K-Means++** along with **Elbow Method** to find the optimal number of **K.** So then we can use the **K** for **KMeans.**

```
#Building the Model
#KMeans Algorithm to decide the optimum cluster number , KMeans++ using Elbow Mmethod
#to figure out K for KMeans, I will use ELBOW Method on KMEANS++ Calculation
from sklearn.cluster import KMeans
wcss=[]

#we always assume the max number of cluster would be 10
#you can judge the number of clusters by doing averaging
###Static code to get max no of clusters

for i in range(1,11):
    kmeans = KMeans(n_clusters= i, init='k-means++', random_state=0)
    kmeans.fit(X)
    wcss.append(kmeans.inertia_)

    #inertia_ is the formula used to segregate the data points into clusters
```

Now lets try to print out the statistics for number of K.

```
plt.plot(range(1,11), wcss)
plt.title('The Elbow Method')
plt.xlabel('no of clusters')
plt.ylabel('wcss')
plt.show()
```

This will give you the output showing that optimal number of **K** is 5. &#x20;

![](https://1942897066-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M3otA1ynNF760FHDv2E%2F-M3uI62a7Ps5kxsxdf-R%2F-M3uuRbf-6AemLThrJeD%2FScreenshot%202020-04-02%20at%205.42.03%20PM.png?alt=media\&token=ccb97374-9d8f-49d1-9143-8316969de0f2)

Nows let apply the **KMeans** to our data with the number **K=5** and plot the number of clusters along with their centroids. Dont worry, many of the lines below are just to add detail to the plot. So dont worry if it looks too much :)

```
kmeansmodel = KMeans(n_clusters= 5, init='k-means++', random_state=0)
y_kmeans= kmeansmodel.fit_predict(X)
plt.scatter(X[y_kmeans == 0, 0], X[y_kmeans == 0, 1], s = 100, c = 'red', label = 'Cluster 1')
plt.scatter(X[y_kmeans == 1, 0], X[y_kmeans == 1, 1], s = 100, c = 'blue', label = 'Cluster 2')
plt.scatter(X[y_kmeans == 2, 0], X[y_kmeans == 2, 1], s = 100, c = 'green', label = 'Cluster 3')
plt.scatter(X[y_kmeans == 3, 0], X[y_kmeans == 3, 1], s = 100, c = 'cyan', label = 'Cluster 4')
plt.scatter(X[y_kmeans == 4, 0], X[y_kmeans == 4, 1], s = 100, c = 'magenta', label = 'Cluster 5')
plt.scatter(kmeansmodel.cluster_centers_[:, 0], kmeansmodel.cluster_centers_[:, 1], s = 300, c = 'yellow', label = 'Centroids')
plt.title('Clusters of customers')
plt.xlabel('Annual Income (k$)')
plt.ylabel('Spending Score (1-100)')
plt.legend()
plt.show()
```

This gives us the output, plotting **5 clusters** on the graph against **Annual Income** and **Spending Score**.\
&#x20;&#x20;

![](https://1942897066-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M3otA1ynNF760FHDv2E%2F-M3uI62a7Ps5kxsxdf-R%2F-M3uvHt36Q3mSp7NivcS%2FScreenshot%202020-04-02%20at%205.45.43%20PM.png?alt=media\&token=368b0e3d-08e2-4195-a073-4968df2cbc85)

This output here shows us that we have 5 different types of customers belong from with different annual income groups and spending scores but basically showing us what group to target and what to offer to those customers so that we can get more and more customers and keep our current customers happy.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://huzaifahsaleem.gitbook.io/learn-clustering-alogrithms-using-python-and-sciki/to-be-continued....md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
