Customer Segmentation
The problem that we are discussing here and solving through clustering is customer segmentation based on their buying score. This helps retail store owners in identifying their potential customers.
So first of all, lets start importing libraries into the first cell and run it. OS is imported for using operating system related functions if we need them. so its always good to import it. NumPy is a really good library for creating Arrays and Matrices and performing mathematical operations on them. Pandas provide dataframes which are similar to tables and we can import our dataset into dataframes to perform any sort of transformation or manipulation. Matplotlib and Seaborn both provides functions and APIs for data visualizations.
After the libraries are imported, we need to import the dataset so that we can start working with data. Click the connections button on the top left and add Data File as Pandas Dataframe into the code.
this will add the below code to the code cell. Run it and you will see the output as the first 5 rows of the dataset.
So thats the output of our head call which is displaying the first five rows of our dataset.
As the result shows, our dataset contains CustomerID, Age, Gender, Annual Income and their Spending Score that has been done by the mall by their own criteria.
Just some functions that you should run before applying any algorithm to you data is to check the shape of your data, check if there arent null values in your data and so on. i have pasted the three functions. run them separately in different cell blocks to see their outputs.
Now lets get on to applying the algorithm to see the clusters of the data.
Last updated