...

/

Solution: Unsupervised Learning

Solution: Unsupervised Learning

Follow the instructions for using unsupervised learning algorithms on preprocessed data.

We'll cover the following...

Here’s how we can cluster our data using DBSCAN:

Press + to interact
main.py
data.csv
from collections import Counter
preprocessed = pd.read_csv("preprocessed.csv")
X_var = [
'SeniorCitizen', 'tenure',
'MonthlyCharges', 'TotalCharges', 'gender_Male',
'Partner_Yes', 'Dependents_Yes', 'PhoneService_Yes',
'MultipleLines_No phone service', 'MultipleLines_Yes',
'InternetService_Fiber optic', 'InternetService_No',
'OnlineSecurity_No internet service', 'OnlineSecurity_Yes',
'OnlineBackup_No internet service', 'OnlineBackup_Yes',
'DeviceProtection_No internet service', 'DeviceProtection_Yes',
'TechSupport_No internet service', 'TechSupport_Yes',
'StreamingTV_No internet service', 'StreamingTV_Yes',
'StreamingMovies_No internet service', 'StreamingMovies_Yes',
'Contract_One year', 'Contract_Two year', 'PaperlessBilling_Yes',
'PaymentMethod_Credit card (automatic)', 'PaymentMethod_Electronic check',
'PaymentMethod_Mailed check',
]
X=preprocessed[X_var]
# DBSCAN
from sklearn.cluster import DBSCAN
dbscan = DBSCAN(eps=1.8, min_samples=50)
dbscan.fit(X)
dbscan_labels = dbscan.labels_
dbscan_cluster_count = Counter(dbscan_labels)
print(dbscan_cluster_count)
  • Line 22: We import the DBSCAN class.

  • Lines 51–53: We implement the algorithm. A DBSCAN object ...