Solution: Unsupervised Learning
Follow the instructions for using unsupervised learning algorithms on preprocessed data.
We'll cover the following...
Here’s how we can cluster our data using DBSCAN:
Press + to interact
main.py
data.csv
from collections import Counterpreprocessed = pd.read_csv("preprocessed.csv")X_var = ['SeniorCitizen', 'tenure','MonthlyCharges', 'TotalCharges', 'gender_Male','Partner_Yes', 'Dependents_Yes', 'PhoneService_Yes','MultipleLines_No phone service', 'MultipleLines_Yes','InternetService_Fiber optic', 'InternetService_No','OnlineSecurity_No internet service', 'OnlineSecurity_Yes','OnlineBackup_No internet service', 'OnlineBackup_Yes','DeviceProtection_No internet service', 'DeviceProtection_Yes','TechSupport_No internet service', 'TechSupport_Yes','StreamingTV_No internet service', 'StreamingTV_Yes','StreamingMovies_No internet service', 'StreamingMovies_Yes','Contract_One year', 'Contract_Two year', 'PaperlessBilling_Yes','PaymentMethod_Credit card (automatic)', 'PaymentMethod_Electronic check','PaymentMethod_Mailed check',]X=preprocessed[X_var]# DBSCANfrom sklearn.cluster import DBSCANdbscan = DBSCAN(eps=1.8, min_samples=50)dbscan.fit(X)dbscan_labels = dbscan.labels_dbscan_cluster_count = Counter(dbscan_labels)print(dbscan_cluster_count)
Line 22: We import the
DBSCAN
class.Lines 51–53: We implement the algorithm. A
DBSCAN
object ...