Classification using SVM, KNN, RandomForestClassifier, and PCA
Learn how to classify multiple datasets using Sklearn classification models.
Helper functions
Let’s create some helper functions to load the datasets and models.
Function to get the dataset
Let’s create a function named return_data()
that helps us to load the datasets.
def return_data(dataset):
if dataset == 'Wine':
data = load_wine()
elif dataset == 'Iris':
data = load_iris()
else:
data = load_breast_cancer()
df = pd.DataFrame(data.data, columns=data.feature_names , index=None)
df['Type'] = data.target
X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, random_state=1, test_size=0.2)
return X_train, X_test, y_train, y_test,df,data.target_names
- The function
return_data(dataset)
takes a string that contains the name of thedataset
the user selects. - It loads the relevant dataset.
- We create a DataFrame
df
that we can show in our UI. - We use sklearn’s
train_test_split()
method to create the training sets (X_train
,y_train
) and testing sets (X_test
,y_test
). - The function returns the training set, testing set, the DataFrame, and the target classes (
X_train
,X_test
,y_train
,y_test
,df
,data.target_names
).
Let’s run the following code to load the datasets and display them on the console.
Get hands-on with 1400+ tech skills courses.