Use PyCaret to Build Classification Machine Learning Model

My first machine learning model in Python for a hackathon was quite a cumbersome block of code. I still remember the many lines of code it took to build an ensemble model – it would have taken a wizard to untangle that mess!

When it comes to building interpretable machine learning models, especially in the industry (or when we want to explain our hackathon results to the client), writing efficient code is key to success. That’s why I strongly recommend using the PyCaret library.

I wish PyCaret was around during my rookie machine learning days! It is a super flexible and useful library that I’ve leaned on quite a bit in recent months. I firmly believe anyone with an aspiration to succeed as a data science or analytics professional will benefit a lot from using PyCaret.

We’ll see what exactly PyCaret it, how to install it on your machine, and then we’ll dive into using PyCaret for building interpretable machine learning models, including ensemble models. A lot of learning to be done so let’s dig in.

Use PyCaret to Build Classification Machine Learning Model

What is PyCaret and Why Should you Use it?
Installing PyCaret on your Machine
Let’s Get Familiar with PyCaret
Training our Machine Learning Model using PyCaret
Building Ensemble Models using PyCaret
Let’s Analyze our Model!
Time to Make Predictions
Save and Load the Model

What is PyCaret and Why Should you Use it?

PyCaret is an open-source, machine learning library in Python that helps you from data preparation to model deployment. It is easy to use and you can do almost every data science project task with just one line of code.

I’ve found PyCaret extremely handy. Here are two primary reasons why:

PyCaret, being a low-code library, makes you more productive. You can spend less time on coding and can do more experiments
It is an easy to use machine learning library that will help you perform end-to-end machine learning experiments, whether that’s imputing missing values, encoding categorical data, feature engineering, hyperparameter tuning, or building ensemble models

Installing PyCaret on your Machine

This is as straightforward as it gets. You can install the first stable version of PyCaret, v1.0.0, directly using pip. Just run the below command in your Jupyter Notebook to get started:

!pip3 install pycaret

Let’s Get Familiar with PyCaret

Problem Statement and Dataset

In this article, we are going to solve a classification problem. We have a bank dataset with features like customer age, experience, income, education, and whether he/she has a credit card or not. The bank wants to build a machine learning model that will help them identify the potential customers who have a higher probability of purchasing a personal loan.

The dataset has 5000 rows and we have kept 4000 for training our model and the remaining 1000 for testing the model. You can find the complete code and dataset used in this article here.

Let’s start by reading the dataset using the Pandas library:

# importing pandas to read the CSV file
import pandas as pd
# read the data
data_classification = pd.read_csv('datasets/loan_train_data.csv')
# view the top rows of the data
data_classification.head()

	# importing pandas to read the CSV file
	import pandas as pd
	# read the data
	data_classification = pd.read_csv('datasets/loan_train_data.csv')
	# view the top rows of the data
	data_classification.head()

	# import the classification module
	from pycaret import classification
	# setup the environment
	classification_setup = classification.setup(data= data_classification, target='Personal Loan')

	# build the decision tree model
	classification_dt = classification.create_model('dt')

	# build the xgboost model
	classification_xgb = classification.create_model('xgboost')

	# build and tune the catboost model
	tune_catboost = classification.tune_model('catboost')

	# ensemble boosting
	boosting = classification.ensemble_model(classification_dt, method= 'Boosting')

	# Ensemble: blending
	blender = classification.blend_models(estimator_list=[classification_dt, classification_xgb])

	# compare performance of different classification models
	classification.compare_models()

	# AUC-ROC plot
	classification.plot_model(classification_dt, plot = 'auc')

	# Decision Boundary
	classification.plot_model(classification_dt, plot = 'boundary')

	# Precision Recall Curve
	classification.plot_model(classification_dt, plot = 'pr')

	# Validation Curve
	classification.plot_model(classification_dt, plot = 'vc')

	# evaluate model
	classification.evaluate_model(classification_dt)

	# interpret_model: SHAP
	classification.interpret_model(classification_xgb)

	# interpret model : Correlation
	classification.interpret_model(classification_xgb,plot='correlation')

	# read the test data
	test_data_classification = pd.read_csv('datasets/loan_test_data.csv')
	# make predictions
	predictions = classification.predict_model(classification_dt, data=test_data_classification)
	# view the predictions
	predictions

	# save the model
	classification.save_model(classification_dt, 'decision_tree_1')

	# load model
	dt_model = classification.load_model(model_name='decision_tree_1')

Data Science and AI Labs

Ad Code

Social Widget

Use PyCaret to Build Classification Machine Learning Model

Use PyCaret to Build Classification Machine Learning Model

Use PyCaret to Build Classification Machine Learning Model

Table of Contents

What is PyCaret and Why Should you Use it?

Installing PyCaret on your Machine

Let’s Get Familiar with PyCaret

Problem Statement and Dataset

Training our Machine Learning Model using PyCaret

Training a Model

Hyperparameter Tuning

Building Ensemble Models using PyCaret

Compare Models

Let’s Analyze our Model!

Plot Model Results

Evaluate our Model

Interpret our Model

Time to Make Predictions!

Save and Load the Model

End Notes

Related Posts

Post a Comment

0 Comments

Followers-Theo dõi

Mã Giảm Giá

Categories

Featured Post

Deploy LLM in HuggingFace Spaces For Free Using Ollama

Popular Posts

Connect and Read, Write to Azure Blob Storage from Databricks

Tags

Recent Posts

Tags

Recent in Sports

Mã giảm giá

Footer Menu Widget