Support Vector Machine with implementation | supervised learning Algorithm

April 07, 2020

Support Vector Machine with implementation | supervised learning Algorithm

Support vector machine is basically used for supervised learning means it deals with classification and regression and the dataset which is used with this, is labeled. It is one of the popular algorithm that is used for classification problem.This algorithm is powerful as well as clear for complex task.

Definition
SVM is a supervised algorithm that classifies cases by finding a separator and it involves two steps that are following.
1.Mapping data into a high dimensional feature space.
2.Finding a separator.

Above figure have three blocks. In first block model is taking dataset ,after then model will be trained on that dataset, in second phase, it will be tested for some inputs and at last we will get the result.

Support vector machine generally deals with two types of data, first is linear separable and next one is non separable data.

It is an example of non linear separable data.
We use kernalling to solve non linear problem.

It is nothing but conversion of non linear into linear by using polynomials, sigmoid function etc.

eg. non linear data with one variable can be easily converted in to linear separable by using following function.
f(x) = [ x : x^2 ]

In this figure data can be separated by a single straight line, that's why it is linear separable data.

some terms that are related with SVM are given below.
1. Hyperplane - Straight line which divides datasset into two clusters and separate the whole dataset in to two parts is know as hyperplane. It is also known as decision boundary.
2. Margin- two straight lines are drawn parallel to hyperplane. The distance between hyperplane and each parallel lines is known as margin. It is denoted by D- and D+ respectively. It is the distance of parallel lines.
3. Support vectors - two data points that are nearest to the parallel lines is known as support vectors.

Significance of Margin

We only take that model which have greater margin because models having large margin are more accurate and it will be good for future prediction.
distance between margin is directly proportional to accuracy.

Advantage
1. Accurate in high dimensional space
2. memory efficient.

Disadvantage
1. leads to over-fitting.
2. No probability estimation.
3. Useful for small datasets.

Applications
1. Image classification.
2. text mining.
3. hand writing recognition.
4. genes expression classification.

Implementation of SVM algorithm

here I am going to upload snapshots of code , if you guys want the whole project based on Support vector machine, just drop a comment I'll explain the whole project to you.

After running this script you'll get the value 0.9635036496350365 which is almost 1, means our model has approx 96 % accuracy that will be better for future prediction.

If you have any doubt, just drop a comment.
connect with us on -
https://www.instagram.com/kavyansh.pandey

Thank You.

Comments

Nimisha pandeyApril 8, 2020 at 5:52 PM
👌 👌 👌 👌 👌 👌 👌 👌
ReplyDelete
Replies

Search This Blog

All about Data Science

Featured

Skillset, topics, projects and virtual internships for DS

Support Vector Machine with implementation | supervised learning Algorithm

Comments

Post a Comment

Popular Posts

KPMG Data Science Virtual Internship Task 2 | KPMG virtual internship

Univariate Analysis