Skip to main content

Featured

Skillset, topics, projects and virtual internships for DS

This post is for those who are beginner and do not have any idea about topics that they need as a beginner DATA SCIENCE/ DATA ANALYST.  I am also facing the same problem before a year ago and till date I have some relevant knowledge about data science and also have some projects.  People are saying the we need so many skills like Mathematics, Programming language, some cloud concepts too. Actually they are right. Being a Data Scientist is not like being a web developer or a front-end developer that have limited skill set.  In this post I will tell you the exact topics that you need to learn at beginner level. MATHEMATICS Descriptive Statistics, distributions, hypothesis testing and regression analysis. Bayesian Thinking, conditional probability, priors, maximum likely hood. Vectors and matrices Matrices operations Eigenvalues and eigenvectors Linear and non linear functions Multivariable calculus  PROGRAMMING LANGUAGE(Python or R)    Data types, String operations, Expressions and varia

Support Vector Machine with implementation | supervised learning Algorithm

Support vector machine is basically used for supervised learning means it deals with classification and regression and the dataset which is used with this, is labeled. It is one of the popular algorithm that is used for classification problem.This algorithm is powerful as well as clear for complex task.
Definition
SVM is a supervised algorithm that classifies cases by finding a separator and it involves two steps that are following.
1.Mapping data into a high dimensional feature space.
2.Finding a separator.

Above figure have three blocks. In first block model is taking dataset ,after then model will be trained on that dataset, in second phase, it will be tested for some inputs and at last we will get the result.

Support vector machine generally deals with two types of data, first is linear separable and next one is non separable data.
 It is an example of non linear separable data.
We use kernalling to solve non linear problem.

It is nothing but conversion of non linear into linear by using polynomials, sigmoid function etc.

eg. non linear data with one variable can be easily converted in to linear separable by using following function.
f(x) = [ x : x^2 ]



In this figure data can be separated by a single straight line, that's why it is linear separable data.










some terms that are related with SVM are given below.
1. Hyperplane - Straight line which divides datasset into two clusters and separate the whole dataset in to two parts is know as hyperplane. It is also known as decision boundary.
2. Margin- two straight lines are drawn parallel to hyperplane. The distance between hyperplane and each parallel lines is known as margin. It is denoted by D- and D+ respectively. It is the distance of parallel lines.
3. Support vectors - two data points that are nearest to the parallel lines is known as support vectors.

Significance of Margin

We only take that model which have greater margin because models having large margin are more accurate and it will be good for future prediction.
distance between margin is directly proportional to accuracy.

Advantage
1. Accurate in high dimensional space
2. memory efficient.

Disadvantage
1. leads to over-fitting.
2. No probability estimation.
3. Useful for small datasets.

Applications
1. Image classification.
2. text mining.
3. hand writing recognition.
4. genes expression classification.

Implementation of SVM algorithm

here I am going to upload snapshots of code , if you guys want the whole project based on Support vector machine, just drop a comment I'll explain the whole project to you.








After running this script you'll get the value 0.9635036496350365 which is almost 1, means our model has approx 96 % accuracy that will be better for future prediction.

If you have any doubt, just drop a comment.
connect with us on -
https://www.instagram.com/kavyansh.pandey

Thank You.








Comments

  1. 👌 👌 👌 👌 👌 👌 👌 👌

    ReplyDelete

Post a Comment

Popular Posts