Support Vector Machines  What is Support vector in SVM?
Support Vector Machines
Support vector machines in machine learning is a supervised learning model, used to classify complex, mediumsized datasets, it's name comes from term support vectors.
Introduction
In machine learning, you often hear SVM or the term Support Vector Machines. As you might be knowing that it is a supervised learning model. To put in another manner, it mostly plays with labeled data. SVM is widely known to data science community.
What is Support Vector Machines or SMV?
Support Vector Machines is a powerful Machine Learning model. It can able to perform linear and nonlinear classification. SVM performs regression and outlier detection tasks as well. Support vector machine is mainly for classifying the datasets. Complex, small or mediumsized datasets are well classified with SVM.
SVM: A Binary Classifier
SVM is strictly a binary classifier. Still, various strategies can be used to perform multiclass classification. One can use multiple binary classifiers. The support vector machine is to fit the widest possible street between two classes.
Linear SVM as binary classifier 
The two classes can clearly be separated with a straight line are said to be linearly separable classes. In the figure above the two classes of training instances represented by clusters of dots and bubbles are linearly separable and separated with the decision boundary(the continuous middle line).
The decision boundary(the continuous middle line) lies in a plane and represents the separation of two classes staying far away from the closest training instances lying on both sides. Decision boundary can be a line, plane, or hyperplane(in higher dimensions).
This decision boundary with dotted lines on both sides forms a street. So, unless we are not adding the training instances right on to the street(margin). Hence, we can add them elsewhere. Adding them in any quantity will not affect the decision boundary at all.
Support Vectors
First of all, we have seen what a SVM is? Now we will be looking at SV part of SVM. What support vector in SVM? You should be familier with the concept called Support Vectors. It is of great importance. The figure above pointing out the support vectors.
Support Vectors are the training instances or training points located on dotted lines on both sides of the decision boundary of the SVM classifier.
Let's put it more simply. It will be :
Those points that lie on the border between the classes. Therefore, support vector machine gets its name from the support vector.
The distance to each of the support vector is taken into consideration before making a prediction for a new point, even classification is also made based on that. The Gaussian kernel is used to measure that distance.
Linear SVM
The Linear support vector machine(Linear SVMs) is one of the most common linear classification algorithm implemented in svm.LinearSVC. Here, SVC stands for support vector classifier.
Linear models are very fast. Fast, not only for training but also to predict. They are easy to understand which makes prediction tasks easy to get into the head.
LinearSVC
We can use the LinearSVC model to forge the dataset and visualize the decision boundary as found by the linear models.
LinearSVC has a parameter that determines the regularization strength. You find that the parameter is called C. It is of great importance. If you have learned regression models, you might know alpha. You detect analogies between C and Alpha.
C And Classification
Higher will be the value of C less will be the regularization.A higher value of C emphasis, the classification of each individual data point correctly. You will get to know the role of C parameter. It's high value will force the model. The line drawn in such a way that it will divide data points as accurately as possible. Let's take a look at some examples.
At start, value of C is very small. That's why it corresponds to more regularization. With more regularization decision boundary tends to be horizontal and doesn't classify to points.

In second case, the value of C is little large that first example. It tries to cover the two misclassified points. Resulting in the decision boundary to get tilt.
Decision boundary of linear SVM having a value of C=100.00 
Finally, the higher value of C tilts the decision boundary(line of separation) by the extent with which it perfectly classifies all points in the class lying downward(represented by blue dots) However, you will easily find that model is not efficient. It is not capturing overall layout of classes well. So it can be said that this model is likely overfitting.
Conclusion on C
For above figures, you get to know what C is? practically. Along with it's importance.
A high value of C results in high accuracy. They are diretly proportional. As one's value raise the other will increase. And you can grab high accuracy by raising C.
For linear models, the decision boundary is a linear function of the input.
The SVM classifier is the widest possible margin(space in between both dotted lines) between the classes. You can call it classification with some type of margin. Therefore, termed as Large margin Classification.
Kernelized Support Vector Machines
Up to this point, we have seen SVM. After that, let have a glance at its Kernelized form. It is something that can group more scatter data and classify well. Many a time, you hear SMV to be referred for two. One for the Support Vector Machine and other for Kernelised support vector machine. These terms are not interchangeable. Moreover, Kernelised support vector machines give us the ability to perform more complex tasks. So, we can implement advanced decision boundaries. We can implement kernels for the same purpose. Lines and hyperplanes are not used in the result. The regularization parameter C along with gamma(the RBF kernel parameter) controls the complexity and precision of the model. To achieve that you need to tune C. Along with that, you should tune gamma as well.
Kernel Trick
Kernel Trick is a mathematical way to implement SVM. We use it whenever there is linearly inseparable dataset is taken as input. More specifically dataset which we cannot separate just with line and hyperplane. As a result, we use Kernel Trick .
That's all for this post. We will discuss the Kernelised SVM and gamma in another post.
You would also love to know >>
What is MEAN stack?