**Tune Parameters**

Another important point is

**to check the SVM algorithm parameters**. As many Machine Learning algorithms, SVM has some parameters that have to be tuned to gain better performance. This is very important:

**SVM is very sensitive to the choice of parameters**. Even close parameters values might lead to very different classification results. Really! In order to find the best for your problem, you might want to test some different values. A great tool to help this job in R is the

**tune.svm()**method. It can test several different values, and return the ones which minimizes the classification error for the 10-fold cross validation.

Example of tune.svm() output:

The γ (gama) has to be tuned to better fit the hyperplane to the data. It is responsible for the linearity degree of the hyperplane, and for that, it is not present when using linear kernels.

**The smaller**γ

**is, the more the hyperplane is going to look like a straight line**.

**If**γ

**is too great, the hyperplane will be more curvy**and might delineate the data too well and lead to overfitting.

*Picture 2 - great value of*

**γ**Another parameter to be tuned to help improve accuracy is C. It is responsible for the size of the "soft margin" of SVM. The soft margin is a "gray" area around the hyperplane. This means that points inside this soft margin are not classified as any of the two categories.

**The smaller the value of C, the greater the soft margin**.

*Picture 3 - Great values of C*

*Picture 4 - Small values of C*

**The svm() method in R expects a matrix or dataframe with one column identifying the class of that row and several features that describes that data. The following table shows an example of two classes, 0 and 1, and some features. Each row is a data entry.**

**How to Prepare Data**class f1 f2 f3

The input for the svm() method could be:

> svm(class ~., data = my_data, kernel = "radial", gamma = 0.1, cost = 1)

Here "class" is the name of the column that describes the classes of your data and "my_data" is obviously your dataset. The parameters should be the ones best suitable for your problem.

## You need to be a member of Data Science Central to add comments!

Join Data Science Central