Although a support vector machine model (binary classifier) is more commonly built by solving a quadratic programming problem in the dual space, it can be built fast by solving the primal optimization problem also. In this article a Support Vector Machine implementation is going to be described by solving the primal optimization problem with subgradient solver using stochastic gradient decent. The algorithm is called the Pegasosalgorithm, as described by Shai ShalevShwartz et al, in their original paper.
As shown in the next figure taken from the slides, the SoftSVM Primal Lagrangian can be represented as follows:
or as the following if the explicit bias term is discarded:
where the 01 loss is approximated by the hingeloss.
The following figure shows a simplified version of the algorithm:
The following python code implements the algorithm:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

class SVMPegasos(LinearClassifier): """Implementation of SVM with SGD with PEGASOS Algorithm""" def __init__( self , n_iter = 10 , lambda1 = 1 ): self .n_iter = n_iter self .lambda1 = lambda1 def fit( self , X, Y): Y = list (Y) self .find_classes(Y) # convert all output values to +1 or 1 Yn = [sign(y, self .positive_class) for y in Y] X = X.toarray() m, n_features = X.shape[ 0 ], X.shape[ 1 ] self .w = numpy.zeros( n_features ) #print(m) for i in range ( self .n_iter): eta = 1. / ( self .lambda1 * (i + 1 )) j = numpy.random.choice(m, 1 )[ 0 ] x, y = X[j], Yn[j] score = self .w.dot(x) if y * score /code>

https://sandipanweb.files.wordpress.com/2018/04/ov.png?w=150&h=128 150w, https://sandipanweb.files.wordpress.com/2018/04/ov.png?w=300&h=255 300w" sizes="(maxwidth: 765px) 100vw, 765px" />
https://sandipanweb.files.wordpress.com/2018/04/svm_ov.png?w=150&am... 150w, https://sandipanweb.files.wordpress.com/2018/04/svm_ov.png?w=300&am... 300w" sizes="(maxwidth: 845px) 100vw, 845px" />
The next figure, taken from the same paper shows how the algorithm can be adapted to kernelSVM.
https://sandipanweb.files.wordpress.com/2018/04/f26.png?w=95&h=150 95w, https://sandipanweb.files.wordpress.com/2018/04/f26.png?w=190&h... 190w" sizes="(maxwidth: 846px) 100vw, 846px" />
https://sandipanweb.files.wordpress.com/2018/04/svm_flame.png?w=150... 150w, https://sandipanweb.files.wordpress.com/2018/04/svm_flame.png?w=300... 300w" sizes="(maxwidth: 883px) 100vw, 883px" />
https://sandipanweb.files.wordpress.com/2018/04/jain.png?w=150&... 150w, https://sandipanweb.files.wordpress.com/2018/04/jain.png?w=300&... 300w" sizes="(maxwidth: 783px) 100vw, 783px" />
https://sandipanweb.files.wordpress.com/2018/04/svm_jain.png?w=150&... 150w, https://sandipanweb.files.wordpress.com/2018/04/svm_jain.png?w=300&... 300w" sizes="(maxwidth: 1079px) 100vw, 1079px" />
The following figures show how by changing the loss function (from hingeloss to logloss) in the PEGASOS algorithm, a logistic regression model can be trained.
https://sandipanweb.files.wordpress.com/2018/04/f30.png?w=71 71w, https://sandipanweb.files.wordpress.com/2018/04/f30.png?w=142 142w, https://sandipanweb.files.wordpress.com/2018/04/f30.png 690w" sizes="(maxwidth: 676px) 100vw, 676px" /> https://sandipanweb.files.wordpress.com/2018/04/f36.png?w=150 150w, https://sandipanweb.files.wordpress.com/2018/04/f36.png?w=300 300w, https://sandipanweb.files.wordpress.com/2018/04/f36.png 754w" sizes="(maxwidth: 688px) 100vw, 688px" /> https://sandipanweb.files.wordpress.com/2018/04/f27.png?w=150&h=62 150w, https://sandipanweb.files.wordpress.com/2018/04/f27.png?w=300&h... 300w" sizes="(maxwidth: 721px) 100vw, 721px" /> https://sandipanweb.files.wordpress.com/2018/04/f28.png?w=150&h... 150w, https://sandipanweb.files.wordpress.com/2018/04/f28.png?w=300&h... 300w" sizes="(maxwidth: 682px) 100vw, 682px" />
https://sandipanweb.files.wordpress.com/2018/04/lr_ov.png?w=150&... 150w, https://sandipanweb.files.wordpress.com/2018/04/lr_ov.png?w=300&... 300w" sizes="(maxwidth: 635px) 100vw, 635px" />
Given the following dataset with 11914 texts, let’s fit a Perceptron model along with SVM / LR models with PEGASOS algorithm on 80% sample and predict to classify the sentiments of the 20% heldout text, followed by computing the accuracy. The next figure shows first few rows of the corpus and number of positive and negative sentiment examples.
https://sandipanweb.files.wordpress.com/2018/04/f37.png?w=150&h=96 150w, https://sandipanweb.files.wordpress.com/2018/04/f37.png?w=300&h... 300w" sizes="(maxwidth: 518px) 100vw, 518px" />
https://sandipanweb.files.wordpress.com/2018/04/f39.png?w=150&h=74 150w" sizes="(maxwidth: 202px) 100vw, 202px" />
The next figures show the distribution of the word lengths for the texts for different sentiments.
https://sandipanweb.files.wordpress.com/2018/04/f38.png?w=150 150w, https://sandipanweb.files.wordpress.com/2018/04/f38.png?w=300 300w, https://sandipanweb.files.wordpress.com/2018/04/f38.png?w=768 768w, https://sandipanweb.files.wordpress.com/2018/04/f38.png 935w" sizes="(maxwidth: 676px) 100vw, 676px" />
The following python code shows how the textclassification pipeline is implemented with SVMPegasos.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

def classify_with_svm(): # read all the documents X, Y = read_corpus( 'all_sentiment_shuffled.txt' ) # split into training and test parts Xtrain, Xtest, Ytrain, Ytest = train_test_split(X, Y, train_size = 0.8 , random_state = 0 ) classifier = Pipeline( [( 'vec' , CountVectorizer(preprocessor = lambda x: x, tokenizer = lambda x: x)), ( 'fs' , SelectKBest(k = 5000 )), ( 'cls' , SVMPegasos(n_iter = len (Xtrain) * 10 , lambda1 = 0.015 ))] ) t0 = time.time() classifier.fit(Xtrain, Ytrain) t1 = time.time() print ( 'Training time:' , t1  t0, 'seconds.' ) Yguess = classifier.predict(Xtest) print ( 'accuracy:' , accuracy_score(Ytest, Yguess)) 
As can be seen, even with very simple features, the performance of all the classifiers in terms of accuracy of prediction on the test set and the time taken to train the model are good and comparable to each other.
https://sandipanweb.files.wordpress.com/2018/04/f41.png?w=103&h... 103w" sizes="(maxwidth: 358px) 100vw, 358px" />
https://sandipanweb.files.wordpress.com/2018/04/f42.png?w=100&h... 100w" sizes="(maxwidth: 355px) 100vw, 355px" />
© 2020 Data Science Central ® Powered by
Badges  Report an Issue  Privacy Policy  Terms of Service
Most Popular Content on DSC
To not miss this type of content in the future, subscribe to our newsletter.
Other popular resources
Archives: 20082014  20152016  20172019  Book 1  Book 2  More
Most popular articles
You need to be a member of Data Science Central to add comments!
Join Data Science Central