Finding Classifiers with Support Vector Machines (SVM)

Published by Jared Kunz on

This is a Machine Learning (ML) project written in R Language. It is part of a series I’ve worked on as part of graduate courses I’ve taken or other personal ML and programming projects. While the exercises and problem statements may be perscribed and sometimes related to different courses, the code written, analysis and results are my own. If you see similarities in these exercises with any courses you are taking, please do not copy my code or my analysis verbatim and re-use it in your classes. Please just use my code and analysis as an example of one approach. I plan to expand on these tutorials and go deeper into them, i.e. improve them over time as my knowledge increases and based upon any feedback I receive.

Two Exercises Using R Language and a credit card data set

Step One: Download the data from the link below. Step Two. Run the code called “how_to_find_a_good_classifier.r” found in this repo. The files credit_card_data.txt (without headers) and credit_card_data-headers.txt (with headers) contain a dataset with 654 data points, 6 continuous and 4 binary predictor variables.

It has anonymized credit card applications with a binary response variable (last column) indicating if the application was positive or negative. The dataset is the “Credit Approval Data Set” from the UCI Machine Learning Repository (https://archive.ics.uci.edu/ml/datasets/Credit+Approval) without the categorical variables and without data points that have missing values.

Exercise Part One – Using the support vector machine function ksvm contained in the R package kernlab, find a good classifier for this data. Show the equation of your classifier, and how well it classifies the data points in the full data set.

Hint: You might want to view the predictions your model makes; if C is too large or too small, they’ll almost all be the same (all zero or all one) and the predictive value of the model will be poor. Even finding the right order of magnitude for C might take a little trial-and-error.

The term λ we use with SVM as the trade off the two components of correctness and the margin is called C in ksvm. One of the challenges of this is to find a value of C that works well; for many values of C, almost all predictions will be “yes” or almost all predictions will be “no”. • ksvm does not directly return the coefficients a0 and a1…am. Instead, you need to do the last step of the calculation yourself.

Exercise Part Two – Try other (nonlinear) kernels as well; they can sometimes be useful and might provide better predictions than vanilladot.

My code for this project can be found here on my github

https://github.com/jaredkunz/MLprojectsRlang/tree/main/001proj-svmclassifiers

For example, the equation for your classifier, i.e. the answer to Exercise Part One might be:

> aScaledUp
>The coefficient equation is as follows:
          V1           V2           V3           V4           V5           V6           V7           V8           V9 
-0.004180985  0.004628256  0.014836154  0.093885681  0.569911183 -0.222436309  0.158114398 -0.001308316 -0.019636394 
         V10 
 0.105437302 
> a0ScaledUp
[1] -0.05105346
> 
> # Finding C aka the Margin, with this equation is basically sum(aScaledUp V1 to V10) + a0ScaledUp(-0.0511) = 0.6481975
> # So using this scaling approach, the Margin or C, a good Constant C Classifier between 0 and 1 is:
[1] 0.6481975
> 
> # see what the model predicts with ksvm predict() function
> PredictFunctScaledUp <- predict(ModelScaledUp, CreditData[, 1:10])
> 
> # percent of testing observations that are correctly classified.
> percent(sum(PredictFunctScaledUp == CreditData$V11) / nrow(CreditData))
[1] "86.39%"
Using this Final Step: Building SV Fluctuation Chart
Accuracy: 86.39%
Num of Support Vects: 193
Const C: 43.5

In summary, using svm with vanilladot to find a good soft margin classifier, i.e. the Constant C or Margin is somewhere around 43.5 or 64.9

To understand this excercise better, here is a good article on the subject: https://machinelearningmastery.com/support-vector-machines-for-machine-learning/#:~:text=The%20margin%20is%20calculated%20as,are%20called%20the%20support%20vectors.

“The smaller the value of C, the more sensitive the algorithm is to the training data (higher variance and lower bias). The larger the value of C, the less sensitive the algorithm is to the training data (lower variance and higher bias).”

Answer to Exercise Part Two might look like this type of visual (see below the visual for explanation):

image

Laplacedot looks way out-of-bounds and not valid

rbfdot may be a better choice as it shows more support vectors and higher accuracy, but per my education anything in the 90% or above is suspect and needs heavy scrutiny

splinedot has higher accuracy but fewer support vectors

It appears the sweet spot is using vanilladotanovadot or maybe polydot but you get fewer support vectors with polydot

tanhdot you get much less accuracy and fewer support vectors, yet that doesn’t mean it’s not a possible option to explore as well


2 Comments

Working at Walmart · November 9, 2022 at 6:31 am

Thanks for sharing your thoughts!

    Jared · November 25, 2022 at 11:44 am

    sure np!

Leave a Reply

Avatar placeholder

Your email address will not be published. Required fields are marked *

We've updated our privacy policy (link at bottom of site) in compliance with data protection law. By continuing to use this site, you are agreeing to our updated privacy policy.