“Please explain Support Vector Machines (SVM) like I am a 5 year old.” #analytics #machinelearning #modeling
Courtesy of @copperking at Reddit: https://www.reddit.com/r/MachineLearning/comments/15zrpp/please_explain_support_vector_machines_svm_like_i
Direct quotation from Reddit:
- “We have 2 colors of balls on the table that we want to separate.
- We get a stick and put it on the table, this works pretty well right?
- Some villain comes and places more balls on the table, it kind of works but one of the balls is on the wrong side and there is probably a better place to put the stick now.
- SVMs try to put the stick in the best possible place by having as big a gap on either side of the stick as possible.
- Now when the villain returns the stick is still in a pretty good spot.
- There is another trick in the SVM toolbox that is even more important. Say the villain has seen how good you are with a stick so he gives you a new challenge.
- There’s no stick in the world that will let you split those balls well, so what do you do? You flip the table of course! Throwing the balls into the air. Then, with your pro ninja skills, you grab a sheet of paper and slip it between the balls.
- Now, looking at the balls from where the villain is standing, they balls will look split by some curvy line.
Boring adults the call balls data, the stick a classifier, the biggest gap trick optimization, call flipping the table kernelling and the piece of paper a hyperplane.”
Well, for a practice-oriented guy the first question obviously is: so what? What can you do with it in practice?
I think it boils down to the nature of classification algorithms. They are quite widely used, e.g. in image or text recognition. So, machine can better learn how to differentiate between an orange and an apple, for example. This of course leads into multiple efficiency advantages, when we are able to replace human classifiers in many jobs.
In conclusion, in my quest to understand machine learning it has become obvious that support vector machine is not the easiest concept to start from. However, since classification is an essential area in machine learning, one cannot avoid it for too long.