If you haven’t read my last post about frogs and rats, go do that first, because this builds on that. How you scale a model from a binary classification (either this or that) to a more useful classification with many categories on a higher level is not too much harder. You just tell the function to do it.
On a lower level, this requires the model to have multiple outputs. One for each class that you’re trying to find. Another thing is that it requires much more data. Think about it, if all you had to do was decide if something was a pizza or not a pizza, you’d just have to look for roughly triangular shapes with cheese on them, and those are probably pizza. Now consider if all the categories were animals of similar size and almost all furry.
Fortunately, I didn’t painstakingly collect all this data, I got it from a dataset called the “The Oxford-IIIT Pet Dataset”. This contains 200 images for each of the 37 breeds of cats and dogs. 200 sounds like a bunch, but for computer vision, it’s surprisingly little. Think about it, an expert must’ve seen all 37 of these breeds for 1000s of hours and still maybe makes mistakes sometimes. For a computer to just see 200 and predict with high accuracy is truly nothing short of a miracle.
This is only possible because the machine learning model had help. It was a pretrained model and was just fine-tuned on this new data. This is called Transfer Learning
and I talk more about it in my last post. I say it like it’s simple, but this is truly the culmination of decades of work and there is so much that I’ve abstracted away.
What the model does at the end is that it goes through the layers and finally, it outputs multiple, in this case 37, values. All of these, due to the way the model is built, add up to 1.0. This means that whichever one of these outputs has the highest number is the class that is the most likely to classify your image according to the model.
Now it’s your turn, go be a good pet owner and find out what breed your dog or cat is at: https://stonkszain-pet-classifier.hf.space
Or just what breed of dog or cat you are for fun :D