Getting Started With Machine Learning

A random calming image to show I have no experience in writing content

They said the hardest part of the flight is taking off. Guess we’ll never know if we don’t go to the runway.

And why did I decide to start a blog post on machine learning when there is literally a million listed, authentic resources out there all over YouTube and MOOC websites. That is exactly the reason why. Gone are the days when ML was restricted to a select few PhD holders, now every other guy with a laptop is training models and claiming results (rather blindly).

And it is very easy to get trapped in the jungle, not knowing where to start, how to deal with the jargon and most importantly, appreciating the beauty of algorithms without being daunted by the very nature of it. And in course we will try to answer the fundamental problems of ethics involved in using a technology. But is machine learning a new independent technology after all?

In 2012, Harvard Business Review dubbed Data Scientist the sexiest job of the 21st Century (The difference between Data Science and Machine Learning is another future post). AI is the new electricity. Really? I view Machine Learning as just a tool in my kit to solve problems. This tool is evolving and getting better everyday. And we contribute to it. Every selfie we click, every recording, permissions of apps and this blog, everything is generating data. Machine Learning helps us to tap into this mine and make things better for us (That AI triple camera you are using now, or something as simple as auto-text).

Here’s the answer to the question you have come here for! I am gonna conclude that all my readers are freshmen and will try to simplify the process as much as I can. Be free to skip some steps depending on your current condition.

Step 0: Get hold of the basics of Python

This is step 0 because let’s assume you know at this one language to start with. Remember machine learning is just a bunch of algorithms and the choice of language has little to do with it ideally. Hell yeah you can do c++ and java, but the most arguable statements in favor of Python are the extremely large community support and huge availability of support libraries. Don’t worry about these now, just learn the basics of Python, like how to call an object for a class, read a csv file and so on. A Coursera course called ‘Python for machine learning’ will do, as will a simple search on YouTube.

Step 1: The Math

Do this along with step 0 to get comfortable with a library called Numpy (call it numerical Python). This is extremely important because Numpy helps us with matrix operations and other algebraic stuff, as you will learn later, implementation of ML algorithms are almost always just really big matrix operations, with multiple dimensions.

So do not miss your algebra and calculus class, and brush up your statistics, so you’ll understand when someone says the median is less affected by extreme value than the mean.

Step 2: Algorithms and Data Structures

It is already step 2 and you are now screaming “Where is Machine Learning ?“

Well, this step might be skipped but if you are starting to get interested in coding, the knowledge of DS Algo is as important as shoes for a runner, going far without it would be difficult. While there are a plethora of resources to learn the basics, I would recommend geeksforgeeks. There also is a project based specialization on Coursera that starts with The Algorithmic Toolbox, which is shorter, more concise and the practice problems actually push you to think.

Step 3: Machine Learning

For this step, blindly dive into one course from Stanford University, simply called Machine Learning by Andrew Ng. This 12 week long course covers the most basic algorithms, and teaches you not just to implement them but also visualize the mathematical models and how a machine “learns”.

What is important to understand is, why can’t we just go straight into the libraries and import them, fit and compile the models and run projects already?

What is the need to study the mathematical processes behind them? The same reason why some people bake pizza and others order takeaways(Programmers and pizza jokes lol). Unless you bake one from scratch, you will never appreciate the process, never properly understand the fine-tuning of hyper parameters (a fancy name for matrix values),but if you do, maybe even invent your own algorithm or filter.

Core machine learning is a highly evolving field, every day we see new breakthroughs and it is never too early(or late) to start contributing.

Step 4: Implement and practice

How you implement depends on your field of interest. Nonetheless Kaggle is a great, easy place to get started(Post on Kaggle coming soon). Or just download some datasets and play with them, here are some great ones. Playing with data is how the actual learning happens. Don’t worry about the syntax, you can always Google that (Just try to relate what algorithm is at work when u import a module called sklearn and play with its input parameters). Make sure to install Jupyter notebook according to your OS, it is a great tool for experimenting and the clean layout offers efficiency(There are dark mode themes too for an add on).

However the human brain is susceptible to forgetfulness, so I recommend going through old notes at least once a month. We are not getting down to deep learning just yet.

Here is an interview I read and quote, “As a ML engineer you need to program better than the average math teacher and know more math than the average programmer”. And this can’t be farther from the truth.

I guess that will be all for this post!