How to Become a Machine Learning Engineer - Joe Rideg

After seven years of work as a CFD simulation engineer with a mechanical engineering background, I realised that I reached a point in my career when I needed to learn something entirely new. So, after a one-year learning phase of programming, first in Java then Python, my goal became clear for me: becoming a Machine Learning (ML) engineer!

I’m aware that it is a little “bit” too broad yet, and I’m sure that later during my learning process, I will focus on a smaller niche. However, before I reach that point later, I need a plan. So let’s go through the primary skills and milestones that seem crucial to learn if you want to become a machine learning engineer!

1. Behind the scenes: Mathematics

It is tempting to jump immediately into applying different Machine Learning models, feeling like working on awakening Skynet in a dark secret room, like a true “hacker man”. However, each profession has a solid foundation that you can’t skip. For doctors, they are Anatomy and Biochemistry, I guess. For architects, descriptive geometry and the ability to draw proper cubes on a piece of paper without fancy tools. For mechanical engineers, Mathematics, Thermodynamics and Newtonian Physics.

For ML, you need a thorough understanding of Mathematics. Linear Algebra, Calculus, Statistics and Probability. Without these theories, you use black boxes without knowing what’s going on under the hood. It is dangerous for CFD engineers working in Computational Fluid Dynamics to imagine this trap of “lazy” ML engineers.

2. Choose a programming language: Python (or R)

There is a great debate about these two languages. I won’t fake being an expert. However, I see the emerging popularity of Python in the TIOBE Index. One of the hottest topics in machine learning is deep learning. Moreover, the two most popular libraries are Tensorflow and PyTorch. For the application of any of them, you need a solid skill in Python programming.

On the other hand, be ready to learn additional languages. Python might not be the perfect choice when you need fast calculations with large datasets. In addition, you might need to know C/C++ for CUDA (which is “is a parallel computing platform and programming model that makes using a GPU for general-purpose computing simple and elegant ” [3] ).

Alternatively, maybe you should learn the Klingon language. Who knows?

However, yes, you have to be able to collect/clean data. Merge columns/vectors for better modelling. Then, selecting the proper ML model, split your data into training and test data. Training your model, and last but not least, visualise your results. I know certain people do not care about Navier-Stokes equations. However, everybody loves colourful streamlines and plots of a flow field (and they don’t care, whether they are temperature, velocity, pressure, or turbulent kinetic energy, as far as they look great).

3. Data science

Once you have learned/refreshed your background in Mathematics and mastered your skills in a decent programming language, you have to know about data science and machine learning. Linear regression, decision trees, artificial neural networks are supposed to be your best friends during this phase. “Data Science from Scratch” by Joel Grus might be a good choice for learning these data science concepts. [1]

4. Start a side-project

You should find a topic in which you are interested and define a problem that you could solve with Machine Learning. It can be a sport, movie, music, or public transport system. Just anything, where you can find both data and motivation for your work. Once you can prove that you can solve problems with a proper understanding of ML models and mathematics and build up a complete workflow from scratch to visualisation, you can be proud of yourself because you learned a valuable skill. You can solve more problems than earlier (or the good old ones with better results).

Summary

You have to be aware of the mathematical foundations behind Machine Learning. It would be best to be confident with programming at least one language, preferably Python, for collecting, manipulating data, importing and using ML models, and visualising the results. Learning Python and Mathematics could be done parallel or in a reversed order. They are independent basics for Machine Learning.

I am sure that there will be more and more demand for experts of all different fields within science, engineering and doctors who have their domain knowledge and the ability to use Machine Learning for their problem-solving.

Sources

[1] “Roadmap: How to Learn Machine Learning in 6 Months” by Zach Miller, Senior Data Scientist at Metis https://www.youtube.com/watch?v=MOdlp1d0PNA

[2] TIOBE index: https://www.tiobe.com/tiobe-index/

[3] “Machine Learning for Absolute Beginners: A Plain English Introduction” by Oliver Theobald https://www.amazon.com/Machine-Learning-Absolute-Beginners-Introduction-ebook/dp/B06VXKBLNG

[4] “What is CUDA?” https://blogs.nvidia.com/blog/2012/09/10/what-is-cuda-2/