Note: This post was originally published in 2020 and has been updated as of Nov. 23, 2021.
Machine Learning (ML) is the study of computer algorithms that improve automatically through experience. ML is a lucrative field that is growing quickly. It is predicted to reach $30.6 billion by 2024. If you’re pursuing a data scientist or software engineering role, you’ll go through a competitive interview process. You may be tested on your programming, data analysis, critical thinking, and system design skills in your interview.
System design skills can set you apart from other engineers. Top tech companies ask system design interview questions to see if you can efficiently solve real-world problems. Today we’ll discuss how you can ace machine learning interviews using system design concepts.
Ace your machine learning engineer interview
System design is an important component of any ML interview. Being able to efficiently solve open-ended machine learning problems is a key skill that can set you apart from other engineers and increase the level of seniority at which you’re hired. This course helps you build that skill, and goes over some of the most popularly asked interview problems at big tech companies. You’ll walk step-by-step through solving these problems, focusing in particular on how to design machine learning systems rather than just answering trivia-style questions. Once you’re done with the course, you’ll be able to not just ace the machine learning interview at any tech company, but impress them with your ability to think about systems at a high level. If you have a machine learning or system design interview coming up, you’ll find the course tremendously valuable.
ML aims to solve a multitude of complex problems. It has made rapid progress in areas like speech understanding, search ranking, and credit card fraud detection. Companies are leveraging these technologies across industries from healthcare and agriculture to manufacturing and retail.
A high level of technical skill is required in the machine learning field, particularly for machine learning engineers. In a machine learning interview, you’ll be asked open-ended questions to test your ability to solve an ML system design problems, similar to system design interview.
In an interview, you’ll be tested on the following:
During your interview, you may be asked to:
Our goal is to improve our metrics when working on an ML-based system. We also want to ensure that we meet the capacity and performance Service Level Agreement (SLA). Performance-based SLA ensures that we return results within a given time frame (e.g. 500ms) for 99% of queries. Capacity refers to the load that our system can handle (e.g. the system supports 1000 queries per second).
There are two important discussions regarding performance and capacity when building an ML system:
The layered/funnel modeling approach is the best way to solve for scale and relevance while keeping performance and capacity in check. You’ll start with a relatively fast model when you have the highest number of documents (e.g. 100 million documents in case of the search query “computer science”). In each later stage, you continue to increase the complexity (i.e. more optimized model in prediction) and execution time. The model needs to run on a reduced number of documents as the stages progress (e.g. your first stage could use a linear model and the final stage can use a deep neural network).
An ML model learns directly from the data it’s provided. It creates and refines its rules on a given task based on that data, which is called training data. To effectively develop such models, it’s essential to learn machine learning principles and techniques. This makes it crucial to avoid inadequate, irrelevant, or biased data. For instance, a machine learning model based on racially biased data will simply learn to automate racial bias. Even the most performant algorithms are useless if they are not based on quality dataset.
The quality and quantity of training data is a big factor in determining how far you can go in your machine learning optimization task. Data collection techniques primarily involve user interactions, human labelers, or specialized labelers.
You can also make use of other creative data collection techniques. For example, you can build a personalized experience in your product by collecting data from users. If you’re working with a system that uses visual data, such as object detectors or image segmenters, you can use GANs (generative adversarial networks) to enhance the training data. Other things to consider include:
“Success” can be measured in numerous ways in machine learning system design. A successful machine learning system must gauge its performance by testing different scenarios. This can make a model’s design more innovative.
To run an online experiment, A/B testing is a great way to assess the impact of new features or changes in the system. In an A/B experiment, a second modified version of a webpage or screen is created. The original version is known as the control, and the modified version is the variation. From here, we can formulate two hypotheses:
We an also use this stage to measure long term effects with back testing and long-running A/B tests.
Embeddings enable us to encode entities (e.g., words, docs, images, person) in a low-dimensional vector space in order to capture their semantic information. Two popular models used for word embeddings are:
We’ve gone over the main concepts and techniques we use in ML interview and design. This is just an introduction to the techniques you will need to be successful in machine learning system design and interviews. More topics you’ll want to know are:
Ace your machine learning engineer interview
System design is an important component of any ML interview. Being able to efficiently solve open-ended machine learning problems is a key skill that can set you apart from other engineers and increase the level of seniority at which you’re hired. This course helps you build that skill, and goes over some of the most popularly asked interview problems at big tech companies. You’ll walk step-by-step through solving these problems, focusing in particular on how to design machine learning systems rather than just answering trivia-style questions. Once you’re done with the course, you’ll be able to not just ace the machine learning interview at any tech company, but impress them with your ability to think about systems at a high level. If you have a machine learning or system design interview coming up, you’ll find the course tremendously valuable.
You’ll be expected to set up a system effectively in an ML interview. Let’s discuss the thought process required to answer an interviewer’s questions.
Interviewers will generally ask you to design a machine learning system for a particular task. This question is usually broad. The first thing you need to do is ask questions to narrow down the scope of the problem and ensure your system’s requirements. You should also ask questions about performance and capacity considerations of the system.
Clarifying these questions will guide your system’s architecture. Knowing that you need to return results quickly will influence the depth and complexity of your models.
After asking questions, you should carefully choose your system’s performance metrics for both online and offline testing. These metrics will differ depending on the problem your system is trying to solve.
For example, if you are performing binary classification, you will use the following offline metrics: Area Under Curve (AUC), log loss, precision, recall, and F1-score.
When deciding on online metrics, you may need both component-wise and end-to-end metrics. Component-wise metrics are used to evaluate the performance of ML systems that are plugged in to and used to improve other ML systems. End-to-end metrics evaluate a system’s performance after an ML model has been applied. For example, a metric for a search engine would be the users’ engagement and retention rate after your model has been plugged in.
The next step is to design your system’s architecture. You need to think about the system’s components and how the data will flow through those components. In this step, your aim is to design a model that can scale easily.
To build a scalable system, your design needs to efficiently deal with a large and continually increasing amount of data. For instance, an ML system that displays relevant ads to users can’t process every ad in the system at once. You could use the funnel approach, wherein each stage has fewer ads to process. This will yield a scalable system that quickly determines relevant ads for users despite the increase in data.
When you have nailed down all of your ML system’s requirements, you can proceed to building your model. This involves:
Now, we’ll move on to the task of building an entity linking system.
Named entity linking (NEL) is the process of detecting and linking named entities in a given text to corresponding entities in a target knowledge base. There are two parts to entity linking:
Let’s see entity linking in action in the following example:
The text says, “Michael Jordan is a machine learning professor at UC Berkeley.” First, NER detects and classifies the named entities Michael Jordan and UC Berkeley as person and organization. Next, disambiguation takes place. Assume that there are two ‘Michael Jordan’ entities in the given knowledge base, the UC Berkeley professor and the athlete. Michael Jordan in the text is linked to UC Berkeley professor entity in the knowledge base. Similarly, UC Berkeley in the text is linked to the University of California entity in the knowledge base.
Entity linking has applications in many natural language-processing tasks. Use cases can be broadly categorized as information retrieval, information extraction, and building knowledge graphs. These can be used in many systems, such as:
The aforementioned applications require a high-level representation of text. In this high-level representation, the concepts relevant to the application are separated from the text and other non-meaningful data.
The interviewer has asked you to design an entity linking system that:
The problem statement translates to the following machine learning problem:
"Given a text and knowledge base, find all the entity mentions in the text (Recognize) and then link them to the corresponding correct entry in the knowledge base (Disambiguate).”
These are some of the questions that an interviewer can put forth during a discussion on entity linking systems.
Congrats! You have learned about implementing introductory ML system concepts and how to approach system design interview questions. There’s still a lot to learn about ML system design.
You’ll need to master the following systems:
To help you master these concepts and strategies, check out Educative’s Grokking the Machine Learning Interview course. You’ll master machine learning system design and answer some of the most popular interview problems at big tech companies. You should come out of the course with the ability to impress interviewers by thinking about systems at a high level.
If you want even more practice with system design questions for machine learning interviews, check out Machine Learning System Design.
Free Resources