Feature Engineering

Let's engineer meaningful features to train the search ranking model.

An important aspect of the feature generation process is to first think about the main actors that will play a key role in our feature engineering process.

📝 The terms “features” and “signals” are generally used interchangeably as we will also do so.

The four such actors for search are:

  1. Searcher
  2. Query
  3. Document
  4. Context
Press + to interact
Actors
Actors

In the above figure, the context for a search query is browser history. However, it is a lot more than just search history. It can also include the searcher’s age, gender, location, and previous queries and the time of day.


Let’s go over the characteristics of these actors and their interactions to generate meaningful features/signals for your machine learning model.

This is essentially the process of feature engineering.

📝 The knowledge of feature engineering is highly significant from an interview perspective.

Features for ML model

We can generate a lot of features for the search ranking problem based on the actors identified above. A subset of these features is shown below.

Press + to interact
Features in the training data row
Features in the training data row

Let’s discuss these features one by one.

Searcher-specific features

Assuming that the searcher is logged in, you can tailor the results according to their age, gender and interests by using this information as features for your model.

Press + to interact
Gender-based results
1 / 2
Gender-based results

Query-specific features

Let’s explore some query-related aspects that can also be useful as features.

Query historical engagement

For relatively popular queries, historical engagement can be very ...