Feature Engineering
Let's engineer meaningful features to train the search ranking model.
An important aspect of the feature generation process is to first think about the main actors that will play a key role in our feature engineering process.
📝 The terms “features” and “signals” are generally used interchangeably as we will also do so.
The four such actors for search are:
- Searcher
- Query
- Document
- Context
In the above figure, the context for a search query is browser history. However, it is a lot more than just search history. It can also include the searcher’s age, gender, location, and previous queries and the time of day.
Let’s go over the characteristics of these actors and their interactions to generate meaningful features/signals for your machine learning model.
This is essentially the process of feature engineering.
📝 The knowledge of feature engineering is highly significant from an interview perspective.
Features for ML model
We can generate a lot of features for the search ranking problem based on the actors identified above. A subset of these features is shown below.
Let’s discuss these features one by one.
Searcher-specific features
Assuming that the searcher is logged in, you can tailor the results according to their age, gender and interests by using this information as features for your model.
Query-specific features
Let’s explore some query-related aspects that can also be useful as features.
Query historical engagement
For relatively popular queries, historical engagement can be very ...