Feature Space

Learn feature engineering techniques by implementing feature space exploration, subspace analysis, and feature transformation.

In both supervised learning and clustering, the term “data point” has been utilized. Specifically, each data point x\bold x within the training dataset is represented as a dd-dimensional vector in Rd\R^d such that xRd\bold x \in \R^d. The elements of x\bold x are referred to as features. Therefore, x\bold x is also referred to as a feature vector.

Feature space

A feature space is a mathematical space that represents the features or attributes of a given dataset. Each observation in the dataset is represented by a vector in the feature space, where each dimension of the vector corresponds to a specific feature or attribute of the observation.

For example, let’s say we have a dataset of cars, and each car is described by its make, model, year, horsepower, and fuel efficiency. The feature space for this dataset would be a five-dimensional space, where each dimension corresponds to one of these features. By analyzing the patterns and relationships among the feature vectors in the feature space, we can gain insights into the underlying structure and characteristics of the dataset.

The feature space is the vector space that contains the feature vectors of a dataset, with each feature vector belonging to this space. The dataset is a sample collected from this feature space, where xRd\bold x \in \R^d. In most cases, the feature space is a subspace of RdR^d representing the underlying structure of the dataset.

Subspace example

Consider a dataset of cars D=(x1,y1),(x2,y2),,(xn,yn)D={(\bold x_1, y_1), (\bold x_2, y_2), \dots, (\bold x_n, y_n)}, where each xiRd\bold x_i \in \R^d ...