Embeddings and Random Walks

Learn the concept of embeddings, and random walks, and how to implement them in Python.

What are embeddings?

Embeddings are instances of mathematical structures that are contained in other structures defined by an injective mappingIf each possible element of the codomain is mapped to by at most one argument.:

This is another way of saying that the mathematical structure XX is somehow contained in the mathematical structure YY.

The idea that embeddings give us is that they allow us to map structures to another type of representation more suitable to the task at hand.

For example, we can map images and text to a metric space by using embeddings to calculate distances between those elements. This way, we can tell how far an image of a dog and a cat are in relationship to the image of a reptile.

Press + to interact
Embedding of images to a metric space
Embedding of images to a metric space

Many machine learning algorithms, especially deep learning ones, work well with embeddings and require some kind of vector data to be used. This is why we’re going to see embeddings mostly mapping non-tabular data, such as images, graphs, and texts, to some form of tabular data, such as vectors or tensors.

Embeddings applications for complex networks

When we talk about applying embeddings to complex networks, we have three options for embeddings to be generated, each for one of the main structures of a complex network: nodes, edges, and the entire graph.

Node embeddings

When we’re interested in embedding nodes, we usually want to represent each node in our graph as a vector of characteristics. For example, depending on our graph, we can have work tasks, protein structure, closest friends, age, and so on.

Notice that when nodes have many of these characteristics, it’s not trivial to compare their similarity, especially when this similarity is related to their connections.

This way we can directly compare nodes, calculate the distance between them, and also use traditional machine learning models in node data, which isn’t feasible using the original graph representation.

Edge embeddings

It isn’t common to directly embed edges in vectors. Usually, the process that is done is to create a node embedding. Then, use the average embedding of the two nodes that are linked by the edge to define what is this edge embedding.

Most ...