Similarity and Equivalence
Learn about similarity in complex networks and the idea of equivalence.
We'll cover the following...
Similarity is an important concept in data science in general. Being able to compare instances and determine how similar or dissimilar they are opens up a lot of possibilities to improve our analysis and prescriptions.
Defining similarity in graphs
Defining similarity in graphs can come in several shapes. We can define similarity in at least three levels:
Node similarity
Edge similarity
Graph/subgraph similarity
Let’s explore how each one of them can be defined.
Node similarity
Node similarity tries to answer if two nodes are similar in some sort of way. Notice that this is a tricky thing to define. Saying a node is similar to another can have a lot of definitions.
One node can be similar to another one if they have the same centrality measures. Another definition can be that nodes are similar if they’re linked to the same set of other nodes. Yet another way to say that nodes are similar is if the amount of information that passes through them is similar. There is no single definition, but each definition can be useful depending on our objectives.
By having some measure of similarity between nodes, we can try to make some inferences:
If node
is similar to and node likes action movies, maybe node will also like it. If node
is similar to node , maybe they’ll generate an edge between them in the future. If node
has a lot of common friends with node , maybe they know each other and we should recommend they send a friendship request to each other. If node
...