Puzzle 10: Explanation
Let’s learn how UTF-8 encoded character comparison works in Rust.
We'll cover the following...
Test it out
Hit “Run” to see the code’s output.
Press + to interact
fn main() {if 'X' == 'Χ' {println!("It matches!");} else {println!("It doesn't match.");}}
Explanation
Unicode allows for homoglyphs, which are characters that are very similar or identical and can be encoded in different ways. The first X
is the Latin Unicode character, encoded as 0x58
. The second Χ
is the capitalized version of the Greek letter chi, encoded in UTF- 8 as 0xCE 0xA7
. If we look closely, they aren’t quite identical, but in some fonts, notably Consolas
on Windows, they are indistinguishable.
Homoglyphs are popular in ...