Homoglyph Substitutions

When characters look alike but are technically different, confusion and typos are inevitable.

What Are Homoglyphs?

Homoglyphs are characters that look similar or identical to each other but are technically different characters. In the context of domain names, homoglyph substitutions occur when visually similar characters are swapped, creating domains that appear legitimate but are actually different.

These substitutions are particularly concerning because they're often difficult for users to detect visually, making them a common technique in phishing attacks and brand impersonation.

Common Homoglyph Pairs

Characters that are frequently confused with each other

0 ↔ O

Zero and uppercase letter O

Example: google.com → g00gle.com

1 ↔ l ↔ I

Number one, lowercase L, uppercase i

Example: paypal.com → paypa1.com

5 ↔ S

Number five and uppercase S

Example: chase.com → cha5e.com

8 ↔ B

Number eight and uppercase B

Example: bitcoin.com → 8itcoin.com

m ↔ rn

Lowercase m and lowercase r+n

Example: amazon.com → arnazon.com

vv ↔ w

Double lowercase v and lowercase w

Example: www.site.com → vvvv.site.com

cl ↔ d

Lowercase c+l and lowercase d

Example: discord.com → cliscord.com

nn ↔ m

Double lowercase n and lowercase m

Example: gmail.com → gnnail.com

Homoglyph Typosquatting

How homoglyphs are used in malicious domains

Homoglyph substitution is a common technique used in typosquatting and phishing attacks. By creating domains that look visually identical to legitimate sites, attackers can:

Phishing Attacks
Create fake login pages that appear legitimate to steal credentials.
Example: paypa1.com instead of paypal.com
Malware Distribution
Trick users into downloading malicious software from what appears to be a trusted source.
Example: g00gle.com instead of google.com
Brand Impersonation
Create fake websites that mimic legitimate brands to damage reputation or steal traffic.
Example: arnazon.com instead of amazon.com

Homoglyph attacks are particularly effective because they're difficult for users to detect visually.

Homoglyph Detection

How our system identifies potential homoglyph typos

Our typo analysis engine uses a comprehensive database of homoglyph pairs to identify potential substitutions in domain names. We assign a probability score of 55% to homoglyph substitutions, reflecting their deliberate nature.

Homoglyph Detection Algorithm

Map each character in the domain to potential homoglyphs
Generate all possible combinations of homoglyph substitutions
Filter out implausible combinations
Rank by likelihood based on character frequency and position
Prioritize substitutions that maintain visual similarity

This approach allows us to identify the most likely homoglyph typos for any domain name.

Protecting Against Homoglyph Attacks

Defending against homoglyph-based typosquatting requires a proactive approach to domain registration and monitoring. Our typo generator can help you identify potential homoglyph variations of your domain that might be used in phishing attacks.

Try Our Typo Generator Explore Typosquatting Use Cases

International Homoglyphs

The problem of homoglyphs extends beyond the Latin alphabet. With the introduction of Internationalized Domain Names (IDNs), characters from different scripts can appear visually identical to Latin characters, creating even more opportunities for confusion.

For example, the Cyrillic letter 'о' looks identical to the Latin letter 'o', but they are different Unicode characters. This has led to sophisticated phishing attacks known as "IDN homograph attacks."

Modern browsers implement various protections against these attacks, such as displaying the Punycode representation of IDNs that mix scripts, but users should remain vigilant when clicking on links or typing domain names.

Learn About Keyboard Adjacency Errors