Advertisement

Multilingual Addresses

Started by October 26, 2009 08:43 AM
0 comments, last by SiCrane 15 years ago
So I just read on Google News that ICANN said it "would declare an end to the exclusive use of Latin characters for website addresses" 1, but it doesn't really educate the readers on the significance of this change. It's my understanding that when you permit Unicode strings, you lose the ability to rely on the visual representation of the string to uniquely identify it. For instance:
http://www.gAmedev.net
A = 0x53 (ASCII, Latin 'A')
http://www.gAmedev.net
A = 0xB3E653AF (UTF-8, Arabic Greek 'A')
ICANN is aware of this...
Quote: "ICANN is concerned about the potential exacerbation of homograph domain name spoofing as IDNs become more widespread, and is equally concerned about the implementation of countermeasures that may unnecessarily restrict the use and availability of IDNs." ICANN Statement on IDN Homograph Attacks
... but I'm not sure whether the change is just on Internet top-level domains or all levels. It's my understanding that we were already at the point where domain names like "蔡依林.cn" are valid, but domains names such as "蔡依林.公司" are not. Any experts here mind filling in the gaps? [dead]
It's highly unlikely they'll use arbitrary unicode sequences as domain names. They'd probably force the use of normalized forms, probably NFKC.

This topic is closed to new replies.

Advertisement