Non-English Domain Names Likely Delayed

Mon Jul 18 13:48:51 UTC 2005

Stephane Bortzmeyer wrote:

>>Forwarded Message from Neil Harris <neil at tonal.clara.co.uk> ---
>>    
>>
>...
>  
>
>>After extensive analysis and discussion, the Mozilla community and Opera 
>>have already produced a fix for this,
>>    
>>
>
>Which is highly questionable and that is rejected by most european
>ccTLDs.
>
>  
>
>>Already, some 21 TLDs are whitelisted, including .cn, .tw, a number
>>of European ccTLDs, .museum, and .info. Any other registrars who
>>want to be supported can simply E-mail Gerv at the Mozilla
>>Foundation, or his Opera counterpart, and give them a pointer to
>>their anti-spoofing rules.
>>    
>>
>
>The Polish registry already refused to comply, saying that the Mozilla
>foundation has no legitimacy deciding the registration rules in ".pl".
>
>  
>

Stephane, can I ask you what your detailed objections are to the 
Moz/Opera mechanism, and could you let me know your proposal for an 
alternative mechanism for preventing IDN spoofing?

I completely understand the need for registries to define and control 
their own rules, since every registry has different needs. Thus, I agree 
with you that the Mozilla foundation does not have, and should not have, 
any right whatsoever to decide registries' registration rules.

However, by the same principle, Mozilla, Opera and other software 
vendors also have the right to choose their policy for how they display 
domain names in their products' GUI. Ultimately, the decision of what 
policy is used devolves to the user, who decides what software they want 
to install on their machine.

The Moz/Opera anti-spoofing mechanism is the result of widespread public 
analysis and discussion, and has the following advantages:
* it deals with the actual problem: the visual representation of 
characters to the user -- the problem is, quite literally, in the eye of 
the beholder
* it is simple to code and deploy: about ten lines of code for the 
Mozilla implementation.
* it is based on simple and non-political principles
* it requires only a minimal amount of data to be distributed with the 
software
* it is the sole survivor of a large number of alternative proposals 
that were considered and rejected. Unlike most of the other rejected 
proposals, it does not need any modifications to the DNS protocol, or 
distribution of "language" codes for labels, nor does it require 
multiple DNS lookups, large character tables in the browser, or 
real-time access to WHOIS information. (I can tell you in great detail 
about some of the flawed alternative proposals, if you like).
* it is based on a much more thorough analysis of the problem than the 
earlier ICANN proposals, and builds on the experience of the Unicode 
community, and the earlier analysis of the spoofing problem for the CJK 
languages performed for RFC 3743. For example, simple script 
restrictıons alone, as per ICANN, do not solve the problem -- there are 
plenty of subtle homographs in the Latin alphabet, such as the one 
embedded in this sentence.
* it does not treat IDNs as second-class citizens
* it is language- and script-agnostic
* it is scalable on a per-registry basis, so there's no need for a "flag 
day", and requires no action on behalf of the registry beyond that which 
might be expected as a service to their customers, who have a reasonable 
expectation that their domains not be easily spoofed.
* and, most of all, it uses human, and not technical, means to provide a 
chain of trust from the registry to the application to the user

I must say that, from a user's perspective, I find it hard to understand 
why any registry would not want to put their anti-spoofing policy -- 
assuming they have one -- on public display, thus encouraging software 
vendors to regard their IDN labels as safe to display within their software.

In the long run, of course, it makes sense for best common registry 
anti-spoofing practices to be codified, probably in an RFC, or through 
the Unicode consortium. However, until then, the maintenance of an 
ad-hoc list by software vendors seems to be a powerful incentive in the 
short term for registries to implement and publish anti-spoofing 
policies which encourage trust.

There are a vast number of possible policies which registries could 
introduce, any of which might serve this purpose.

For example, for .fr, it could be as simple as saying something like 
"labels in .fr must consist only of characters from the set -, 0, 1, 2, 
3, 4, 5, 6, 7, 8, 9, a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, 
r, s, t, u, v, w, x, y, z, à, â, æ, ç, è, é, ê, ë, î, ï, ô, ù, û, ü, ÿ, 
œ", putting that statement on their website, and letting the software 
makers know about it.

For .pl, which appears to want to support multiple character sets 
including the Cyrillic alphabet, it could be to say "we implement the 
character set restrictions of draft-bartosiewicz-idn-pltld-06.txt, 
together with blocking bundling using the confusables.txt table as per 
UTR #36-3".

In my opinion, either of these statements would persuade me that the 
registry was applying due diligence in avoiding homograph spoofs, and I 
would imagine that browser vendors would take the same view.

Again, if this is unworkable, please let me know a better alternative.

-- Neil