Abstract
The composition of traditional domain names are restricted to ASCII letters, digits, and hyphens (abbreviated as LDH). This makes it difficult for many to use their native language to name and access their Internet hosts. The IETF IDN (Internationalized Domain Name) Working Group proposes a mechanism, IDNA (Internationalizing Domain Names in Applications), for internationalized access to multilingual domain names. The proposal uses a preparation process that converts a Unicode IDN into an ACE (ASCII Compatible Encoding) string that uses only LDH. Thus, applications can look up the ACE string by using the existing DNS infrastructure. However, some of the domain name strings embedded in multilingual content do not have any charset tag so they cannot appropriately be converted into ACE strings. We noticed that many Internet applications allow users to use non-ASCII domain names. We were motivated to design an architecture for IDN resolution as well as to minimize the cost of modifying legacy Internet applications. We specifically focus on designing an IDN server proxy, which is located on the domain name server side, to handle domain names in multiple encodings. In this article, we study several architecture design issues including detection of charset encoding, routing of non-ACE IDN lookup requests, and so on.With respect to these design issues, we present an IDN server proxy architecture which stores ACE IDNs in domain name servers. Note that traditional domain name servers can be used without modification. An IDN server proxy, called Octopus, is employed on the domain name server side. Octopus converts a non-ACE IDN string into ACE upon receiving an IDN lookup request from remote users or autonomous systems. The ACE string is then forwarded to backend domain name servers (where the traditional domain names and ACE IDNs are stored) for further processing. This allows Internet users to access IDNs without having to upgrade their software.Based on the design and implementation of Octopus, we deployed a CDN (Chinese domain name) trial in July 2002. In this article, we present the results of testing Octopus IDN lookup functions as well as our experiences in providing CDN Web services. Several types of errors can occur if applications are unable to handle IDNs adequately. For example, a Web browser may erroneously parse an IDN within a URL. Many legacy Web servers are unable to process the IDN of a virtual host. Web application servers may have trouble completing some actions such as redirecting Web pages to alternative Web pages. Our studies help service providers understand potential problems when non-ASCII domain names are used and the best common practice at this stage. As well, the experiences give some guidance for software developers to develop IDNA-compliant Internet applications.
- Berners-Lee, T., Fielding, R., and Frystyk, H. 1996. Hypertext Transfer Protocol---HTTP/1.0. RFC 1945.]] Google Scholar
- Berners-Lee, T., Fielding, R., and Masinter, L. 1998. Uniform Resource Identifiers (URI): Generic Syntax. RFC 2396.]] Google Scholar
- Berners-Lee, T., Masinter, L., and McCahill, M. 1994. Uniform Resource Locators (URL). RFC 1738.]] Google Scholar
- Chung, E. and Leung, D. 2002. Internationalized Domain Name Transition (IDNX). Internet Draft.]]Google Scholar
- Costello, A. M. 2003. Punycode: A Bootstring encoding of Unicode for IDNA. RFC 3492.]] Google Scholar
- Faltstrom, P., Hoffman, P., and Costello, A. M. 2003. Internationalizing Domain Names in Applications (IDNA). RFC 3490.]] Google Scholar
- Fielding, R., Gettys, J., Mogul, J., Frystyk, H., and Berners-Lee, T. 1997. Hypertext Transfer Protocol---HTTP/1.1. RFC 2068.]] Google Scholar
- Hoffman, P. 2003. A Method for Registering Internationalized Domain Names. Internet Draft.]]Google Scholar
- Hoffman, P. and Blanchet, M. 2002. Preparation of Internationalized Strings (stringprep). RFC 3454.]] Google Scholar
- Hoffman, P. and Blanchet, M. 2003. Nameprep: A Stringprep Profile for Internationalized Domain Names. RFC 3491.]] Google Scholar
- InterNIC. InterNIC Name Delegation Policy. www.internic.net.]]Google Scholar
- ISO/IEC 10646-1:2000(E). International Standard---Information Technology. Universal Multiple-Octet Coded Character Set (UCS).]]Google Scholar
- Jpnic. Internationalized/Multilingual Domain Name Tool Kit (idnkit/mDNkit). http://www.nic.ad.jp/en/idn/index.html.]]Google Scholar
- Konishi, K., Huang, K., Qian, H., and KO, Y. 2004. Joint Engineering Team (JET) Guidelines for Internationalized Domain Names (IDN) Registration and Administration for Chinese, Japanese, and Korean. RFC 3743.]] Google Scholar
- Mockapetris, P. 1987. Domain Names---Concepts and Facilities (RFC 1034) and Domain Names-Implementation and Specification (RFC 1035). STD 13.]] Google Scholar
- The Unicode Consortium. The Unicode Standard. http://www.unicode.org/.]] Google Scholar
- Yergeau, F. 1998. UTF-8: A Transformation Format of ISO 10646. RFC 2279.]] Google Scholar
Index Terms
- IDN server proxy architecture for Internationalized Domain Name resolution and experiences with providing Web services
Recommendations
Variant Chinese Domain Name Resolution
Many efforts in past years have been made to lower the linguistic barriers for non-native English speakers to access the Internet. Internet standard RFC 3490, referred to as IDNA (Internationalizing Domain Names in Applications), focuses on access to ...
A Web Redirection Service for Variant Chinese Domain Name Resolution
ICITA '05: Proceedings of the Third International Conference on Information Technology and Applications (ICITA'05) Volume 2 - Volume 02Many efforts in past years have been made to lower the linguistic barriers for non-native English speakers to access the Internet. IDNA [2] focuses on access to Internationalized Domain Names (IDN) in a range of scripts that is broader in scope than the ...
Comments