|
ABSTRACT
Unicode is becoming a dominant character representation format for information processing. This presents a very dangerous usability and security problem for many applications. The problem arises because many characters in the UCS (Universal Character Set) are visually and/or semantically similar to each other. This presents a mechanism for malicious people to carry out Unicode Attacks, which include spam attacks, phishing attacks, and web identity attacks. In this paper, we address the potential attacks, and propose a methodology for countering them. To evaluate the feasibility of our methodology, we construct a Unicode Character Similarity List (UC-SimList). We then implement a visual and semantic based edit distance (VSED), as well as a visual and semantic based Knuth-Morris-Pratt algorithm (VSKMP), to detect Unicode attacks. We develop a prototype Unicode attack detection tool, IDN-SecuChecker, which detects phishing weblinks and fake user name (account) attacks. We also introduce the possible practical use of Unicode attack detectors.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
ANSI, American Standard Code for Information Interchange, www.ansi.org.
|
| |
2
|
Anti-Phishing Group of City University of Hong Kong, http://antiphishing.cs.cityu.edu.hk
|
| |
3
|
Anti-Phishing Working Group, http://www.antiphishing.org.
|
| |
4
|
Berners-Lee T., Fielding R., Masinter L., RFC 3986: Uniform Resource Identifier (URI): Generic Syntax, The Internet Society (2005), Jan. 2005.
|
 |
5
|
|
 |
6
|
|
 |
7
|
|
| |
8
|
Duerst M., Suignard M., RFC 3987: Internationalized Resource Identifiers (IRIs), The Internet Society (2005), Jan. 2005.
|
| |
9
|
Fu A. Y., Web Link Illustrator Demo for Microsoft IE, www.mit.edu/~ayf
|
| |
10
|
Fu A. Y., Deng X., Liu W., A Potential IRI based Phishing Strategy, in the Proceedings of 6th International Conference on Web Information Systems Engineering, LNCS Volume 3806, pages 618--619, 2005
|
| |
11
|
Fu A. Y., Liu W., Deng X., EMD based Visual Similarity for Detection of Phishing Webpages, in Proceedings of International Workshop on Web Document Analysis 2005, 2005.
|
 |
12
|
|
 |
13
|
|
| |
14
|
|
| |
15
|
ICANN, http://www.icann.org/topics/idn/implementation-guidelines.htm
|
| |
16
|
ICANN. http://www.icann.org/
|
| |
17
|
Jakobsson M., Modeling and Preventing Phishing Attacks, Phishing Panel of Financial Cryptography, 2005.
|
| |
18
|
|
| |
19
|
Knuth D., Morris J. H., and Pratt V., Fast pattern matching in strings. SIAM Journal on Computing, 6(2):323--C350. 1977. Citations. Original publication.
|
| |
20
|
Levenshtein V. I., Binary codes capable of correcting deletions, insertions, and reversals, Cybernetics and Control Theory, Volume 10, pages 707--710, 1966.
|
| |
21
|
|
| |
22
|
Microsoft, Arial Unicode MS, Version 1.01, 2000.
|
| |
23
|
Microsoft, Description of the Arial Unicode MS font in Word 2002, http://support.microsoft.com/kb/q287247.
|
| |
24
|
Nanno T., Saito S., and Okumura M., Structuring Webpages based on repetition of elements, in Proceedings of the 7th International Conference on Document Analysis and Recognition, 2003.
|
| |
25
|
Needleman S. and Wunsch C., A general method applicable to the search for similarities in the amino acid sequence of two proteins, Journal of Molecular Biology, 48(3):443--453, 1970.
|
| |
26
|
Netscape, http://wp.netscape.com/eng/ssl3.
|
| |
27
|
The Unicode Consortium, http://www.unicode.org.
|
| |
28
|
Wood L., Document Object Model (DOM) Level 1 Specification, http://www.w3.org/TR/REC-DOM-Level-1.
|
 |
29
|
|
|