ABSTRACT
Emails are important tools for communication and cooperation, they contain large amount of information and connections to knowledge and data sources. Because of this, it is very important to improve the efficiency of their processing. This paper describes an email search system which integrates full-text search with social search while processing also the attached and linked resources. The project described in this paper is still in progress. Due to this fact, some proposed parts of the system are not implemented and also not proven yet. The proposed equation for determining the social importance of an email has also to be tuned during the last phases of the development and the evaluation phase. The already implemented part of the system includes content extraction from the email messages, attached and linked resources and also the textual search and social relation extraction is implemented. The next phase of the development includes tuning of the social evaluation and it's integration with textual search.
- Jeffrey Jones, Gallup: Almost All E-Mail Users Say Internet, E-Mail Have Made Lives Better, http://www.gallup.com/poll/4711/Almost-All-EMail-Users-Say-Internet-EMailMade-Lives-Better.aspx, 2001Google Scholar
- Jiangong Zhang, Torsten Suel: Efficient Search in Large Textual Collections with Redundancy. WWW 2007 (Banff, Alberta, Canada, 2007) Google ScholarDigital Library
- Karp-Rabin algorithm, Available at: http://www-igm.univ-mlv.fr/~lecroq/string/node5.html (2011)Google Scholar
- Saul Schleimer, Daniel S. Wilkerson, Alex Aiken: Winnowing: Local Algorithms for Document Fingerprinting. SIGMOD 2003 (San Diego, CA, 2003) Google ScholarDigital Library
- Lampert, A., Dale, R., Paris, C.: Segmenting Email Message Text into Zones. Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing (Singapore, 2009) Google ScholarDigital Library
- Henry Tirri, Jukka Perkiö, Ville Tuulos, Wray Buntine: Multi-Faced Information Retrieval System for Large Scale Email Archives. Proceedings of the 2005 IEEE/WICI/ACM International Conference on Web Intelligence (2005) Google ScholarDigital Library
- Laclavík, M. 'eleng, M. Ciglan, M. Hluchý, L.: Ontea: Platform for Pattern Based Automated Semantic Annotation. Computing and Informatics, Vol. 28, 2009, pp. 555--579.Google Scholar
- Shinjae Yoo, Yiming Yang, Frank Lin, Il-Chul Moon: Mining Social Networks for Personalized Email Prioritization. KDD'09 (Paris, France, 2009) Google ScholarDigital Library
- Apache Lucene: Overview. Available at: http://lucene.apache.org/java/docs/index.html. 2011.Google Scholar
- Sqlite3. Available at: http://www.sqlite.org. 2011.Google Scholar
- Vitor R. Carvalho, William W. Cohen: Learning to Extract Signature and Reply Lines from Email. CEAS-2004 (Conference on Email and Anti-Spam), Mountain View, CA, July 2004Google Scholar
- Shinjae Yoo , Yiming Yang , Frank Lin , Il-Chul Moon, Mining social networks for personalized email prioritization, Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, June 28-July 01, 2009, Paris, France Google ScholarDigital Library
- Monica Cahill McJunkin, Precision and recall in title keyword searches, Information Technology and Libraries, v.14 n.3, p.161--171, Sept. 1995 Google ScholarDigital Library
Index Terms
- Full-text search in email archives using social evaluation, attached and linked resources
Recommendations
Preventing Spam Email by Delivery Limitation in RMX
IDEAS '15: Proceedings of the 19th International Database Engineering & Applications SymposiumOn the rule-based email exchange system called RMX, similar to general mailing lists, anyone can send emails by sending to an address unique to RMX. However, there is a security problem that we cannot prevent spam emails and accidentally sending email ...
Revisiting Whittaker & Sidner's "email overload" ten years later
CSCW '06: Proceedings of the 2006 20th anniversary conference on Computer supported cooperative workTen years ago, Whittaker and Sidner [8] published research on email overload, coining a term that would drive a research area that continues today. We examine a sample of 600 mailboxes collected at a high-tech company to compare how users organize their ...
Rethinking email message and people search
WWW '09: Proceedings of the 18th international conference on World wide webWe show how a number of novel email search features can be implemented without any kind of natural language processing (NLP) or advanced data mining. Our approach inspects the email headers of all messages a user has ever sent or received and it creates ...
Comments