skip to main content
10.1145/2723372.2749447acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Exact Top-k Nearest Keyword Search in Large Networks

Published: 27 May 2015 Publication History

Abstract

Top-k nearest keyword search has been of interest because of applications ranging from road network location search by keyword to search of information on an RDF repository. We consider the evaluation of a query with a given vertex and a keyword, and the problem is to find a set of $k$ nearest vertices that contain the keyword. The known algorithms for handling this problem only give approximate answers. In this paper, we propose algorithms for top-k nearest keyword search that provide exact solutions and which handle networks of very large sizes. We have also verified the performance of our solutions compared with the best-known approximation algorithms with experiments on real datasets.

References

[1]
I. Abraham, D. Delling, A. V. Goldberg, and R. F. F. Werneck. A hub-based labeling algorithm for shortest paths in road networks. In SEA, pages 230--241, 2011.
[2]
T. Akiba, Y. Iwata, and Y. Yoshida. Fast exact shortest-path distance queries on large networks by pruned landmark labeling. In SIGMOD, 2013.
[3]
T. Akiba, Y. Iwata, and Y. Yoshida. Dynamic and historical shortest-path distance queries on large evolving networks by pruned landmark labeling. In WWW, pages 237--248, 2014.
[4]
B. Bahmani and A. Goel. Bringing order to social search. In WWW, 2012.
[5]
G. Bhalotia, A. Hulgeri, C. Nakhe, S. Chakrabarti, and S. Sudarshan. Keyword searching and browsing in databases using banks. In ICDE, pages 431--440, 2002.
[6]
R. Bramandia, B. Choi, and W. K. Ng. On incremental maintenance of 2-hop labeling of graphs. In 17th WWW, pages 845--854, 2008.
[7]
T. Bu and D. Towsley. On distinguishing between internet power law topology generators. In INFOCOM, pages 638--647. IEEE, 2002.
[8]
X. Cao, L. Chen, G. Cong, and X. Xiao. Keyword-aware optimal route search. VLDB, 5(11):1136--1147, 2012.
[9]
X. Cao, G. Cong, C. S. Jensen, and B. C. Ooi. Collective spatial keyword querying. In SIGMOD, pages 373--384. ACM, 2011.
[10]
J. L. Carter and M. N. Wegman. Universal classes of hash functions. In 9th STOC, pages 106--112. ACM, 1977.
[11]
L. Chang, J. Yu, L. Qin, H. Cheng, and M. Qiao. The exact distance to destination in undirected world. The VLDB Journal, 2012.
[12]
E. Cohen, E. Halperin, H. Kaplan, and U. Zwick. Reachability and distance queries via 2-hop labels. SIAM Journal of Computing, 32(5):1338--1355, 2003.
[13]
P. M. Fenwick. A new data structure for cumulative frequency tables. Software: Practice and Experience, 24(3):327--336, 1994.
[14]
A. Fu, H. Wu, J. Cheng, and R. Wong. Is-label: an independent-set based labeling scheme for point-to-point distance querying. In PVLDB, volume 6, April 2013.
[15]
R. Geisberger, P. Sanders, D. Schultes, and D. Delling. Contraction hierarchies: Faster and simpler hierarchical routing in road networks. In WEA, pages 319--333, 2008.
[16]
K. Golenberg, B. Kimelfeld, and Y. Sagiv. Keyword proximity search in complex data graphs. In SIGMOD, pages 927--940, 2008.
[17]
A. Gubichev, S. J. Bedathur, S. Seufert, and G. Weikum. Fast and accurate estimation of shortest paths in large graphs. In CIKM, pages 499--508, 2010.
[18]
H. He, H. Wang, J. Yang, and P. Yu. Blinks: Ranked keyword searches on graphs. In SIGMOD, pages 305--316, 2007.
[19]
M. Jiang, A. W.-C. Fu, R. C.-W. Wong, and Y. Xu. Hop doubling label indexing for point-to-point distance querying on scale-free networks. 2014.
[20]
R. Jin, N. Ruan, Y. Xiang, and V. E. Lee. A highway-centric labeling approach for answering distance queries on large sparse graphs. In SIGMOD Conference, pages 445--456, 2012.
[21]
M. Kargar and A. An. Keyword search in graphs: Finding r-cliques. In PVLDB, pages 681--692, 2011.
[22]
C. Long, R. C.-W. Wong, K. Wang, and A. W.-C. Fu. Collective spatial keyword queries:a distance owner-driven approach. In SIGMOD, 2013.
[23]
J. Lu, Y. Lu, and G. Cong. Reverse spatial and textual k nearest neighbor search. In SIGMOD, pages 349--360. ACM, 2011.
[24]
H. Maserrat and J. Pei. Neighbor query friendly compression of social networks. In KDD, pages 533--541, 2010.
[25]
S. T. Piantadosi. Zipf's word frequency law in natural language: A critical review and future directions. Psychonomic bulletin & review, 21(5):1112--1130, 2014.
[26]
M. Qiao, H. Cheng, J. Yu, and W. Tian. Top-k nearest keyword search on large graphs. In VLDB, 2013.
[27]
H. Samet, J. Sankaranarayanan, and H. Alborzi. Scalable network distance browsing in spatial databases. In SIGMOD, 2008.
[28]
P. Sanders and D. Schultes. Highway hierarchies hasten exact shortest path queries. In ESA, pages 568--579, 2005.
[29]
J. Sankaranarayanan, H. Samet, and H. Alborzi. Path oracles for spatial networks. PVLDB, 2(1):1210--1221, 2009.
[30]
A. Sarma, S. Gollapudi, M. Najork, and R. Panigraph. A sketch-based distance oracle for web-scale graphs. In WSDM, pages 401--410, 2010.
[31]
C. Spearman. The proof and measurement of association between two things. The American journal of psychology, 15(1):72--101, 1904.
[32]
Y. Tao, S. Papadopoulos, C. Sheng, and K. Stefanidis. Nearest keyword search in xml documents. In SIGMOD, 2011.
[33]
J. van Leeuwen and D. Wood. Interval heaps. The Computer Journal, 36(3):209--216, 1993.
[34]
F. Wei. Tedi: efficient shortest path query answering on graphs. In SIGMOD Conference, pages 99--110, 2010.
[35]
J. W. J. Williams. Algorithm-232-heapsort, 1964.
[36]
D. Wu, M. Yiu, G. Cong, and C. Jensen. Joint top-k spatial keyword query processing. TKDE, 2011.
[37]
A. D. Zhu, W. Lin, S. Wang, and X. Xiao. Reachability queries on large dynamic graphs: a total order approach. In SIGMOD, pages 1323--1334. ACM, 2014.
[38]
G. K. Zipf. Human behavior and the principle of least effort. addison-wesley press, 1949.

Cited By

View all
  • (2025)Privacy-Preserving Hierarchical Top-k Nearest Keyword Search on GraphsElectronics10.3390/electronics1404073614:4(736)Online publication date: 13-Feb-2025
  • (2024)Chameleon: A Heterogeneous and Disaggregated Accelerator System for Retrieval-Augmented Language ModelsProceedings of the VLDB Endowment10.14778/3696435.369643918:1(42-52)Online publication date: 1-Sep-2024
  • (2024)DKWS: A Distributed System for Keyword Search on Massive GraphsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.3313726(1-16)Online publication date: 2024
  • Show More Cited By

Index Terms

  1. Exact Top-k Nearest Keyword Search in Large Networks

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data
    May 2015
    2110 pages
    ISBN:9781450327589
    DOI:10.1145/2723372
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 May 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. 2-hop labeling
    2. keyword-lookup tree
    3. nearest keyword search

    Qualifiers

    • Research-article

    Funding Sources

    • RGC GRF Hong Kong

    Conference

    SIGMOD/PODS'15
    Sponsor:
    SIGMOD/PODS'15: International Conference on Management of Data
    May 31 - June 4, 2015
    Victoria, Melbourne, Australia

    Acceptance Rates

    SIGMOD '15 Paper Acceptance Rate 106 of 415 submissions, 26%;
    Overall Acceptance Rate 785 of 4,003 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)24
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 19 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Privacy-Preserving Hierarchical Top-k Nearest Keyword Search on GraphsElectronics10.3390/electronics1404073614:4(736)Online publication date: 13-Feb-2025
    • (2024)Chameleon: A Heterogeneous and Disaggregated Accelerator System for Retrieval-Augmented Language ModelsProceedings of the VLDB Endowment10.14778/3696435.369643918:1(42-52)Online publication date: 1-Sep-2024
    • (2024)DKWS: A Distributed System for Keyword Search on Massive GraphsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.3313726(1-16)Online publication date: 2024
    • (2024)Hierarchical Taylor quantized kernel least mean square filter for data aggregation in wireless sensor networkInternational Journal of Communication Systems10.1002/dac.595237:18Online publication date: 15-Aug-2024
    • (2023)An Efficient Dynamic Programming Algorithm for Finding Group Steiner Trees in Temporal GraphsInternational Journal of Intelligent Systems10.1155/2023/19741612023Online publication date: 1-Jan-2023
    • (2023)PSPC: Efficient Parallel Shortest Path Counting on Large-Scale Graphs2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00074(896-908)Online publication date: Apr-2023
    • (2023)Efficiently Answering Quality Constrained Shortest Distance Queries in Large Graphs2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00071(856-868)Online publication date: Apr-2023
    • (2022)Nearest close friend query in road-social networksComputer Science and Information Systems10.2298/CSIS210930031C19:3(1283-1304)Online publication date: 2022
    • (2022)Path–Based Continuous Spatial Keyword QueriesComplexity10.1155/2022/40912452022Online publication date: 1-Jan-2022
    • (2022)DAWARProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3477495.3531962(395-404)Online publication date: 6-Jul-2022
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media