In this paper, we attempt to build query networks from web search engine
query logs, with the nodes representing queries and the edges exhibiting the
semantic relatedness between queries. To build the network, users’ query histories
are extracted from query logs and are then segmented into query sessions.
Semantic relatedness of queries is modeled using three different statistical
measures: collocation, weighted dependence, and mutual information. We
compare the constructed query networks with comparable random networks
and conclude that query networks are of small world properties. Besides, we
propose a method for identifying the community structures, which is representative
of semantic taxonomies, by applying Newman clustering to query networks.
The experimental evaluation prove the effectiveness of our proposed
method against a baseline model.
by
otis
2007-09-17 19:41
NLP
·
information retrieval
·
paper
·
search