Abstract: Visitors enter a website through a variety of means, including web searches, links from other sites, and personal bookmarks. In some cases the first page loaded satisfies the visitor’s needs and no additional navigation is necessary. In other cases, however, the visitor is better served by content located elsewhere on the site found by navigating links. If the path between a user’s current location and his eventual goal is circuitous, then the user may never reach that goal or will have to exert considerable effort to reach it. By mining site access logs, we can draw conclusions of the form “users who load page p are likely to later load page q.” If there is no direct link from p to q, then it would be advantageous to provide one. The process of providing links to users’ eventual goals while skipping over the in-between pages is called shortcutting. Existing algorithms for shortcutting require substantial offline training, which make them unable to adapt when access patterns change between training sessions. We present improved online algorithms for shortcut link selection that are based on a novel analogy drawn between shortcutting and caching. In the same way that cache algorithms predict which memory pages will be accessed in the future, our algorithms predict which web pages will be accessed in the future. Our algorithms are very efficient and are able to consider accesses over a long period of time, but give extra weight to recent accesses. Our experiments show significant improvement in the utility of shortcut links selected by our algorithm as compared to those selected by existing algorithms.
Download: pdf
Citation
- Adaptive Website Design using Caching Algorithms (pdf, software)
J. Brickell, I. Dhillon, D. Modha.
ACM International Conference on Knowledge Discovery and Data Mining (KDD) (Workshop on Web Mining and Web Usage Analysis) (WebKDD), August 2006.
Bibtex: