Micro-Mining and Segmented Log File Analysis: A Method for Enriching the Data Yield from Internet Log Files
- 1 October 2003
- journal article
- research article
- Published by SAGE Publications in Journal of Information Science
- Vol. 29 (5) , 391-404
- https://doi.org/10.1177/01655515030295005
Abstract
The authors propose improved ways of analysing web server log files. Traditionally web site statistics focus on giving a big (and shallow) picture analysis based on all transaction log entries. The pictures are, however, distorted because of the problems associated with resolving Internet protocol (IP) numbers to a single user and cross-border IP registration. The authors argue that analysing extracted sub-groups and categories presents a more accurate picture of the data and that the analysis of the online behaviour of selected individuals (rather than of very large groups) can add much to our understanding of how people use web sites and, indeed, any digital information source. The analysis is labelled `micro' to distinguish it from traditional macro, big picture transactional log analysis. The methods are illustrated with recourse to the logs of the Surgery Door (www.surgerydoor.co.uk) consumer health web site. It was found that use attributed to academic users gave a better approximation of the sites' geographical distribution of users than an analysis based on all users. This occurs as academic institutions, unlike other user types, register in their host country. Selecting log entries where each user is allocated a unique IP number can be particularly beneficial, especially to analyses of returnees. Finally the paper tracks the online behaviour of a small number of IP numbers, in an example of the application of microanalysis,Keywords
This publication has 6 references indexed in Scilit:
- Digital journals, Big Deals and online searching behaviour: a pilot studyAslib Proceedings, 2003
- Digital visibility: menu prominence and its impact on use. Case study: the NHS Direct Digital channel on Kingston Interactive TelevisionAslib Proceedings, 2002
- Evaluating metrics for comparing the use of web sites: a case study of two consumer health web sitesJournal of Information Science, 2002
- WebQuiltPublished by Association for Computing Machinery (ACM) ,2001
- Developing and testing methods to determine the use of web sites: case study newspapersAslib Proceedings, 1999
- How people revisit web pages: empirical findings and implications for the design of history systemsInternational Journal of Human-Computer Studies, 1997