Log File Analysis


With the help of a log file analysis, statistics about page access or key figures about the use of a website or a web server can be collected. The log file analysis is based on the so-called log files, which are read out. Today, page tagging used by analysis tools has in many cases replaced the traditional analysis of log files. This is not to be confused with social tagging.

Evaluation of Log files[edit]

When webmasters read out logfiles from accessing a website, they usually have to process large amounts of data. For very small projects with few page views, the files could be read out and the individual areas could be assigned by hand. However, as soon as the number of accesses increases and longer data periods are to be collected, special programs are required in which the log files can be entered and then output according to individual aspects.

Aspects of a logfile analysis[edit]

The log file analysis can break down basic key figures about the users of a website:

  • IP Address and host name
  • Country or region of origin
  • Browser and operating system used
  • Direct access by the user or reference from another website or advertising measure
  • Type of search engine and search term entered
  • Duration and number of pages visited by the user
  • Page on which the user has left the website again

Advantages of a log file analysis[edit]

An analysis of the log files offers the following advantages:

  • Reorganization of historical data: Web servers continuously record log files. If the files are saved again and again, these files can be evaluated flexibly.
  • Access numbers remain within your own network: whoever carries out log file analyses and does not transfer the task to an external service provider retains control of his access data.
  • Measurement of aborted downloads: when logging the web server, a log file records all files that have been stored there and can be downloaded by a user. Logfiles use the timestamp of individual hits to log exactly how long and how much a user has downloaded. Problems with downloaded files can therefore be determined more precisely with the log file analysis.
  • Firewalls do not interfere with the protocol: when a website is accessed from a server, the firewall does not interfere. The log file can therefore log the access exactly.
  • Automatic logging of crawlers: Log files record every visit to the web server automatically. This also includes search engine bots.
  • No JavaScript or cookies required: In contrast to web analysis tools, no JavaScript codes or cookies are required for log file analysis. This makes the analysis less susceptible to technical problems. Log files can also be recorded even when users block web analytics tools.
  • Simple formatting: If the log file is not too large, the data can be read out and segmented with conventional data processing programs such as Excel. Therefore, no complex program solutions are required.

Disadvantages of log file analysis[edit]

The disadvantages of log file analysis:

  • Caching and proxies: since a log file can only record data that is created by direct server access, all accesses that occur via the cache memory of the browser and via proxy servers are not included in the protocol. Therefore, the traffic of a page is only inaccurately determined with the logfile analysis.
  • Regular updates necessary: To ensure that logfiles always deliver correct figures, the webmaster has to update the software for data collection again and again. This results in additional maintenance costs.
  • Additional storage requirements: because logfiles are automatically logged, the amount of data for the log files can quickly become very large in case of high visitor traffic, as each server access is registered. Those who perform log file analyses of large websites themselves therefore need additional storage resources.
  • Complex data preparation for large amounts of data: for log file analysis, the individual log files must first be entered into a data preparation program. This means extra work, especially for many data sets.
  • No tracking of widgets or AJAX: a log file can only store data resulting from server requests. For example, if actions within a page are carried out using AJAX, they will not be reflected in the log file, as these are not real server queries.
  • Inaccurate assignment of visits: if a user uses a dynamic IP assignment when surfing and accesses a website several times, the log file shows several accesses although it was only one user at a time. This makes the traffic count inaccurate. The same applies if several users with the same IP access a website. These are then counted as one visitor only.
  • Less data: compared to web analysis tools, the log file analysis offers far less data. For example, it cannot display important KPIs such as bounce rates.

Practical Use for SEO[edit]

With the help of the log file analysis, SEOs have the possibility to evaluate and process relevant visitor data themselves. At the same time, no data is transferred to external service providers, which avoids data protection problems. However, the possibilities for analysis are limited, which is why log file analysis should not be used as the only method for visitor analysis, but rather as a supplement to or as a test tool for common analysis tools such as Google Analytics. For larger websites, the analysis of log files is also associated with the processing of very large amounts of data, which in the long term requires a powerful IT infrastructure.