dcsimg

Why do the SmartSource Data Collector logs return different report metrics than ones based on web server logs?

Products

Webtrends Analytics 9.x
Webtrends Analytics 8.x

Issue

Why do the SmartSource Data Collector logs return different report metrics than ones based on web server logs?

Resolution

The technologies used to collect log file data differ greatly and can result in discrepancies, as explained below.

  • Counting Method – Traditional log analysis counts hits from web server log files; client-side analysis counts users (identified by cookies) and HTML pages viewed from browsers.
  • Crawlers/Spiders – Log file analysis can be configured to count hits from spiders and other crawlers; since such hits do not cause the JavaScript to be rendered, client-side analysis does not count them.
  • Proxy Servers and Page Caching – With log file analysis, pages cached by proxy servers may not be counted; with client-side, all page views are counted (each page view contains a random number that prevents proxy servers from caching it).
  • Downloads and non-HTML Pages – With client-side, only viewable HTML pages can be counted (.html, .htm, .asp, etc.); log file analysis software can be configured to count all downloads and links to non-HTML files.
  • Error Pages – Error pages are counted and tabulated in a separate table by log file analysis; client-side counts viewable error pages only if they contain its JavaScript.
  • Redirects – While log file analysis software can be configured to count all redirects, the JavaScript cannot be embedded in a virtual redirect and thus client-side cannot count them. This is especially important for sites that use redirects on their home page. If the first hit to a site is a virtual redirect, client-side will never know the true referrer. For client-side to capture accurately referring information, the script must be present on the first page viewed (even if it is an invisible redirect). If the first hit is in fact a redirect, it should be a hard-coded redirect.
  • New Visitor/Sessions – Log file analysis software can track new visitors and sessions from the log files and can use special cookies to further track visitor sessions; browsers must have JavaScript and cookies enabled for the client-side to track new visitors and sessions.
  • JavaScript code not loading – If a page is requested by the user and it does not completely load before they decide to either click on a link (because the page is loading slowly) or go back to the previous page before the JavaScript code is executed the web server will still log an entry in the log file as the page was requested by the user. However, the JavaScript was not executed on the page so there is no request recorded in the SDC/OnDemand log files. At times when sites are slow users tend to click on links before the page loads completely, or they just leave the site before the JavaScript code is executed.