dcsimg

How does client-side data tagging effect page view counts?

Products

Webtrends Analytics 9.x
Webtrends Analytics 8.x

Issue

How does client-side data tagging effects page view counts?

Resolution

Organizations are increasingly implementing an alternative technique to collect web site traffic information, rather than relying on web server log files. This technique is called client-side data collection, or “data tagging” for short. Popularized by Webtrends and other web analytics vendors that provide hosted services, data tagging solves many problems associated with web server log file analysis. With data tagging, web traffic data is more accurate because traffic normally hidden by cache or proxy servers is tracked. IT administration is eased because data collection is centralized in one location versus site data being dispersed among several log files from multiple web servers that may also be geographically dispersed. And web data can be collected from specialized applications, such as application servers and browser applications. With all these benefits of client-side collection, there are a few drawbacks. Implementing data tagging requires some development work to ensure that data tags are inserted and maintained on web pages.

What is a Page View – A hit to any file classified as a page (such as, html, htm, php, and aspx pages). Why may client-side Data Tagging solutions sometimes produce lower page view results than traditional server-side log file web analytics?

Server Side Redirects: These are captured in Web Server log files but not with tagging since the redirect is server side. For example when a visitor types in http://webtrends.com they are redirected to http://www.webtrends.com. This will show as two page views from your log files (one for http://webtrends.com and one for http://www.webtrends.com) but would only count as one page view for tagging (one for the landing page http://www.webtrends.com).

Downloads: Downloads (.xls, .pdf, .doc. swf etc..) may be counted as a page view with your log file solution which by default are not tracked with a JavaScript tag. If these downloads are not tracked then they won’t appear as a page view in your tagging solution.

File Type configuration: Some Web Analytics tools are configured to count certain file types as page views. For example, images that might signify a certain step in a process may be given page view status in order to complete a process or scenario. Some application processes use a servlet so all pages look the same. Each step in the process is determined by passing a different image. Then by excluding all images except the ones that represent pages the process is measured. This ultimately results in double counting of pages, one for the servlet and one for the image.

Bots, Crawlers, and Spiders: These terms are all the same, they all refer to an automated program that goes from website to website caching and processing the pages for search engines. A spider looks at all the pages of your website, and uses that information to rank you in search engines (how high you will list in a search result), and cache a copy of your page on their server for quick reference, and if your site ever goes down. Spiders jump from link to link on the Internet and run endlessly, even if you never submit your website to a search engine, odds are your site will still be spidered. Most sites experience some type of ?log-file bloat?. This refers to the extra page views contained within a log file data-source that would naturally be excluded using the more accurate client-side data tagging solution. Spiders and bots that are not included within client-side data tagging solutions are usually contained within log files to some degree. The reason for this is that there are currently over300+ spiders robots and crawlers actively hitting sites on the Internet. Log file solutions rely on filters within their configuration settings to exclude known spiders. Given the fact that anyone may write a crawler to release on the Internet at any time, these manually applied configuration setting are usually out of date and do not represent the current spiders on the Internet at any given time.

Web Monitoring Automated Tools & Scripts: Most enterprises have automated scripts and spiders that are used to test availability as well as periodically test application function and record uptime. These solutions all produce page views in a log file scenario but do not produce page views in a data tagging solution. Also, if the site includes frames, the frame pages will be counted by a log file solution which will also create “log-file bloat.”

Filters: New implementations of an analytics product can also result in lower page view counts as well. Most often this will be caused by a filter set on the customer’s profile. As log file analysis products count page views regardless of origin, this could cause a discrepancy between log file solutions and data tagging solutions. In order to avoid these problems, the customer can use SmartSource, which is somewhat like an auto-filter, as only the pages of interest are tagged.

The Tag doesn’t execute: It’s possible for a visitor to click a link to go to another page or click stop in the browser before the Webtrends tag executes, for example if the tag is at the bottom of the page or the site has poor page performance. Customers concerned about this type of situation may elect to put the Webtrends tag at the top of the page.

The image request doesn’t reach the data collection servers: It’s possible for the JavaScript to execute but not actually reach the data collection servers. This could be the result of a proxy server or spyware program blocking the image request from being sent to the tag servers. An example could be if you were tracking an internal website accessed by employees of your company behind your companies firewall. The JavaScript could execute but be blocked from reaching the data collection servers because the companies Proxy Server doesn’t allow traffic to the data collection servers.

JavaScript Disabled: If JavaScript isn’t enabled on the visitor’s browser, no information gathered by the Script – User-agent, referrer etc is passed on. Estimates of the number of people who don’t use JavaScript is reported to be anywhere from <2% on up to as high as 10% in some cases depending on your visitors. Every website has a different audience with unique characteristics.

The Webtrends tag is not called from all the pages of your site: Your log files collect all requests to your web servers where tagging only collects the information from the pages containing the Tag. If the tag is not on certain pages then you will obviously not be counting those page views.

More Information

The JavaScript tag usually handles the task of automatically gathering the information, formatting the HTML request, and then passing the data to the SmartSource Data Collection servers. In some cases, the use of the JavaScript tag may not be the best method for implementing Webtrends On Demand tracking. The following are some examples of scenarios where it may be necessary to use server-side code to compose the HTML request:

  • Your application server and/or content management server does not support the inclusion of JavaScript.
  • You are tracking commerce or other dynamic events, where the information you need to track is known to the application server, but not to the HTML page.
  • Your visitors’ browsers do not support JavaScript, or the visitors have disabled JavaScript support in their browsers.