dcsimg

Detecting spider and bot traffic through reports

Products

Webtrends Analytics 8.x
Webtrends Analytics 9.x

Cause

You suspect recent increases in your traffic may be the result of automated processes such as site testing, bots, or site indexing.

Resolution

There are two reports that can help file and determine the impact of automated visits to a web site:

1. Hits Trends
In the Complete View template, navigate to Site Performance > Technical Statistics > Hits Trend. Select the daily view from the calendar for a day which should display typical visitor metrics. For many web sites, normal site traffic should roughly appear as a bell curve, with traffic peaking, possibly several times, and consistently dropping off at times when the majority of visitors would not be expected. If automated traffic is frequenting the site it may cause a plateau which does not decay throughout the day. As machines do not have to care about the time of day this kind of traffic can start high and stay high, producing a flat table effect in the Hits Trend report.

2. Top Visitors
Also in the Complete View template, under Marketing > Visitors, the Top Visitors report displays the highest-ranked visitors by the number of visits made, as well as display the number of hits generated. Automated processes create a large number of hits without ending their visit so one visitor may have few visits but thousands of hits.
Note: The Top Visitor report sorts by Visits in descending order as its default. Click the “Hits” column header to sort visitors by hits in descending order.

The Top Visitors report is subject to table limits so a visitor with few visits may not necessarily show up in the report. To work around this limitation, copy the profile and set the “From the Following Date:” value to the date at which traffic is suspected to be coming from automated processes.

More Information

While Webtrends has a Browsers filter that can remove known bots and spiders from reporting, it doesn’t cover all possible forms of automated traffic. Internal testing and denial-of-service attacks will not be filtered out by the “all spiders and robots” filter. The built-in “All spiders and robots” filter is based on the contents of the browsers.ini file, which contains a list of publicly known bots and spiders.

To remove this unwanted data from reports, find and filter out entries for the IP address of the unwanted visitor prior to analysis.

One indicator of automated traffic is that these clients rarely accept cookies. In the cases where they are capable of accepting cookies the client’s IP address will appear as the first part of the cookie value displayed in the Top Visitors report.