dcsimg

Why don’t these two reports match?

Products

Webtrends Analytics 8.x
Webtrends Analytics 9.x

Cause

Comparing the data in two different reports that conceptually appear to show the same data, but which do not match. This perceived discrepancy can reduce confidence in the accuracy of Webtrends data, so it’s important to understand why two apparently similar reports might show different data.

Resolution

The following checklist covers the most likely reasons for report discrepancies, roughly in order from most to least common. This information pertains primarily to custom reports but much of it (particularly the profile-level points) will also apply to standard reports.

1. Are the reports on different profiles?

a. Different datasources

If we are feeding different sets of data into the profile, we can expect different sets of data in the resulting reports.

b. Different profile-level filters

Corollary to 1a. Differences in the dataset will generally yield differences in report data.

c. Different profile-level configuration

Several points of configuration at the profile level will affect the report data.

i. Timezones

Two otherwise identical profiles set to different timezones will show the same data with an offset. The reports will trend toward similarity in larger time periods but will diverge in smaller time periods.

ii. Page view determination

Page view determination behaves like a filter, removing non-pageview data from Pages based dimensions or Page View measures. Two profiles with different page view determination settings will likely show different data in the Pages report and in any other reports based on pages.

iii.URL Rebuilding

URL Rebuilding will add uniqueness to data in the Pages and URL dimensions, so while the overall number of page views for two profiles with different URL Rebuilding settings might match, the page views for specific pages might not.

iv. URL Search & Replace

URL Search & Replace can have a dramatic effect on the data analyzed in a profile – it can add or remove uniqueness (increasing or decreasing the cardinality of URLs), add or remove query parameters, or change the nature of URLs completely. If two profiles don’t have exactly identical URL S&R definitions, we would not expect the resulting report data to match.

v. VH settings

Profile-level Visitor History settings will affect the availability of several pieces of data stored in the per-profile VH database. Any reports that use dimensions or measures based on VH values (usually WT.vr_*) are dependent on these settings, and differences here will cause those reports to diverge. For example, if one profile has VH Campaign Retention set to 30 days, and the other has Campaign Retention set to 90 days, the second will show considerably more campaign data.

vi. Sessionization

Two profiles with different sessionization settings will show similar or identical data in hit-based reports but will diverge in visit-based reports, since sessionization directly affects our ability to stitch together hits into visits (and to determine when that visit has ended).

d. Different profile analysis cycles

Even two profiles that are identical in every way will not necessarily show identical data unless they have analyzed exactly the same data. If two identical profiles run five minutes apart, the second profile may have five minutes of additional data, so numbers will be slightly increased.

e. Different analysis history

Corollary to 1d. If one profile was set up in January and the other was set up in February, their yearly reports will not match. When two profiles diverge on historic data but not on current data, always check the history of the profiles to see if one has been collecting data longer than the other.

2. Are the two reports on the same profile?

a. Different report-level filters

Differences in report-level filters will alter the available dataset for each report in exactly the same way profile-level filters (1b) would.

b. Different dimensions

If two reports are based on similar but different dimensions, we would not necessarily expect the data to match. Remember that dimensions can be assigned a different name at the report level, so always check the report configuration to ensure that two dimensions with the same report name are actually the same dimension.

i. Dimensions are based on the same parameter but have different accumulation types (configured at dimension level)

Accumulation types determine when we collect the value from the dimension in question (first occurrence in visit, last occurrence in visit, most recent value, etc). If two dimensions are both based on WT.z_value, but one is set to collect (for example) at the start of the visit, and the other is set to collect the most recent value, then their values will diverge as visitors encounter different WT.z_value parameters in the course of a single visit.

ii. Dimensions are based on the same parameter but differ in pattern matching, regular expression or translation file settings (configured at dimension level)

If two dimensions are both based on the same value but one is altered in some way, the overall report numbers may match but line items will differ. The Fixed Pattern or Regular Expression options can either change the values as they appear in the report or act as filters to exclude information that does not match. Similarly, the translation options will alter parameter values (and possibly cause some individual values to roll together).

iii. Dimensions are based on the same parameter but one has “Multiple Values Delimited by Semicolons” checked (configured at dimension level)

Checking this box will have no effect when parameters never contain semicolons, but in cases where they do, checking the box will result in a report with more individual line items as the semicolon is used to separater distinct parameter values. In these cases, hit-based measures should continue to total as before, but visit-based measures will likely be higher than in a report without this option set.

iv. Dimensions are based on the same parameter but one has “Exclude activity without dimension data” checked (configured at report level)

The “Exclude activity without dimension data” (aka “Exclude nones”) option can have a dramatic effect on the report totals. If a certain parameter only appears in ten visits out of a hundred, a report based on that parameter which has this box unchecked will report metrics on all 100 visits, with 90 of those visits appearing under a “None” line item. A similar report with the box checked will report metrics on only the 10 visits which had a value for the parameter.

v. Dimensions are based on the same parameter but one has “Parameter Contains Drilldown Data” checked (configured at report level)

As above, checking this box will have no effect if there is no drilldown delimiter present in the parameter. If there is a delimiter present, Webtrends will interpret the delimited values as parts of a drilldown. This will add drilldown layers in the report. Both hit- and visit-based measures will likely be similar or identical, but individual line items may appear completely differently and may roll together if their source values share common drildown levels.

c. Different measures

If two reports are using similar but different measures, we would not expect their values to match. Again, bear in mind that measures (like dimensions) can be assigned a different name at the report level, so check report configuration to ensure that two measures that appear identical are actually the same measure.

i. Measures are based on the same value but set to different accumulation types (configured at measure level)

As in 2b(i), the accumulation type (first occurrence in visit, last occurrence in visit, most recent, etc) can have a dramatic effect on the measure’s values.

ii. Measures are based on the same value but differ in pattern matching, regular expression or translation file settings (configured at measure level)

As in 2b(ii), differences in measure configuration when one measure is using pattern matching, regular expression matching or translation will yield different results. Pattern matching and regular expression can either alter the value or act as filters, while translation can replace the value of the measure entirely.

iii.Measures are based on the same value but one has “Multiple Values Delimited by Semicolons” box checked (configured at measure level)

As in 2b(iii), a measure which has this box checked and which contains semicolons will result in different report values than the same measure without the box checked.

iv. Measures are based on the same value but one is set to sum across visit (configured at measure level)

Setting a measure to sum across the visit will fundamentally change the way the values for that measure are calculated, so if two measures are otherwise identical but one has this option set, their values will match only occasionally.

v. Measures are based on the same value but use a different measure method (configured at report level)

One of the report-level options for certain measures is “Method”, which can be set to sum, average, minimum, maximum, or count. Two measures with different methods for calculating values will show dramatically different results in reports. For example, the built-in Orders and Revenue measures are nearly identical; both are based on WT.tx_s, and they differ only in calculation method – Orders are set to Count, while Revenue is set to Sum.

vi. Measures are identical but added to report in a different order (configured at report level)

The results in Webtrends reports are sorted by default by the values in the first measure, so two reports with the same measures in a different order will sort their data differently. Trimming takes place based on this same default sorting, so if reports are trimming in any way, the values trimmed vs those retained will vary depending on the sort order. (See also 2d for more on trimming.)

d. One or more report or analysis tables may be trimming

Unless the two reports are identical in every way, they won’t necessarily trim in exactly the same way. If two otherwise similar reports do not contain the same data, check the table limits on both to ensure that they both have the ability to display the same amount of data and that data is not being trimmed out of one report and not the other. In most cases, the totals on reports affected in this way will be identical (since any missing items should be summarized in an “Other” line item, but individual line items may vary from report to report). See also 2c(vi) for the effect of measure order and sorting on trimming.