dcsimg

What is Analysis Throttling and how can it be optimized?

Products

Webtrends Analytics 8.x
Webtrends Analytics 9.x

Cause

Analysis Throttling is a global and profile-level setting which controls how much log data is processed in one cycle, which is not to be confused with the amount processed throughout the course of the analysis job. During analysis, the engine will read the amount of log data specified, beginning from the start of the data source (for new profiles) or the point at which it last left off (for existing profiles), up through the end of that time span. The data is parsed, sorted and configuration options are applied, after which the report tables are either built or updated, and if log data that has not yet been analyzed remains, a new cycle begins and the process is repeated until no additional log data remains to be analyzed.

Note that Analysis Throttling does not specify how much data will be analyzed overall, but rather how much is analyzed per cycle (the exception being if the option to rerun analysis is unchecked or stop at midnight functionality is enabled). The extent of data analyzed is dependent on the availability of log files in the data source location and the definition of the data source path which, for example, could be defined with a date macro such as ex2010*.log, from which only log files beginning with ex2010 would be included.

Resolution

The option to configure Analysis Throttling for all profiles is found under Web Analysis > Options > Analysis > Analysis Throttling. To apply it on a per-profile basis, edit the profile and navigate to Analysis > Analysis Throttling.

The Maximum Data to Analyze section allows users to specify how much log data is processed per cycle, with the default system value of one day. Other options include analyzing all available log data or a user-defined amount in intervals of days, hours or minutes.

The default system value of one day suffices for many installations in which the volume of log data for this period, and the amount of memory required to process it, is not so great as to degrade performance or cause analysis failure.

If “Analyze all available log data” is selected, the application will attempt to analyze all available log data in a single cycle.

If a custom value is defined, performance may improve or degrade, depending on the volume of data to be analyzed. When the custom value is selected the following additional options are available:

-Rerun analysis immediately after maximum amount of data is analyzed
-Update report data between each analysis period

The first option instructs the application to continue analysis or it will otherwise finish after completing only one analysis cycle. This option will almost always be selected except for when a predetermined amount of data is to be analyzed, and no more. The second option writes the results of the analysis to the report database, and this makes it possible to incrementally view report data while analysis is under way, assuming multiple cycles are required.



More Information

What are the optimal settings for Analysis Throttling?
The system default should be sufficient in many environments, however there are times when selecting other options will improve performance and efficiency.

  • Select “Analyze all available log data” whenthe data source’s log files contain a minimal amount of data and their combined volume, spanning from beginning to end of the data source, do not collectively impose a significant burden on system resources. An example of when this would be appropriate is for a profile with a data source containing a year of log files in which each daily log file averages 20kb in size. While the system default could be used, this would require the application to process one day at a time, going back for another day for an entire year, adding a delay in the time required to process daily intervals when it could otherwise complete in one pass.
  • Select “Analyze a maximum of X days” whenthe data source’s log files are sparse enough to allow processing multiple days at once without issue but the total volume increases the risk of failure when analyzing all available log data. An example of when this might be used would be if an archive of several years of daily log files averaging 20mb each were to be analyzed. Depending on system resources, analysis of ten days or more at a time may be optimal.
  • Select “Analyze a maximum of X hours” whenlog files are so large that attempts to analyze one day of log data fail.
  • Select “Analyze a maximum of X minutes” whenlog files are so large that attempts to analyze one hour of log data fail.
  • Select “Update report data between each analysis period” whenyou want the assurance that, in the event of analysis failure, updates to the report database persist and the next attempt at analysis will begin from the point from which it was last saved.
  • Do not select “Update report data between each analysis period:” whencompletion of a certain profile’s analysis is so time-sensitive that it is preferable to forego updating the report database at the end of each analysis cycle (and there is certainty that the analysis will complete successfully).It must be emphasized that if analysis fails, all progress for the current analysis will be lost.
The results of making changes to Analysis Throttling are heavily dependent on the system resources available. What is mentioned above in examples may hold true on a server which only meets the minimum system requirements, but on a machine with ample resources there will almost certainly be a greater threshold that must be reached before issues with analysis are a possibility. When assessing the capabilities of a Webtrends installation, it is recommended that test analyses with varied levels of analysis throttling are run to determine where the “sweet spot” lies for greatest stability and efficiency. Similarly, when issues with analysis arise it is recommended that the Analysis Throttling settings are incrementally reduced or halved (as they are usually too high, rather than too low) until the point is reached where analysis completes successfully and consistently.