Browser Cache and Test Results Interpretation

Probably the most critical stage in any testing project is obtaining an accurate interpretation of the test results and combining this with recommendations or suggestions about where problems are coming from and how the system can be improved.

The accuracy and reliability of the results depend on how closely tests resemble real life and real user experience (i.e. the user model) but there are a few straightforward ways of simplifying system analysis and improving the readability or understanding of the results.

One way is to start with a set of very simple user transactions, analyze how the system behaves with those and only increase the complexity of the tests after the interpretation of previous sets of results has been completed successfully.

Another is narrow down the scope of the testing, in the first instance, by initially dividing the problem. (NB. Let me stress that this approach is to be used as a starting point. I am not suggesting that the overall scope of the testing should remain narrow.) For example, even though in real life users may be connecting from various network locations, unless we can confirm that the system is capable of supporting the required user base, it would not matter if they were all working from the “server room”. In other words, there is little point in starting to examine network impact on the overall user experience if the system will not cope with the requisite number of users.

Results can be easily obscured, often as a consequence of introducing too much “uncontrolled” variability too early in the test model. Search transactions are a good example of this, as real users would normally perform a number of different searches, thereby adding to the complexity of the test. A good test, however, would have these searches under tight control so that any cause/consequence analysis can be performed easily.

It is not unusual for a web performance testing project to combine all of the above types of “errors” by trying to simulate the browser cache’s contribution to the overall user experience from the very start. The more successfully that this simulation is done (where tests are handling data variability, variability in data volumes being retrieved, random caching levels), the more difficult it would be to provide any meaningful interpretation of results over and above the simple pass/fail. It would also be hard to understand if there are any potential system weaknesses or be able to pinpoint the source of any problems, if present.

Perfect load is random, but random load will produce random results that are hard to understand.

Most typical multitier web delivered applications (BMC Remedy ITSM is a good example of this) have two sides to their behavior, capacity and scalability. These are:

Serving static content (images, static html, css)
Serving dynamic content (for Remedy ITSM, Back Channel calls that talk to the Remedy server).

Web servers are very good at serving static content and if problems are identified (and separated from any surrounding “noise) these are normally easy to address. On some systems, static content is even delivered from a dedicated web server, in which case the design of the user experience or system performance test tends to reflect the underlying architecture. This allows a separate analysis of the two different types of content, even if these are delivered simultaneously to the end user browser.

A similar approach can be deployed – even if there are no dedicated image servers – by first creating, executing and analyzing a set of tests that assume the static content is always in a browser cache (or that simply ignore it!) to start with in order not to obscure the results. In this case, we remove any “noise” created by delivering static content. While coming from the same system, static and dynamic content are processed and delivered in very different ways, with each of them having different characteristics, capacity limitations and “weaknesses”. Static content might hit limitations on the network, whereas dynamic content might highlight problems in the application server or database. Having the different types of content separated means that we can analyze their contribution individually before combining them into more realistic scenarios.

Having the ability to control and measure them separately provides a powerful tool for analyzing the system and for answering questions like:

“What is the most that we can get out of the system?”
“Is it a network or a database issue?”
“Will adding another web server help?”
“What is the weakest link in our architecture?”
and many more, similar questions.

It also makes the testing model more flexible, making it easy to compare end user experience and taking into account different levels of client side caching. Most of all, it makes any results produced easy to relate to the system that is being tested, creating an increased amount of relevant information and a higher degree of understanding than just a simple pass or fail.

Scapa Expedite Methodology

Our testing approach is outlined in our Scapa Expedite Methodology white papers which can be found on our website. Based on a sequence of standard test activities, each of which outlines specific objectives and can be applied singly or in combination at various points in a project where most benefit can be derived, the methodology is designed to help you understand your system’s capabilities and, where problems or inadequacies are uncovered, to advise on how to improve performance in a timely and cost-effective manner.

Have you found this helpful or would you like more information or to speak to one of our testing consultants? Please get in touch to let us know.