Sunday, December 19, 2004

Capturing Performance Requirements for Desktop applications

The common performance objectives for desktop applications are as follows:

  • UI Responsiveness: No matter what you do never let the UI hang or be non responsive for the user. I would call this as perceived performance where the end user sees the UI as being continuously responsive. If you are performing a blocking operation which is also a long running one then showing a status bar helps rather than just showing a hung application with the sand clock icon.
    You should quantify this objective at an application level rather than a use case level. The attribute can be quantified in Not to Exceed terms. For example, you may decide that the application should not be non responsive for more than say 2-4 seconds no matter what the use case is.
  • Number of screen hops to complete a use case: This is another attribute which comes in perceived performance category rather than being based on the machine resources on which the application is installed. If an application requires multiple nested dialog boxes to complete a single use case this might be taken as a negative factor by the end user for the performance. The screen flow for any use case should be kept to as minimal as possible without increasing the screen complexity.
    You should quantify this attribute at application level rather than use case level. This attribute can be quantified in Not to Exceed terms. For example an architect may decide that the screen flow should not exceed 3 nested dialog boxes for any use case.
  • Private Bytes: This is the amount of bytes which have been allocated to the process and cannot be shared with any other process. Keep in mind that memory is a shared resource on a desktop. If available you can also do the benchmark analysis of the existing applications so as to set a target for yourselves. The guidelines for setting the threshold limit for this objective are very subjective. It depends upon various parameters like :
    Who is the target end user? : A developer who shall be working on a high end machine or a normal home user. It makes a lot of difference because a developer shall most probably be powered by a high end machine with ample physical memory as compared to a home user. You need to remember that the application you are developing will co-exist and ‘co-run’ with other applications and hence share the total available memory.
    What is the type of application you are developing?: If you are developing an application which constantly demands user attention and simply cannot work in background then it has kind of a distinct advantage as it may be allowed to hog more memory as compared to an application which is running as a windows service in the background.
    How critical is the application for an end user?: If the application is critical to the end user he/she would not mind it gobbling up most of the desktop resources. However, if the application is just among the mob of others fighting to co-exist and co-run on the desktop then you better make sure that the application consumes minimum memory possible. Consider the example of Microsoft Word. When the user has it opened up and is busy typing something, it is among the top memory hoggers on the desktop. The user tolerates this (he does not have any other option in Microsoft worldJ) because he/she is actually busy using the application and chances are he/she is not working on anything else. However, it would be obscene for any windows service which runs in the background to dare gobble up such large amounts of memory in the form of private bytes. Therefore you should keep in mind what is the nature of application and how critical it is to the user.
    Do you perform any I/O operations? Applications performing I/O activity (Disk I/O or Network I/O) may need to allocate buffers for storing of data. This affects the private bytes consumed by the application. This factor is significant in server applications but nevertheless needs should be considered for desktop applications too.
  • CPU cycles: Like memory processor is again a shared resource. Almost all the rules mentioned above for memory apply to processor utilization too. For desktop scenarios it is safe to assume a single processor desktop and allocate CPU budget for the application. Depending upon the criticality of the application you need to set the performance objective for the allowed CPU cycles. For example, if the application is supposed to run as a back ground service you may decide not to hog more than 20 % of CPU at any point of time. However, if your application is supposed to one of the critical ones running on the desktop you may decide to up the limit. A good example would be of a user playing music while working on a desktop. He/she may not like the fact the music application is hogging more than 70% of the CPU. However, a disc jockey creating fusion music may not bother even if most of the CPU cycles are consumed by the music application.
    The important consideration for setting the limit of the CPU cycles again is the deployment scenario. An important difference between Private Bytes and CPU Cycles is that you need to set the limit on both the application level as well as the use case level explained below:
    Application level: The application level limit can be set in either Not to Exceed terms or as a general figure which is applicable to most of the use cases. There may be certain use cases which may exceed this set limit because of their nature of computation. If there are any processor intensive use cases then they may deserve to exceed the limit set for the application.
    Use Case level: As explained above, there may be certain use cases which may deserve to breach the application limit. You need to limit the breach levels for these use cases too.
    In my next blog entry I shall write about the objectives which should be captured for server applications like a web service, ASP.NET web application, COM+ Server etc.

What is Performance?

Performance is a very generic term. If you were to ask people how they would define performance then you would usually get answers which would mean – they know when an application or a module is performing well because they can feel it to be blazingly fast and they know when a system is performing slow because they can see a non-responsive UI or the hour glass endlessly on their screen. Well, actually they are talking about the symptoms of how an application poor in performance would look like. They are not anywhere near to defining performance.
Defining Performance is the first step to achieving it. So let us start with defining performance: Performance is a qualifier of a system which is defined by certain characteristics called as objectives or targets. The objectives and targets may be dependent upon the type of application, the deployment scenarios of the application and other custom requirements based on the use cases. For example throughput (requests/sec) may be an important performance objective for a server application where as working set may be an important objective for a desktop application.

It is important to capture the performance objectives for an application because you know what to measure for throughout the development stage and during the post deployment stage when the application is in production. Measuring throughout the development stage helps you ensure that the development is on track. Measuring in the post deployment stage helps ensure that the heath of the application is good enough to support the peak load of users and meeting the performance objectives at the same time. If I were to sum up the most important strategy to ensure that the application meets the performance objectives then it would be: Measure, Measure and Measure.

Capturing the performance objectives is particularly important for service providers developing applications for their customers because meeting these objectives would serve as the acceptance criteria for the application.
The next blog entry shall explain in detail some of the important objectives which need to be captured for desktop and server applications.

Monday, December 13, 2004

.NET Reflector

Lutz Roeder's Programming.NET C# VB CLR WinFX page includes a tool - .NET Rflector which allows to you to see the source code of .NET assemblies including the ones available in .NET Framework.

The tool can be used to see the source code, Xml documentation of framework assemblies. The user can use this to find out inner workings of .NET CLR and analyze any issues. The tool comes with search feature which is useful in spotting a particualr interface, class, method etc.

I have used this tool to learn in detail about the System.ComponentModel namespace which helped me test the custom versions of Container, Component classes.

I strongly suggest using this tool as a quick and easy way to browse framework source code rather than browsing the rotor code online.

Performance Anti-Patterns

Anti-patterns are similar to patterns but they are bad solutions to a problem and produce negative consequences. They also tell you why the solution although attractive is bad and what good patterns can be applied instead. Some of the anti-patterns with their impact on performance are discussed below:

The Blob or “GOD” class anti-pattern

Problem
A “god” class can be either one of the following:

  • One which performs most of the work by the system and usually has weak helper classes with only accessor operations like get() or set(). The helper classes perform almost no computation. Consider the following example:

    Class Manager
    {
    void DoWork()
    {
    Helper obj = new HelperClass();
    If (obj.CanDoWork)
    {
    Double x = obj.SomeValue + obj.SomeOtherValue;
    Obj.NewValue = x;
    }
    }
    }
    Class Helper
    {
    bool CanDoWork = true;
    double SomeValue;
    double SomeOtherValue;
    double NewValue;
    }
  • One which encapsulates most of the system’s data in one place and all other classes obtain the data from this class for performing the operation through the accessor functions.

The disadvantages are as follows:

  • Lots of message traffic between the classes which lead to chatty communication.
  • Tight coupling between the classes, if one changes the other also needs to change.
  • Poor object oriented design as it still actually is procedural programming because the relevant data and the actions to be performed on the data are not clubbed together in a logical entity known as class.

SOLUTION
Organize the data and methods which act on the data in the logical top level entities. An object should have most of the data it needs to make a decision. The above code can be re-factored to as follows:

Class Manager
{
void DoWork()
{
Helper obj = new HelperClass();
Obj.DoWork();
}
}
Class Helper
{
bool CanDoWork = true;
double SomeValue;
double SomeOtherValue;
double NewValue;
void DoWork()
{
If (CanDoWork)
NewValue = SomeValue + SomeOtherValue;
}
}

Excessive dynamic allocation anti-pattern

Problem

Developers in zeal to prevent memory overhead create and immediately destroy the objects. This can affect performance if the usage of the object although for small intervals of time is very frequent. Each time the object is created the memory to contain it must be allocated from the heap and when you are done using it necessary clean up must be performed to avoid memory leaks.
This repetitive cycle for each creation/destruction counts as a significant performance overhead.

Solution

There are two possible solutions to avoid the performance overhead depending on the usage scenario:

  • Create a shared pool of resources
    Pre-allocate a set of objects in a shared pool. This pool is created at the time of application start up. When the object is required it is pulled out of the pool for usage and when done using it the object is returned back to the pool. This significantly reduces the performance overhead of frequent creation and destruction. The impact can be much more significant if the initialization of the objects take a long time.
  • Share using Flyweight pattern
    Using the flyweight pattern eliminates the need to create frequent objects and allows all clients to share a single instance of the object.

Circuitous Treasure hunt anti-pattern

Problem

The anti-pattern describes a situation where an object must look in several places to find the information it needs. The look up may involve crossing several process, machine or network boundaries. Besides costing the lookup time, additional costs are involved in making the probable remote calls is also involved.
This anti-pattern is typically found in database applications. Data is retrieved from the first table to query the second table and ultimately you retrieve the data from third table from where you get the actual results.
This situation can also occur in case of objects which have improper distribution of data i.e. the data relevant to a particular operation is distributed in various classes.

Solution

If the database is involved in the scenario try optimizing the data organization, but be careful that optimizing the data for one scenario may cause havoc for other scenarios. Therefore you should list all your scenarios up front and try profiling each of the scenario to reduce the look up and retrieval time.

If the scenario involves a distributed application, consider the following:

  • Modify the design to provide access paths which minimize on the look up time. This can be done by redistributing the data among classes.
  • If optimizing on the data is not a possibility try reducing the number of remote calls involved. Consider the following options:
  • Business Façade: This can be done by wrapping up the complex functionality with a façade and expose only the minimal interface. Offload the work of lookup and processing to this interface which involves local calls. In this way number of remote calls and the network trips can be significantly reduced.
  • Adapter pattern: If you are dealing with incompatible interfaces and looking for a suitable way to create a wrapper use the Adapter pattern. The pattern makes heavy use of delegation where the delegate is the adapter (or the wrapper) and the delegatee is the class being adapted.

One- Lane Bridge anti-pattern

Problem

The anti-pattern is related to contention for a resource because at a given point of time. The problem can be explained by the analogy of a one lane bridge which can be accessed only by one vehicle at a time. The vehicles on both side of the bridge have to wait to access the bridge resulting in long queues and there might be a potential situation of deadlock when two vehicles try to enter from opposite ends with neither willing to let the other go first.
Similar situation exists in software applications where multiple threads or processes try to access a resource concurrently in order to modify its state. For example a lock on the database table ensures that all other queries trying to modify the table or the portion of the table have to wait. The situation can mirror the one lane bridge if the updates are to be made on the same object for most of the time.

Solution

The solution to the problem can again be found by solving the one lane problem. Constructing additional lanes or routing the traffic through other lanes can help reduce the traffic queues. Other option is to limit the access to only certain type of vehicles thereby restricting the availability of bridge.

Similar solutions can be developed for software applications too:

  • Consider de-normalizing of data if database is involved.
  • Consider loose coupling to reduce concurrent access of resources.

References:

http://www.perfeng.com/papers/antipat.pdf

http://www.perfeng.com/papers/moreanti.pdf

http://pacinotti.isti.cnr.it/ISSTA2002/slides/07242002/sess12pr2.ppt

Monday, June 21, 2004

Web Services and application pools

We have often experienced scenarios where a web application (an ASP.NET web application i.e. ASPX or another web service i.e. ASMX) needs to call a web service on the local machine. Consider a scenario where a server in the application tier hosts multiple web services and one web service needs to call another web service on the same machine.
If you have Windows 2000 as your host operating system this essentially means that all the web services shall be hosted in the same worker process. Calls to the web service require threads from the threadpool. If you load test this scenario even after sufficiently tuning the threadpool ( the minFreeThreads, minLocalRequestFreeThreads, maxconnection, maxWorkerThreads and maxIoThreads attributes) you may experience a throughput curve which is not stable and shows a wavering trend something like a sine curve. The only thing you can do for Windows 2000 is host the web services on separate machines because tuning the thread pool may not sufficiently minimize the wavering trend in the throughput graph and other graphs for resource utilization and response time.

However, if you have Windows 2003 as your host operating system you can considerably stabilize the performance indicators by hosting the web services in separate application pools on the same machine and tuning the threadpool. To know more about tuning the threadpool see Chapter 10 at http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnpag/html/scalenetchapt10.asp