Saturday, 1 March 2008

7 ways to do Performance Optimization of an ASP.NET 3.5 Web 2.0 portal

By: Tanzim Saqib
Download Sample Code

This article explores some of the key performance issues that can occur while developing a Web 2.0 portal using server side multithreading and caching. It also demonstrates model driven application development using Windows Workflow Foundation.

The Performance

Performance is a vast area and great results can never be achieved by a silver bullet. Apparently there are only few points we can explore while developing a Web 2.0 portal - within the scope of the article. Web 2.0 applications are widely developed. Even the DotNetSlackers website you are browsing right now is a perfect example of a Web 2.0 application. These applications often work with third party contents, aggregate them, make various use of them and then make something useful and meaningful to the users. For the past few years, developers were also engaged with such endeavors and a lot of their websites have not addressed performance issues, thus resulting in an unpleasant experience to the users.

The Web 2.0 portal

The Web 2.0 portal you will develop in this article is a mashup of the Eventful.com, Upcoming.org, Zvents.com and Yahoo Geocode APIs. The main interface of the application consists of a Button and a Text Box where the user will type the location of his/her interest and press the Button. It will validate the user's location by using Yahoo Geocode and will let the user know (with corresponding messages) whether the location is valid or not. Then it will go through all three popular Web 2.0 services that store local events by location, and it will display only the present and future local events to the user, sorted by date with no duplicates. In this article you will build the Local Events search engine shown in figure 1, on top of the .NET Framework 3.5.

Figure 1: The Local Events Web 2.0 portal

images/LocalEventsPortal.jpg

Why Performance

After I built this application with no performance optimization in place, I tested it using a dialup internet connection, to simulate the experience for low bandwidth Internet users. I found that it often failed and sometimes it took 80-90 seconds to load. The question was - why wasn't it as fast as any other search service? The answer: The search result, in this case, is the combined result of search performed on three different services over the Internet, which are located in three different servers and maybe in different continents/countries. We need them, since the application itself has no data and yet we need to offer users a search feature. So, such amount of data download is inevitable. However, we need to smartly decide, design, and apply tricks to make it a little faster.

The Developer APIs

We will need four third party APIs in this application. One is the Yahoo Geocode API, which can validate whether the user's submitted location data is valid or not. The other three services make up our data source for this application. Documentation and API keys are available at the following addresses:

They offer open APIs with numerous methods and lots of protocols. We choose REST because it is the simplest among others and can be performed by simple HTTP GET or POST requests.

Performance Improvement #1: Make it AJAX

As I said before, the data download is inevitable and it will take time to load. Making the application postback-less decreases the waiting time for the user. After the user types a location, the application simply displays a message saying that it is locating the events. When available, it displays them in a grid - without reloading the whole page. This removes the overhead caused by reloading a whole page, and makes the website more interactive.

Performance Improvement #2: Remove burden from client side

After AJAX-ifying it, the application might be designed in a way that each of the Web Services is called using AJAX, complex business logic is performed using JavaScript and finally results are displayed in the page by using HTML and CSS. The key business logic that might be implemented on the client side is: Sorting the events, removing duplicates, performing complex string operations, handling exceptions, controlling the flow and so on. Take away all possible burdens from the client side and put most of them in server side. That will make the site lighter and provide a smoother client experience. From my personal experience as a rational Internet user, I still haven't found any browser that works comfortably with JavaScript. Wish I could have a flawless, less error prone browser in my lifetime!

In this application, we will make an AJAX call to a single Web Service that is hosted by our application. The Web Service will perform all of our business logic, including complex operations, and return only an array of LocalEvent objects to the client side. After that, some AJAX control will take over the responsibility of rendering them. Also, an appropriate and adequate use of caching might also improve performance a lot. We will discuss it soon.

Model-driven Development: Windows Workflow Foundation

Windows Workflow Foundation, which we will use as part of our Model-driven development, is a core capability that lets you explicitly or declaratively model the control flow of your application. Rather than embedding your application logic in code, in a workflow the logic is represented declaratively. As a result, you can inspect the application logic, visualize it, track its execution, and even change the logic at runtime. Workflow Foundation provides a higher level of abstraction and visual representation of your business processes that makes them easier to understand and design, by both developers and business domain experts. It's easy to change the flow and rules associated with business processes, often without having to recompile.

Compared to their UML counterpart Activity Diagrams, Workflow diagrams are first-class software artifacts that do not become outdated and diverge from business process logic because they are the business process logic. On the other hand, the Windows Workflow runtime provides a robust, scalable environment for your workflows to execute. Workflows can be persisted to a database when they become idle and reactivated when an external stimulus occurs.

The following figure shows the Workflow of our application.

Figure 2: The SearchWorkflow for searching events in three different sources

images/SearchWorkflow.jpg

This Workflow does the following:

  • It checks in the Cache whether there is a previous entry against the location passed to Workflow
  • If there is any, it directly retrieves the LocalEvent array from the Cache and returns it to the invoker
  • If there is none, it verifies the location with the Yahoo Geocode API
  • If it fails to verify, it sets the ErrorMessage properties and goes to the Terminate state of the Workflow
  • If the location is valid, it shifts to a ParallelActivity made up of three CodeActivity. Each of them performs the search by using a particular Event Service
  • The last CodeActivity consolidates all the results into a single LocalEvent array and returns it to the invoker.
Note: ParallelActivity does not ensure that the activities underneath will run asynchronously. It is a nice representation of what we intend to do though.

The Solution Structure

The solution structure is similar to the one shown in figure 3. It is divided into two projects. The LocalEvents.Business project contains the LocalEvent class, SearchWorkflow and other helper classes. On the other hand, the LocalEvents.Portal project is a standard ASP.NET application with a Style.css, Web.config, Global.asax and a WebService file named LocalEventsWS.asmx.

Figure 3: The Solution Structure

SolutionStructure.jpg

Performance Improvement #3: Initialize Workflow Runtime Engine Once and then Reuse It

The WorkflowHelper class encapsulates the hosting runtime management for our application. The Start method of this class creates a runtime and puts in the Application. Before that, it looks for the previous one: If found, it is reused. By default, the Workflow Scheduler Service class is DefaultWorkflowSchedulerService, which allows spawning threads dynamically for each of the Workflows, which means that the Workflows run asynchronously. But in our case, we need the Workflows to run synchronously, since we will expose the Workflow to a WebService, and WebServices need to wait for the processing to be finished by the Workflow in order to get the results. To achieve this, a ManualWorkflowSchedulerService instance is added to the runtime.

  1. public static WorkflowRuntime Start()  
  2. {  
  3.     WorkflowRuntime workflowRuntime;  
  4.   
  5.     if (HttpContext.Current == null)  
  6.         workflowRuntime = new WorkflowRuntime();  
  7.     else  
  8.     {  
  9.         if (HttpContext.Current.Application["WorkflowRuntime"] == null)  
  10.             workflowRuntime = new WorkflowRuntime();  
  11.         else  
  12.             return HttpContext.Current.Application["WorkflowRuntime"as WorkflowRuntime;  
  13.     }  
  14.   
  15.     var scheduler = new ManualWorkflowSchedulerService();  
  16.     workflowRuntime.AddService(scheduler);  
  17.   
  18.     workflowRuntime.StartRuntime();  
  19.   
  20.     if (null != HttpContext.Current)  
  21.         HttpContext.Current.Application["WorkflowRuntime"] = workflowRuntime;  
  22.   
  23.     return workflowRuntime;  
  24. }  

The following method is responsible for executing the workflow, which takes an instance of a Dictionary that contains the necessary parameters, including output parameters, and then runs the Workflow through the Scheduler service we added before:

  1. public static void ExecuteWorkflow(Type workflowType, Dictionary<stringobject> properties)  
  2. {  
  3.     properties.Add("TheContext", HttpContext.Current);  
  4.   
  5.     var workflowRuntime = Start();  
  6.     var scheduler = workflowRuntime.GetService<ManualWorkflowSchedulerService>();  
  7.     WorkflowInstance instance = workflowRuntime.CreateWorkflow(workflowType, properties);  
  8.   
  9.     instance.Start();  
  10.   
  11.     // ... code edited to save space  
  12.   
  13.     scheduler.RunWorkflow(instance.InstanceId);  

The Stop method terminates the Workflow Runtime engine and removes it from the Application object:

  1. public static void Stop()  
  2. {  
  3.     WorkflowRuntime workflowRuntime = HttpContext.Current.Application["WorkflowRuntime"as System.Workflow.Runtime.WorkflowRuntime;  
  4.     workflowRuntime.StopRuntime();  
  5.     HttpContext.Current.Application.Remove("WorkflowRuntime");  
  6. }  

Run and terminate the Workflow Runtime engine at the application level so that workflows will run once in the application's lifetime. Add the following two event handlers in Global.asax.cs:

  1. protected void Application_Start(object sender, EventArgs e)  
  2. {  
  3.     LocalEvents.Business.WorkflowHelper.Start();  
  4. }  
  5.   
  6. protected void Application_End(object sender, EventArgs e)  
  7. {  
  8.     LocalEvents.Business.WorkflowHelper.Stop();  
  9. }  

Inside the SearchWorkflow

The first activity in the Workflow finds the cached LocalEvent array and, if found, it terminates the Workflow and returns the array immediately:

  1. private void IsNotInCache(object sender, ConditionalEventArgs e)  
  2. {  
  3.     // Determine whether it's available in Cache  
  4.     e.Result = TheContext.Cache[Location.ToUpper()] == null;  
  5.   
  6.     // Yes available, return the result from Cache  
  7.     // And do not proceed any further of the Workflow  
  8.     if (e.Result == false)  
  9.         LocalEventsData = (LocalEvent[])TheContext.Cache[Location.ToUpper()];  
  10. }  

Depending on the e.Result property, the Workflow engine determines the way to go. If the LocalEvent array cannot be found in cache, e.Result = true will cause the flow moving to the next activity, which is an IsInvalidLocation activity that performs a check using Yahoo Geocode. The GetResponse method fetches data from the specified URL. The code is pretty self-explanatory so let us leave it to you to explore. In this block of code, you can see that if the location can be resolved, we store its longitude and latitude in private variables that we will reuse to retrieve the events that do not have valid longitude and latitude but came up after a search for that location.

  1. private void IsInvalidLocation(object sender, ConditionalEventArgs e)  
  2. {  
  3.     // A try catch is required.  
  4.     // Yahoo throws Bad Request (400)   
  5.     // while it fails to recognize a location  
  6.   
  7.     try  
  8.     {  
  9.         string response = GetResponse(string.Format(YAHOO_API_URL, YahooKey, Location));  
  10.   
  11.         XNamespace yahooNamespace = "urn:yahoo:maps";  
  12.         var xml = XElement.Parse(response);  
  13.         var resultTag = xml.Element(yahooNamespace + "Result");  
  14.         _Latitude = resultTag.Element(yahooNamespace + "Latitude").Value;  
  15.         _Longitude = resultTag.Element(yahooNamespace + "Longitude").Value;  
  16.   
  17.         e.Result = false;  
  18.     }  
  19.     catch (WebException wex)  
  20.     {  
  21.         ErrorMessage = "The location entered can not be resolve. Please make sure you typed correctly.";  
  22.         e.Result = true;  
  23.     }  
  24. }  

If the location cannot be resolved, it throws an exception that will be handled in WebService in order to return a meaningful message to the client, telling what has just happened.

Performance Improvement #4: Server-side Multithreading

The next thing to do in the Workflow, after validating the location, is invoking the event provider services: InvokeSearchEventful, InvokeSearchUpcoming and InvokeSearchZvents. Wait, we have got something to work on. Instead of executing these searches sequentially, how about invoking each in a different thread? Remember when we talked about ParallelActivity? Do not forget that it is not responsible for spawning the activities into separate threads. So how do we achieve this?

We initialized a couple of variables to keep track of the threads to be spawned:

  1. List<ManualResetEvent> locks = new List<ManualResetEvent>(MAX_SYNC_CALLS);  
  2. ManualResetEvent[] threadEvents = new ManualResetEvent[MAX_SYNC_CALLS];  

As soon as the control flow comes to the InvokeSearchEventful activity, it adds a reference to this particular thread into the locks that we will use to keep track of the threads in our code. Then, it queues the SearchEventful method to the ThreadPool. It means that it spawns a thread which will execute the SearchEventful method.

  1. private void InvokeSearchEventful(object sender, EventArgs e)  
  2. {  
  3.     locks.Add(threadEvents[EVENTFUL_SEQUENCE]);  
  4.   
  5.     // Queue to the ThreadPool  
  6.     ThreadPool.QueueUserWorkItem(new WaitCallback(SearchEventful), (  
  7.         object)(new object[] { threadEvents[EVENTFUL_SEQUENCE] }));  
  8. }   

The other two methods, InvokeSearchUpcoming and InvokeSearchZvents are similar, so we are not going into them. The SearchEventful method retrieves data from the service and populates each LocalEvent object with data using LINQ to XML, as shown in the following listing. As you can see at the end of the code block, the call to evt.Set()marks that the thread is done. The other two methods are the same, so those are intentionally left out.

Listing 9: SearchEventful method

  1. private void SearchEventful(object states)  
  2. {  
  3.     var evt = (ManualResetEvent)((object[])states)[0];  
  4.   
  5.     // Retreive result from API call  
  6.     string response = GetResponse(string.Format(EVENTFUL_API_URL, EventfulAppKey,   
  7.        EventfulUser, EventfulUser, Location));  
  8.   
  9.     var events = XElement.Parse(response).Element("events").Elements("event");  
  10.     foreach (XElement anEvent in events)  
  11.     {  
  12.         LocalEvent localEvent = new LocalEvent();  
  13.   
  14.         // These APIs often return data outdated date, so filter  
  15.         localEvent.EventDateTime =   
  16.            DateTime.Parse(anEvent.Element("start_time").Value).ToString();  
  17.         if (!IsAcceptableDate(Convert.ToDateTime(localEvent.EventDateTime)))  
  18.             continue;  
  19.   
  20.         localEvent.Title = anEvent.Element("title").Value;  
  21.         localEvent.Summary = anEvent.Element("description").Value;  
  22.         localEvent.URL = "http://eventful.com/events/" +   
  23.            anEvent.Attribute("id").Value;  
  24.   
  25.         // If no longitude, latitude is found, set it to what   
  26.         // was returned by Yahoo Geocode API  
  27.         localEvent.Longitude = anEvent.Element("longitude").Value == string.Empty ?   
  28.            _Longitude : anEvent.Element("longitude").Value;  
  29.         localEvent.Latitude = anEvent.Element("latitude").Value == string.Empty ?   
  30.            _Longitude : anEvent.Element("latitude").Value;  
  31.   
  32.         // If it is not already in our list, add it.  
  33.         // It is often observed that users   
  34.         // often post the same entries in multiple sites.  
  35.         if (!_LocalEvents.Contains(localEvent))  
  36.             _LocalEvents.Add(localEvent);  
  37.     }  
  38.   
  39.     evt.Set();  
  40. }  

Next thing to do in the Workflow is to wait for the methods to complete:

  1. private void ConsolidateResult(object sender, EventArgs e)  
  2. {  
  3.     // Wait till all the spawned threads are done  
  4.     WaitHandle.WaitAll(locks.ToArray());  
  5.   
  6.     // ... code truncated to save space  

As I said when I first created this application, it took me 80-90 seconds to complete the search. After I implemented the server-side multithreading feature, performance dramatically improved, requiring 25-28 seconds to complete the search. Multithreading on server-side is definitely something that might boost the website's performance from the client point of view, but it might not be appropriate in some cases, since it's quite stressful on the server. You will have to make tradeoffs depending on your business problems.

Performance Improvement #5: Cache the result

This is the last part of the ConsolidateResult method, which implements a caching technique. It caches the LocalEvent array for one hour. You might want to configure the timeout depending on your needs. The next time the Workflow starts, the first activity would be to look for a cached item for this particular location. If the item is found, the control flow doesn't go through this long tunnel of Workflow. Instead, it will simply return the cached result.

  1. // Cache the resultant array  
  2. TheContext.Cache.Insert(Location.ToUpper(), LocalEventsData, null, DateTime.Now.AddHours(1), Cache.NoSlidingExpiration);  

One of the biggest performance improvements lies here. Caching allows us the retrieval of events from all the three data sources. This is of course much faster since the data is now delivered directly and only from our server. Just before this improvement I was able to retrieve search results in at most 28 seconds, in fact 28536ms. Now, the same operation requires 1302ms, which correspond to 1.3 seconds.

Performance Improvement #6: Client side validation

When we hear this from somebody else, we wonder why we wouldn't validate data while passing to server-side code. This is often a common mistake made in a lot of websites (do you think it's because of a lot of .NET coders just do not like writing JavaScript?). Well, in our case, the fact is - if we do not validate whether the text box is empty or not, it starts a roundtrip which causes a WebService call, a Workflow to be run and a Yahoo Geocode API method to be invoked. Therefore, careful distinctions of such scenarios come often handy to make the site a real performer. "Doing something" for performance optimization is not always a good choice. Sometimes, "Not Doing something" might be result in better optimization.

  1. if(searchText == '')  
  2. {  
  3.     _DivStatus.innerHTML = 'Please type a location.';  
  4.     $get('txtSearch').focus();  
  5. }  
  6. else  
  7. {  
  8.     Sys.Net.WebServiceProxy.invoke(  
  9.         "/LocalEventsWS.asmx",      // WebService Path  
  10.         "SearchLocalEvents",        // Method name  
  11.         true,                       // Use GET?  
  12.         { location: searchText },   // Parameters  
  13.         onEventsDownloadCompleted,  // On success callback  
  14.         onEventsDownloadError);     // On failure callback  
  15.           
  16.     _DivStatus.innerHTML = 'Locating events...';  
  17. }  

The Ajax Data Controls

You will certainly notice that we used a GridView AJAX control, which is part of a DotNetSlackers' hosted project called AjaxDataControls . The reason behind choosing this awesome control library is that these controls have fantastic client programmability, which makes the life of an AJAX developer much easier.

The following snippet simply binds a LocalEvent array to this GridView:

  1. var gridView = $find('<%= GridView1.ClientID %>');  
  2. gridView.set_dataSource(result);  
  3. gridView.dataBind();  

The Web.config changes

The following code needs to be edited with your own API keys, User Name and Password. These credentials are used by the methods in the SearchWorkflow.

  1. <appSettings>  
  2.   <add key="YahooKey" value="Yahoo API Key Here"/>  
  3.   <add key="ZventsKey" value="Zvents API Key here"/>  
  4.   <add key="UpcomingKey" value="Upcoming API Key here"/>  
  5.   <add key="EventfulAppKey" value="Eventful API Key here"/>  
  6.   <add key="EventfulUser" value="Eventful User Name here"/>  
  7.   <add key="EventfulPassword" value="Eventful Password here"/>  
  8. </appSettings>  

Performance Improvement #7: Cache on the client side

Now let us look at the WebService code that resides in server side. It is invoked from the client side and deals with the server side cache. It also invokes the web services of the events provider. It's a bridge between the data and client. It reads the Web.config file, prepares a Dictionary with some necessary objects and executes the SearchWorkflow we saw before. One significant thing to note here is that the WebService methods that can be invoked through AJAX should use the ScriptMethod attribute. The reason why we set the UseHttpGet parameter is that we want to issue an AJAX HTTP GET request. This will be discussed in the next section.

  1. [WebMethod]  
  2. [ScriptMethod(UseHttpGet = true)]  
  3. public LocalEvent[] SearchLocalEvents(string location)  
  4. {  
  5.     LocalEvent[] list = null;  
  6.   
  7.     // Prepare a Dictionary object that will hold all the parameters to be  
  8.     // passed to workflow including the output parameters. e.g. LocalEventsData  
  9.     Dictionary<stringobject> properties = new Dictionary<stringobject>();  
  10.     properties.Add("Location", location);  
  11.     properties.Add("ErrorMessage"null);  
  12.     properties.Add("LocalEventsData", list);  
  13.     properties.Add("YahooKey", ConfigurationManager.AppSettings["YahooKey"]);  
  14.     properties.Add("ZventsKey", ConfigurationManager.AppSettings["ZventsKey"]);  
  15.     properties.Add("UpcomingKey", ConfigurationManager.AppSettings["UpcomingKey"]);  
  16.     properties.Add("EventfulAppKey",   
  17.        ConfigurationManager.AppSettings["EventfulAppKey"]);  
  18.     properties.Add("EventfulUser", ConfigurationManager.AppSettings["EventfulUser"]);  
  19.     properties.Add("EventfulPassword",   
  20.        ConfigurationManager.AppSettings["EventfulPassword"]);  
  21.   
  22.     WorkflowHelper.ExecuteWorkflow(typeof(SearchWorkflow), properties);  
  23.   
  24.     list = (LocalEvent[])properties["LocalEventsData"];  
  25.   
  26.     if (list == null)  
  27.     {  
  28.         if (properties["ErrorMessage"] == null)  
  29.             // No exception occured  
  30.             // we just couldn't find any event there  
  31.             throw new Exception("No local events found in the specified location");  
  32.         else  
  33.             throw new Exception(properties["ErrorMessage"as string);  
  34.     }  
  35.   
  36.     // ... code edited to save space  

Now for the tricky part. Before we return a response from the WebService method to the client, we need to cache it so that once the WebService is fired up from the AJAX code, it will show the data cached in the browser instead of making a round trip to the server. This results in an awesome performance optimization. The following code is supposed to do exactly what we need:

  1. TimeSpan duration = new TimeSpan(1, 0, 0);  
  2.   
  3. Context.Response.Cache.SetCacheability(HttpCacheability.Public);  
  4. Context.Response.Cache.SetExpires(DateTime.Now.Add(duration));  
  5. Context.Response.Cache.SetMaxAge(duration);  
  6.   
  7. return list;  

This is again setting 1 hour expiry time to the cache. Unfortunately, though, SetMaxAge didn't work. It demands a special Reflection hack when using with ASP.NET AJAX, in order to alter the maxAge property:

  1. FieldInfo maxAge = Context.Response.Cache.GetType().GetField("_maxAge", BindingFlags.Instance | BindingFlags.NonPublic);  
  2. maxAge.SetValue(HttpContext.Current.Response.Cache, duration);  

The browser now saves the request query and the associated response as a pair in its cache, so that every time this method is invoked with the same location parameter, it returns the data directly from the browser cache within one hour. Now my response time came down to 117ms, which is almost 0.1 second.

This project is hosted at Codeplex: http://www.codeplex.com/LocalEvents for those of you who would like to do further development on it. One enhancement one might want to implement is integrating a Virtual Earth or Google Maps map to pin point the events on the map. It's very easy to do because longitude and latitude information is available for each event passed to the client side.

We saw how a Web 2.0 portal's response time was decreased from 90 seconds to 117ms by applying 7 performance tips. Finally, there can be numerous ways to improve performance depending on the business problems you are trying to solve. Even in this application, there might be other ways to improve performance. Not every issue can be addressed in such a short scope. However, I hope to discuss them in near future.

Summary

In this article you explored some of the key performance issues while developing a Web 2.0 portal using server side multithreading, caching. You also learned how to do model driven application development using Windows Workflow Foundation.

Source: http://dotnetslackers.com/articles/aspnet/SevenWaysToDoPerformanceOptimizationOfAnASPNET35Web20Portal.aspx

No comments: