Tuesday, June 21, 2005

Points to consider to increase performance in ADO.NET – Part II

In my previous article, ten points on the general considerations on increasing the performance of ADO.NET were discussed. Let us examine in more detail on other factors that affect the performance.

1. Use of Transactions
Transactions are important for ensuring data integrity. But they can also have an operational cost if they are not rightly used. You must select the right transaction management mechanism for your application to significantly improve scalability and performance.
The following points should be remembered during your decision of using transactions:

a) Use SQL transactions when:
· You need server controlled transactions
· You can complete the transaction operation on a single data store in just one simple call to the datastore
b) Use ADO.NET transactions when:
· You need client-controlled transactions
· When you want to make multiple calls to a single datastore
c) Keep transactions as short as possible with as little code as possible to avoid locks.
d) Use the appropriate isolation level.
SQL Server and other database systems support various levels of isolation for transactions. Isolation shields database operations from the effect of other concurrent transactions.
There are basically four levels of isolation-
· Read uncommitted,
· Read committed,
· Repeatable read
· Serializable

The highest isolation level, serializable, reduces concurrency and provides the highest level of data integrity.
The lowest isolation level, read uncommitted, gives the opposite result.

By selecting the correct level of isolation that pertains to your application, you can improve performance and scalability.

2. Avoid code that can lead to deadlock.
Sometimes your code may get trapped in a deadlock when it is highly data centric. To avoid deadlocks, always access tables in the same order across transactions in your application. The likelihood of a deadlock increases when you access tables in a different order each time you access them.

3. Consider SET NOCOUNT ON for SQL Server
When you use SET NOCOUNT ON, the message that indicates the number of rows that are affected by the T-SQL statement is not returned as part of the results. When you use SET NOCOUNT OFF, the count is returned. Using SET NOCOUNT ON can improve performance because network traffic can be reduced.

4. The Parameter Collection
Use the Parameters collection when you call a stored procedure and when you build SQL commands. Also ensure that you explicitly set the data type for each parameter. This will avoid the round trips to server by checking the data type on the client and the likelihood that the Parameter object could set an invalid type.

3. Cache stored procedure SqlParameter objects.
Often, applications must run commands multiple times. To avoid recreating the SqlParameter objects each time, cache them so that they can be reused later. You can use a Hashtable object to cache SqlParameter objects.
For code samples for the parameter caching approach, check out this link:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnbda/html/daab-rm.asp.

4. Avoid moving Binary Large Objects Repeatedly
Avoid moving BLOB data more than one time from the server to client and vice versa.
For example, if you build a Web application that serves images, store the images on the file system and the file names in the database instead of storing the images as BLOBs in the database. Storing the images as BLOBs in the database means that you must read the BLOB from the database to the Web server and then send the image from the Web server to the browser. Reading the file name from the database and having the Web server send the image to the browser reduces the load on the database server.

5. Paging Records
Paging records is a common application scenario. The records that you need to page through can often be based on user input. For example, they can be based on a search keyword entered through a search screen. Or, the records can be common to all users, like a product catalogue.

There are two basic approaches to paging:
· You can return the whole result set from the database to the client. The client caches the result set and then displays the most relevant results to the user by using the correct filtering mechanism. But this will involve processing cost at the client and memory for caching.

· You can have the database assume the additional role of a filter by making the database return only the most relevant result set to the client. This will involve network cost for the amount of data sent across the network.

Depending upon the type of application and load on the server, we must try to decide the best between the above two approaches.

6. Exception Management
a) Avoid relying on an Exception error handler to detect connection state availability. use the ConnectionState.Open or ConnectionState.Close method to check the state before use.

b) Use try/finally more often than try/catch/finally. Using finally gives you the option to close the connection, even if an exception occurs.

c) Use specific handlers to catch specific exceptions. For example, if you know that your code may cause an SqlException, use catch(SqlException sqlEx) and then use a generic exception handler like catch(Exception ex).

I have covered most of the points I've learnt on increasing performance in ADO.NET.
One interesting topic which remains to be addressed is about the best use of DataSet and DataReader, which I think is addressed in numerous articles on web. Here is one:
http://geekswithblogs.net/ranganh/archive/2005/04/25/37618.aspx

Thursday, June 16, 2005

Ten points to consider to increase performance in ADO.NET

Here is a consolidated list of how you can improve the performance of your .NET apps:

1. Design your data access layer based on how the data is used.

In larger applications, it’s always better to go for a separate Data Access Layer (DAL) to abstract the underlying data store complexity and to provide a logical separation. Having the data access logic in the same presentation layer may increase performance but at the cost of maintainability.

You can use the Microsoft Application Blocks for simplifying your tasks of Data Access and Exception handling. (I personally prefer this)

2. Cache your data to avoid unnecessary round trips and network overhead.

Try to cache data that is used across your application, in the layer that is close to the consumer of data. This will reduce the latency in network to fetch data. Note that if your data in cache needs to be updated too frequently, then better you don’t cache that data.

3. Acquire late, release early.

Open database connections right only when you need them. Close the database connections as soon as you are finished. Acquire locks late, and release them early.

4. Close disposable resources.

Make sure that you call either the Dispose or Close method on resources that are disposable, as soon as you are finished with using the resource.

5. Reduce round trips.

· If you have some 3 or 4 SQL statements, try to use batch sql statements in a stored procedure to decrease round trips.
· Use ExecuteScalar method for getting a single result.
· Use ExecuteNonQuery method when you want to execute any DDL statement.
· Use connection pooling to help avoid extra round trips. By reusing connections from a connection pool, you can avoid the round trips that are associated with connection establishment and authentication.

6.Return only the data you need.

Evaluate the data that your application actually requires and return only that data. This will minimize the bandwidth consumption in the network.

7. Use Windows authentication.

From a security perspective, use Windows authentication instead of SQL authentication. This ensures that credentials are not passed over the network, database connection strings do not contain credentials, and you can apply standard Windows security policies to accounts. Remember to use connection pooling with your connection.

8. Use stored procedures.

Avoid embedded SQL statements and use Store Procedures (SP) instead. This has the following advantages:
· A Logical separation of Data access code from your Business Logic code
· Queries can be optimized for performance from SQL server
· Deployment becomes easier as, for any change in SP you don’t need to redeploy you application.
· SPs allow the batch execution of SQL commands
· You can impose specific restrictions on selected stored procedures for security reasons. This is very difficult to be done from embedded SQL.
· You can avoid the most dangerous SQL Injection by using parameterized SPs.

9. Consider how to handle exceptions.

You can use try/finally blocks to ensure that connections and other resources are closed, regardless of whether exceptions are generated or not. The best way for abstracting all exceptions from user is to log them to a file or Windows Event log.

10. Use appropriate normalization.

You may want a normalized database to minimize data duplication but be aware that you don’t over-normalize. This can affect the performance and scalability of your application

The above list is purely based on my working experience and there are many more points to add to this, which I’ll be adding in the next part of this article.