Show Menu
Cheatography

SQL Query Performance Improvement Cheat Sheet (DRAFT) by [deleted]

This is a draft cheat sheet. It is a work in progress and is not finished yet.

Introd­uction

SQL query perfor­mance improv­ement is a very though­t-p­rov­oking topic between developers and the user community. Users always wants a fast response on their data retrieval action and developers put forth their best efforts to provide the data in the shortest time, however, there is no straig­htf­orward way to define what is the best perfor­mance. Sometime it’s debatable what is good and what is bad perfor­mance of a query but overall if you follow best practices during develo­pment, you can provide the best query response to users and avoid such discus­sions.

You can choose multiple ways to improve SQL query perfor­mance, which falls under various categories like re-writing the SQL query, creation and use of Indexes, proper management of statis­tics, etc.

Avoid Multiple Joins in a Single Query

Try to avoid writing a SQL query using multiple joins that includes outer joins, cross apply, outer apply and other complex sub queries. It reduces the choices for Optimizer to decide the join order and join type. Sometime, Optimizer is forced to use nested loop joins, irresp­ective of the perfor­mance conseq­uences for queries with excess­ively complex cross apply or sub queries.

Eliminate Cursors from the Query

Try to remove cursors from the query and use set-based query; set-based query is more efficient than cursor­-based. If there is a need to use cursor than avoid dynamic cursors as it tends to limit the choice of plans available to the query optimizer. For example, dynamic cursor limits the optimizer to using nested loop joins.

Avoid Use of Non-co­rre­lated Scalar Sub Query

You can re-write your query to remove non-co­rre­lated scalar sub query as a separate query instead of part of the main query and store the output in a variable, which can be referred to in the main query or later part of the batch. This will give better options to Optimizer, which may help to return accurate cardin­ality estimates along with a better plan.

Avoid Multi-­sta­tement Table Valued Functions

Multi-­sta­tement TVFs are more costly than inline TFVs. SQL Server expands inline TFVs into the main query like it expands views but evaluates multi-­sta­tement TVFs in a separate context from the main query and materi­alizes the results of multi-­sta­tement into temporary work tables. The separate context and work table make multi-­sta­tement TVFs costly.
 

Creation and Use of Indexes

We are aware of the fact that Index can magically reduce the data retrieval time but have a reverse effect on DML operat­ions, which may degrade query perfor­mance. With this fact, Indexing is a challe­nging task, but could help to improve SQL query perfor­mance and give you best query response time.

Understand the Data

Understand the data, its type and how queries are being performed to retrieve the data before making any decision to create an index. If you understand the behavior of data thorou­ghly, it will help you to decide which column should have either a clustered index or non-cl­ustered index. If a clustered index is not on a unique column then SQL Server will maintain uniqueness by adding a unique identifier to every duplicate key, which leads to overhead. To avoid this type of overhead choose the column correctly or make the approp­riate changes.

Create a Highly Selective Index

Select­ivity define the percentage of qualifying rows in the table (quali­fying number of rows/total number of rows). If the ratio of the qualifying number of rows to the total number of rows is low, the index is highly selective and is most useful. A non-cl­ustered index is most useful if the ratio is around 5% or less, which means if the index can eliminate 95% of the rows from consid­era­tion. If index is returning more than 5% of the rows in a table, it probably will not be used; either a different index will be chosen or created or the table will be scanned.

Position a Column in an Index

Order or position of a column in an index also plays a vital role to improve SQL query perfor­mance. An index can help to improve the SQL query perfor­mance if the criteria of the query matches the columns that are left most in the index key. As a best practice, most selective columns should be placed leftmost in the key of a non-cl­ustered index.

Drop Unused Indexes

Dropping unused indexes can help to speed up data modifi­cations without affecting data retrieval. Also, you need to define a strategy for batch processes that run infreq­uently and use certain indexes. In such cases, creating indexes in advance of batch processes and then dropping them when the batch processes are done helps to reduce the overhead on the database.

Statistic Creation and Updates

You need to take care of statistic creation and regular updates for computed columns and multi-­columns referred in the query; the query optimizer uses inform­ation about the distri­bution of values in one or more columns of a table statistics to estimate the cardin­ality, or number of rows, in the query result. These cardin­ality estimates enable the query optimizer to create a high-q­uality query plan.

Revisit Your Schema Defini­tions

Last but not least, revisit your schema defini­tions; keep on eye out that approp­riate FORIGEN KEY, NOT NULL and CEHCK constr­aints are in place or not. Availa­bility of the right constraint on the right place always helps to improve the query perfor­mance, like FORIGEN KEY constraint helps to simplify joins by converting some outer or semi-joins to inner joins and CHECK constraint also helps a bit by removing unnece­ssary or redundant predic­ates.

Conclusion

We discussed how SQL query perfor­mance can be improved by re-writing a SQL query, creation and use of Indexes, proper management of statistics and we revisited schema defini­tions. There are many more areas that can be looked at to improve the SQL query perfor­mance like using query hints, table hints and plan hints, etc.