Optimizing SQL Server

What Developers Can Do And
How DBAs Can Help

PASS MN • July 16, 2019

Andrew Brobston

Senior Software Developer, SJV & Associates

Copyright © 2019 Andrew Brobston. Licensed under CC-BY-SA 4.0.

To get us started…

Slides are available at brobston.com.

Contact me with questions! Twitter: @AndrewBrobston

SJV & Associates is hiring full-stack .NET/SQL Server developers! Check out the listing at https://stackoverflow.com/jobs/280663/net-software-developer-sjv-associates.

I would have put the job listing in an iframe, but frame-ancestors 'self'.

Small Disclaimer

  • I have not tested these strategies with SQL Server 2019.
  • Features such as Adaptive Query Processing could affect optimization.

Are you a developer? This session is for you if…

  • You develop something that uses Microsoft SQL Server, or you plan to.
  • You know what SELECT, INSERT, UPDATE, and DELETE do.
  • You have used those operations in hand-written queries and/or through ORMs.
  • That's it.
  • More experience is great too.

Are you a DBA? This session is for you if…

  • You have developers writing code interacting with your SQL Server databases.
  • You would like to improve that code's performance.
  • You like sanity.

When this talk ends, you should…

  • Have some tips for better-performing queries, either for you or for other developers.
  • Know some basics—and maybe not-so-basics—of using execution plans.

This talk won't get to…

  • Relational database design
  • Detailed index strategies
  • Administrative topics

Why optimize?

  1. Make your query faster. (Please don't leave yet…)
  2. Make the server's other queries faster by reducing CPU, memory, disk, and locking.
  3. Spend less on hardware, virtual servers, or even licensing.

…but premature optimization is the root of all evil!

  • The rationale for the saying is avoiding spending time on trivial optimizations that delay completion.
  • If you know some quick optimizations, you can save time by:
    • Waiting less for your queries to run while developing.
    • Spending less time troubleshooting performance.
  • If you don't optimize, and your organization succeeds, poor database design and unscalable queries WILL crash down on you, and not gradually.
  • A solution good enough for, say, 100 writes/hour to a main table may not scale to thousands per hour or more.
  • You don't need to check the execution plan for every query, but do so for critical or slow ones.

Just a few

👍 Rules of Thumb 👍

Or heuristics, if you prefer.

Or maybe tips.

Whatever works for you.

Rule 1:

Each rule, including this one, has exceptions.

I don't know them all.

Corollary: Rules can conflict. Everything is a tradeoff.

Rule 2: SARGability

SARGability

  • SARG is short for Search ARGument-able.
  • Mostly refers to WHERE and JOIN conditions.
  • Could SQL Server use an index to resolve the condition if it had one?
Rule 2: SARGability

Example:


                        SELECT a.Id, a.Whatever
                        FROM   dbo.OurTable a
                        WHERE  ISNULL(a.TextField, N'') = N''
                        

What's wrong with this query?

(Besides that it's horribly contrived.)

Rule 2: SARGability

Example:


                        SELECT a.Id, a.Whatever
                        FROM   dbo.OurTable a
                        WHERE  ISNULL(a.TextField, N'') = N''
                        

If dbo.OurTable has an index on TextField, we can't use it!

But shouldn't SQL Server use the index to check for both NULL and N''?

Maybe, but it doesn't.

SQL Server is NOT THAT INTELLIGENT!

Rule 2: SARGability

You need to check the values separately.


                        SELECT a.Id, a.Whatever
                        FROM   dbo.OurTable a
                        WHERE  a.TextField IS NULL OR a.TextField = N''
                        

Also, avoid implicit type conversions.


                            -- Bad because a.TextField is NVARCHAR.
                            WHERE  a.TextField = ''
                            -- Compare to NVARCHAR instead.
                            WHERE  a.TextField = N''
                            

Try to avoid explicit type conversions too.

One trick, when it's worth it: use a PERSISTED computed column in a temporary table to convert the value.

Rule 3: EXISTS

EXISTS and NOT EXISTS

  • Are you using IN or NOT IN with a subquery?
  • Are you using a LEFT JOIN just to see if another table has a certain value?
  • Do you suffer from slow queries? Too many rows in a result set?
  • Ask your presenter if EXISTS and NOT EXISTS are right for you!
  • (They are.)
Rule 3: EXISTS
mis-join (mɪs' ʤɔɪn) n. a join for which a zero-to-one or one-to-one relation is expected, but a one-to-many or many-to-many relation is obtained instead, leading to extra, unwanted records in the result set

I definitely made up that term. Maybe someone else has a better name.

Rule 3: EXISTS

                        -- Avoid:
                        SELECT a.Id, a.Whatever
                        FROM   dbo.OurTable a
                        WHERE  a.TextField IN (SELECT b.AnotherTextField
                                               FROM   dbo.AnotherTable b
                                               WHERE  b.Id = @SomeParameter)
                        -- NOT IN is even worse.
                        
                        -- Also avoid:
                        SELECT a.Id, a.Whatever
                        FROM   dbo.OurTable a
                        LEFT JOIN dbo.AnotherTable b ON a.TextField = b.AnotherTextField
                        WHERE  b.AnotherTextField IS NOT NULL

                        -- Use EXISTS or NOT EXISTS:
                        SELECT a.Id, a.Whatever
                        FROM   dbo.OurTable a
                        WHERE  EXISTS (SELECT 1 FROM dbo.AnotherTable b
                                       WHERE  a.TextField = b.AnotherTextField
                                       AND    b.Id = @SomeParameter)
                        
Rule 3: EXISTS

Exception

Don't use EXISTS or NOT EXISTS in the SELECT clause.


                        -- Avoid:
                        SELECT a.Id,
                               a.Whatever,
                               CASE WHEN EXISTS (SELECT 1 FROM dbo.AnotherTable b WHERE a.TextField = b.AnotherTextField) THEN 1
                                    ELSE 0 END AS DontDoThis
                        FROM   dbo.OurTable a
                        

Workarounds can be difficult. Consider a LEFT JOIN to a subquery or CTE with GROUP BY.

Rule 4: Table Variables

Table Variables

  • Don't use them.
  • So, no DECLARE @DontDoThis TABLE
  • Use temp tables instead.
  • CREATE TABLE #ThisIsBetter
Rule 5: User-Defined Functions

User-Defined Functions

  • Objects you create with CREATE FUNCTION statements.
  • Use cautiously.
  • Avoid scalar user-defined functions in the SELECT clause.
  • Such functions cause forced serialization.
  • Worse, the scalar function executes individually, row by row, for each row in the result set.
  • If you guessed that's really slow…
  • Some built-in functions, such as cryptographic functions, cause similar behavior.
  • If you must have such a function, consider splitting the query into two queries. First query inserts into a temp table without using the function. Second query runs the function against the temp table.
Rule 5: User-Defined Functions

User-Defined Functions

  • Table-valued functions also must execute against each row.
  • Same is true of built-in functions like STRING_SPLIT.
  • If you have a way to JOIN rather than APPLY — do something besides use the function directly — consider it.
Rule 6: Transaction Isolation Levels

Transaction Isolation Levels

One of the first things relational-database newcomers learn is ACID.

  • Atomicity
  • Consistency
  • Isolation
  • Durability

So that works for real-world SQL Server applications, right?

🤣

Rule 6: Transaction Isolation Levels

We can specify isolation levels in .NET (C#) application code as
using (var transaction = conn.BeginTransaction(IsolationLevel.ReadUncommitted))
or in SQL as
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED.

Isolation levels in SQL Server:

  • READ UNCOMMITTED
  • READ COMMITTED (which has two different meanings depending on database configuration)
  • SNAPSHOT (has special requirements)
  • REPEATABLE READ
  • SERIALIZABLE (has nothing to do with "forced serialization" from earlier)

FROM dbo.OurTable a WITH (NOLOCK) on static tables when not guaranteed to be in READ UNCOMMITTED

Rule 7: CTEs and Subqueries

Common table expressions (CTEs) and subqueries

  • Not persisted anywhere.
  • May be re-run (many times in extreme cases).
  • Can force serialization in some situations.
  • EXISTS / NOT EXISTS outside the SELECT clause is an exception (it's a type of join).
  • May need CTEs for updates (no aggregates in SET clause), window functions, and recursion when you have no better option.
Rule 8: Use the Application

Do work in the application tier instead of the database whenever reasonable

  • Sorting result sets is a great example if the result set would not already be sorted due to index order (can't use with paging).
  • Can allow receiving first results sooner. In Dapper, use buffered: false, then .AsParallel().OrderBy(x => x.Whatever).ToList()
  • Pass easily-used parameters — don't make the query do the work. Example: pass a bit rather than a string if the string would just be compared to a constant.
  • Cache static query results and possibly even use the cache to eliminate joins — but see shouldiblamecaching.com.
Rule 9: RBAR

Avoid RBAR

  • Row by Agonizing Row
  • SQL is designed for you to tell it the results you want involving data sets.
  • SQL is not designed for you to loop.
  • You might need a prepare/execute pattern for multi-row insert, but sometimes alternate means could perform as well. Example: JSON (though performance can also be bad).
  • Cursors have their place if you absolutely must loop, but use the least-resource-intensive cursor that makes sense. FAST_FORWARD usually, or STATIC with FORWARD_ONLY if you need unchanging results. (Look up these keywords if you need cursors.)
  • WHILE can make sense in some situations, but ensure no batch or application solution is feasible.
  • A standard query may still produce an RBAR execution plan. Remember scalar-valued user-defined functions?
Rule 10: Index Use

Index Use

A much larger topic than we can discuss in any detail in a 75-minute talk

  • Know what indexes you have.
  • If you're not using the first indexed field, subsequent fields don't help you.
  • If you can use an index-only query, do so. Key lookups can be expensive.
  • Indexes are not magic. Poor indexes can hurt overall performance.

Execution Plans

Also known as query plans

An execution plan results from SQL Server's compiling a query. You can look at execution plans and… learn things.

Getting Started

  • You need SHOWPLAN permissions on the database
  • Production vs. non-production
  • Estimated vs. actual vs. live
  • Will differ with different workloads, statistics, and (sometimes) parameters
  • Lots of ways to get plans, including:
    • SQL Server Management Studio (or whatever Microsoft calls it this week)
    • SQL Server Profiler
    • Query store (if enabled) and cache — use Brent Ozar's free "First Responder Kit" at www.brentozar.com/first-aid
    • SET STATISTICS XML ON
  • The server returns XML, but really, you're going to look at diagrams in your tool of choice.

Most Important Information

  • Basic nodes, especially select, sort (avoid), filter, and join
  • Total execution time
  • Memory grant — queries cannot start until SQL Server grants them the plan's required memory
  • Compilation statistics (time, cache, parallel, allocated memory)
  • Cardinality estimates (fancy way of saying "how many rows")
  • Warnings (sometimes)

What to Ignore

Query Costs

Also, percentages

Sometimes, it's all you really have.

Also, specific index suggestions

The plan's suggestion may tell you that the plan may benefit from some index, but the one it suggests is typically not a great choice.

Temporary Stored Procedures

  • Yes, that's a thing.
  • CREATE OR ALTER PROCEDURE #MyTempProcedure
  • Use to replicate any parameterized query or stored procedure.
  • Query plan differs if you just use variables.
  • Be sure to roll back transactions!

Resources

Thank you!

Contact me with questions!

Twitter: @AndrewBrobston

Woefully out-of-date blog (but I'm going to fix that!): https://brobston.com/