SQL Server Deadlock Analysis – RockyPC / tpch10.dbo.orders – 2026-05-05

Tuning Goal: Fix Deadlocks
Server: RockyPC Version: SQL Server 2022 (16.0.1175.1) Developer Edition Database: master (deadlocks occurred in tpch10)

Executive Summary

All five deadlocks captured between 2026-05-05 01:21:56 UTC and 01:22:40 UTC involve the same statement: the UPDATE dbo.orders in stored procedure tpch10.dbo.usp_UpdateOrdersAndLineitem joining a @BatchOrders table variable. The deadlocks are U-U key lock cycles on the clustered index orders_pk combined with intra-query parallelism exchange-event waits (parallel worker threads on the same SPID forming part of the cycle). Multiple SQLCMD sessions execute the same procedure concurrently with overlapping date ranges, so they target overlapping o_orderkey rows but in different orders.

Top priorities to eliminate these deadlocks:

  1. P1 Force a deterministic, ordered access path to orders (process keys in sorted order) and serialize the update using WITH (UPDLOCK, HOLDLOCK) on the driving set, eliminating the U/U conversion race. Confidence: High
  2. P1 Enable Read Committed Snapshot Isolation (RCSI) on tpch10 to remove unnecessary read/U-lock contention from the same and other workloads. Confidence: High
  3. P1 Suppress parallelism on this UPDATE statement (OPTION (MAXDOP 1)) — every captured deadlock graph contains parallel e_waitPortOpen exchange events, which is the proximate amplifier. Confidence: High
  4. P2 Replace the @BatchOrders table variable with a temp table that has a clustered PK on o_orderkey, providing the optimizer with cardinality and ordering it currently lacks. Confidence: High
  5. P2 Partition concurrent jobs by non-overlapping date ranges and/or serialize via sp_getapplock; current 1995-01-01–03-31 vs. 1995-02-15–05-15 ranges overlap. Confidence: High

Key Findings

#Timestamp (UTC)VictimResourcePattern
12026-05-05 01:22:40process140206ad468 (spid 69, ecid 7)KEY on orders_pkU↔U + parallel exchange
22026-05-05 01:22:03process140d354cca8 (spid 86, ecid 7)KEY on orders_pk4-process U-cycle + exchange
32026-05-05 01:22:00process14152bfa4e8 (spid 84, ecid 5)KEY on orders_pkU↔U + exchange
42026-05-05 01:21:59process1415f41a8c8 (spid 86, ecid 5)KEY on orders_pkU↔U + 2 exchange events
52026-05-05 01:21:56process13e78d1aca8 (spid 85, ecid 7)KEY on orders_pkU↔U + exchange
  • All processes ran the same statement at tpch10.dbo.usp_UpdateOrdersAndLineitem line 53.
  • All processes use read committed (2) isolation; RCSI is disabled.
  • All wait resources are KEY: 11:72057594047102976 on index orders_pk (clustered PK on dbo.orders).
  • Every graph includes one or more exchangeEvent … e_waitPortOpen entries — the UPDATE goes parallel, and parallel sibling threads (different ecid, same spid) are part of the cycle.
  • Two callers used overlapping ranges: 1995-01-01..1995-03-31 and 1995-02-15..1995-05-15.
  • Lock mode is U on both owner and waiter sides — classic update-scan/seek convert race when different workers hit the same keys via different orders.

Root Cause Analysis

1. Non-deterministic key access order under parallel UPDATE

The UPDATE joins dbo.orders to a @BatchOrders table variable. Because table variables expose cardinality 1 to the optimizer (no statistics), the planner often chooses a parallel plan with nested loop / hash join and an unordered driving stream. Multiple concurrent executions therefore acquire U locks on the same orders_pk keys in different orders, producing a classic deadlock cycle.

2. Parallelism amplifies the cycle

Every deadlock graph includes exchangeEvent resources of type e_waitPortOpen with multiple ecid entries on the same spid. This is intra-query parallel deadlock: one parallel worker holds a U lock another worker (in the same plan) is waiting for, while a different session is mirrored on the opposite side. Removing parallelism on this statement alone usually eliminates this class of deadlock.

3. Read Committed without RCSI

RCSI is disabled, so any concurrent reader/UPDATE on overlapping rows takes shared/U key locks instead of using row versioning. This significantly increases U-U conflict probability on hotspot rows.

4. Overlapping date partitions across callers

Sessions execute with @FromDate/@ToDate ranges that overlap (Jan–Mar vs. Feb–May 1995). They therefore write into the same o_orderkey set, guaranteeing contention.

5. Long-lived "user_transaction"

The graphs show transactionname="user_transaction" with a multi-second lasttranstarted gap. The procedure is doing additional work (likely also the lineitem update) inside the same transaction, holding U locks on orders longer than necessary.

Prioritized Recommendations

  1. P1 – Highest impact Force MAXDOP 1 on the offending UPDATE

    The intra-query parallel exchange events in every graph are a strong indicator. For an UPDATE that touches a relatively small filtered set joined to a table variable, parallelism rarely helps and creates deadlock surface area.

    UPDATE o
       SET o.o_comment = LEFT(CONCAT(o.o_comment, ' | BATCH_UPDATE ',
            CONVERT(varchar(10), @FromDate, 120), '-',
            CONVERT(varchar(10), @ToDate, 120)), 79)
    FROM dbo.orders AS o
    JOIN @BatchOrders AS bo
      ON bo.o_orderkey = o.o_orderkey
    OPTION (MAXDOP 1);

    Confidence: High. Direct mitigation of exchange-event deadlocks.

  2. P1 Enable RCSI on tpch10

    RCSI removes shared lock contention for readers and reduces conversion contention. It does not eliminate U-U writer conflicts but materially reduces deadlock frequency.

    -- Quiesce, then:
    ALTER DATABASE tpch10 SET READ_COMMITTED_SNAPSHOT ON WITH ROLLBACK IMMEDIATE;
    -- Verify:
    SELECT name, is_read_committed_snapshot_on FROM sys.databases WHERE name = 'tpch10';

    Ensure tempdb has sufficient space and IO; monitor version_store usage. SQL Server 2022 + Enterprise tolerates this well.

    Confidence: High.

  3. P1 Process keys in deterministic order with explicit U-lock acquisition

    Materialize the target keys, sort them, and lock with UPDLOCK, HOLDLOCK in that order so all sessions take locks in the same sequence. This converts a deadlock-prone pattern into a blocking-only pattern.

    -- Inside usp_UpdateOrdersAndLineitem
    DECLARE @KeysToUpdate TABLE (o_orderkey BIGINT PRIMARY KEY);
    
    INSERT INTO @KeysToUpdate (o_orderkey)
    SELECT bo.o_orderkey
    FROM @BatchOrders AS bo
    ORDER BY bo.o_orderkey;   -- deterministic order
    
    UPDATE o
       SET o.o_comment = LEFT(CONCAT(o.o_comment, ' | BATCH_UPDATE ',
            CONVERT(varchar(10), @FromDate, 120), '-',
            CONVERT(varchar(10), @ToDate, 120)), 79)
    FROM dbo.orders AS o WITH (UPDLOCK, HOLDLOCK, ROWLOCK)
    JOIN @KeysToUpdate AS k
      ON k.o_orderkey = o.o_orderkey
    OPTION (MAXDOP 1, FORCE ORDER);

    Confidence: High. Combined with P1.1 + P1.2 this typically eliminates the cycle entirely.

  4. P2 Replace @BatchOrders table variable with a #temp table having a clustered PK

    Table variables expose 1-row cardinality to the optimizer (unless you use trace flag 2453 / OPTION (RECOMPILE)). This drives bad join orders and parallel plans that contribute to deadlocks. A #temp table with a clustered PK on o_orderkey gives accurate stats and a sorted seek into orders.

    CREATE TABLE #BatchOrders (o_orderkey BIGINT NOT NULL PRIMARY KEY CLUSTERED);
    INSERT INTO #BatchOrders (o_orderkey)
    SELECT o_orderkey FROM dbo.orders
    WHERE o_orderdate >= @FromDate AND o_orderdate < DATEADD(day,1,@ToDate);

    Confidence: High.

  5. P2 Eliminate overlapping ranges across callers

    The captured workloads overlap (Jan–Mar vs. Feb–May 1995). Either:

    • Use non-overlapping date partitions per caller, or
    • Serialize concurrent invocations with an application lock so only one update of the same logical partition runs at a time:
    BEGIN TRAN;
    DECLARE @rc INT;
    EXEC @rc = sp_getapplock
         @Resource    = N'usp_UpdateOrdersAndLineitem',
         @LockMode    = N'Exclusive',
         @LockOwner   = N'Transaction',
         @LockTimeout = 60000;
    IF @rc < 0
    BEGIN
       ROLLBACK; THROW 51000, 'Could not acquire applock for batch update.', 1;
    END
    -- ... do the update ...
    COMMIT;

    Confidence: High.

  6. P2 Shorten transaction scope & batch the update

    If the procedure also updates lineitem in the same transaction, split work into smaller, ordered batches (e.g., 1,000–10,000 keys per commit). Smaller transactions hold fewer U locks and reduce the deadlock window.

    -- pseudo-pattern
    WHILE EXISTS (SELECT 1 FROM #BatchOrders)
    BEGIN
       BEGIN TRAN;
    
       ;WITH cte AS (
           SELECT TOP (5000) o_orderkey
           FROM #BatchOrders
           ORDER BY o_orderkey
       )
       DELETE FROM cte
       OUTPUT deleted.o_orderkey INTO #Slice (o_orderkey);
    
       UPDATE o
          SET o.o_comment = ...
       FROM dbo.orders AS o WITH (UPDLOCK, HOLDLOCK, ROWLOCK)
       JOIN #Slice s ON s.o_orderkey = o.o_orderkey
       OPTION (MAXDOP 1);
    
       TRUNCATE TABLE #Slice;
       COMMIT;
    END

    Confidence: Medium-High.

  7. P3 Plan-level safety net: lock hint to disable lock escalation surprises

    Confirm orders does not escalate to TABLE locks under load:

    SELECT name, lock_escalation_desc
    FROM sys.tables WHERE name = 'orders';
    -- If anything other than TABLE/AUTO is unintended, leave AUTO.
    -- For very large batches, consider:
    ALTER TABLE dbo.orders SET (LOCK_ESCALATION = AUTO);

    Confidence: Medium. Diagnostic; only act if escalation is observed in extended events.

  8. P3 Set deadlock priority for batch jobs

    If after the above changes occasional deadlocks remain, mark the batch jobs as the preferred victim so OLTP traffic wins:

    SET DEADLOCK_PRIORITY LOW;

    Combine with retry logic (see next).

    Confidence: Medium.

  9. P3 Implement deterministic retry on error 1205
    DECLARE @attempt INT = 0;
    WHILE (1=1)
    BEGIN
       BEGIN TRY
          BEGIN TRAN;
          EXEC dbo.usp_UpdateOrdersAndLineitem @FromDate=@FromDate, @ToDate=@ToDate;
          COMMIT;
          BREAK;
       END TRY
       BEGIN CATCH
          IF XACT_STATE() <> 0 ROLLBACK;
          IF ERROR_NUMBER() = 1205 AND @attempt < 5
          BEGIN
             SET @attempt += 1;
             WAITFOR DELAY '00:00:00.250';
             CONTINUE;
          END
          THROW;
       END CATCH
    END

    Confidence: High as a defensive control; not a substitute for P1 fixes.

Recommended Refactor of usp_UpdateOrdersAndLineitem

Consolidated pattern incorporating P1 + P2 recommendations:

CREATE OR ALTER PROCEDURE dbo.usp_UpdateOrdersAndLineitem
    @FromDate date,
    @ToDate   date
AS
BEGIN
    SET NOCOUNT ON;
    SET XACT_ABORT ON;

    -- 1) Materialize keys with statistics + ordering
    CREATE TABLE #BatchOrders (o_orderkey BIGINT NOT NULL PRIMARY KEY CLUSTERED);

    INSERT INTO #BatchOrders (o_orderkey)
    SELECT o_orderkey
    FROM dbo.orders
    WHERE o_orderdate >= @FromDate
      AND o_orderdate <  DATEADD(day, 1, @ToDate);

    -- 2) Serialize on logical partition (optional but strongly recommended)
    BEGIN TRAN;

    DECLARE @rc INT;
    EXEC @rc = sp_getapplock
         @Resource    = N'usp_UpdateOrdersAndLineitem',
         @LockMode    = N'Exclusive',
         @LockOwner   = N'Transaction',
         @LockTimeout = 60000;
    IF @rc < 0
    BEGIN
        ROLLBACK;
        THROW 51000, 'Could not acquire applock.', 1;
    END

    -- 3) Update orders in deterministic key order, no parallelism
    UPDATE o
       SET o.o_comment = LEFT(CONCAT(o.o_comment, ' | BATCH_UPDATE ',
              CONVERT(varchar(10), @FromDate, 120), '-',
              CONVERT(varchar(10), @ToDate, 120)), 79)
    FROM dbo.orders AS o WITH (UPDLOCK, HOLDLOCK, ROWLOCK)
    JOIN #BatchOrders AS b
      ON b.o_orderkey = o.o_orderkey
    OPTION (MAXDOP 1, FORCE ORDER);

    -- 4) Lineitem update should also key-order and MAXDOP 1
    -- UPDATE l ... FROM dbo.lineitem l WITH (UPDLOCK, HOLDLOCK, ROWLOCK)
    --   JOIN #BatchOrders b ON b.o_orderkey = l.l_orderkey
    --   OPTION (MAXDOP 1);

    COMMIT;
END
GO

Validation & Monitoring

  • Confirm RCSI is on: SELECT is_read_committed_snapshot_on FROM sys.databases WHERE name='tpch10';
  • Confirm parallel UPDATE no longer occurs — examine actual plan or query store: no Parallelism (Gather Streams) on the UPDATE node.
  • Track deadlocks via the system_health XEvent session:
    SELECT XEvent.value('(@timestamp)[1]', 'datetime2') AS ts,
           XEvent.query('.') AS deadlock_xml
    FROM (
        SELECT CAST(target_data AS XML) AS td
        FROM sys.dm_xe_session_targets t
        JOIN sys.dm_xe_sessions s ON s.address = t.event_session_address
        WHERE s.name = 'system_health' AND t.target_name = 'ring_buffer'
    ) x
    CROSS APPLY x.td.nodes('//RingBufferTarget/event[@name="xml_deadlock_report"]') AS X(XEvent)
    ORDER BY ts DESC;
  • Watch sys.dm_tran_version_store_space_usage after enabling RCSI.
  • Track sys.dm_exec_query_stats / Query Store regressions on the UPDATE after MAXDOP 1; expected to be neutral or positive.