SQL Server Deadlock Analysis – RockyPC / tpch10.dbo.orders – 2026-05-05

Tuning Goal: Fix Deadlocks

Server: RockyPC Version: SQL Server 2022 (16.0.1175.1) Developer Edition Database: master (deadlocks occurred in tpch10)

Executive Summary

All five deadlocks captured between 2026-05-05 01:21:56 UTC and 01:22:40 UTC involve the same statement: the UPDATE dbo.orders in stored procedure tpch10.dbo.usp_UpdateOrdersAndLineitem joining a @BatchOrders table variable. The deadlocks are U-U key lock cycles on the clustered index orders_pk combined with intra-query parallelism exchange-event waits (parallel worker threads on the same SPID forming part of the cycle). Multiple SQLCMD sessions execute the same procedure concurrently with overlapping date ranges, so they target overlapping o_orderkey rows but in different orders.

Top priorities to eliminate these deadlocks:

P1 Force a deterministic, ordered access path to orders (process keys in sorted order) and serialize the update using WITH (UPDLOCK, HOLDLOCK) on the driving set, eliminating the U/U conversion race. Confidence: High
P1 Enable Read Committed Snapshot Isolation (RCSI) on tpch10 to remove unnecessary read/U-lock contention from the same and other workloads. Confidence: High
P1 Suppress parallelism on this UPDATE statement (OPTION (MAXDOP 1)) — every captured deadlock graph contains parallel e_waitPortOpen exchange events, which is the proximate amplifier. Confidence: High
P2 Replace the @BatchOrders table variable with a temp table that has a clustered PK on o_orderkey, providing the optimizer with cardinality and ordering it currently lacks. Confidence: High
P2 Partition concurrent jobs by non-overlapping date ranges and/or serialize via sp_getapplock; current 1995-01-01–03-31 vs. 1995-02-15–05-15 ranges overlap. Confidence: High

Key Findings

#	Timestamp (UTC)	Victim	Resource	Pattern
1	2026-05-05 01:22:40	process140206ad468 (spid 69, ecid 7)	KEY on orders_pk	U↔U + parallel exchange
2	2026-05-05 01:22:03	process140d354cca8 (spid 86, ecid 7)	KEY on orders_pk	4-process U-cycle + exchange
3	2026-05-05 01:22:00	process14152bfa4e8 (spid 84, ecid 5)	KEY on orders_pk	U↔U + exchange
4	2026-05-05 01:21:59	process1415f41a8c8 (spid 86, ecid 5)	KEY on orders_pk	U↔U + 2 exchange events
5	2026-05-05 01:21:56	process13e78d1aca8 (spid 85, ecid 7)	KEY on orders_pk	U↔U + exchange

All processes ran the same statement at tpch10.dbo.usp_UpdateOrdersAndLineitem line 53.
All processes use read committed (2) isolation; RCSI is disabled.
All wait resources are KEY: 11:72057594047102976 on index orders_pk (clustered PK on dbo.orders).
Every graph includes one or more exchangeEvent … e_waitPortOpen entries — the UPDATE goes parallel, and parallel sibling threads (different ecid, same spid) are part of the cycle.
Two callers used overlapping ranges: 1995-01-01..1995-03-31 and 1995-02-15..1995-05-15.
Lock mode is U on both owner and waiter sides — classic update-scan/seek convert race when different workers hit the same keys via different orders.

Root Cause Analysis

1. Non-deterministic key access order under parallel UPDATE

The UPDATE joins dbo.orders to a @BatchOrders table variable. Because table variables expose cardinality 1 to the optimizer (no statistics), the planner often chooses a parallel plan with nested loop / hash join and an unordered driving stream. Multiple concurrent executions therefore acquire U locks on the same orders_pk keys in different orders, producing a classic deadlock cycle.

2. Parallelism amplifies the cycle

Every deadlock graph includes exchangeEvent resources of type e_waitPortOpen with multiple ecid entries on the same spid. This is intra-query parallel deadlock: one parallel worker holds a U lock another worker (in the same plan) is waiting for, while a different session is mirrored on the opposite side. Removing parallelism on this statement alone usually eliminates this class of deadlock.

3. Read Committed without RCSI

RCSI is disabled, so any concurrent reader/UPDATE on overlapping rows takes shared/U key locks instead of using row versioning. This significantly increases U-U conflict probability on hotspot rows.

4. Overlapping date partitions across callers

Sessions execute with @FromDate/@ToDate ranges that overlap (Jan–Mar vs. Feb–May 1995). They therefore write into the same o_orderkey set, guaranteeing contention.

5. Long-lived "user_transaction"

The graphs show transactionname="user_transaction" with a multi-second lasttranstarted gap. The procedure is doing additional work (likely also the lineitem update) inside the same transaction, holding U locks on orders longer than necessary.

Prioritized Recommendations

P1 – Highest impact Force MAXDOP 1 on the offending UPDATE
The intra-query parallel exchange events in every graph are a strong indicator. For an UPDATE that touches a relatively small filtered set joined to a table variable, parallelism rarely helps and creates deadlock surface area.
```
UPDATE o
   SET o.o_comment = LEFT(CONCAT(o.o_comment, ' | BATCH_UPDATE ',
        CONVERT(varchar(10), @FromDate, 120), '-',
        CONVERT(varchar(10), @ToDate, 120)), 79)
FROM dbo.orders AS o
JOIN @BatchOrders AS bo
  ON bo.o_orderkey = o.o_orderkey
OPTION (MAXDOP 1);
```
Confidence: High. Direct mitigation of exchange-event deadlocks.
P1 Enable RCSI on tpch10
RCSI removes shared lock contention for readers and reduces conversion contention. It does not eliminate U-U writer conflicts but materially reduces deadlock frequency.
```
-- Quiesce, then:
ALTER DATABASE tpch10 SET READ_COMMITTED_SNAPSHOT ON WITH ROLLBACK IMMEDIATE;
-- Verify:
SELECT name, is_read_committed_snapshot_on FROM sys.databases WHERE name = 'tpch10';
```
Ensure tempdb has sufficient space and IO; monitor version_store usage. SQL Server 2022 + Enterprise tolerates this well.

Confidence: High.

P1 Process keys in deterministic order with explicit U-lock acquisition

Materialize the target keys, sort them, and lock with UPDLOCK, HOLDLOCK in that order so all sessions take locks in the same sequence. This converts a deadlock-prone pattern into a blocking-only pattern.

-- Inside usp_UpdateOrdersAndLineitem
DECLARE @KeysToUpdate TABLE (o_orderkey BIGINT PRIMARY KEY);

INSERT INTO @KeysToUpdate (o_orderkey)
SELECT bo.o_orderkey
FROM @BatchOrders AS bo
ORDER BY bo.o_orderkey;   -- deterministic order

UPDATE o
   SET o.o_comment = LEFT(CONCAT(o.o_comment, ' | BATCH_UPDATE ',
        CONVERT(varchar(10), @FromDate, 120), '-',
        CONVERT(varchar(10), @ToDate, 120)), 79)
FROM dbo.orders AS o WITH (UPDLOCK, HOLDLOCK, ROWLOCK)
JOIN @KeysToUpdate AS k
  ON k.o_orderkey = o.o_orderkey
OPTION (MAXDOP 1, FORCE ORDER);

Confidence: High. Combined with P1.1 + P1.2 this typically eliminates the cycle entirely.

P2 Replace @BatchOrders table variable with a #temp table having a clustered PK
Table variables expose 1-row cardinality to the optimizer (unless you use trace flag 2453 / OPTION (RECOMPILE)). This drives bad join orders and parallel plans that contribute to deadlocks. A #temp table with a clustered PK on o_orderkey gives accurate stats and a sorted seek into orders.
```
CREATE TABLE #BatchOrders (o_orderkey BIGINT NOT NULL PRIMARY KEY CLUSTERED);
INSERT INTO #BatchOrders (o_orderkey)
SELECT o_orderkey FROM dbo.orders
WHERE o_orderdate >= @FromDate AND o_orderdate < DATEADD(day,1,@ToDate);
```
Confidence: High.

P2 Eliminate overlapping ranges across callers

The captured workloads overlap (Jan–Mar vs. Feb–May 1995). Either:

Use non-overlapping date partitions per caller, or
Serialize concurrent invocations with an application lock so only one update of the same logical partition runs at a time:

BEGIN TRAN;
DECLARE @rc INT;
EXEC @rc = sp_getapplock
     @Resource    = N'usp_UpdateOrdersAndLineitem',
     @LockMode    = N'Exclusive',
     @LockOwner   = N'Transaction',
     @LockTimeout = 60000;
IF @rc < 0
BEGIN
   ROLLBACK; THROW 51000, 'Could not acquire applock for batch update.', 1;
END
-- ... do the update ...
COMMIT;

Confidence: High.

P2 Shorten transaction scope & batch the update

If the procedure also updates lineitem in the same transaction, split work into smaller, ordered batches (e.g., 1,000–10,000 keys per commit). Smaller transactions hold fewer U locks and reduce the deadlock window.

-- pseudo-pattern
WHILE EXISTS (SELECT 1 FROM #BatchOrders)
BEGIN
   BEGIN TRAN;

   ;WITH cte AS (
       SELECT TOP (5000) o_orderkey
       FROM #BatchOrders
       ORDER BY o_orderkey
   )
   DELETE FROM cte
   OUTPUT deleted.o_orderkey INTO #Slice (o_orderkey);

   UPDATE o
      SET o.o_comment = ...
   FROM dbo.orders AS o WITH (UPDLOCK, HOLDLOCK, ROWLOCK)
   JOIN #Slice s ON s.o_orderkey = o.o_orderkey
   OPTION (MAXDOP 1);

   TRUNCATE TABLE #Slice;
   COMMIT;
END

Confidence: Medium-High.

P3 Plan-level safety net: lock hint to disable lock escalation surprises

Confirm orders does not escalate to TABLE locks under load:

SELECT name, lock_escalation_desc
FROM sys.tables WHERE name = 'orders';
-- If anything other than TABLE/AUTO is unintended, leave AUTO.
-- For very large batches, consider:
ALTER TABLE dbo.orders SET (LOCK_ESCALATION = AUTO);

Confidence: Medium. Diagnostic; only act if escalation is observed in extended events.

P3 Set deadlock priority for batch jobs
If after the above changes occasional deadlocks remain, mark the batch jobs as the preferred victim so OLTP traffic wins:
```
SET DEADLOCK_PRIORITY LOW;
```
Combine with retry logic (see next).

Confidence: Medium.

P3 Implement deterministic retry on error 1205

DECLARE @attempt INT = 0;
WHILE (1=1)
BEGIN
   BEGIN TRY
      BEGIN TRAN;
      EXEC dbo.usp_UpdateOrdersAndLineitem @FromDate=@FromDate, @ToDate=@ToDate;
      COMMIT;
      BREAK;
   END TRY
   BEGIN CATCH
      IF XACT_STATE() <> 0 ROLLBACK;
      IF ERROR_NUMBER() = 1205 AND @attempt < 5
      BEGIN
         SET @attempt += 1;
         WAITFOR DELAY '00:00:00.250';
         CONTINUE;
      END
      THROW;
   END CATCH
END

Confidence: High as a defensive control; not a substitute for P1 fixes.

Recommended Refactor of `usp_UpdateOrdersAndLineitem`

Consolidated pattern incorporating P1 + P2 recommendations:

CREATE OR ALTER PROCEDURE dbo.usp_UpdateOrdersAndLineitem
    @FromDate date,
    @ToDate   date
AS
BEGIN
    SET NOCOUNT ON;
    SET XACT_ABORT ON;

    -- 1) Materialize keys with statistics + ordering
    CREATE TABLE #BatchOrders (o_orderkey BIGINT NOT NULL PRIMARY KEY CLUSTERED);

    INSERT INTO #BatchOrders (o_orderkey)
    SELECT o_orderkey
    FROM dbo.orders
    WHERE o_orderdate >= @FromDate
      AND o_orderdate <  DATEADD(day, 1, @ToDate);

    -- 2) Serialize on logical partition (optional but strongly recommended)
    BEGIN TRAN;

    DECLARE @rc INT;
    EXEC @rc = sp_getapplock
         @Resource    = N'usp_UpdateOrdersAndLineitem',
         @LockMode    = N'Exclusive',
         @LockOwner   = N'Transaction',
         @LockTimeout = 60000;
    IF @rc < 0
    BEGIN
        ROLLBACK;
        THROW 51000, 'Could not acquire applock.', 1;
    END

    -- 3) Update orders in deterministic key order, no parallelism
    UPDATE o
       SET o.o_comment = LEFT(CONCAT(o.o_comment, ' | BATCH_UPDATE ',
              CONVERT(varchar(10), @FromDate, 120), '-',
              CONVERT(varchar(10), @ToDate, 120)), 79)
    FROM dbo.orders AS o WITH (UPDLOCK, HOLDLOCK, ROWLOCK)
    JOIN #BatchOrders AS b
      ON b.o_orderkey = o.o_orderkey
    OPTION (MAXDOP 1, FORCE ORDER);

    -- 4) Lineitem update should also key-order and MAXDOP 1
    -- UPDATE l ... FROM dbo.lineitem l WITH (UPDLOCK, HOLDLOCK, ROWLOCK)
    --   JOIN #BatchOrders b ON b.o_orderkey = l.l_orderkey
    --   OPTION (MAXDOP 1);

    COMMIT;
END
GO

Validation & Monitoring

Confirm RCSI is on: SELECT is_read_committed_snapshot_on FROM sys.databases WHERE name='tpch10';
Confirm parallel UPDATE no longer occurs — examine actual plan or query store: no Parallelism (Gather Streams) on the UPDATE node.

Track deadlocks via the system_health XEvent session:

SELECT XEvent.value('(@timestamp)[1]', 'datetime2') AS ts,
       XEvent.query('.') AS deadlock_xml
FROM (
    SELECT CAST(target_data AS XML) AS td
    FROM sys.dm_xe_session_targets t
    JOIN sys.dm_xe_sessions s ON s.address = t.event_session_address
    WHERE s.name = 'system_health' AND t.target_name = 'ring_buffer'
) x
CROSS APPLY x.td.nodes('//RingBufferTarget/event[@name="xml_deadlock_report"]') AS X(XEvent)
ORDER BY ts DESC;

Watch sys.dm_tran_version_store_space_usage after enabling RCSI.
Track sys.dm_exec_query_stats / Query Store regressions on the UPDATE after MAXDOP 1; expected to be neutral or positive.