Query Tuning Report – RockyPC/SQLStorm – QueryHash 0x932168333A069240

Tuning Goal: Query Tuner  |  Server: RockyPC  |  Version: SQL Server 2022 (16.0.1180.1) Developer Edition (Enterprise engine)  |  Database: SQLStorm

Executive Summary

The query has an estimated total cost of 20.63 (Medium) but contains multiple structural issues that inflate the cost dramatically beyond what's necessary. The optimizer hit an StatementOptmEarlyAbortReason="TimeOut", meaning the chosen plan is likely suboptimal. The most significant problems are: (1) full clustered index scans on Votes (926K rows), Comments (351K rows × 2), Posts (246K rows × 2), Users (267K rows), and Badges (439K rows) due to missing supporting indexes; (2) an Index Spool / Eager Spool (cost 15.07 alone — over 73% of plan cost) materializing 351K Comments rows to satisfy the OR EXISTS predicate; (3) the UserReputation CTE aggregates all users/posts/badges before filtering PostCount > 5, doing far more work than needed; and (4) a memory grant warning — desired 111 MB but granted 0 KB, indicating a likely runtime sort spill.

Top Priorities

  1. P1 Add nonclustered index on Comments(PostId) INCLUDE (UserId) — eliminates the 15.07-cost Eager Spool and two 351K-row Comments scans. Plan-reported impact 40%.
  2. P1 Rewrite the OR EXISTS as UNION ALL — the OR with a correlated EXISTS forces the Concatenation + Eager Spool pattern, the single largest cost driver.
  3. P2 Add nonclustered index on Posts(OwnerUserId) INCLUDE (CreationDate, Title) — eliminates two Posts clustered index scans (398 MB table). Plan impact 12.68%, DMV impact up to 28.81%.
  4. P2 Add filtered index on Votes(PostId) INCLUDE (BountyAmount) WHERE VoteTypeId = 8 — eliminates the 926K-row Votes clustered scan. Only ~1,478 rows match.
  5. P2 Add nonclustered index on Badges(UserId) INCLUDE (Class) — DMV shows 76% impact and 24,652 avg cost from other queries; benefits this query too.
  6. P3 Push PostCount > 5 filter earlier in the UserReputation CTE or restructure so that small set of "active users" drives the join.

Confidence: High for index and rewrite recommendations (evidence taken directly from showplan operators and missing-index hints). Confidence: Medium on absolute runtime gains — execution plan estimates only; validate with actual stats.

Environment

  • Server: RockyPC — SQL Server 2022 (16.x) RTM-GDR 16.0.1180.1, Developer Edition (Enterprise engine)
  • Database: SQLStorm — Auto Create Stats ON, Auto Update Stats ON, Async Update OFF
  • Platform features available: Batch Mode on rowstore (already in use — BatchModeOnRowStoreUsed="true"), filtered indexes, online index builds, IQP / PSP optimization, DOP feedback
  • Plan optimization: StatementOptmLevel="FULL" but StatementOptmEarlyAbortReason="TimeOut" — optimizer ran out of search time, so plan is not provably optimal

Key Findings

1. Massive Eager Index Spool (P1 — Critical for this query)

Evidence: NodeId="28" PhysicalOp="Index Spool" LogicalOp="Eager Spool" EstimatedTotalSubtreeCost="15.0741", fed by Clustered Index Scan of PK__Comments__3214EC079DB7C869 reading 351,440 rows. This spool alone is 73% of the entire query cost. It exists because the OR EXISTS in the outer WHERE forces the optimizer to build an on-the-fly index on Comments to support the per-row correlated lookup driven by the Nested Loops Left Semi Join (NodeId 1).

2. Full Clustered Index Scans on Large Tables (P1/P2)

TableRows ReadSizeOperator CostReason
Votes926,08412 MB2.14No index on VoteTypeId / PostId — filter VoteTypeId = 8 applied as predicate after full scan
Posts (probe 1)246,672398 MB2.59CreationDate filter applied as predicate; no supporting index
Posts (probe 2)246,672398 MB2.59OwnerUserId join, no NC index
Comments (probe)351,44078 MB7.79PostId join needs index
Comments (spool source)351,44078 MB7.80Same — feeds Eager Spool
Users267,19344 MB1.73Join probe — clustered scan acceptable but bitmap probes used
Badges439,3528 MB1.24No index on UserId

Index usage stats confirm none of Votes/Comments/Badges have any seeks recorded — they are scan-only objects in the workload, consistent with missing access paths.

3. Cardinality Estimation / Optimizer Timeout

Evidence: StatementOptmEarlyAbortReason="TimeOut" with CompileTime="84" ms. The optimizer didn't fully explore. Reducing the search space via better predicate placement (rewrite) and supporting indexes will let it find a cheaper plan.

4. Memory Grant Anomaly / Likely Sort Spill

Evidence: SerialDesiredMemory="111200" KB, GrantedMemory="0" KB. The Sort (NodeId 2) has MemoryFractions Input="0.688312". If the granted memory is actually 0 at runtime, the sort will spill to tempdb. This is symptomatic of estimated-plan-only data; verify with actual execution stats.

5. Aggregations Before Filtering in UserReputation CTE

Evidence: Hash Match (Aggregate) NodeId 16 produces 6,792 rows from joining all 267K Users × all 246K Posts × all 439K Badges before the PostCount > 5 filter is applied (Filter NodeId 3, predicate [Expr1009] > (5)). All Users/Posts/Badges are scanned regardless of post count.

Detailed Prioritized Recommendations

All runnable scripts are in the Scripts section. Storage estimates assume PAGE compression (consistent with existing clustered indexes) and 4-byte int keys; treat as approximate (±25%).

  1. P1 — Create NC index on Comments(PostId) INCLUDE (UserId)

    Evidence: Missing index hint Impact="40.0023" for [Comments].[PostId]. The Eager Spool (cost 15.07) and two full Comments scans (cost 7.79 + 7.80) all disappear with this single index.

    • Estimated storage: ~5.5 MB (351,440 rows × ~12 bytes effective + tree overhead, PAGE compressed)
    • Expected read benefit: ~55–70% of total query cost reduction (removes the dominant cost driver)
    • Expected write overhead: ~3–5% on INSERT/UPDATE/DELETE of Comments (single narrow NC index)
    • Maintenance notes: Build ONLINE = ON, DATA_COMPRESSION = PAGE. Statistics auto-created/updated. Low fragmentation risk because PostId tends to be append-correlated with new comments on recent posts.
    • Risk: Low. Narrow index on a relatively small (78 MB) table. No write-heavy workload signal in DMV stats (Updates = 0 recorded).
  2. P1 — Rewrite OR EXISTS as UNION ALL

    Evidence: The Concatenation (NodeId 24) plus Index Spool (NodeId 28, cost 15.07) plus Nested Loops Left Semi Join (NodeId 1) collectively exist to handle WHERE rp.TotalBounty > 0 OR EXISTS (...). OR with correlated subquery defeats set-based execution and forces row-by-row spooling.

    See Query Rewrite for the full T-SQL. After this rewrite combined with the Comments index, the plan should resolve EXISTS as a hash semi-join in batch mode, eliminating the spool entirely.

    • Expected gain: Removes the 15.07 cost spool subtree; combined with index gives ~70–80% cost reduction.
    • Risk: Semantically equivalent only if duplicates are deduped — use UNION (not UNION ALL) over keyed rows, or wrap properly. See script.
  3. P2 — Create NC index on Posts(OwnerUserId) INCLUDE (CreationDate, Title)

    Evidence: Missing index hint Impact="12.6801". DMV shows additional missing-index requests on Posts(OwnerUserId) with avg user impact 28.81% across 56 seeks/scans — broadly beneficial. Posts is 398 MB; full clustered scans cost 2.59 each, two of them per query execution.

    • Estimated storage: ~22 MB (740K rows × ~30 bytes effective with Title being variable; PAGE compressed reduces materially)
    • Expected read benefit: ~12–18% on this query; substantial broader workload benefit per DMV
    • Expected write overhead: ~4–6% on Posts DML (Title is a wide INCLUDE — updates to Title trigger NC update)
    • Maintenance notes: Title is nvarchar, so this can grow. Consider OMITTING Title initially and only including CreationDate to keep narrow. Build ONLINE = ON, PAGE compression. Monitor fragmentation due to OwnerUserId being non-sequential.
    • Risk: Medium-low. Title increases size — alternative narrow form is provided as Script 3b.
  4. P2 — Create filtered NC index on Votes(PostId) INCLUDE (BountyAmount) WHERE VoteTypeId = 8

    Evidence: Missing index hint Impact="11.9123". Only 1,478 estimated rows match VoteTypeId = 8 out of 926,084 — perfect candidate for a filtered index. Current plan scans all 926K rows (cost 2.14) to find them.

    • Estimated storage: <100 KB (only matching rows are stored)
    • Expected read benefit: ~10–12% on this query; collapses the Votes scan to a tiny seek
    • Expected write overhead: Negligible — filtered index only updates when VoteTypeId = 8 rows change
    • Maintenance notes: Filtered indexes require SET QUOTED_IDENTIFIER ON and ANSI_NULLS ON at connection time to be used. Most modern clients (.NET, SSMS) set these correctly; legacy clients may not benefit.
    • Risk: Low. Very small, very targeted.
  5. P2 — Create NC index on Badges(UserId) INCLUDE (Class)

    Evidence: DMV missing-index entry shows Avg User Cost 24,652, User Impact 76.44%, plus a related request for INCLUDE (Name) with 87.57% impact. For this query specifically, eliminates the 439K-row clustered scan (cost 1.24).

    • Estimated storage: ~3 MB (439K rows × ~10 bytes effective, PAGE compressed)
    • Expected read benefit: ~6% on this query; ~50%+ on other workload queries per DMV evidence
    • Expected write overhead: ~2–3% on Badges INSERTs
    • Maintenance notes: Build ONLINE = ON, PAGE compression. Class has only 2 distinct values so it's tiny.
    • Risk: Low.
  6. P3 — Push the PostCount > 5 filter earlier

    Evidence: Filter NodeId 3 (predicate [Expr1009] > (5)) executes after the full aggregate over all Users × Posts × Badges. We can pre-filter to users with > 5 posts before joining Badges. See the rewrite. Combined with the Posts(OwnerUserId) index, this becomes a streaming aggregate.

    • Expected gain: Reduces Badges-side hash build dramatically; modest extra benefit on top of indexes.
    • Risk: Low. Logically equivalent.
  7. P3 — Statistics maintenance

    All referenced statistics are sampled at 100% and most were updated on 2026-03-19 with ModificationCount="0". No action needed currently. Recommend continuing scheduled stats maintenance (e.g., Ola Hallengren) with @UpdateStatistics='ALL', @OnlyModifiedStatistics='Y'. See Statistics section.

  8. P4 — Parameter sniffing considerations

    The query has no parameters — only literal constants ('2024-10-01 12:34:56', VoteTypeId = 8, PostCount > 5, OFFSET 0 FETCH NEXT 100). Parameter sniffing and PSP optimization don't apply here. If this query becomes parameterized in production, SQL Server 2022's Parameter Sensitive Plan optimization will help automatically. No plan guide needed at this time.

Estimated Plan: Before vs After

Text-based comparison of the dominant subtree, showing operators, estimated cost, and estimated row counts. After applying the Comments index + UNION ALL rewrite + Posts/Votes/Badges indexes.

BEFORE — Cost 20.63

Top (rows: 100, cost: 20.63)
 └─ Nested Loops Left Semi Join (cost: 20.63) ◄ row-by-row
     ├─ Sort (cost: 20.63, mem-frac 0.69) ◄ may spill
     │   └─ Filter PostCount>5 (rows: 5,053) ◄ late filter
     │       └─ Hash Aggregate (rows: 8,126)
     │           └─ Hash Inner Join Users×Posts
     │               ├─ Hash Aggregate (rows: 8,352)
     │               │   └─ Hash Right Outer Join Votes×Posts
     │               │       ├─ CI Scan Votes (926,084 rows, cost 2.14) ✗
     │               │       └─ Hash Left Outer Join Posts×Comments
     │               │           ├─ CI Scan Posts (246,672 rows, cost 2.59) ✗
     │               │           └─ CI Scan Comments (351,440 rows, cost 7.79) ✗
     │               └─ Hash Left Outer Join Users×Posts×Badges
     │                   ├─ CI Scan Users (267,193 rows, cost 1.73)
     │                   ├─ CI Scan Posts (246,672 rows, cost 2.59) ✗
     │                   └─ CI Scan Badges (439,352 rows, cost 1.24) ✗
     └─ Concatenation
         └─ Filter EXISTS
             └─ Index Spool / Eager Spool (cost: 15.07) ✗✗✗ DOMINANT
                 └─ CI Scan Comments (351,440 rows, cost 7.80) ✗

AFTER — Projected cost ~4–6

Top (rows: 100)
 └─ Sort TOP-N (small, in-memory)
     └─ Hash Inner Join RankedPosts×ActiveUsers
         │   (UNION ALL branch keeps it set-based)
         ├─ Hash Aggregate (rows ~8K)
         │   └─ Hash Left Outer Join Posts×Comments
         │       ├─ Index Seek Posts by CreationDate
         │       │   (or CI scan still acceptable)
         │       └─ Index Seek Comments(PostId) ✓ NEW INDEX
         │           INCLUDE(UserId) ── enables semi-join
         │           AND removes spool
         │   ⤷ Hash Left Outer Join Votes(filtered) ✓ NEW FILTERED IDX
         │       Index Seek Votes WHERE VoteTypeId=8 (≈1,478 rows)
         └─ Hash Aggregate Users with PostCount>5 pushed early
             ├─ Index Seek Posts(OwnerUserId) ✓ NEW INDEX
             └─ Index Seek Badges(UserId) ✓ NEW INDEX
                 INCLUDE(Class)

Eager Spool: ELIMINATED ✓
Full CI scans on Votes/Comments/Badges: ELIMINATED ✓
Sort spill risk: REDUCED (smaller input)

Diagram description: side-by-side text plan trees. Left shows the current plan with the Eager Spool (cost 15.07) as the dominant subtree and five full clustered index scans of large tables. Right shows the projected plan after recommendations, where index seeks replace scans and the spool is eliminated by the UNION ALL rewrite plus the Comments(PostId) index.

Query Rewrite

The key change is replacing WHERE rp.TotalBounty > 0 OR EXISTS (...) with UNION (deduplicates on PostId). This lets the optimizer process each branch as a set-based join rather than introducing a Concatenation+Spool to satisfy OR.

A second improvement: pre-filter users to those with > 5 posts via a derived "ActiveUserIds" set before aggregating Badges, so we don't sum badges for users we'll discard.

The full rewritten T-SQL is in Script 6.

Statistics Maintenance

  • All statistics on referenced tables were updated 2026-03-19 at 100% sample with ModificationCount = 0. They are fresh and complete; no immediate update needed for this query.
  • Badges stats are slightly older (2026-02-28) but still show ModificationCount = 0.
  • Recommend keeping AUTO_UPDATE_STATISTICS_ASYNC = OFF only if you can tolerate occasional sync stats updates blocking queries; otherwise enable it to avoid synchronous waits.
  • After creating the new indexes, statistics will be auto-created on the index keys at 100% sample.

Table-Specific Insights

TableRowsSizeInsight
Posts740,016398 MBLargest table by storage. Heavily scanned (2,497 scans, 240 seeks). DMV shows ~20 distinct missing-index requests — there is a broader index-strategy gap on this table beyond this query.
Votes926,08412 MBHighest row count but small (narrow rows). 1,243 scans, zero seeks — no useful NC indexes exist. VoteTypeId selectivity is excellent for filtered indexes.
Users534,38644 MBReputation has 2,443 distinct values — good selectivity. DMV flags missing inequality index on Reputation (8.1% impact, 33 hits) — minor.
Badges439,3528 MBZero seeks; high-impact missing index on UserId. Small table — index is cheap.
Comments351,44078 MBZero seeks; PostId has 122,541 distinct values (~3 comments/post) — selective. Missing PostId index is the single biggest miss in the workload per the 40% plan impact.

Note: OwnerUserId on Posts has all-density 0.000015 (~65,118 distinct), so a seek on OwnerUserId averages ~11 matching rows — very selective and ideal for a NC index seek.

Validation Procedure

  1. Capture baseline. Run the original query in a session with SET STATISTICS IO ON; SET STATISTICS TIME ON; and save the output (logical reads per table, CPU time, elapsed time, actual execution plan).
  2. Apply indexes one at a time if possible (Script 1 → Script 2 → Script 3 → Script 4), re-running the original query between each to attribute gains.
  3. Apply the query rewrite (Script 6) and compare again.
  4. Use sys.dm_db_index_usage_stats after 24–48 hours to confirm the new indexes are actually being used (seeks > 0) and not just adding write overhead.
  5. Monitor sys.dm_db_index_operational_stats for lock/latch contention if write workload changes.
  6. If runtime is critical, enable Query Store and compare the runtime stats and plans across the change boundary.
Important: All cost figures in this report are estimated from the showplan. Actual gains depend on data distribution, memory, I/O subsystem, and concurrency. Always validate with actual execution stats before deploying to production.

Scripts

Script 1 — P1: Create NC index on Comments(PostId) INCLUDE (UserId)

USE [SQLStorm];
GO
-- P1: Eliminates the 15.07-cost Eager Spool and full Comments scans.
-- Plan-reported impact: 40.00%
IF NOT EXISTS (
    SELECT 1 FROM sys.indexes
    WHERE object_id = OBJECT_ID(N'dbo.Comments')
      AND name = N'IX_Comments_PostId_INC_UserId'
)
BEGIN
    CREATE NONCLUSTERED INDEX IX_Comments_PostId_INC_UserId
        ON dbo.Comments (PostId)
        INCLUDE (UserId)
        WITH (ONLINE = ON, DATA_COMPRESSION = PAGE, FILLFACTOR = 100);
END;
GO

Script 2 — P2: Create NC index on Votes — filtered for VoteTypeId = 8

USE [SQLStorm];
GO
-- P2: Filtered index for bounty votes. Only ~1,478 of 926,084 rows match.
-- Plan-reported impact: 11.91%
IF NOT EXISTS (
    SELECT 1 FROM sys.indexes
    WHERE object_id = OBJECT_ID(N'dbo.Votes')
      AND name = N'IX_Votes_PostId_INC_Bounty_FILT_VoteType8'
)
BEGIN
    CREATE NONCLUSTERED INDEX IX_Votes_PostId_INC_Bounty_FILT_VoteType8
        ON dbo.Votes (PostId)
        INCLUDE (BountyAmount)
        WHERE VoteTypeId = 8
        WITH (ONLINE = ON, DATA_COMPRESSION = PAGE, FILLFACTOR = 100);
END;
GO
-- NOTE: Sessions using this index must have:
--   SET QUOTED_IDENTIFIER ON;  SET ANSI_NULLS ON;
-- (These are default for SSMS, .NET SqlClient, ODBC modern drivers.)

Script 3a — P2: Create NC index on Posts(OwnerUserId) INCLUDE (CreationDate, Title) — full version

USE [SQLStorm];
GO
-- P2: Eliminates Posts CI scans on the OwnerUserId join path.
-- Plan-reported impact: 12.68%; DMV avg user impact up to 28.81%.
-- This version covers OwnerUserId + outputs needed (CreationDate, Title).
IF NOT EXISTS (
    SELECT 1 FROM sys.indexes
    WHERE object_id = OBJECT_ID(N'dbo.Posts')
      AND name = N'IX_Posts_OwnerUserId_INC_CreationDate_Title'
)
BEGIN
    CREATE NONCLUSTERED INDEX IX_Posts_OwnerUserId_INC_CreationDate_Title
        ON dbo.Posts (OwnerUserId)
        INCLUDE (CreationDate, Title)
        WITH (ONLINE = ON, DATA_COMPRESSION = PAGE, FILLFACTOR = 100);
END;
GO

Script 3b — P2 (alternative narrow): Posts(OwnerUserId) INCLUDE (CreationDate) only

USE [SQLStorm];
GO
-- P2 ALTERNATIVE: Choose this OR Script 3a, not both.
-- Narrower index (no Title) -- smaller storage, lower write overhead,
-- but query will still need a key lookup for Title.
-- Use this version if write workload on Posts is heavy or storage is tight.
IF NOT EXISTS (
    SELECT 1 FROM sys.indexes
    WHERE object_id = OBJECT_ID(N'dbo.Posts')
      AND name = N'IX_Posts_OwnerUserId_INC_CreationDate'
)
BEGIN
    CREATE NONCLUSTERED INDEX IX_Posts_OwnerUserId_INC_CreationDate
        ON dbo.Posts (OwnerUserId)
        INCLUDE (CreationDate)
        WITH (ONLINE = ON, DATA_COMPRESSION = PAGE, FILLFACTOR = 100);
END;
GO

Script 4 — P2: Create NC index on Badges(UserId) INCLUDE (Class)

USE [SQLStorm];
GO
-- P2: Eliminates Badges CI scan; addresses high-impact DMV finding (76% impact).
IF NOT EXISTS (
    SELECT 1 FROM sys.indexes
    WHERE object_id = OBJECT_ID(N'dbo.Badges')
      AND name = N'IX_Badges_UserId_INC_Class'
)
BEGIN
    CREATE NONCLUSTERED INDEX IX_Badges_UserId_INC_Class
        ON dbo.Badges (UserId)
        INCLUDE (Class)
        WITH (ONLINE = ON, DATA_COMPRESSION = PAGE, FILLFACTOR = 100);
END;
GO

Script 5 — Baseline / validation harness

USE [SQLStorm];
GO
-- Use this to capture before/after metrics. Run it BEFORE creating indexes,
-- then again AFTER each change, saving the messages tab output.
SET STATISTICS IO ON;
SET STATISTICS TIME ON;
SET NOCOUNT ON;
GO

-- Clear cache ONLY in a non-production / dev environment for clean comparison.
-- DBCC FREEPROCCACHE;  -- DEV ONLY
-- DBCC DROPCLEANBUFFERS; -- DEV ONLY

-- === ORIGINAL QUERY ===
WITH RankedPosts AS (
    SELECT
        p.Id AS PostId, p.Title, p.CreationDate, p.OwnerUserId,
        COUNT(c.Id) AS CommentCount,
        SUM(v.BountyAmount) AS TotalBounty,
        ROW_NUMBER() OVER (PARTITION BY p.OwnerUserId ORDER BY p.CreationDate DESC) AS PostRank
    FROM dbo.Posts p
    LEFT JOIN dbo.Comments c ON p.Id = c.PostId
    LEFT JOIN dbo.Votes v ON p.Id = v.PostId AND v.VoteTypeId = 8
    WHERE p.CreationDate >= DATEADD(year, -1, CAST('2024-10-01 12:34:56' AS datetime))
    GROUP BY p.Id, p.Title, p.CreationDate, p.OwnerUserId
),
UserReputation AS (
    SELECT u.Id AS UserId, u.Reputation,
           COUNT(DISTINCT p.Id) AS PostCount,
           SUM(ISNULL(b.Class, 0)) AS TotalBadges
    FROM dbo.Users u
    LEFT JOIN dbo.Posts p ON u.Id = p.OwnerUserId
    LEFT JOIN dbo.Badges b ON u.Id = b.UserId
    GROUP BY u.Id, u.Reputation
),
ActiveUsers AS (
    SELECT ur.UserId, ur.Reputation, ur.PostCount, ur.TotalBadges,
           ROW_NUMBER() OVER (ORDER BY ur.Reputation DESC) AS UserRank
    FROM UserReputation ur WHERE ur.PostCount > 5
)
SELECT rp.PostId, rp.Title, rp.CreationDate, ua.UserId, ua.Reputation,
       rp.CommentCount, rp.TotalBounty, ua.PostCount, ua.TotalBadges
FROM RankedPosts rp
JOIN ActiveUsers ua ON rp.OwnerUserId = ua.UserId
WHERE rp.TotalBounty > 0
   OR EXISTS (SELECT 1 FROM dbo.Comments c WHERE c.PostId = rp.PostId AND c.UserId IS NOT NULL)
ORDER BY ua.Reputation DESC, rp.CreationDate DESC
OFFSET 0 ROWS FETCH NEXT 100 ROWS ONLY;
GO

SET STATISTICS IO OFF;
SET STATISTICS TIME OFF;
GO

Script 6 — P1: Rewritten query (UNION + early filter for ActiveUsers)

USE [SQLStorm];
GO
SET STATISTICS IO ON;
SET STATISTICS TIME ON;
SET NOCOUNT ON;
GO

-- === REWRITTEN QUERY ===
-- Changes:
--   (1) OR EXISTS -> two-branch UNION (removes the Eager Spool / Concatenation pattern)
--   (2) ActiveUsers pre-filters PostCount > 5 BEFORE joining Badges
--   (3) Removes unused PostRank / UserRank columns from final projection
WITH ActiveUserIds AS (
    -- Users with > 5 posts -- evaluated FIRST so Badges aggregation
    -- only runs for this much smaller set.
    SELECT p.OwnerUserId AS UserId
    FROM dbo.Posts p
    WHERE p.OwnerUserId IS NOT NULL
    GROUP BY p.OwnerUserId
    HAVING COUNT_BIG(*) > 5
),
ActiveUsers AS (
    SELECT u.Id           AS UserId,
           u.Reputation,
           ( SELECT COUNT_BIG(*) FROM dbo.Posts p2
             WHERE p2.OwnerUserId = u.Id )           AS PostCount,
           ( SELECT SUM(ISNULL(b.Class, 0)) FROM dbo.Badges b
             WHERE b.UserId = u.Id )                 AS TotalBadges
    FROM dbo.Users u
    WHERE u.Id IN (SELECT UserId FROM ActiveUserIds)
),
RankedPosts AS (
    SELECT
        p.Id AS PostId, p.Title, p.CreationDate, p.OwnerUserId,
        COUNT(c.Id) AS CommentCount,
        SUM(v.BountyAmount) AS TotalBounty
    FROM dbo.Posts p
    LEFT JOIN dbo.Comments c ON p.Id = c.PostId
    LEFT JOIN dbo.Votes v ON p.Id = v.PostId AND v.VoteTypeId = 8
    WHERE p.CreationDate >= DATEADD(year, -1, CAST('2024-10-01 12:34:56' AS datetime))
    GROUP BY p.Id, p.Title, p.CreationDate, p.OwnerUserId
),
Candidates AS (
    -- Branch A: posts with TotalBounty > 0
    SELECT rp.PostId, rp.Title, rp.CreationDate, rp.OwnerUserId,
           rp.CommentCount, rp.TotalBounty
    FROM RankedPosts rp
    WHERE rp.TotalBounty > 0

    UNION   -- dedupe by all output columns; PostId is unique within RankedPosts

    -- Branch B: posts that have at least one Comment with non-NULL UserId
    SELECT rp.PostId, rp.Title, rp.CreationDate, rp.OwnerUserId,
           rp.CommentCount, rp.TotalBounty
    FROM RankedPosts rp
    WHERE EXISTS (
        SELECT 1 FROM dbo.Comments c
        WHERE c.PostId = rp.PostId AND c.UserId IS NOT NULL
    )
)
SELECT  c.PostId, c.Title, c.CreationDate,
        ua.UserId, ua.Reputation,
        c.CommentCount, c.TotalBounty,
        ua.PostCount, ua.TotalBadges
FROM Candidates c
JOIN ActiveUsers ua ON c.OwnerUserId = ua.UserId
ORDER BY ua.Reputation DESC, c.CreationDate DESC
OFFSET 0 ROWS FETCH NEXT 100 ROWS ONLY;
GO

SET STATISTICS IO OFF;
SET STATISTICS TIME OFF;
GO

Script 7 — Verify new indexes are being used after deployment

USE [SQLStorm];
GO
-- Run 24-48 hours after deploying the new indexes.
-- 'user_seeks' should be > 0 for the indexes to be earning their keep.
SELECT  OBJECT_SCHEMA_NAME(i.object_id) AS [Schema],
        OBJECT_NAME(i.object_id)        AS [Table],
        i.name                          AS IndexName,
        s.user_seeks, s.user_scans, s.user_lookups, s.user_updates,
        s.last_user_seek, s.last_user_scan, s.last_user_update
FROM sys.indexes i
LEFT JOIN sys.dm_db_index_usage_stats s
       ON s.object_id = i.object_id
      AND s.index_id  = i.index_id
      AND s.database_id = DB_ID()
WHERE i.name IN (
    N'IX_Comments_PostId_INC_UserId',
    N'IX_Votes_PostId_INC_Bounty_FILT_VoteType8',
    N'IX_Posts_OwnerUserId_INC_CreationDate_Title',
    N'IX_Posts_OwnerUserId_INC_CreationDate',
    N'IX_Badges_UserId_INC_Class'
)
ORDER BY [Table], IndexName;
GO