Query Tuner Report – SQLStorm @ RockyPC – QueryHash 0x932168333A069240

Tuning Goal: Query Tuner

Server: RockyPC  |  Database: SQLStorm  |  SQL Server 2022 (16.0.1180.1) Developer Edition (Enterprise engine)

Executive Summary

The statement has an estimated cost of 20.63 (Medium). The optimizer hit a StatementOptmEarlyAbortReason="TimeOut", meaning the chosen plan may be sub-optimal. The dominant problems are an Eager Index Spool re-scanning all 351,440 Comments rows for the correlated EXISTS, and several full clustered index scans on large tables (Votes 926K, Posts 740K, Comments 351K) because no supporting non-clustered indexes exist — every referenced table is being accessed only via its primary key clustered index.

  1. Critical Eliminate the Eager Spool over Comments (NodeId 28, subtree cost 15.07 ≈ 73% of total) caused by the OR EXISTS predicate. Rewrite with UNION / pre-aggregation and add a Comments index.
  2. High Create IX_Comments_PostId (PostId) INCLUDE (UserId) — plan missing-index impact 40%; supports both the join and the EXISTS.
  3. Medium Create a filtered index on Votes for VoteTypeId = 8 to replace the 926K-row scan (currently reads all rows, returns 1,478).
  4. Medium Create IX_Posts_OwnerUserId (plan impact 12.68%) to support the UserReputation join and the final join.
  5. Low Rewrite query to avoid recomputation: the Posts → Comments join is materialized twice (once in the CTE aggregate, once in the EXISTS spool).

Overall confidence: High for index/spool findings (direct plan evidence); Medium for exact percent-benefit estimates.

Plan Overview

All five referenced tables are read through their clustered PK only (no non-clustered index exists on any of them per the usage stats). Batch mode on rowstore is engaged (good), but the volumes scanned are large.

Operator (Node)TableRows ReadRows OutSubtree Cost
Eager Index Spool (28)Comments351,440~115.07
Clustered Index Scan (14)Comments351,44035,1447.79
Clustered Index Scan (12 / 21)Posts246,672 ea.8,352 / 2,4672.59 ea.
Clustered Index Scan (10)Votes926,0841,4782.14
Clustered Index Scan (19)Users267,1932,6721.73
Clustered Index Scan (23)Badges439,3524,3941.24
Operator cost breakdown — the Comments Eager Spool alone is ~73% of the plan.
Evidence: StatementOptmEarlyAbortReason="TimeOut" and Index Spool NodeId 28 EstimatedTotalSubtreeCost="15.0741" with EstimatedRowsRead="351440".

Detailed Prioritized Recommendations

1. Eliminate the Eager Spool from the OR EXISTS Critical

The final WHERE rp.TotalBounty > 0 OR EXISTS(...) forces a row-by-row correlated check. The optimizer built an Eager Index Spool (NodeId 28) that scans the entire Comments clustered index (351,440 rows) and is rebound ~100 times (EstimateRebinds="99.9946"). This single branch is the largest cost in the plan.

Evidence: PhysicalOp="Index Spool" LogicalOp="Eager Spool", subtree cost 15.0741; feeding Clustered Index Scan of Comments TableCardinality="351440".

Recommended approach: replace the disjunctive OR with a UNION ALL of the two qualifying conditions, or pre-compute a "post has a non-null-user comment" flag during the existing Comments aggregation in RankedPosts. Both remove the need for a separate correlated lookup. See Script 1. Combined with the Comments index (Rec 2) the optimizer can satisfy the EXISTS with a seek instead of a full scan + spool.

Confidence: High. Expected to remove ~70%+ of plan cost.

2. Index Comments on PostId INCLUDE UserId High

Comments is joined and probed by PostId in three places yet has no non-clustered index. The plan's own missing-index list rates this the top opportunity.

Evidence: <MissingIndex Table="[Comments]"> EQUALITY PostId, Impact 40.00%; DMV missing index [PostId] avg impact 18.40%, and index usage shows Comments PK Scans=860, Seeks=0.

Including UserId covers the EXISTS predicate (c.UserId IS NOT NULL) and the comment-count aggregation. See Script 2.

MetricEstimate
estimatedStorageBytes~7.0 MB (351,440 rows × ~12 B key+include + overhead, PAGE-compressed)
expectedReadBenefitPercent~40%
expectedWriteOverheadPercent~3–5% on Comments INSERT/UPDATE of PostId/UserId
maintenanceNotesSingle narrow index; low fragmentation risk (PostId roughly monotonic via FK). Rebuild ONLINE available (Enterprise).

Risk: Low. Comments has 0 recorded updates in usage window; write overhead minimal.

Confidence: High.

3. Filtered index on Votes for VoteTypeId = 8 Medium

The Votes join filters VoteTypeId = 8 but reads the full 926,084-row clustered index to return only 1,478 rows. VoteTypeId has only 13 distinct values, making a filtered index ideal.

Evidence: NodeId 10 Clustered Index Scan EstimatedRowsRead="926084"EstimateRows="1478"; predicate [v].[VoteTypeId]=(8).

A filtered index on (PostId) INCLUDE (BountyAmount) WHERE VoteTypeId = 8 reduces this to a tiny seek/scan. See Script 3.

MetricEstimate
estimatedStorageBytes~40 KB (filtered to ~1,500 qualifying rows)
expectedReadBenefitPercent~12% (plan impact 11.91%)
expectedWriteOverheadPercent<1% (only VoteTypeId=8 rows maintained)
maintenanceNotesFiltered index; ensure session SET options (ARITHABORT etc.) are compatible for writes. ARITHABORT is OFF in this plan — verify connections setting ARITHABORT ON for DML against the filtered index.

Risk: Low storage/write cost. Note the filtered-index DML caveat: the plan shows ARITHABORT="false"; sessions modifying Votes must have the seven required SET options ON or DML will fail.

Confidence: High on read benefit, Medium on write-path SET-option risk.

4. Index Posts on OwnerUserId Medium

Posts is scanned in full twice (RankedPosts and UserReputation) and joined on OwnerUserId. A supporting index assists the UserReputation aggregation and the final join.

Evidence: <MissingIndex Table="[Posts]"> EQUALITY OwnerUserId, Impact 12.68%; NodeId 21 scan reads 246,672 rows for the OwnerUserId hash probe.

Because the date filter and Title/CreationDate columns are also needed, a covering variant helps the RankedPosts branch too. See Script 4.

MetricEstimate
estimatedStorageBytes~6 MB key-only (OwnerUserId); ~20 MB if covering CreationDate/Title
expectedReadBenefitPercent~12%
expectedWriteOverheadPercent~2–4% on Posts inserts
maintenanceNotesPosts is the largest table (399 MB). Keep includes minimal (Title is wide, avg 24 B). Use the key-only version unless RankedPosts scan dominates after Rec 1.

Risk: Medium — Posts is large and write-active in production; prefer the narrow key-only index first, measure, then widen.

Confidence: Medium.

5. Rewrite the OR predicate as UNION-friendly logic Low–Medium

Disjunctive OR across an aggregate (TotalBounty) and a correlated EXISTS defeats index seeks. Pre-aggregate a comment indicator inside RankedPosts so the outer predicate becomes a simple column comparison. Full rewrite in Script 5.

Evidence: Concatenation NodeId 24 + two Filters represent the OR-expansion; the EXISTS branch carries the 15.07 spool cost.

Confidence: Medium — semantics preserved by adding MAX(CASE WHEN c.UserId IS NOT NULL THEN 1 ELSE 0 END) to the existing Comments join.

6. UserReputation CTE re-scans Posts/Badges/Users Low

The UserReputation CTE aggregates Users ⋈ Posts ⋈ Badges (cost ≈ 5.6) only to filter PostCount > 5. Posts is scanned a second time here. The Posts OwnerUserId index (Rec 4) and a Badges UserId index reduce these scans.

Evidence: NodeId 23 Badges Clustered Index Scan reads 439,352 rows; NodeId 21 second Posts scan reads 246,672.

Optional Badges index in Script 6. Lower priority because these are batch-mode hash joins with reasonable cost relative to the spool.

Confidence: Medium.

7. Statistics & Parameter Considerations Informational

  • Statistics are healthy: all referenced stats show ModificationCount="0", SamplingPercent="100", updated 2026-03-19. No stats refresh required now.
  • Optimizer timeout: StatementOptmEarlyAbortReason="TimeOut" indicates the plan search was cut short — simplifying the query (Rec 5) and adding indexes (Rec 2–4) gives the optimizer cheaper seek-based options it can find faster.
  • Parameter sniffing: The query uses literal/constant predicates only (the date is computed from a constant; VoteTypeId = 8, PostCount > 5 are literals). There are no parameters, so parameter-sniffing mitigations and plan guides are not applicable. SQL Server 2022 PSP optimization is also irrelevant here.
  • No Query Store history exists for this text, so no regression baseline is available — capture one after changes.

Confidence: High.

Before / After Plan Comparison

Text representation of the expected plan shape change after applying Recs 1–3.

Before vs After plan operators BEFORE (cost ≈ 20.63) Eager Index Spool over Comments (351,440 rows) — 15.07 Votes Clustered Scan 926,084 → 1,478 — 2.14 Comments Clustered Scan 351,440 — 7.79 AFTER (projected ↓ ~70%) Pre-aggregated comment flag (no spool) Votes filtered-index Seek ~1,478 rows Comments IX_PostId Seek/range scan Diagram: the dominant Eager Spool branch (left) is replaced by an aggregated indicator computed once, and full clustered scans become index seeks. Projected total cost drops below ~6.

Table-Specific Insights

TableRowsSizeIndex usageKey insight
Votes926,08412 MBScans 1,248 / Seeks 0VoteTypeId only 13 distinct → filtered index ideal.
Posts740,016399 MBScans 2,507 / Seeks 240Largest table; OwnerUserId ~65K distinct — good seek selectivity.
Users534,38644 MBScans 1,763 / Seeks 533Reputation only 2,443 distinct; sort on it is cheap.
Comments351,44078 MBScans 860 / Seeks 0No NC index → drives the spool. Top fix target.
Badges439,3528 MBScans 696 / Seeks 0UserId ~154K distinct; optional supporting index.
Every referenced table currently relies solely on its PK clustered index (0 seeks on Votes/Comments/Badges).

Validation: Run SET STATISTICS IO ON; and SET STATISTICS TIME ON; before and after applying scripts and compare logical reads and elapsed time for each referenced table.

Scripts

Script 1 + 5 — Rewrite query: pre-aggregate comment indicator, remove Eager Spool (Recs 1 & 5)

USE [SQLStorm];
GO
WITH RankedPosts AS (
    SELECT
        p.Id            AS PostId,
        p.Title,
        p.CreationDate,
        p.OwnerUserId,
        COUNT(c.Id)     AS CommentCount,
        MAX(CASE WHEN c.UserId IS NOT NULL THEN 1 ELSE 0 END) AS HasUserComment,
        SUM(v.BountyAmount) AS TotalBounty
    FROM dbo.Posts p
    LEFT JOIN dbo.Comments c ON p.Id = c.PostId
    LEFT JOIN dbo.Votes    v ON p.Id = v.PostId AND v.VoteTypeId = 8
    WHERE p.CreationDate >= DATEADD(year, -1, CAST('2024-10-01 12:34:56' AS datetime))
    GROUP BY p.Id, p.Title, p.CreationDate, p.OwnerUserId
),
UserReputation AS (
    SELECT
        u.Id AS UserId,
        u.Reputation,
        COUNT(DISTINCT p.Id)        AS PostCount,
        SUM(ISNULL(b.Class, 0))     AS TotalBadges
    FROM dbo.Users u
    LEFT JOIN dbo.Posts  p ON u.Id = p.OwnerUserId
    LEFT JOIN dbo.Badges b ON u.Id = b.UserId
    GROUP BY u.Id, u.Reputation
),
ActiveUsers AS (
    SELECT UserId, Reputation, PostCount, TotalBadges
    FROM UserReputation
    WHERE PostCount > 5
)
SELECT
    rp.PostId, rp.Title, rp.CreationDate,
    ua.UserId, ua.Reputation,
    rp.CommentCount, rp.TotalBounty,
    ua.PostCount, ua.TotalBadges
FROM RankedPosts rp
JOIN ActiveUsers ua ON rp.OwnerUserId = ua.UserId
WHERE rp.TotalBounty > 0
   OR rp.HasUserComment = 1     -- replaces correlated EXISTS / Eager Spool
ORDER BY ua.Reputation DESC, rp.CreationDate DESC
OFFSET 0 ROWS FETCH NEXT 100 ROWS ONLY;
GO

Script 2 — Comments supporting index (Rec 2)

USE [SQLStorm];
GO
CREATE NONCLUSTERED INDEX IX_Comments_PostId_inclUserId
    ON dbo.Comments (PostId)
    INCLUDE (UserId)
    WITH (DATA_COMPRESSION = PAGE, ONLINE = ON);
GO

Script 3 — Votes filtered index for VoteTypeId = 8 (Rec 3)

USE [SQLStorm];
GO
-- NOTE: sessions performing DML on dbo.Votes must have ARITHABORT ON
-- (and the other required SET options) for filtered-index maintenance.
CREATE NONCLUSTERED INDEX IX_Votes_Bounty8_PostId
    ON dbo.Votes (PostId)
    INCLUDE (BountyAmount)
    WHERE VoteTypeId = 8
    WITH (DATA_COMPRESSION = PAGE, ONLINE = ON);
GO

Script 4 — Posts OwnerUserId index (Rec 4, start with key-only)

USE [SQLStorm];
GO
CREATE NONCLUSTERED INDEX IX_Posts_OwnerUserId
    ON dbo.Posts (OwnerUserId)
    WITH (DATA_COMPRESSION = PAGE, ONLINE = ON);
GO

-- Optional covering variant for the RankedPosts branch (evaluate storage first):
-- CREATE NONCLUSTERED INDEX IX_Posts_OwnerUserId_cover
--     ON dbo.Posts (OwnerUserId)
--     INCLUDE (CreationDate, Title)
--     WITH (DATA_COMPRESSION = PAGE, ONLINE = ON);
-- GO

Script 6 — Optional Badges UserId index (Rec 6)

USE [SQLStorm];
GO
CREATE NONCLUSTERED INDEX IX_Badges_UserId_inclClass
    ON dbo.Badges (UserId)
    INCLUDE (Class)
    WITH (DATA_COMPRESSION = PAGE, ONLINE = ON);
GO

Script 7 — Validation harness (run before and after)

USE [SQLStorm];
GO
SET STATISTICS IO ON;
SET STATISTICS TIME ON;
GO
-- Execute the rewritten query from Script 1 here, capture reads/time, compare.
GO
SET STATISTICS IO OFF;
SET STATISTICS TIME OFF;
GO