Query Tuning Report – SQLStorm.RankedPosts (QueryHash 0xCB06A3190DEA642C) – 2026-05-17
Executive Summary
The query has an estimated subtree cost of 201.56 (Critical band). The dominant cost is concentrated in a single
operator subtree: the Nested Loops with a LIKE '%' + t.TagName + '%' predicate against dbo.Posts.Tags,
contributing ~193 of the 201 cost units (≈96%). The remaining plan (the user/votes/badges aggregation) is well-shaped and costs only ~7 units.
Additionally, the final LEFT JOIN RankedPosts rp ON tp.UserId = rp.PostId joins UserId to PostId, which is almost certainly a logic bug — those
ID domains do not match. This means the RankedPosts CTE is computed at enormous cost (~104K rows × 1,232 tags via a non-sargable wildcard scan) only to produce
essentially incidental matches in the final result. Fixing this single defect is the largest available win.
- Rewrite the Posts–Tags relationship to remove the leading-wildcard
LIKEjoin (useSTRING_SPLITor a normalized PostTags table). Critical - Correct the final join predicate —
tp.UserId = rp.PostIdis semantically wrong; join onOwnerUserIdor remove the join. Critical - Add targeted indexes on
Posts(OwnerUserId,PostTypeId),Votes(PostId) INCLUDE(VoteTypeId), andBadges(UserId). High - Pre-aggregate
VotesandBadgesper user before joining, to remove the cross-product effect feedingCOUNT(DISTINCT). Medium - Remove unnecessary
SELECT DISTINCTafter the rewrite to avoid the final Hash Aggregate / Sort spill risk. Low
Top Findings (Evidence)
-
Non-sargable LIKE drives 96% of cost.
Nested Loops (NodeId 8)
EstimateCPU="67.2275", subtree cost193.282; predicatePosts.Tags LIKE '%' + Tags.TagName + '%'with a Table Spool (NodeId 10) rewound 104,435 times. -
Wrong-domain join.
Query text:
LEFT JOIN RankedPosts rp ON tp.UserId = rp.PostId.UserIdis adbo.Users.Iddomain (267,193 distinct);PostIdbelongs todbo.Posts.Id(246,672 distinct). Overlap of value ranges is coincidental. -
Posts full scan.
Clustered Index Scan on
PK__Posts__3214EC07AF21AC3E(NodeId 9) readsEstimatedRowsRead="246672", AvgRowSize 2051, table size 399 MB. No usable index for thePostTypeId=1predicate or for the tag join. -
COUNT(DISTINCT) over a join cross-product.
Hash Match Aggregate (NodeId 18) ingests ~
1,700,990rows becausePosts ⨝ Votes ⨝ Badgesmultiplies row counts beforeCOUNT(DISTINCT p.Id)/COUNT(DISTINCT b.Id)deduplicate. -
Memory grant pressure risk.
SerialDesiredMemory="1224792"KB requested; multiple Sort/Hash operators with batch-mode and MemoryFractions across the plan; tempdb spill possible at higher concurrency. -
Missing-index DMV hints align with rewrite targets.
DMV:
Votes(PostId) INCLUDE(VoteTypeId)impact 19.36%;Posts(OwnerUserId)impact 13.58%;Badges(UserId)impact 21.41%.
Detailed Prioritized Recommendations
1. Eliminate the LIKE '%' + TagName + '%' join on Posts.Tags Critical
Why: Posts.Tags is a delimited string column (e.g. <sql-server><tuning>). Joining to Tags.TagName with a double-wildcard LIKE is non-sargable: the optimizer cannot seek, must scan all 246,672 posts, and re-evaluates the predicate per tag (1,232 tags) using a Table Spool. This is the root cost.
Before (current):
FROM Posts p
JOIN Tags t ON p.Tags LIKE '%' + t.TagName + '%'
WHERE p.PostTypeId = 1
After — Option A: use STRING_SPLIT with the < > delimiter (SQL 2022 supports ordinal):
FROM Posts p
CROSS APPLY STRING_SPLIT(REPLACE(REPLACE(p.Tags,'>',''),'<','|'), '|') s
JOIN Tags t ON t.TagName = s.value
WHERE p.PostTypeId = 1
AND s.value <> '';
After — Option B (best): normalize via a PostTags bridge table (one-time materialization). This converts the LIKE pattern join into a hash/merge join on integer keys.
Confidence: High on the impact (≥80% cost reduction for this subtree).
Risk: Option B requires a one-time backfill and a maintenance trigger or scheduled rebuild; Option A is zero-schema but still scans Posts once.
2. Correct the final join predicate Critical
LEFT JOIN RankedPosts rp ON tp.UserId = rp.PostId joins user identifiers to post identifiers. This is almost certainly unintended. Two valid interpretations:
- Latest post owned by the user: include
OwnerUserIdinRankedPostsand jointp.UserId = rp.OwnerUserId AND rp.rn = 1. - Latest post per tag (independent list): remove the join entirely and return the latest-per-tag list separately.
Recommended rewrite (latest post per user):
WITH UserLatestPost AS (
SELECT p.OwnerUserId, p.Id AS PostId, p.Title, p.CreationDate,
ROW_NUMBER() OVER (PARTITION BY p.OwnerUserId ORDER BY p.CreationDate DESC) AS rn
FROM Posts p
WHERE p.PostTypeId = 1
)
SELECT ... FROM TopUsers tp
LEFT JOIN UserLatestPost rp ON rp.OwnerUserId = tp.UserId AND rp.rn = 1;
This removes Tags from the plan entirely if tag context is not actually required by the report. Confidence: High.
3. Pre-aggregate Votes & Badges before joining Medium
The current shape multiplies rows: Users ⨝ Posts ⨝ Votes ⨝ Badges produces ≈1.7M rows feeding a Hash Aggregate with COUNT(DISTINCT). Pre-aggregating to one row per user keeps cardinality low and eliminates the need for DISTINCT.
WITH PostAgg AS (
SELECT p.OwnerUserId, COUNT(*) AS TotalPosts
FROM Posts p WHERE p.OwnerUserId IS NOT NULL
GROUP BY p.OwnerUserId
),
VoteAgg AS (
SELECT p.OwnerUserId,
SUM(CASE WHEN v.VoteTypeId = 2 THEN 1 ELSE 0 END) AS UpVotes,
SUM(CASE WHEN v.VoteTypeId = 3 THEN 1 ELSE 0 END) AS DownVotes
FROM Posts p
JOIN Votes v ON v.PostId = p.Id
GROUP BY p.OwnerUserId
),
BadgeAgg AS (
SELECT b.UserId, COUNT(*) AS BadgeCount
FROM Badges b GROUP BY b.UserId
)
SELECT ...
FROM Users u
LEFT JOIN PostAgg pa ON pa.OwnerUserId = u.Id
LEFT JOIN VoteAgg va ON va.OwnerUserId = u.Id
LEFT JOIN BadgeAgg ba ON ba.UserId = u.Id
WHERE pa.TotalPosts > 10;
Confidence: High. Expected reduction of intermediate row volume by ~10×.
4. Targeted supporting indexes High
The following indexes directly serve the rewritten query and align with DMV missing-index hints.
| Index | Est. Storage | Read Benefit | Write Overhead | Maintenance Notes / Risk |
|---|---|---|---|---|
IX_Posts_OwnerUserId_PostTypeId_INC_CreationDate_Title |
~28–34 MB | ~55–70% | Low–Medium (Posts is moderate write) | Supports PostAgg, UserLatestPost CTE, and Posts→Users join. PAGE compression recommended. Use ONLINE = ON. |
IX_Votes_PostId_INC_VoteTypeId |
~16–20 MB | ~40–55% | Medium (Votes is write-heavy) | Replaces full Votes scan (926K rows). Watch insert hotspots; PAGE compression. |
IX_Badges_UserId |
~6–8 MB | ~30–45% | Low | Tiny; high benefit for BadgeAgg. Aligns with DMV impact 21.41%. |
IX_Posts_PostTypeId_INC_OwnerUserId_CreationDate_Title (only if Option A LIKE rewrite retained) |
~26–32 MB | ~25–35% (filter only) | Low–Medium | Replaces the 399 MB clustered scan with a narrower scan; does not eliminate LIKE cost. |
Risk assessment: Posts and Votes are large; each new index adds insert/update cost and a small extra log volume. Build with ONLINE = ON, DATA_COMPRESSION = PAGE. Combined estimated added storage is ~60–90 MB — acceptable on this database.
5. Remove SELECT DISTINCT and the trailing ORDER BY on a left-joined column Low
Once joins are corrected, DISTINCT is no longer needed and the final Hash Match Aggregate (NodeId 2, est. 67,958 rows) plus Sort (NodeId 1) can be removed.
Also, ORDER BY tp.UserRank, rp.CreationDate DESC orders by a column that can be NULL from the LEFT JOIN — keep UserRank only or add NULLS LAST semantics via CASE if needed.
6. Statistics & parameter considerations Low
- All referenced stats are 100% sampled and current (latest update 2026-03-19). No statistics action required immediately.
- The query has no parameters — Parameter Sensitive Plan (PSP) optimization and parameter sniffing are not applicable. Do not introduce a plan guide.
- Consider enabling
AUTO_UPDATE_STATISTICS_ASYNCon SQLStorm to avoid compile-time stalls on large tables (current: Disabled). - SQL Server 2022 Batch Mode on Rowstore is already engaged (
BatchModeOnRowStoreUsed="true") — keep CE 160.
Table-Specific Insights
| Table | Rows | Observed Usage | Insight |
|---|---|---|---|
| dbo.Posts | 740,016 (399 MB) | 170 scans, 20 seeks | Dominated by scans; large row size (avg 2051 bytes). Highest-leverage table for new indexes. |
| dbo.Votes | 926,084 (12 MB) | 93 scans, 0 seeks | Zero seeks indicates no non-clustered support on PostId or UserId. Strong DMV hints confirm. |
| dbo.Users | 534,386 (44 MB) | 120 scans, 36 seeks | Used as the driving aggregation key — well-served by clustered PK once Posts/Votes/Badges are indexed. |
| dbo.Badges | 439,352 (8 MB) | 43 scans, 0 seeks | Small and write-light — safe to add UserId index. |
| dbo.Tags | 1,232 | 4 scans | Tiny; only problematic because of how it's joined (LIKE %tag%). |
Validation Steps
- Capture a baseline:
SET STATISTICS IO ON; SET STATISTICS TIME ON;then run the original query, save outputs. - Apply scripts in order: indexes first, then rewritten query.
- Re-run with
STATISTICS IO/TIMEand compare logical reads on Posts/Votes, CPU time, and elapsed time. Target: >5× reduction in elapsed time, >10× reduction in logical reads on Posts. - Inspect the new actual plan for absence of the Nested Loops + LIKE predicate and absence of Table Spool.
- Verify result correctness by comparing row counts and a sample of
UserRank, DisplayName, TotalPostsvalues against the original (or document intended semantic change for the join fix).
Scripts
Script 1 — Baseline measurement (run before changes)
USE [SQLStorm];
GO
SET STATISTICS IO ON;
SET STATISTICS TIME ON;
SET NOCOUNT ON;
GO
-- Run the original query here and capture STATISTICS IO/TIME output and the actual plan.
-- (Paste the original query in this batch when executing the baseline.)
GO
SET STATISTICS IO OFF;
SET STATISTICS TIME OFF;
GO
Script 2 — Create supporting indexes (Recommendation #4)
USE [SQLStorm];
GO
-- Posts: support OwnerUserId joins, PostTypeId filter, and latest-post lookup.
CREATE NONCLUSTERED INDEX [IX_Posts_OwnerUserId_PostTypeId_INC_CreationDate_Title]
ON [dbo].[Posts] ([OwnerUserId], [PostTypeId])
INCLUDE ([CreationDate], [Title])
WITH (ONLINE = ON, DATA_COMPRESSION = PAGE, FILLFACTOR = 90, SORT_IN_TEMPDB = ON);
GO
-- Votes: support PostId join with VoteTypeId aggregation.
CREATE NONCLUSTERED INDEX [IX_Votes_PostId_INC_VoteTypeId]
ON [dbo].[Votes] ([PostId])
INCLUDE ([VoteTypeId])
WITH (ONLINE = ON, DATA_COMPRESSION = PAGE, FILLFACTOR = 95, SORT_IN_TEMPDB = ON);
GO
-- Badges: support UserId aggregation.
CREATE NONCLUSTERED INDEX [IX_Badges_UserId]
ON [dbo].[Badges] ([UserId])
WITH (ONLINE = ON, DATA_COMPRESSION = PAGE, FILLFACTOR = 95, SORT_IN_TEMPDB = ON);
GO
-- Optional: Posts secondary covering index for PostTypeId-only scans (keep only if Option A LIKE rewrite is retained).
-- CREATE NONCLUSTERED INDEX [IX_Posts_PostTypeId_INC_OwnerUserId_CreationDate_Title]
-- ON [dbo].[Posts] ([PostTypeId])
-- INCLUDE ([OwnerUserId], [CreationDate], [Title])
-- WITH (ONLINE = ON, DATA_COMPRESSION = PAGE, FILLFACTOR = 90, SORT_IN_TEMPDB = ON);
-- GO
Script 3 — Rewritten query (Recommendations #1, #2, #3, #5)
USE [SQLStorm];
GO
-- Rewritten version: pre-aggregates, removes LIKE join, fixes UserId/PostId mismatch by
-- joining latest post per user via OwnerUserId. Adjust if the original intent was latest-per-tag.
WITH UserLatestPost AS (
SELECT p.OwnerUserId,
p.Id AS PostId,
p.Title,
p.CreationDate,
ROW_NUMBER() OVER (PARTITION BY p.OwnerUserId
ORDER BY p.CreationDate DESC) AS rn
FROM dbo.Posts p
WHERE p.PostTypeId = 1
AND p.OwnerUserId IS NOT NULL
),
PostAgg AS (
SELECT p.OwnerUserId, COUNT(*) AS TotalPosts
FROM dbo.Posts p
WHERE p.OwnerUserId IS NOT NULL
GROUP BY p.OwnerUserId
),
VoteAgg AS (
SELECT p.OwnerUserId,
SUM(CASE WHEN v.VoteTypeId = 2 THEN 1 ELSE 0 END) AS UpVotes,
SUM(CASE WHEN v.VoteTypeId = 3 THEN 1 ELSE 0 END) AS DownVotes
FROM dbo.Posts p
JOIN dbo.Votes v ON v.PostId = p.Id
GROUP BY p.OwnerUserId
),
BadgeAgg AS (
SELECT b.UserId, COUNT(*) AS BadgeCount
FROM dbo.Badges b
GROUP BY b.UserId
),
TopUsers AS (
SELECT u.Id AS UserId,
u.DisplayName,
pa.TotalPosts,
ISNULL(va.UpVotes, 0) AS UpVotes,
ISNULL(va.DownVotes, 0) AS DownVotes,
ISNULL(ba.BadgeCount,0) AS BadgeCount,
DENSE_RANK() OVER (ORDER BY pa.TotalPosts DESC,
ISNULL(va.UpVotes,0) - ISNULL(va.DownVotes,0) DESC) AS UserRank
FROM dbo.Users u
JOIN PostAgg pa ON pa.OwnerUserId = u.Id
LEFT JOIN VoteAgg va ON va.OwnerUserId = u.Id
LEFT JOIN BadgeAgg ba ON ba.UserId = u.Id
WHERE pa.TotalPosts > 10
)
SELECT tp.UserRank,
tp.DisplayName,
tp.TotalPosts,
tp.UpVotes,
tp.DownVotes,
tp.BadgeCount,
rp.Title AS LatestPostTitle,
rp.CreationDate AS LatestPostDate,
CASE WHEN rp.rn = 1 THEN 'Latest' ELSE 'Older' END AS PostStatus
FROM TopUsers tp
LEFT JOIN UserLatestPost rp
ON rp.OwnerUserId = tp.UserId
AND rp.rn = 1
ORDER BY tp.UserRank;
GO
Script 4 — Alternative rewrite preserving Tags semantics via STRING_SPLIT (Recommendation #1, Option A)
USE [SQLStorm];
GO
-- Use this variant only if the original "latest post per Tag" semantics are required.
-- Replaces the non-sargable LIKE join with STRING_SPLIT against the Tags column.
WITH RankedPosts AS (
SELECT p.Id AS PostId,
p.Title,
p.CreationDate,
t.Id AS TagId,
t.TagName,
ROW_NUMBER() OVER (PARTITION BY t.Id ORDER BY p.CreationDate DESC) AS rn
FROM dbo.Posts p
CROSS APPLY STRING_SPLIT(REPLACE(REPLACE(p.Tags, '>', ''), '<', '|'), '|') s
JOIN dbo.Tags t ON t.TagName = s.value
WHERE p.PostTypeId = 1
AND s.value <> ''
)
SELECT TagId, TagName, PostId, Title, CreationDate
FROM RankedPosts
WHERE rn = 1
ORDER BY TagName;
GO
Script 5 — Optional: enable async auto-update stats (Recommendation #6)
USE [master];
GO
ALTER DATABASE [SQLStorm] SET AUTO_UPDATE_STATISTICS_ASYNC ON WITH NO_WAIT;
GO
Script 6 — Post-change validation
USE [SQLStorm];
GO
SET STATISTICS IO ON;
SET STATISTICS TIME ON;
SET NOCOUNT ON;
GO
-- Re-run the rewritten query (Script 3) here and capture STATISTICS IO/TIME + actual plan.
-- Compare logical reads on Posts/Votes/Badges and elapsed time against the baseline (Script 1).
GO
SET STATISTICS IO OFF;
SET STATISTICS TIME OFF;
GO
Script 7 — Rollback (drop new indexes if needed)
USE [SQLStorm];
GO
IF EXISTS (SELECT 1 FROM sys.indexes WHERE name = 'IX_Posts_OwnerUserId_PostTypeId_INC_CreationDate_Title' AND object_id = OBJECT_ID('dbo.Posts'))
DROP INDEX [IX_Posts_OwnerUserId_PostTypeId_INC_CreationDate_Title] ON [dbo].[Posts];
GO
IF EXISTS (SELECT 1 FROM sys.indexes WHERE name = 'IX_Votes_PostId_INC_VoteTypeId' AND object_id = OBJECT_ID('dbo.Votes'))
DROP INDEX [IX_Votes_PostId_INC_VoteTypeId] ON [dbo].[Votes];
GO
IF EXISTS (SELECT 1 FROM sys.indexes WHERE name = 'IX_Badges_UserId' AND object_id = OBJECT_ID('dbo.Badges'))
DROP INDEX [IX_Badges_UserId] ON [dbo].[Badges];
GO