AI SQL Tuner Recommendations
sp00060Executive Summary
This query has an estimated subtree cost of 199.87, placing it in the CRITICAL category. The dominant bottleneck is the Nested Loops with LIKE '%…%' join between Posts (246K rows) and Tags (1,232 rows), which effectively produces a cross join with per-row LIKE evaluation costing ~193 out of 199.87 (96.7%) of total plan cost. Additionally, the query contains a logical bug in the final LEFT JOIN that joins on the wrong column, and produces an inflated intermediate row set of 1.7M rows from a Cartesian-like expansion via Badges. The requested memory grant of ~5.6 GB will very likely spill to tempdb on most systems.
Cost Overview
| Plan Component | Est. Cost | % of Total | Severity |
|---|---|---|---|
| Nested Loops (Posts × Tags LIKE) | 193.28 | 96.7% | Critical |
| UserStats CTE (Hash Aggregates + Joins) | 5.47 | 2.7% | Low |
| Final Hash Join + Sort | 1.12 | 0.6% | Low |
Top Priorities
- Critical Fix the LIKE '%…%' cross join pattern — The join
p.Tags LIKE '%' + t.TagName + '%'forces a Nested Loops scan of 246K Posts × 1,232 Tags = ~304M comparisons. Replace with a relational tag-to-post mapping table or use full-text search. - Critical Fix the logical bug in the final JOIN —
LEFT JOIN RankedPosts rp ON tp.UserId = rp.PostIdincorrectly joins UserId to PostId. This should beON tp.UserId = rp.PostOwnerUserId(or an appropriate column). The current join produces meaningless results. - High Eliminate Cartesian expansion from Badges — The UserStats CTE joins Users → Posts → Votes → Badges with LEFT JOINs and COUNT(DISTINCT), producing 1.7M intermediate rows. Pre-aggregate each dimension separately before joining.
- Medium Add a covering index on Posts for PostTypeId filter — The clustered index scan of the entire 399MB Posts table could be narrowed with an index on PostTypeId that includes the necessary columns.
- Low Monitor memory grant — SerialDesiredMemory is 5.8GB; on real execution this will almost certainly spill to tempdb.
Detailed Prioritized Recommendations
1. Eliminate the LIKE '%…%' Cross Join Pattern
Critical Confidence: 98%
The LIKE '%' + t.TagName + '%' join predicate between Posts and Tags is the single largest cost driver. It is a non-SARGable predicate applied inside a Nested Loops join. The optimizer scans all 246,672 Posts rows (filtered to ~104K by PostTypeId=1), then for each row, evaluates the LIKE against all 1,232 Tags via a Lazy Spool. This results in approximately 128 million string comparisons.
[Posts].[Tags] LIKE [Expr1020].
Option A: Create a PostTags Bridge Table (Recommended)
Since StackOverflow-style Posts store tags in a delimited string like <c#><.net><linq>, the most effective fix is to normalize this into a relational mapping table:
-- One-time ETL to create a proper many-to-many relationship
CREATE TABLE dbo.PostTags (
PostId INT NOT NULL,
TagId INT NOT NULL,
CONSTRAINT PK_PostTags PRIMARY KEY CLUSTERED (PostId, TagId),
CONSTRAINT FK_PostTags_Posts FOREIGN KEY (PostId) REFERENCES dbo.Posts(Id),
CONSTRAINT FK_PostTags_Tags FOREIGN KEY (TagId) REFERENCES dbo.Tags(Id)
);
-- Create reverse index for tag-centric lookups
CREATE NONCLUSTERED INDEX IX_PostTags_TagId
ON dbo.PostTags (TagId, PostId)
WITH (DATA_COMPRESSION = PAGE);
-- Populate (one-time)
INSERT INTO dbo.PostTags (PostId, TagId)
SELECT DISTINCT p.Id, t.Id
FROM dbo.Posts p
CROSS APPLY STRING_SPLIT(REPLACE(REPLACE(p.Tags, '<', ''), '>', ' '), ' ') s
INNER JOIN dbo.Tags t ON t.TagName = LTRIM(RTRIM(s.value))
WHERE s.value <> '' AND p.Tags IS NOT NULL;
| Metric | Value |
|---|---|
| estimatedStorageBytes | ~7 MB (PK) + ~7 MB (IX) = ~14 MB total |
| expectedReadBenefitPercent | 95–99% cost reduction for tag-related joins |
| expectedWriteOverheadPercent | Negligible for reads; inserts/updates to Posts need ETL sync |
| maintenanceNotes | Requires trigger or application-layer logic to keep in sync with Posts.Tags column |
Option B: Use Full-Text Search (Alternative)
If schema changes are not feasible, create a full-text index on Posts.Tags and use CONTAINS or FREETEXT. This is less optimal than a bridge table but eliminates the LIKE cross join.
Rewritten RankedPosts CTE (using bridge table):
-- BEFORE (current — ~193 cost units)
FROM Posts p
JOIN Tags t ON p.Tags LIKE '%' + t.TagName + '%'
WHERE p.PostTypeId = 1
-- AFTER (with PostTags bridge — estimated <2 cost units)
FROM Posts p
INNER JOIN dbo.PostTags pt ON p.Id = pt.PostId
INNER JOIN Tags t ON pt.TagId = t.Id
WHERE p.PostTypeId = 1
2. Fix the Logical Bug: Final JOIN on Wrong Column
Critical Confidence: 99%
The final SELECT joins TopUsers tp LEFT JOIN RankedPosts rp ON tp.UserId = rp.PostId. This joins a User's ID to a Post's ID — these are different entities and will produce semantically incorrect results. The RankedPosts CTE selects p.Id AS PostId but does not expose the Post's OwnerUserId. The intent appears to be joining on the user who authored the post.
[Posts].[Id], HashKeysProbe references [Users].[Id]. The CTE output PostId is p.Id (Post primary key), not a user identifier.
Fix:
-- BEFORE (buggy)
WITH RankedPosts AS (
SELECT
p.Id AS PostId,
p.Title,
p.CreationDate,
ROW_NUMBER() OVER (PARTITION BY t.Id ORDER BY p.CreationDate DESC) AS rn,
t.TagName
FROM Posts p
JOIN Tags t ON p.Tags LIKE '%' + t.TagName + '%'
WHERE p.PostTypeId = 1
)
...
LEFT JOIN RankedPosts rp ON tp.UserId = rp.PostId -- BUG: UserId ≠ PostId
-- AFTER (corrected — add OwnerUserId to CTE)
WITH RankedPosts AS (
SELECT
p.Id AS PostId,
p.OwnerUserId, -- ADD THIS
p.Title,
p.CreationDate,
ROW_NUMBER() OVER (PARTITION BY t.Id ORDER BY p.CreationDate DESC) AS rn,
t.TagName
FROM Posts p
INNER JOIN dbo.PostTags pt ON p.Id = pt.PostId
INNER JOIN Tags t ON pt.TagId = t.Id
WHERE p.PostTypeId = 1
)
...
LEFT JOIN RankedPosts rp ON tp.UserId = rp.OwnerUserId -- FIXED
3. Eliminate Cartesian Expansion in UserStats CTE
High Confidence: 95%
The UserStats CTE performs Users LEFT JOIN Posts LEFT JOIN Votes LEFT JOIN Badges in a single pass, producing a Cartesian-like expansion to 1.7M intermediate rows before aggregating with COUNT(DISTINCT). A user with 10 posts, 50 votes across those posts, and 5 badges would contribute 10 × 50 × 5 = 2,500 rows instead of just needing 3 pre-aggregated lookups.
Rewrite: Pre-aggregate Each Dimension
-- BEFORE (Cartesian explosion to 1.7M rows)
UserStats AS (
SELECT u.Id AS UserId, u.DisplayName,
COUNT(DISTINCT p.Id) AS TotalPosts,
SUM(CASE WHEN v.VoteTypeId = 2 THEN 1 ELSE 0 END) AS UpVotes,
SUM(CASE WHEN v.VoteTypeId = 3 THEN 1 ELSE 0 END) AS DownVotes,
COUNT(DISTINCT b.Id) AS BadgeCount
FROM Users u
LEFT JOIN Posts p ON u.Id = p.OwnerUserId
LEFT JOIN Votes v ON p.Id = v.PostId
LEFT JOIN Badges b ON u.Id = b.UserId
GROUP BY u.Id, u.DisplayName
)
-- AFTER (pre-aggregate, ~300K rows max per subquery)
PostCounts AS (
SELECT OwnerUserId, COUNT(*) AS TotalPosts
FROM Posts
GROUP BY OwnerUserId
),
VoteCounts AS (
SELECT p.OwnerUserId,
SUM(CASE WHEN v.VoteTypeId = 2 THEN 1 ELSE 0 END) AS UpVotes,
SUM(CASE WHEN v.VoteTypeId = 3 THEN 1 ELSE 0 END) AS DownVotes
FROM Posts p
INNER JOIN Votes v ON p.Id = v.PostId
WHERE v.VoteTypeId IN (2, 3)
GROUP BY p.OwnerUserId
),
BadgeCounts AS (
SELECT UserId, COUNT(*) AS BadgeCount
FROM Badges
GROUP BY UserId
),
UserStats AS (
SELECT
u.Id AS UserId,
u.DisplayName,
ISNULL(pc.TotalPosts, 0) AS TotalPosts,
ISNULL(vc.UpVotes, 0) AS UpVotes,
ISNULL(vc.DownVotes, 0) AS DownVotes,
ISNULL(bc.BadgeCount, 0) AS BadgeCount
FROM Users u
LEFT JOIN PostCounts pc ON u.Id = pc.OwnerUserId
LEFT JOIN VoteCounts vc ON u.Id = vc.OwnerUserId
LEFT JOIN BadgeCounts bc ON u.Id = bc.UserId
)
This eliminates COUNT(DISTINCT) and reduces intermediate rows from 1.7M to approximately 267K (one per user).
4. Complete Query Rewrite
Medium Confidence: 90%
Combining all fixes from recommendations 1–3, here is the complete rewritten stored procedure:
ALTER PROCEDURE sp00060
AS
BEGIN
SET NOCOUNT ON;
WITH PostCounts AS (
SELECT OwnerUserId, COUNT(*) AS TotalPosts
FROM dbo.Posts
WHERE OwnerUserId IS NOT NULL
GROUP BY OwnerUserId
HAVING COUNT(*) > 10 -- Push filter early
),
VoteCounts AS (
SELECT p.OwnerUserId,
SUM(CASE WHEN v.VoteTypeId = 2 THEN 1 ELSE 0 END) AS UpVotes,
SUM(CASE WHEN v.VoteTypeId = 3 THEN 1 ELSE 0 END) AS DownVotes
FROM dbo.Posts p
INNER JOIN dbo.Votes v ON p.Id = v.PostId
WHERE v.VoteTypeId IN (2, 3)
AND p.OwnerUserId IS NOT NULL
GROUP BY p.OwnerUserId
),
BadgeCounts AS (
SELECT UserId, COUNT(*) AS BadgeCount
FROM dbo.Badges
GROUP BY UserId
),
TopUsers AS (
SELECT
u.Id AS UserId,
u.DisplayName,
pc.TotalPosts,
ISNULL(vc.UpVotes, 0) AS UpVotes,
ISNULL(vc.DownVotes, 0) AS DownVotes,
ISNULL(bc.BadgeCount, 0) AS BadgeCount,
DENSE_RANK() OVER (
ORDER BY pc.TotalPosts DESC,
ISNULL(vc.UpVotes, 0) - ISNULL(vc.DownVotes, 0) DESC
) AS UserRank
FROM dbo.Users u
INNER JOIN PostCounts pc ON u.Id = pc.OwnerUserId
LEFT JOIN VoteCounts vc ON u.Id = vc.OwnerUserId
LEFT JOIN BadgeCounts bc ON u.Id = bc.UserId
),
RankedPosts AS (
SELECT
p.Id AS PostId,
p.OwnerUserId,
p.Title,
p.CreationDate,
ROW_NUMBER() OVER (
PARTITION BY pt.TagId
ORDER BY p.CreationDate DESC
) AS rn,
t.TagName
FROM dbo.Posts p
INNER JOIN dbo.PostTags pt ON p.Id = pt.PostId
INNER JOIN dbo.Tags t ON pt.TagId = t.Id
WHERE p.PostTypeId = 1
)
SELECT
tp.UserRank,
tp.DisplayName,
tp.TotalPosts,
tp.UpVotes,
tp.DownVotes,
tp.BadgeCount,
rp.Title AS LatestPostTitle,
rp.CreationDate AS LatestPostDate,
CASE
WHEN rp.rn = 1 THEN 'Latest'
ELSE 'Older'
END AS PostStatus
FROM TopUsers tp
LEFT JOIN RankedPosts rp ON tp.UserId = rp.OwnerUserId
ORDER BY tp.UserRank, rp.CreationDate DESC;
END;
Expected impact: Estimated cost reduction from ~200 to <10. Memory grant reduction from ~5.6GB to <200MB. Execution time improvement of 90%+.
5. Index Recommendations
5a. PostTags Bridge Table Indexes (if created per Rec #1)
Already specified in Recommendation 1. The PK on (PostId, TagId) and the secondary index on (TagId, PostId) are essential.
5b. Index on Posts for PostTypeId with Covering Columns
Currently the plan scans the full 399MB clustered index on Posts to filter PostTypeId = 1 (returning ~104K of 246K rows, a 42% selectivity). With the rewrite, an index becomes more beneficial since the bridge table join changes the access pattern.
CREATE NONCLUSTERED INDEX IX_Posts_PostTypeId_Covering
ON dbo.Posts (PostTypeId, CreationDate DESC)
INCLUDE (OwnerUserId, Title, Tags)
WITH (DATA_COMPRESSION = PAGE, ONLINE = ON);
| Metric | Value |
|---|---|
| estimatedStorageBytes | ~25 MB (with PAGE compression; Posts avg Tags=13.5 bytes, Title=24 bytes) |
| expectedReadBenefitPercent | 60–75% for RankedPosts CTE; avoids 399MB clustered scan |
| expectedWriteOverheadPercent | 3–5% on Posts INSERT/UPDATE (already has 2 NC indexes) |
| maintenanceNotes | PAGE compression recommended; ONLINE build available on Enterprise/Developer. Consider dropping if PostTags bridge eliminates need for Tags column in this index. |
5c. Filtered Index on Votes for VoteTypeId IN (2, 3)
The rewritten VoteCounts CTE filters on VoteTypeId IN (2, 3). A filtered index avoids scanning all 926K votes (most of which are type 1 = AcceptedByOriginator, type 5 = Favorite, etc.).
CREATE NONCLUSTERED INDEX IX_Votes_PostId_UpDownVotes
ON dbo.Votes (PostId)
INCLUDE (VoteTypeId)
WHERE VoteTypeId IN (2, 3)
WITH (DATA_COMPRESSION = PAGE, ONLINE = ON);
| Metric | Value |
|---|---|
| estimatedStorageBytes | ~3–5 MB (filtered; upvotes + downvotes are a subset of 926K) |
| expectedReadBenefitPercent | 40–60% for VoteCounts CTE; reads only relevant vote types |
| expectedWriteOverheadPercent | 1–2% (only maintained for VoteTypeId 2 or 3 inserts) |
| maintenanceNotes | Filtered indexes not usable with parameterized queries unless OPTION(RECOMPILE) or the filter matches. Safe here since values are hardcoded in the SP. |
5d. Consider Dropping Unused Indexes
All index usage stats show 0 seeks, 0 scans, 0 lookups across every index. This suggests either the server was recently restarted, statistics were just rebuilt, or these indexes are genuinely unused. After the server has been running with production workload for a sufficient period, review and consider dropping:
IX_Votes_UserId_VoteTypeId(8.8 MB) — if no queries seek by UserId on VotesIX_Votes_PostId_VoteTypeId_UpDown_Filtered(8.3 MB) — may be superseded by the new filtered index above
6. Statistics Maintenance
Low Confidence: 95%
Statistics are in excellent shape:
- All referenced statistics are sampled at 100%
- All have ModificationCount = 0
- Auto Create/Update Statistics are enabled
- Last updates range from 2026-02-28 to 2026-03-28 — all recent
Recommendation: No immediate action needed. Consider enabling AUTO_UPDATE_STATISTICS_ASYNC to avoid query compile delays on large tables:
ALTER DATABASE [SQLStorm] SET AUTO_UPDATE_STATISTICS_ASYNC ON;
7. Parameter Sniffing Considerations
Low Confidence: 90%
This stored procedure takes no parameters, so traditional parameter sniffing is not a concern. The query uses only literal constants (PostTypeId = 1, VoteTypeId = 2, VoteTypeId = 3, TotalPosts > 10). The plan will always be compiled with the same values.
However, note that ARITHABORT is set to false in the plan's StatementSetOptions. This can cause the plan to not be shared with SSMS sessions (which default to ARITHABORT ON), potentially leading to two cached plans. Consider adding SET ARITHABORT ON to the stored procedure.
ALTER PROCEDURE sp00060
AS
BEGIN
SET NOCOUNT ON;
SET ARITHABORT ON; -- Ensure plan sharing with SSMS
...
SQL Server 2022's Parameter Sensitive Plan (PSP) optimization and DOP feedback features are available on this instance but are not applicable since there are no parameters and the query runs parallel (Branches=1 indicates a parallel plan).
Table-Specific Insights
| Table | Rows | Size | Indexes | Key Observations |
|---|---|---|---|---|
| Posts | 246,672 (1.23M total incl. all types) |
399 MB (CI) + 23 MB (NC) |
3 | Largest table. Tags column (avg 13.5 bytes) stores delimited tag names — the root cause of the LIKE anti-pattern. PostTypeId has only 5 distinct values; PostTypeId=1 selects ~42% of rows. Title has ~104K distinct values (avg 24 bytes). |
| Votes | 926,084 | 12 MB (CI) + 31 MB (NC) |
4 | Most indexes by count. PostId has ~243K distinct values. VoteTypeId has 13 distinct values. The NC indexes total 31MB — more than 2.5× the clustered index, suggesting potential over-indexing. Only VoteTypeId 2 and 3 are needed by this query. |
| Badges | 439,352 | 8 MB (CI) + 5 MB (NC) |
2 | 154K distinct UserIds means avg ~2.85 badges per user. IX_Badges_UserId includes Class column but this query only needs UserId and Id. The existing index is adequate. |
| Users | 267,193 | 45 MB (CI) | 1 | Only clustered index exists. DisplayName avg 10.2 bytes, 227K distinct values. No additional indexes needed for this query pattern since Users is the dimension table being probed via hash joins. |
| Tags | 1,232 | 0.1 MB | 1 | Tiny table, no compression (consider PAGE compression though benefit is negligible at 0.1MB). TagName avg 10.6 bytes. The entire table fits in a single data page extent — scan is appropriate. |
Validation Steps
After implementing the recommended changes, validate performance improvements by comparing before/after metrics:
-- Run BEFORE and AFTER applying changes
SET STATISTICS IO ON;
SET STATISTICS TIME ON;
EXEC sp00060;
SET STATISTICS IO OFF;
SET STATISTICS TIME OFF;
Key Metrics to Compare
- Logical Reads — Expect a reduction from hundreds of thousands to low thousands
- CPU Time — Expect 90%+ reduction by eliminating the LIKE cross join
- Elapsed Time — Expect significant wall-clock improvement
- Memory Grant (KB) — Check actual vs granted in
sys.dm_exec_query_memory_grants; expect reduction from ~5.6GB to <200MB - TempDB Spills — Check Sort and Hash warnings in the actual execution plan; expect elimination of spills
Additional Validation
-- Check for tempdb spills in actual plan
SET STATISTICS XML ON;
EXEC sp00060;
SET STATISTICS XML OFF;
-- Look for SpillToTempDb="1" in Sort and Hash operators
-- Monitor memory grants
SELECT
session_id,
requested_memory_kb,
granted_memory_kb,
used_memory_kb,
max_used_memory_kb
FROM sys.dm_exec_query_memory_grants
WHERE session_id = @@SPID;
Verify PostTags Bridge Table Accuracy
-- Spot-check: compare LIKE-based results with bridge table results
SELECT TOP 100 p.Id, p.Tags, t.TagName
FROM dbo.Posts p
INNER JOIN dbo.PostTags pt ON p.Id = pt.PostId
INNER JOIN dbo.Tags t ON pt.TagId = t.Id
WHERE p.PostTypeId = 1
ORDER BY p.Id;
-- Count comparison
SELECT COUNT(*) FROM dbo.PostTags;
-- Should approximate the ~104K rows estimated by the LIKE join