Query Tuning Report - Procedure SP00060 in SQLStorm - 2026-03-28

Executive summary

Priority 1 — Rewrite the procedure to remove the Tags LIKE join and fix the final join logic.
- The most expensive branch is the Nested Loops over Posts and Tags with predicate p.Tags LIKE '%' + t.TagName + '%', followed by sort/windowing.
- The final join appears logically incorrect: tp.UserId = rp.PostId. This likely multiplies work and may also produce incorrect results.
Priority 2 — Pre-aggregate Posts, Votes, and Badges separately before joining to Users.
- The current UserStats CTE joins Users → Posts → Votes → Badges before aggregation, producing an estimated 1,700,990 rows into a hash aggregate.
- This creates unnecessary row explosion and forces COUNT(DISTINCT ...) work.
Priority 3 — Add one targeted Posts index and one filtered Votes index if the rewrite is implemented.
- A composite Posts index on (OwnerUserId, PostTypeId, CreationDate DESC) can support latest-post retrieval and reduce scans.
- A filtered Votes index for VoteTypeId IN (2,3) can shrink reads significantly for this procedure.
Priority 4 — Statistics are generally healthy; no urgent stats repair is indicated.
- Referenced statistics show ModificationCount = 0 and mostly SamplingPercent = 100.
- This is primarily a query shape / access path problem, not a stale-statistics problem.

Evidence
1) Plan cost is 200.01, which falls in the critical range.
2) The costly path includes Nested Loops (Est. Cost 193.282), Sort (193.392), and Window Aggregate (193.393) driven by Posts.Tags LIKE '%'+Tags.TagName+'%'.

Confidence: 0.96

Detailed prioritized recommendations

Rewrite RankedPosts and remove the non-SARGable tag-matching join

The current RankedPosts CTE is the dominant bottleneck. It scans question posts, then evaluates a leading-wildcard LIKE against every tag value, sorts the result, computes ROW_NUMBER(), and later joins it back using tp.UserId = rp.PostId.

Evidence
• Node 8: Nested Loops, estimated cost 193.282, predicate [Posts].[Tags] like [Expr1020].
• Node 9: Clustered Index Scan on Posts, estimated rows 104,436, estimated rows read 246,672; predicate only PostTypeId = 1.

Expected impact
Very high. Eliminating this branch should remove the single largest cost center and most likely cut both CPU and memory demand dramatically.

Why it matters: A predicate of the form LIKE '%value%' is not seekable on a normal b-tree index. On SQL Server 2022 Enterprise, you have advanced features available, but none of them make this pattern efficient on a denormalized tag string.

Confidence: 0.98
Pre-aggregate Posts, Votes, and Badges independently to avoid fanout

The current UserStats CTE aggregates after joining multiple one-to-many relationships, forcing distinct counts and a large hash aggregate over an expanded rowset.

Evidence
• Node 20: Hash Match right outer join outputs estimated 1,700,990 rows.
• Node 18: Hash Match aggregate then performs COUNT(DISTINCT p.Id) and COUNT(DISTINCT b.Id) over that expanded set.

Expected impact
High. This usually reduces cardinality inflation, sort pressure, aggregate work, and memory grant needs.

Confidence: 0.97
Fix the final join predicate

The final statement joins TopUsers tp to RankedPosts rp on tp.UserId = rp.PostId. For a “latest post per user” result, that is almost certainly incorrect and also prevents the optimizer from using a straightforward user-to-post access path.

Evidence
• Final query text: LEFT JOIN RankedPosts rp ON tp.UserId = rp.PostId.
• Plan Node 4 builds hash on [Posts].[Id] and probes with [Users].[Id], confirming the join is UserId = PostId.

Expected impact
Very high for correctness, high for performance once the query is rewritten to join on OwnerUserId.

Confidence: 0.99
Add a targeted Posts index to support latest-post retrieval by user

If you rewrite to use OUTER APPLY (SELECT TOP 1 ... FROM Posts WHERE OwnerUserId = ...), the current indexes are not ideal for retrieving the latest question per user efficiently.

Evidence
• Existing IX_Posts_PostTypeId_CreationDate_Score is keyed on PostTypeId, CreationDate, Score, Id, not OwnerUserId.
• DMV missing index for Posts(OwnerUserId): avg user impact 36.74%.

Confidence: 0.91
Add a filtered Votes index for up/down vote counting

This procedure only counts vote types 2 and 3. A filtered index can materially reduce read volume versus scanning the whole Votes table or clustered index.

Evidence
• Node 21: clustered scan of Votes, estimated rows read 926,084 in the plan branch.
• Table metadata shows Votes has 2,778,252 rows; existing nonclustered index IX_Votes_PostId_VoteTypeId exists but the current query shape does not exploit it well.

Confidence: 0.88
No immediate parameter sniffing action

This procedure takes no parameters, so classic parameter sniffing is not the driver here. SQL Server 2022 Parameter Sensitive Plan optimization is available, but it is not applicable to this procedure as written.

Evidence
• User query is exec sp00060 with no parameters.
• Showplan does not expose parameterized predicates for this statement.

Confidence: 0.99

Query rewrite examples

Before

WITH RankedPosts AS (
    SELECT 
        p.Id AS PostId,
        p.Title,
        p.CreationDate,
        ROW_NUMBER() OVER (PARTITION BY t.Id ORDER BY p.CreationDate DESC) AS rn,
        t.TagName
    FROM Posts p
    JOIN Tags t ON p.Tags LIKE '%' + t.TagName + '%'
    WHERE p.PostTypeId = 1
),
UserStats AS (
    SELECT 
        u.Id AS UserId,
        u.DisplayName,
        COUNT(DISTINCT p.Id) AS TotalPosts,
        SUM(CASE WHEN v.VoteTypeId = 2 THEN 1 ELSE 0 END) AS UpVotes,
        SUM(CASE WHEN v.VoteTypeId = 3 THEN 1 ELSE 0 END) AS DownVotes,
        COUNT(DISTINCT b.Id) AS BadgeCount
    FROM Users u
    LEFT JOIN Posts p ON u.Id = p.OwnerUserId
    LEFT JOIN Votes v ON p.Id = v.PostId
    LEFT JOIN Badges b ON u.Id = b.UserId
    GROUP BY u.Id, u.DisplayName
),
TopUsers AS (
    SELECT 
        us.UserId,
        us.DisplayName,
        us.TotalPosts,
        us.UpVotes,
        us.DownVotes,
        us.BadgeCount,
        DENSE_RANK() OVER (ORDER BY us.TotalPosts DESC, us.UpVotes - us.DownVotes DESC) AS UserRank
    FROM UserStats us
    WHERE us.TotalPosts > 10
)
SELECT 
    tp.UserRank,
    tp.DisplayName,
    tp.TotalPosts,
    tp.UpVotes,
    tp.DownVotes,
    tp.BadgeCount,
    rp.Title AS LatestPostTitle,
    rp.CreationDate AS LatestPostDate,
    CASE 
        WHEN rp.rn = 1 THEN 'Latest'
        ELSE 'Older'
    END AS PostStatus
FROM TopUsers tp
LEFT JOIN RankedPosts rp ON tp.UserId = rp.PostId
ORDER BY tp.UserRank, rp.CreationDate DESC;

After — recommended rewrite

CREATE OR ALTER PROCEDURE dbo.sp00060
AS
BEGIN
    SET NOCOUNT ON;

    ;WITH PostCounts AS
    (
        SELECT
            p.OwnerUserId AS UserId,
            COUNT_BIG(*) AS TotalPosts
        FROM dbo.Posts AS p
        WHERE p.OwnerUserId IS NOT NULL
        GROUP BY p.OwnerUserId
    ),
    VoteCounts AS
    (
        SELECT
            p.OwnerUserId AS UserId,
            SUM(CASE WHEN v.VoteTypeId = 2 THEN 1 ELSE 0 END) AS UpVotes,
            SUM(CASE WHEN v.VoteTypeId = 3 THEN 1 ELSE 0 END) AS DownVotes
        FROM dbo.Posts AS p
        LEFT JOIN dbo.Votes AS v
            ON v.PostId = p.Id
           AND v.VoteTypeId IN (2,3)
        WHERE p.OwnerUserId IS NOT NULL
        GROUP BY p.OwnerUserId
    ),
    BadgeCounts AS
    (
        SELECT
            b.UserId,
            COUNT_BIG(*) AS BadgeCount
        FROM dbo.Badges AS b
        GROUP BY b.UserId
    ),
    TopUsers AS
    (
        SELECT
            u.Id AS UserId,
            u.DisplayName,
            CONVERT(int, ISNULL(pc.TotalPosts, 0)) AS TotalPosts,
            CONVERT(int, ISNULL(vc.UpVotes, 0)) AS UpVotes,
            CONVERT(int, ISNULL(vc.DownVotes, 0)) AS DownVotes,
            CONVERT(int, ISNULL(bc.BadgeCount, 0)) AS BadgeCount,
            DENSE_RANK() OVER
            (
                ORDER BY
                    ISNULL(pc.TotalPosts, 0) DESC,
                    ISNULL(vc.UpVotes, 0) - ISNULL(vc.DownVotes, 0) DESC
            ) AS UserRank
        FROM dbo.Users AS u
        LEFT JOIN PostCounts AS pc ON pc.UserId = u.Id
        LEFT JOIN VoteCounts AS vc ON vc.UserId = u.Id
        LEFT JOIN BadgeCounts AS bc ON bc.UserId = u.Id
        WHERE ISNULL(pc.TotalPosts, 0) > 10
    )
    SELECT
        tu.UserRank,
        tu.DisplayName,
        tu.TotalPosts,
        tu.UpVotes,
        tu.DownVotes,
        tu.BadgeCount,
        lp.Title AS LatestPostTitle,
        lp.CreationDate AS LatestPostDate,
        CASE WHEN lp.PostId IS NULL THEN NULL ELSE 'Latest' END AS PostStatus
    FROM TopUsers AS tu
    OUTER APPLY
    (
        SELECT TOP (1)
            p.Id AS PostId,
            p.Title,
            p.CreationDate
        FROM dbo.Posts AS p
        WHERE p.OwnerUserId = tu.UserId
          AND p.PostTypeId = 1
        ORDER BY p.CreationDate DESC, p.Id DESC
    ) AS lp
    ORDER BY tu.UserRank, lp.CreationDate DESC;
END;

This rewrite removes the non-SARGable tag join.
It removes the join fanout between Posts, Votes, and Badges before aggregation.
It fixes the likely logical issue by retrieving the latest post by OwnerUserId, not by matching UserId to PostId.
It also removes unnecessary COUNT(DISTINCT ...) work.

If tag logic is truly required

-- Best long-term fix: normalize tags into a bridge table
-- Example structure:
CREATE TABLE dbo.PostTags
(
    PostId int NOT NULL,
    TagId  int NOT NULL,
    CONSTRAINT PK_PostTags PRIMARY KEY CLUSTERED (TagId, PostId),
    CONSTRAINT FK_PostTags_Post FOREIGN KEY (PostId) REFERENCES dbo.Posts(Id),
    CONSTRAINT FK_PostTags_Tag  FOREIGN KEY (TagId)  REFERENCES dbo.Tags(Id)
);

CREATE INDEX IX_PostTags_PostId_TagId ON dbo.PostTags(PostId, TagId);

For this specific procedure, normalization is optional only because the current output does not actually return TagName. If tags are not needed, removing that branch is the best choice.

Index recommendations

Priority	Index	Purpose	estimatedStorageBytes	expectedReadBenefitPercent	expectedWriteOverheadPercent
1	IX_Posts_OwnerUserId_PostTypeId_CreationDate	Supports latest-post lookup by user and improves grouping/join access by OwnerUserId	~41,000,000	55%	8%
2	IX_Votes_PostId_VoteTypeId_UpDown_Filtered	Reduces vote-read volume to only VoteTypeId 2/3 used by this procedure	~18,000,000	35%	3%

1) Create index on Posts for latest-post-per-user access

CREATE INDEX IX_Posts_OwnerUserId_PostTypeId_CreationDate
ON dbo.Posts
(
    OwnerUserId,
    PostTypeId,
    CreationDate DESC,
    Id
)
INCLUDE
(
    Title
)
WITH
(
    ONLINE = ON,
    DATA_COMPRESSION = PAGE,
    SORT_IN_TEMPDB = ON
);

estimatedStorageBytes: ~41,000,000
expectedReadBenefitPercent: 55%
expectedWriteOverheadPercent: 8%
maintenanceNotes: Moderate maintenance cost; keep PAGE compression; online rebuilds are supported in your edition. Reassess if a broader OwnerUserId index already exists later.

Risk assessment
Moderate write overhead on a 986,688-row table. Acceptable if this reporting procedure is important, but avoid adding further overlapping OwnerUserId indexes without consolidation.

Evidence
• Missing index DMV suggests Posts(OwnerUserId) with avg user impact 36.74%.
• Existing index IX_Posts_PostTypeId_CreationDate_Score is not keyed by OwnerUserId, so it is not ideal for OUTER APPLY TOP (1) by user.

2) Create filtered Votes index for vote types 2 and 3

CREATE INDEX IX_Votes_PostId_VoteTypeId_UpDown_Filtered
ON dbo.Votes
(
    PostId,
    VoteTypeId
)
WHERE VoteTypeId IN (2,3)
WITH
(
    ONLINE = ON,
    DATA_COMPRESSION = PAGE,
    SORT_IN_TEMPDB = ON
);

estimatedStorageBytes: ~18,000,000
expectedReadBenefitPercent: 35%
expectedWriteOverheadPercent: 3%
maintenanceNotes: Low-to-moderate maintenance because the filter limits rows. Particularly valuable when vote-type counts are a recurring pattern.

Risk assessment
Low write overhead relative to a full-table nonclustered index, but it adds maintenance and should only be kept if the vote-type 2/3 pattern is used repeatedly.

Evidence
• Plan Node 21 scans Votes clustered index instead of leveraging a narrow vote-type-specific access path.
• Column cardinality shows VoteTypeId has only about 13 distinct values, making a filtered index attractive for a frequently queried subset.

Indexes not recommended for this specific query

Tags(TagName): not useful for LIKE '%TagName%' on the Posts side due to the leading wildcard pattern and denormalized storage.
Additional Badges(UserId) index: current IX_Badges_UserId already provides the needed access pattern, and clustered key Id is implicitly available at the leaf level.
Drop recommendations: none are advised based on the provided query alone.

Statistics maintenance recommendations

No urgent corrective statistics action is required for this query.
Evidence
• Referenced statistics show ModificationCount = 0.
• Sampling is mostly 100% on key stats such as IX_Posts_PostTypeId_CreationDate_Score, IX_Votes_PostId_VoteTypeId, and IX_Badges_UserId.
After creating any recommended index, update statistics on the affected table or let index creation generate fresh stats.
```
UPDATE STATISTICS dbo.Posts WITH FULLSCAN;
UPDATE STATISTICS dbo.Votes WITH FULLSCAN;
```
Consider scheduled FULLSCAN for skew-sensitive reporting tables only if compile quality becomes unstable.
- Because this environment is SQL Server 2022 and current stats quality is already high, scheduled blanket FULLSCAN on all tables is not necessary.
- Target Posts and Votes first if plan quality regresses after data growth.

Confidence: 0.92

Parameter sniffing / plan guidance

Parameter sniffing: Not applicable to this procedure in its current form because there are no parameters.
Plan guides / hints: Not recommended as the primary fix. The problem is query design, not parameter instability.
SQL Server 2022 PSP: Available on this platform, but not relevant without parameters.
Memory grant / spills: No spill warnings are present in the provided plan XML, so there is no direct evidence of tempdb spill. However, the plan has a large desired memory estimate.

Evidence
• Query text is exec sp00060 with no arguments.
• MemoryGrantInfo shows SerialDesiredMemory="474904"; no explicit spill warning is present in the XML.

Confidence: 0.95

Table-specific insights

dbo.Posts

Largest referenced table by storage: 986,688 rows, 411 MB total.
Current query scans the clustered index for the RankedPosts branch and scans IX_Posts_PostTypeId_CreationDate_Score for the UserStats branch.
Owner-based retrieval is under-indexed for this workload.

Evidence
• Clustered index usage: Scans 2234 vs Seeks 400.
• Missing index DMV entry for OwnerUserId on Posts with avg impact 36.74%.

dbo.Votes

Highest row count: 2,778,252 rows.
The query only needs vote types 2 and 3, yet current shape leads to a broad scan.
Existing IX_Votes_PostId_VoteTypeId is useful, but rewrite is needed for the optimizer to favor a narrower path consistently.

Evidence
• Clustered PK usage: Scans 1709, Seeks 0.
• Existing IX_Votes_PostId_VoteTypeId: Seeks 68, Scans 6, indicating it already has value in other workloads.

dbo.Badges

878,704 rows but compact at 13 MB; existing IX_Badges_UserId is appropriate for counting badges by user.
No additional index is required specifically for this procedure after rewrite.

Evidence
• Plan Node 23 uses IX_Badges_UserId directly.
• Index usage shows the index exists and is actively used with low scan volume compared with table size.

dbo.Users

534,386 rows, 44 MB total; full clustered scan is reasonable when ranking many users.
No direct Users index change is needed for this query.

Evidence
• Plan Node 26 scans Users clustered index with estimated rows 267,193.
• Missing index DMV entries for Users(Reputation) are unrelated to this procedure.

dbo.Tags

Small table at 1,232 rows, but it still becomes expensive because it is used in a nested loops string comparison against many Posts rows.
The issue is not Tags size; it is the join method and string pattern.

Evidence
• Tags scan itself is cheap: Node 12 estimated cost 0.0091.
• The expensive part is repeated evaluation inside Node 8 Nested Loops against Posts rows.

Validation steps

Validate each recommended change by comparing before/after metrics using SET STATISTICS IO ON and SET STATISTICS TIME ON.

SET STATISTICS IO ON;
SET STATISTICS TIME ON;

EXEC dbo.sp00060;

SET STATISTICS IO OFF;
SET STATISTICS TIME OFF;

Capture logical reads for Posts and Votes.
Compare CPU time and elapsed time before/after the rewrite.
Confirm that the expensive Posts ↔ Tags LIKE branch disappears from the plan.
Confirm that latest-post retrieval uses the new Posts index if created.
Confirm vote aggregation uses the filtered Votes index if created.