Index Tuning Report – SQLStorm (RockyPC) – 2026-05-13
Executive Summary
The SQLStorm database on RockyPC is experiencing severe query performance problems driven by full-table scans on the three largest tables: PostHistory (847 K rows, 820 MB), Votes (926 K rows, 11.8 MB), and Posts (246 K rows, 398 MB). The workload — consisting entirely of analytical CTEs aggregating user activity, post statistics, and vote counts — accumulates over 3.7 million seconds of total CPU time in Query Store with individual queries running for 5–48 hours wall-clock time. All tables are read-only in the current workload (zero user_updates on all indexes), making index additions risk-free from a write-overhead perspective.
The five highest-impact actions are:
- Create a nonclustered index on PostHistory(PostId) INCLUDE (PostHistoryTypeId, UserId, CreationDate, Comment) — eliminates the costliest missing-index miss (29 K avg cost, 99.6% improvement).
- Create a nonclustered index on Votes(PostId) INCLUDE (VoteTypeId, UserId, BountyAmount) — the most-scanned table (315 scans) used in virtually every query in the workload.
- Create a nonclustered index on Votes(UserId) INCLUDE (VoteTypeId, BountyAmount) — directly addresses the second missing-index recommendation and the pattern
JOIN Votes V ON u.Id = V.UserId. - Create a nonclustered index on Posts(CreationDate) INCLUDE (Id, OwnerUserId, Score, PostTypeId, ViewCount, Title, AnswerCount, AcceptedAnswerId) — supports the date-range filter present in roughly half of the workload queries.
- Rewrite correlated subquery patterns in multiple queries that perform per-row SELECT COUNT(*) FROM Comments/Votes WHERE PostId = …, replacing them with aggregated JOINs or CTEs to eliminate the O(n) subquery fan-out generating millions of logical reads.
Automatic tuning (FORCE_LAST_GOOD_PLAN) is already enabled and in the correct state — no action required. No forced plan failures were detected. All existing indexes have excellent page density (≥ 90%) and low fragmentation — no rebuild is needed at this time.
Environment
| Property | Value |
|---|---|
| Server Name | RockyPC |
| Database | SQLStorm |
| SQL Server Version | 16.0.1175.1 (SQL Server 2022 RTM-GDR, KB5084815) — Mar 13 2026 |
| Edition | Developer Edition (Engine: Enterprise — full feature set available) |
| Platform | Windows 10 Home 10.0 x64 (Hypervisor) |
| Columnstore Indexes | Available ✔ |
| Online Index Operations | Available ✔ |
| Advanced Compression | Available ✔ |
| Auto Create / Update Stats | Both ON; Async Update = OFF |
| FORCE_LAST_GOOD_PLAN | ON (desired = ON, actual = ON) ✔ |
| Forced Plan Failures | None detected ✔ |
Top Priorities at a Glance
| # | Action | Target | Priority | Confidence |
|---|---|---|---|---|
| 1 | Create NCI on PostHistory(PostId) with includes | dbo.PostHistory | CRITICAL | 95% |
| 2 | Create NCI on Votes(PostId) with includes | dbo.Votes | CRITICAL | 95% |
| 3 | Create NCI on Votes(UserId) with includes | dbo.Votes | HIGH | 90% |
| 4 | Create NCI on Posts(CreationDate) with includes | dbo.Posts | HIGH | 88% |
| 5 | Create NCI on Comments(PostId) with includes | dbo.Comments | HIGH | 88% |
| 6 | Create NCI on Badges(UserId) INCLUDE (Class, Date, Name) | dbo.Badges | MEDIUM | 82% |
| 7 | Rewrite correlated subquery patterns (Comments/Votes per post) | Multiple queries | CRITICAL | 92% |
| 8 | Eliminate LIKE '%tag%' pattern on Posts.Tags | Multiple queries | HIGH | 90% |
| 9 | Evaluate nonclustered columnstore on Votes / PostHistory | dbo.Votes, dbo.PostHistory | MEDIUM | 72% |
| 10 | Enable Auto Update Stats Async | SQLStorm database | LOW | 85% |
Detailed Prioritized Recommendations
Problem
The missing-index DMV reports an average cost of 29,307 with a 99.6% improvement estimate for a seek on PostId with PostHistoryTypeId included. The clustered index is scanned 124 times with zero writes against the table. At 847,593 rows and 820 MB (PAGE compressed), each full scan is extremely expensive. Multiple high-CPU queries in the plan cache and Query Store join to PostHistory on PostId or filter on PostHistoryTypeId, including the 2.3-billion-CPU-millisecond query (hash 2C1078E11002C34F) and the 900,881-second Query Store entry.
Workload Evidence
- Missing index DMV: PostId equality, PostHistoryTypeId included, 12 seeks/scans, avg cost 29,307, 99.6% impact.
- Query hash
4DE12E1909540955: 13.9 M avg CPU, 369 K avg logical reads, joins PostHistory filtering onPostHistoryTypeId IN (10, 11). - Query hash
7C2C5483DB636EB3: 12.7 M avg CPU, references PostHistory for post type name. - Query Store top entry (900,881 s total): references
ClosedPostHistoryCTE that filtersPostHistoryTypeIdand joins onPostId.
Cardinality
PostId has ~246 K distinct values (density 4.05e-6) — highly selective, ideal index key. PostHistoryTypeId has only 29 distinct values — keep as INCLUDE, not key column. Additional high-value includes: UserId, CreationDate, Comment cover the query projections without widening the key.
Recommendation
Create a PAGE-compressed nonclustered index. Use ONLINE = ON (Enterprise-compatible) to avoid blocking. Fill factor 90 is appropriate given the read-only workload (no random inserts observed).
Problem
The Votes clustered index is scanned 315 times with zero writes. At 926,084 rows, every scan reads the full 11.8 MB table. The overwhelming majority of workload queries perform the pattern LEFT JOIN Votes V ON P.Id = V.PostId followed by conditional aggregation on VoteTypeId. Without an index on PostId, SQL Server must scan all votes for every post lookup. PostId has ~242 K distinct values (very high cardinality, density 4.12e-6).
Workload Evidence
- Query hash
558D53CC7558B81F: 366 M avg CPU, 4.17 M avg logical reads, 25,864 max spills — groups Votes by PostId with VoteTypeId filter. - Query hash
F50DF27C41C97904: 10.97 M avg CPU, 6.51 M avg logical reads — correlated subquerySELECT COUNT(*) FROM Votes WHERE PostId = p.Id AND VoteTypeId = 2/3. - Query hash
EE4BAB640222BBF0: 9 M avg CPU, 6.5 M avg logical reads — same correlated subquery pattern. - Query hash
779F0850AFC28347: 8.3 M avg CPU, 6.43 M avg logical reads — correlated subquery on Votes by PostId. - Virtually every Query Store tracked query joins Votes through PostId.
Recommendation
Create a PAGE-compressed NCI on PostId including VoteTypeId, UserId, and BountyAmount to cover all SELECT patterns seen in the workload. This is the single most impactful index change available.
Problem
The missing-index DMV recommends an index on UserId with VoteTypeId and BountyAmount included (avg cost 3,255, 99.4% improvement, 4 seeks/scans). Several queries join Votes directly to Users on u.Id = v.UserId rather than going through Posts — notably query hash 6013F73BCD0BD360 (92.5 M avg CPU, 5.14 M avg reads) and D5E6A0CB09A238C7 (37.5 M avg CPU). Note that UserId has only ~929 distinct non-null values in Votes (density 0.00107) due to many anonymous votes (NULL UserId), meaning selectively for non-NULL UserId queries can be good.
Cardinality Note
The density statistics show UserId in Votes is low-cardinality for the non-NULL population (~929 distinct values out of 926 K rows). A filtered index on UserId IS NOT NULL will be significantly smaller and more efficient for the user-activity join pattern. For queries that aggregate all vote types per user, this filtered approach is preferred.
Recommendation
Create a filtered NCI on UserId WHERE UserId IS NOT NULL including VoteTypeId and BountyAmount. PAGE compressed.
Problem
At least 12 distinct queries in the workload filter Posts with WHERE p.CreationDate > DATEADD(YEAR, -1, ...) or DATEADD(DAY, -30, ...). Without a supporting index, each such query performs a full clustered index scan of 246 K rows across 398 MB (3,125 pages). CreationDate has ~245 K distinct values (density 4.08e-6) — very selective. The existing IX_Posts_OwnerUserId_Covering does not cover CreationDate as a leading key.
Workload Evidence
- Query hash
6013F73BCD0BD360: 92.5 M avg CPU —WHERE p.CreationDate > DATEADD(YEAR, -1, '2024-10-01') - Query hash
F37272E929AEF449: 11.3 M avg CPU, 1.1 M avg reads —WHERE p.CreationDate >= DATEADD(YEAR, -1, ...) - Query hash
255E97B49AAD8F59: 31.4 M avg CPU —WHERE p.CreationDate >= DATEADD(year, -1, ...) - Query hash
8859736AE77722F7: 10.8 M avg CPU —WHERE p.PostTypeId = 1 AND p.Score > 0 AND p.CreationDate >= DATEADD(YEAR, -1, ...) - Query Store: multiple entries with date filters on Posts.
Recommendation
Create a PAGE-compressed NCI on CreationDate with a wide INCLUDE covering the most commonly projected columns. For the compound filter including PostTypeId, a composite key (CreationDate, PostTypeId) would be slightly more efficient; however a single key on CreationDate with generous includes handles the largest portion of the workload with one index.
Problem
The Comments clustered index is scanned 196 times with zero writes. At 351,440 rows and 78 MB, repeated full scans are costly. The majority of workload queries perform LEFT JOIN Comments c ON p.Id = c.PostId or correlated subqueries SELECT COUNT(*) FROM Comments WHERE c.PostId = p.Id. PostId in Comments has ~122 K distinct values (density 8.16e-6), making it reasonably selective.
Workload Evidence
- Query hash
7C2C5483DB636EB3: 12.7 M avg CPU —LEFT JOIN Comments c ON p.Id = c.PostId - Query hash
255E97B49AAD8F59: 31.4 M avg CPU — same join pattern - Query hashes
F50DF27C41C97904,EE4BAB640222BBF0,779F0850AFC28347: each with 6.4–6.5 M avg logical reads using correlated subquery on Comments per PostId
Recommendation
Create a PAGE-compressed NCI on PostId including Id, UserId, CreationDate, Score. This covers the COUNT and aggregation patterns without requiring a lookup to the clustered index.
Problem
The Badges clustered index is scanned 164 times with zero writes. At 439,352 rows and 8 MB, the per-scan cost is lower than Votes or PostHistory, but the frequency is high. The universal pattern is LEFT JOIN Badges B ON U.Id = B.UserId with aggregation on Class. UserId in Badges has ~154 K distinct values — high cardinality.
Recommendation
Create a PAGE-compressed NCI on UserId including Class, Date, Name to cover all observed projection patterns.
Problem
Several queries use correlated subqueries of the form:
COALESCE((SELECT COUNT(*) FROM Comments c WHERE c.PostId = p.Id), 0) AS CommentCount
and
COALESCE((SELECT COUNT(*) FROM Votes v WHERE v.PostId = p.Id AND v.VoteTypeId = 2), 0) AS UpVotes
These patterns execute one additional query per row in the outer Posts result set. Even with the new indexes recommended above, this still results in O(n) round-trips to the index. These three query hashes are the top offenders by logical reads:
F50DF27C41C97904: 6,509,095 avg logical reads, 10.97 M avg CPU — two correlated subqueries on Comments and two on Votes per post row.EE4BAB640222BBF0: 6,500,252 avg logical reads, 9 M avg CPU — same pattern.779F0850AFC28347: 6,428,952 avg logical reads, 8.3 M avg CPU — correlated subquery on Votes per post in a 30-day window.
Recommendation
Replace correlated subqueries with pre-aggregated CTEs or derived tables using GROUP BY, then LEFT JOIN the result. This transforms O(n) subquery executions into a single aggregate pass. Example rewrite approach (illustrative):
Instead of inline correlated subqueries per post, pre-aggregate in a CTE: WITH CommentCounts AS (SELECT PostId, COUNT(*) AS Cnt FROM Comments GROUP BY PostId) then LEFT JOIN CommentCounts cc ON p.Id = cc.PostId. Apply the same pattern for Votes grouped by PostId and VoteTypeId. See the Scripts section for a full rewrite example of query hash F50DF27C41C97904.
Additional Code Issue: Redundant Badge Counting
Query hash 6DA5B718F2D3A8A5 computes COUNT(b.Id) AS GoldBadges, COUNT(b.Id) AS SilverBadges, COUNT(b.Id) AS BronzeBadges — all three columns are identical because the WHERE clause filter b.Class = 1 is applied at the JOIN but COUNT(*) is not conditional. The query should use SUM(CASE WHEN b.Class = 1 THEN 1 ELSE 0 END) etc., or fix the join-filter logic. This is a logic bug in addition to a performance issue.
Problem
Two high-frequency queries in Query Store use the anti-pattern Posts.Tags LIKE '%' + T.TagName + '%' to simulate a tag-to-post relationship. This is a cross-join with a leading wildcard that cannot use any index and forces a full scan of the Posts clustered index for every tag row in the Tags table. The Tags table has ~20 rows scanned per query execution.
Workload Evidence
- Query Store entry with total time 138,468 s (3 executions, avg 46,156 s each):
JOIN Tags t ON p.Tags LIKE '%' + t.TagName + '%'— 866,861 avg reads. - Query Store entry with total time 6,997 s:
JOIN Posts P ON P.Tags LIKE '%' + T.TagName + '%'— 137,024 avg reads.
Recommendation
Normalize the tag relationship into a separate PostTags junction table with proper foreign keys and a covering index on (TagId, PostId). This is the correct long-term architectural fix. As a short-term mitigation, consider using full-text search (CONTAINS) on Posts.Tags if the schema cannot be changed immediately. Any index-based solution is blocked by the leading wildcard; schema change is required for full benefit.
Rationale
Both Votes (926 K rows) and PostHistory (847 K rows) have analytical workloads — aggregations, GROUP BY, conditional SUM — that are natural candidates for columnstore. However, row counts are just below the 1-million threshold used as a general guideline, and the nonclustered row store indexes recommended above (items 1–3) should address the most acute problems first. After applying those indexes, re-evaluate whether residual full-scan aggregation queries still dominate the workload before adding columnstore.
Columnstore Conditions Favorable
- Zero writes on both tables — no delta-store churn.
- Enterprise-compatible instance (Developer edition).
- All queries perform aggregate functions (SUM, COUNT) over large row sets.
- Batch mode execution available in SQL Server 2022.
Recommendation
After deploying items 1–5 and measuring improvement, consider adding a nonclustered columnstore index on Votes covering (VoteTypeId, PostId, UserId, BountyAmount, CreationDate) and on PostHistory covering (PostId, PostHistoryTypeId, UserId, CreationDate). These can coexist with the B-tree indexes and SQL Server's optimizer will choose the best access path.
Problem
AUTO_UPDATE_STATISTICS_ASYNC is currently OFF. With synchronous auto-update, a query that triggers a statistics update must wait for the update to complete before it can compile and execute. Given the large table sizes and the read-heavy analytical nature of this workload, enabling async statistics updates prevents occasional query compilation stalls.
Recommendation
Enable AUTO_UPDATE_STATISTICS_ASYNC = ON for the SQLStorm database. Queries will use the existing (slightly stale) statistics while the update runs in the background, avoiding compilation delays.
Query Store & Automatic Tuning
Query Store Workload Highlights
The Query Store captures a deeply problematic workload. Key observations:
- The single worst query has accumulated 900,881 seconds of total time across 3 executions (avg 300,294 s each), with avg CPU of 299 billion ms — indicating a query that effectively never completes in reasonable time. This is the Tag LIKE pattern combined with aggregation over the full Posts table.
- The second and third worst queries each ran for ~300,000 s wall-clock time (5 minutes execution = approximately ~50 days of CPU equivalent in a single-core interpretation). Both involve the Tags LIKE pattern and full-table aggregations.
- The fourth worst query (hash
2C1078E11002C34F, 274,459 s total) is the UserActivity CTE joining Users → Posts → Votes with a three-way LEFT JOIN aggregate. This will benefit directly from the Votes(PostId) and Posts indexes. - Multiple queries produce zero actual rows (avg_actual_rows = 0.00) despite enormous CPU expenditure, suggesting they may be timing out or returning empty result sets from extremely broad scans.
- Queries with large spills: hashes
558D53CC(25,864 KB spills),6DA5B718(28,032 KB spills),0D1DCC1B(33,801 KB spills),4DE12E19(18,800 KB spills),F37272E9(17,336 KB spills) — all spilling due to large hash joins/aggregates without sufficient memory grants relative to data sizes. The recommended indexes will reduce input row counts to hash operators, reducing or eliminating spills.
Parameter Sensitivity
All Query Store entries show distinct sets = 1 and stdev duration = 0.00, indicating these queries are not parametrized and execute with literal constants (e.g., '2024-10-01 12:34:56'). SQL Server 2022's Parameter Sensitive Plan (PSP) optimization is not applicable to unparameterized queries, but the hardcoded date literals also prevent plan reuse across different date values. Consider using sp_executesql with parameters if these queries are issued programmatically.
Index Page Density Analysis
| Table / Index | Type | Avg Page Density | Avg Fragmentation | Pages | Assessment |
|---|---|---|---|---|---|
| dbo.Posts / IX_Posts_OwnerUserId_Covering | NONCLUSTERED | 90.23% | 1.10% | 2,093 | HEALTHY |
| dbo.Comments / PK (clustered) | CLUSTERED | 98.29% | 0.05% | 9,998 | HEALTHY |
| dbo.Posts / PK (clustered) | CLUSTERED | 99.18% | 0.10% | 3,125 | HEALTHY |
| dbo.PostHistory / PK (clustered) | CLUSTERED | 99.37% | 0.07% | 8,249 | HEALTHY |
| dbo.Users / PK (clustered) | CLUSTERED | 99.55% | 0.00% | 1,936 | HEALTHY |
| dbo.Votes / PK (clustered) | CLUSTERED | 99.83% | 0.33% | 1,508 | HEALTHY |
| dbo.Badges / PK (clustered) | CLUSTERED | 99.85% | 0.00% | 1,023 | HEALTHY |
Unused / Low-Benefit Index Review
The existing nonclustered index IX_Posts_OwnerUserId_Covering on dbo.Posts shows 77 seeks and 6 scans with zero writes — it is being used and beneficial. No indexes are candidates for dropping at this time. The small lookup-type tables (LinkTypes, PostHistoryTypes, PostTypes, VoteTypes, CloseReasonTypes, Tags) all have trivially small clustered indexes; no action needed.
Statistics Recommendations
- Auto Create and Auto Update Statistics are both ON — no manual statistics creation is required for existing columns.
- The new indexes created in the Scripts section will automatically generate associated statistics.
- After deploying the new indexes, execute
UPDATE STATISTICS dbo.Votes WITH FULLSCANandUPDATE STATISTICS dbo.PostHistory WITH FULLSCANto ensure accurate cardinality estimates for the new access paths, given the large table sizes where sampled updates may be imprecise.
Scripts
Script 1 — Create NCI on dbo.PostHistory(PostId) [Recommendation #1]
USE [SQLStorm];
GO
-- Recommendation #1: Nonclustered index on PostHistory(PostId)
-- Addresses missing index with avg cost 29,307, 99.6% improvement
-- Covers join patterns and PostHistoryTypeId filter used in multiple high-CPU queries
-- PAGE compression matches existing clustered index; ONLINE = ON avoids blocking
CREATE NONCLUSTERED INDEX [IX_PostHistory_PostId_Covering]
ON [dbo].[PostHistory] ([PostId])
INCLUDE ([PostHistoryTypeId], [UserId], [CreationDate], [Comment])
WITH (
DATA_COMPRESSION = PAGE,
FILLFACTOR = 90,
ONLINE = ON,
SORT_IN_TEMPDB = ON
);
GO
Script 2 — Create NCI on dbo.Votes(PostId) [Recommendation #2]
USE [SQLStorm];
GO
-- Recommendation #2: Nonclustered index on Votes(PostId)
-- Addresses 315 full clustered index scans, covers conditional aggregation on VoteTypeId
-- Includes UserId and BountyAmount to cover all observed query projections
-- PAGE compression matches existing clustered index
CREATE NONCLUSTERED INDEX [IX_Votes_PostId_Covering]
ON [dbo].[Votes] ([PostId])
INCLUDE ([VoteTypeId], [UserId], [BountyAmount], [CreationDate])
WITH (
DATA_COMPRESSION = PAGE,
FILLFACTOR = 90,
ONLINE = ON,
SORT_IN_TEMPDB = ON
);
GO
Script 3 — Create Filtered NCI on dbo.Votes(UserId) [Recommendation #3]
USE [SQLStorm];
GO
-- Recommendation #3: Filtered nonclustered index on Votes(UserId)
-- Addresses missing index recommendation (avg cost 3,255, 99.4% impact)
-- Filtered on UserId IS NOT NULL: excludes anonymous votes (~90%+ of rows)
-- making the index significantly smaller and more selective
-- Covers LEFT JOIN Votes v ON u.Id = v.UserId patterns in UserActivity/UserScore queries
CREATE NONCLUSTERED INDEX [IX_Votes_UserId_Filtered_Covering]
ON [dbo].[Votes] ([UserId])
INCLUDE ([VoteTypeId], [BountyAmount], [PostId], [CreationDate])
WHERE [UserId] IS NOT NULL
WITH (
DATA_COMPRESSION = PAGE,
FILLFACTOR = 90,
ONLINE = ON,
SORT_IN_TEMPDB = ON
);
GO
Script 4 — Create NCI on dbo.Posts(CreationDate) [Recommendation #4]
USE [SQLStorm];
GO
-- Recommendation #4: Nonclustered index on Posts(CreationDate)
-- Supports date-range filters present in ~12 distinct queries in the workload
-- e.g. WHERE p.CreationDate >= DATEADD(YEAR, -1, '2024-10-01 12:34:56')
-- Wide INCLUDE covers most projected columns to avoid key lookups to the clustered index
CREATE NONCLUSTERED INDEX [IX_Posts_CreationDate_Covering]
ON [dbo].[Posts] ([CreationDate])
INCLUDE (
[Id],
[OwnerUserId],
[PostTypeId],
[Score],
[ViewCount],
[Title],
[AnswerCount],
[AcceptedAnswerId],
[CommentCount],
[ParentId]
)
WITH (
DATA_COMPRESSION = PAGE,
FILLFACTOR = 90,
ONLINE = ON,
SORT_IN_TEMPDB = ON
);
GO
Script 5 — Create NCI on dbo.Comments(PostId) [Recommendation #5]
USE [SQLStorm];
GO
-- Recommendation #5: Nonclustered index on Comments(PostId)
-- Eliminates 196 full clustered index scans on Comments (351K rows, 78MB)
-- Covers COUNT(*) subqueries and LEFT JOIN Comments c ON p.Id = c.PostId patterns
-- Includes columns needed for aggregation without clustered index lookups
CREATE NONCLUSTERED INDEX [IX_Comments_PostId_Covering]
ON [dbo].[Comments] ([PostId])
INCLUDE ([Id], [UserId], [CreationDate], [Score])
WITH (
DATA_COMPRESSION = PAGE,
FILLFACTOR = 90,
ONLINE = ON,
SORT_IN_TEMPDB = ON
);
GO
Script 6 — Create NCI on dbo.Badges(UserId) [Recommendation #6]
USE [SQLStorm];
GO
-- Recommendation #6: Nonclustered index on Badges(UserId)
-- Eliminates 164 full clustered index scans on Badges (439K rows, 8MB)
-- Covers LEFT JOIN Badges B ON U.Id = B.UserId with Class aggregation patterns
CREATE NONCLUSTERED INDEX [IX_Badges_UserId_Covering]
ON [dbo].[Badges] ([UserId])
INCLUDE ([Class], [Date], [Name], [Id])
WITH (
DATA_COMPRESSION = PAGE,
FILLFACTOR = 90,
ONLINE = ON,
SORT_IN_TEMPDB = ON
);
GO
Script 7 — Enable Auto Update Statistics Asynchronously [Recommendation #10]
USE [master];
GO
-- Recommendation #10: Enable async statistics updates for SQLStorm
-- Prevents query compilation stalls when auto-update is triggered on large tables
-- Queries will compile using slightly stale statistics while update runs in background
ALTER DATABASE [SQLStorm]
SET AUTO_UPDATE_STATISTICS_ASYNC ON
WITH NO_WAIT;
GO
-- Verify the setting
SELECT name, is_auto_update_stats_async_on
FROM sys.databases
WHERE name = 'SQLStorm';
GO
Script 8 — Update Statistics with FULLSCAN After Index Creation [Recommendation #6 / General]
USE [SQLStorm];
GO
-- Run after deploying new indexes (Scripts 1-6)
-- FULLSCAN ensures accurate cardinality estimates for large tables
-- where sampled auto-update may produce imprecise histograms
UPDATE STATISTICS [dbo].[Votes] WITH FULLSCAN;
GO
UPDATE STATISTICS [dbo].[PostHistory] WITH FULLSCAN;
GO
UPDATE STATISTICS [dbo].[Posts] WITH FULLSCAN;
GO
UPDATE STATISTICS [dbo].[Comments] WITH FULLSCAN;
GO
UPDATE STATISTICS [dbo].[Badges] WITH FULLSCAN;
GO
Script 9 — Rewrite Correlated Subquery Pattern [Recommendation #7]
USE [SQLStorm];
GO
-- Recommendation #7: Example rewrite of correlated subquery pattern
-- Original pattern (query hash F50DF27C41C97904 / EE4BAB640222BBF0):
-- SELECT ... COALESCE((SELECT COUNT(*) FROM Comments c WHERE c.PostId = p.Id), 0)
-- , COALESCE((SELECT COUNT(*) FROM Votes v WHERE v.PostId = p.Id AND v.VoteTypeId = 2), 0)
-- , COALESCE((SELECT COUNT(*) FROM Votes v WHERE v.PostId = p.Id AND v.VoteTypeId = 3), 0)
-- FROM Posts p WHERE p.CreationDate > DATEADD(year, -1, '2024-10-01 12:34:56')
--
-- REWRITTEN using pre-aggregated CTEs (eliminates O(n) correlated subquery fan-out):
WITH FilteredPosts AS (
SELECT
p.Id AS PostId,
p.Title,
p.CreationDate,
p.Score,
p.ViewCount,
p.OwnerUserId
FROM dbo.Posts AS p
WHERE p.CreationDate > DATEADD(year, -1, '2024-10-01 12:34:56')
),
CommentAgg AS (
SELECT c.PostId, COUNT(*) AS CommentCount
FROM dbo.Comments AS c
WHERE EXISTS (
SELECT 1 FROM FilteredPosts fp WHERE fp.PostId = c.PostId
)
GROUP BY c.PostId
),
VoteAgg AS (
SELECT
v.PostId,
SUM(CASE WHEN v.VoteTypeId = 2 THEN 1 ELSE 0 END) AS UpVotes,
SUM(CASE WHEN v.VoteTypeId = 3 THEN 1 ELSE 0 END) AS DownVotes
FROM dbo.Votes AS v
WHERE v.VoteTypeId IN (2, 3)
AND EXISTS (
SELECT 1 FROM FilteredPosts fp WHERE fp.PostId = v.PostId
)
GROUP BY v.PostId
)
SELECT
fp.PostId,
fp.Title,
fp.CreationDate,
fp.Score,
fp.ViewCount,
COALESCE(ca.CommentCount, 0) AS CommentCount,
COALESCE(va.UpVotes, 0) AS UpVotes,
COALESCE(va.DownVotes, 0) AS DownVotes
FROM FilteredPosts AS fp
LEFT JOIN CommentAgg AS ca ON fp.PostId = ca.PostId
LEFT JOIN VoteAgg AS va ON fp.PostId = va.PostId
ORDER BY fp.Score DESC;
GO
Script 10 — Fix Logic Bug: Duplicate Badge Class Count [Recommendation #7 / Code Correction]
USE [SQLStorm];
GO
-- Recommendation #7 (Code fix): Query hash 6DA5B718F2D3A8A5
-- Original (INCORRECT - all three columns return identical counts because
-- COUNT(b.Id) ignores the JOIN filter by class):
--
-- FROM Users u LEFT JOIN Badges b ON u.Id = b.UserId AND b.Class = 1
-- SELECT COUNT(b.Id) AS GoldBadges,
-- COUNT(b.Id) AS SilverBadges, -- WRONG: same as GoldBadges
-- COUNT(b.Id) AS BronzeBadges -- WRONG: same as GoldBadges
--
-- CORRECTED version using conditional aggregation (no join filter by class):
WITH UserBadges AS (
SELECT
u.Id AS UserId,
SUM(CASE WHEN b.Class = 1 THEN 1 ELSE 0 END) AS GoldBadges,
SUM(CASE WHEN b.Class = 2 THEN 1 ELSE 0 END) AS SilverBadges,
SUM(CASE WHEN b.Class = 3 THEN 1 ELSE 0 END) AS BronzeBadges,
COUNT(b.Id) AS TotalBadges
FROM dbo.Users AS u
LEFT JOIN dbo.Badges AS b ON u.Id = b.UserId
GROUP BY u.Id
)
SELECT
ub.UserId,
ub.GoldBadges,
ub.SilverBadges,
ub.BronzeBadges,
ub.TotalBadges
FROM UserBadges AS ub
ORDER BY ub.GoldBadges DESC;
GO
Script 11 — Verify New Indexes Are Being Used (Post-Deployment Check)
USE [SQLStorm];
GO
-- Run after workload executes against new indexes
-- Check that new indexes are accumulating seeks (not being ignored by optimizer)
SELECT
OBJECT_NAME(ix.object_id) AS TableName,
ix.name AS IndexName,
ix.type_desc AS IndexType,
ius.user_seeks,
ius.user_scans,
ius.user_lookups,
ius.user_updates,
ius.last_user_seek,
ius.last_user_scan
FROM sys.indexes AS ix
INNER JOIN sys.dm_db_index_usage_stats AS ius
ON ix.object_id = ius.object_id
AND ix.index_id = ius.index_id
AND ius.database_id = DB_ID()
WHERE OBJECT_NAME(ix.object_id) IN (
'PostHistory', 'Votes', 'Posts', 'Comments', 'Badges'
)
AND ix.name IN (
'IX_PostHistory_PostId_Covering',
'IX_Votes_PostId_Covering',
'IX_Votes_UserId_Filtered_Covering',
'IX_Posts_CreationDate_Covering',
'IX_Comments_PostId_Covering',
'IX_Badges_UserId_Covering'
)
ORDER BY TableName, IndexName;
GO