SQL Server Code Review – RockyPC.SQLStorm – 2026-05-10

Tuning Goal: Code Review
Server: RockyPC  |  Database: SQLStorm  |  Version: Microsoft SQL Server 2022 (RTM-GDR) 16.0.1175.1 – Developer Edition (64-bit)

Executive Summary

The reviewed objects are predominantly read-only reporting procedures over a Stack Exchange-shaped schema (Posts, Users, Votes, Comments, PostHistory, Badges). The code base shows consistent anti-patterns: full-table aggregations followed by TOP N, fan-out joins between multiple one-to-many tables (causing duplicate-row inflation), LIKE '%tag%' scans of Posts.Tags, join-on-display-name (non-unique), and several outright correctness bugs (Cartesian joins, wrong key joins, joining Users.Id to Posts.Id). One write procedure (usp_UpdateTablesWithDelay) intentionally holds an open transaction with WAITFOR DELAY '00:00:05', which is the most dangerous item in the set: it serializes all readers/writers blocked behind it for 5+ seconds per call, amplified by the calling looper usp_UpdateTablesInLoop.

94
Stored Procedures Reviewed
0
Functions / Triggers / Views Provided
8
Tables > 1,000 Rows
14
Prioritized Findings

Note: Only the most recent stored procedures were provided; older objects, functions, triggers, and views were not in scope.

Review Scope & Object Counts

Object TypeCount Reviewed
SQL_STORED_PROCEDURE94
SQL_SCALAR_FUNCTION / TVF0 (none provided)
VIEW0 (none provided)
TRIGGER0 (none provided)

Only the most recent objects were available for review.

Top Priorities (in order)

  1. Critical Eliminate the 5-second WAITFOR inside an open transaction in usp_UpdateTablesWithDelay; refactor the calling loop usp_UpdateTablesInLoop.
  2. Critical Fix correctness bugs: Cartesian joins (sp09974, sp09895, sp09833, sp09881), wrong-key join in sp09826 (Posts.Id = Users.Id), sp00990 (RankedPosts.PostId = UserReputation.UserId), sp09833 (RankedPosts.PostId IS NOT NULL as join key), and join-by-DisplayName in many procs.
  3. High Replace SELECT TOP 1 ... ORDER BY NEWID() per iteration in usp_UpdateTablesInLoop (full scans of Users and Posts per call).
  4. High Stop fan-out: pre-aggregate Votes, Comments, PostHistory, Badges in CTEs before joining (e.g., sp09994, sp09991, sp09972, sp09907, sp09851).
  5. High Replace p.Tags LIKE '%' + t.TagName + '%' with a normalized PostTags bridge table (sp09982, sp09990, sp00992, sp09881, sp09886).
  6. Medium Repair broken date arithmetic: CAST('2024-10-01 ...' AS DATETIME) - DATEADD(YEAR, 1, 0) does not mean "one year ago" (sp09978, sp09921, sp09905, sp09844).
  7. Medium Add targeted indexes (see §9). The existing IX_Posts_OwnerUserId_Covering is unused — confirm/clean.

Detailed Prioritized Recommendations

1. Critical Long-held transaction with WAITFOR DELAY Confidence: Very High

dbo.usp_UpdateTablesWithDelay holds an explicit transaction open for 5+ seconds while UPDATE Users and UPDATE Posts retain X locks on the modified key(s) (and U-locks during seek). With XACT_ABORT ON + BEGIN TRAN, any reader using default RC will block. usp_UpdateTablesInLoop magnifies this by calling it repeatedly, while itself doing ORDER BY NEWID() against multi-million-row tables.

Fix: Never wait inside a transaction. If you must simulate latency, do it outside.

-- Refactored
CREATE OR ALTER PROCEDURE dbo.usp_UpdateTablesWithDelay
    @UserId INT, @NewReputation INT, @PostId INT, @NewScore INT
AS
BEGIN
    SET NOCOUNT ON;
    SET XACT_ABORT ON;

    BEGIN TRY
        BEGIN TRAN;
            UPDATE dbo.Users
               SET Reputation = @NewReputation, LastAccessDate = SYSUTCDATETIME()
             WHERE Id = @UserId;

            UPDATE dbo.Posts
               SET Score = @NewScore, LastActivityDate = SYSUTCDATETIME()
             WHERE Id = @PostId;
        COMMIT;
    END TRY
    BEGIN CATCH
        IF @@TRANCOUNT > 0 ROLLBACK;
        THROW;
    END CATCH;

    -- Optional pacing OUTSIDE the transaction
    -- WAITFOR DELAY '00:00:05';
END;

Also enable READ_COMMITTED_SNAPSHOT on SQLStorm to reduce reader/writer blocking globally, and avoid PRINT in hot procedures (it forces network round-trips).

2. High ORDER BY NEWID() for random picks Confidence: Very High

In usp_UpdateTablesInLoop:

SELECT TOP 1 @RandomUserId = Id FROM dbo.Users    ORDER BY NEWID();   -- full scan, 267k rows
SELECT TOP 1 @RandomPostId = Id FROM dbo.Posts    ORDER BY NEWID();   -- full scan, 246k rows

Each iteration scans both tables. Use TABLESAMPLE or a key-range pick:

DECLARE @MaxUserId INT = (SELECT MAX(Id) FROM dbo.Users);
DECLARE @MaxPostId INT = (SELECT MAX(Id) FROM dbo.Posts);

SELECT TOP (1) @RandomUserId = Id
FROM dbo.Users
WHERE Id >= ABS(CHECKSUM(NEWID())) % @MaxUserId
ORDER BY Id;

3. Critical Cartesian / CROSS JOIN correctness bugs Confidence: High

  • sp09974: UserStats CROSS JOIN PopularPosts — every user × every popular post (10× row inflation, no filter).
  • sp09895: same anti-pattern with a redundantly duplicated inner query.
  • sp09833: BadgeDistribution bd ON bd.Reputation > 0 — a non-equi predicate that yields a Cartesian-like product.
  • sp09881: UserVoteSummary u CROSS JOIN PopularTags t then joined by display name (also wrong, see #4).
  • sp09982: JOIN TagSummary ts ON ts.PostCount > 5 — non-correlated predicate, Cartesian.
  • sp09861, sp09909, sp09823, sp00990: similar non-correlated ON clauses (e.g., ON UA.PostsChanged > 0, ON ups.PostCount > 5).

Fix: Every JOIN ... ON must include an equality predicate between the two tables. If you really need a "Top 10 tags overall × Top 10 users", do that explicitly and document it; otherwise add the missing equi-join.

4. Critical Wrong-key joins Confidence: High

  • sp09826: JOIN Users u ON tp.PostId = u.Id — joins post id to user id; results are coincidental matches.
  • sp00990: JOIN UserReputation ur ON rp.PostId = ur.UserId — same class of bug.
  • sp09909, sp09867, sp09895, sp09886, sp09831, sp09828, sp09824, sp09815: join on DisplayName, which is not unique in Users and produces duplicate/incorrect results.
  • sp09891: LEFT JOIN PostTypes pt ON tp.PostId = pt.Id — joins post id to post-type id (1..7).
  • sp09833: JOIN PostTypes pt ON rp.PostId IS NOT NULL — no predicate.
  • sp09825: subquery SELECT TOP 1 DisplayName FROM Users WHERE Id = b.UserId referencing the outer alias b incorrectly.

Fix: Always join on surrogate keys. Add a UNIQUE constraint or accept that DisplayName is non-unique and never join on it.

5. High p.Tags LIKE '%' + TagName + '%' Confidence: Very High

Used in sp09982, sp09990, sp00992, sp09881, sp09886, sp09915. This pattern is unsargable, scans all 246k Posts per Tag row, and is functionally incorrect (e.g., sql matches mssql).

Fix: Normalize tags into a bridge:

CREATE TABLE dbo.PostTags(
    PostId INT NOT NULL,
    TagId  INT NOT NULL,
    CONSTRAINT PK_PostTags PRIMARY KEY (PostId, TagId)
);
CREATE INDEX IX_PostTags_TagId_PostId ON dbo.PostTags(TagId, PostId);

-- Or, if you must keep Posts.Tags '|sql|tsql|', use STRING_SPLIT in 2022
-- with proper bracket boundaries and an exact match, not LIKE '%...%'.

6. High Pre-aggregate to avoid fan-out Confidence: Very High

Many procedures LEFT JOIN Posts, Comments, Votes, and Badges in one shot, then GROUP BY u.Id. With Votes (926k) and PostHistory (847k), this multiplies row counts by 10–100×. Examples: sp09991, sp09972, sp09994, sp09907, sp09851, sp09908, sp09812, sp09955.

-- Wrong (fan-out: posts × votes × badges)
FROM Users u
LEFT JOIN Posts  p ON u.Id = p.OwnerUserId
LEFT JOIN Votes  v ON p.Id = v.PostId
LEFT JOIN Badges b ON u.Id = b.UserId
GROUP BY u.Id;

-- Right (independent aggregates, then join)
;WITH P AS (SELECT OwnerUserId AS UserId, COUNT(*) AS PostCnt FROM Posts GROUP BY OwnerUserId),
      V AS (SELECT p.OwnerUserId AS UserId,
                   SUM(CASE WHEN v.VoteTypeId=2 THEN 1 ELSE 0 END) AS Up,
                   SUM(CASE WHEN v.VoteTypeId=3 THEN 1 ELSE 0 END) AS Down
            FROM Votes v JOIN Posts p ON p.Id=v.PostId
            GROUP BY p.OwnerUserId),
      B AS (SELECT UserId, COUNT(*) AS BadgeCnt FROM Badges GROUP BY UserId)
SELECT u.Id, u.DisplayName, P.PostCnt, V.Up, V.Down, B.BadgeCnt
FROM Users u
LEFT JOIN P ON P.UserId=u.Id
LEFT JOIN V ON V.UserId=u.Id
LEFT JOIN B ON B.UserId=u.Id;

sp00060Fixed already follows this pattern correctly and is a good template.

7. High Broken date arithmetic Confidence: Very High

Several procs use:

p.CreationDate >= CAST('2024-10-01 12:34:56' AS DATETIME) - DATEADD(YEAR, 1, 0)
-- DATEADD(YEAR,1,0) => '1901-01-01'
-- DATETIME - DATETIME treats RHS as a number of days, producing nonsense.

Found in sp09978, sp09921, sp09905, sp09844, sp09963, sp00990 (similar). Replace with:

p.CreationDate >= DATEADD(YEAR, -1, CAST('2024-10-01 12:34:56' AS datetime2(0)))

Also remove hard-coded "current date" literals — parameterize @AsOf.

8. Medium Correlated subqueries instead of joins Confidence: High

sp09974, sp09946, sp09940, sp09904, sp09908, sp09891, sp09099: multiple per-row (SELECT COUNT(*) FROM Votes WHERE PostId = ...) calls cause 1 scan per outer row. Use a single grouped CTE joined once. sp09908's STRING_AGG over Votes×Users for every top post will be especially expensive on 926k Votes.

9. Medium Indexing recommendations Confidence: Medium

Usage stats show all listed nonclustered indexes have 0 seeks/scans, suggesting the workload either runs cold or lookups are missing. Validate, then ensure these exist:

-- Posts: most queries filter by OwnerUserId, PostTypeId, CreationDate
CREATE INDEX IX_Posts_OwnerUserId_Type_Date
    ON dbo.Posts(OwnerUserId, PostTypeId, CreationDate)
    INCLUDE (Score, ViewCount, Title, AnswerCount, CommentCount, AcceptedAnswerId)
    WITH (DATA_COMPRESSION = PAGE);

CREATE INDEX IX_Posts_CreationDate_Type
    ON dbo.Posts(CreationDate, PostTypeId)
    INCLUDE (Score, ViewCount, OwnerUserId, Title)
    WITH (DATA_COMPRESSION = PAGE);

-- Votes: extreme fan-out source
CREATE INDEX IX_Votes_PostId_VoteTypeId      ON dbo.Votes(PostId, VoteTypeId)
    INCLUDE (UserId, BountyAmount) WITH (DATA_COMPRESSION = PAGE);
CREATE INDEX IX_Votes_UserId_VoteTypeId      ON dbo.Votes(UserId, VoteTypeId)
    WITH (DATA_COMPRESSION = PAGE);

-- Comments
CREATE INDEX IX_Comments_PostId  ON dbo.Comments(PostId)  INCLUDE (UserId, Score, CreationDate);
CREATE INDEX IX_Comments_UserId  ON dbo.Comments(UserId)  INCLUDE (PostId, Score, CreationDate);

-- Badges
CREATE INDEX IX_Badges_UserId_Class ON dbo.Badges(UserId, Class) INCLUDE (Date, Name);

-- PostHistory (847k rows, 820 MB — biggest object)
CREATE INDEX IX_PostHistory_PostId_Type_Date
    ON dbo.PostHistory(PostId, PostHistoryTypeId, CreationDate) INCLUDE (UserId);

-- Consider columnstore for analytical procs (Enterprise/Developer Edition)
CREATE NONCLUSTERED COLUMNSTORE INDEX NCCI_Posts
    ON dbo.Posts(OwnerUserId, PostTypeId, Score, ViewCount, AnswerCount, CommentCount, CreationDate);

Investigate why IX_Posts_OwnerUserId_Covering shows 0 seeks despite many procs filtering on OwnerUserId — likely the queries are scanning the clustered index due to SELECT *-style projections or non-sargable predicates.

10. Medium Misuse of ORDER BY in CTE / inline TOP Confidence: Very High

sp09886, sp09895 contain ORDER BY in a CTE without TOP/OFFSET — this is invalid or non-deterministic. sp09950, sp09923, sp09929… use RANK()/ROW_NUMBER() over a single key with no tiebreaker, producing non-deterministic ordering across runs. Always include a tiebreaker (e.g., ... ORDER BY Score DESC, Id DESC).

sp09971: ROW_NUMBER() OVER (PARTITION BY p.Id ORDER BY p.CreationDate DESC) over the primary key always returns 1 — the partition is pointless.

11. Medium Reserved words and style hazards Confidence: High

  • Rank is a reserved built-in function name; many procs use it as a column alias (sp09994, sp09908, sp09887, sp09947, etc.). Rename to RankNo/PostRank and bracket if necessary.
  • Mixed case for the same column (p.ANSWERCOUNT in sp09995) — works but defeats text matching, plan cache, and casing consistency on case-sensitive collations.
  • SELECT * from CTEs (sp09837, sp09891, sp09889, sp09820) — fragile and wastes I/O.

12. Medium SET options & error handling Confidence: High

  • All SELECT procs include SET NOCOUNT ON ✅, but none set SET XACT_ABORT ON (good for read-only) and none specify SET TRANSACTION ISOLATION LEVEL READ COMMITTED SNAPSHOT/SNAPSHOT as appropriate. Enable RCSI database-wide to make these heavy reporters non-blocking against OLTP writers.
  • usp_UpdateTablesWithDelay uses SELECT ... AS Status to communicate errors and THROW. Pick one — do not return result sets from CATCH; let the exception propagate.
  • usp_UpdateTablesInLoop swallows errors with BEGIN CATCH ... PRINT .... Log to a table or rethrow categorically.

13. Low Leverage SQL Server 2022 features Confidence: Medium

  • These reporting procs vary heavily by parameter (often none, but joining different filters via constants) — once parameterized, Parameter Sensitive Plan optimization (PSP) will adapt for skewed predicates like OwnerUserId and PostTypeId automatically.
  • DOP feedback and memory grant feedback (percentile) in 2022 will benefit the large GROUP BY / RANK procs over Votes/PostHistory; ensure database COMPATIBILITY_LEVEL = 160.
  • Use GENERATE_SERIES, STRING_SPLIT(... , separator, enable_ordinal) for tag parsing if you keep the denormalized Posts.Tags column.
  • Consider Nonclustered Columnstore on Posts, Votes, PostHistory (Developer/Enterprise edition is licensed for it) — the workload is exactly what columnstore wins at (large GROUP BY, COUNT/SUM, top-N).

14. Low Naming and maintainability Confidence: High

  • Procedure names like sp09994 are opaque. Use a verb-noun convention: usp_Report_TopUsersByReputation.
  • Avoid the sp_/sp00... prefix — SQL Server scans master first for objects starting with sp_, and the numeric prefix invites confusion with system procedures.
  • Many procs hard-code '2024-10-01 12:34:56'. Parameterize @AsOfDate datetime2(0) and @TopN int so the plan cache is reused and the report is repeatable.
  • Several procs (sp09994, sp09940, sp09824) embed (SELECT OwnerUserId FROM Posts WHERE Id = tp.PostId) — fragile if duplicates emerge; replace with a proper join using surrogate keys.