SQL Server Code Review – RockyPC.SQLStorm – 2026-05-10

Tuning Goal: Code Review

Server: RockyPC | Database: SQLStorm | Version: Microsoft SQL Server 2022 (RTM-GDR) 16.0.1175.1 – Developer Edition (64-bit)

Executive Summary

The reviewed objects are predominantly read-only reporting procedures over a Stack Exchange-shaped schema (Posts, Users, Votes, Comments, PostHistory, Badges). The code base shows consistent anti-patterns: full-table aggregations followed by TOP N, fan-out joins between multiple one-to-many tables (causing duplicate-row inflation), LIKE '%tag%' scans of Posts.Tags, join-on-display-name (non-unique), and several outright correctness bugs (Cartesian joins, wrong key joins, joining Users.Id to Posts.Id). One write procedure (usp_UpdateTablesWithDelay) intentionally holds an open transaction with WAITFOR DELAY '00:00:05', which is the most dangerous item in the set: it serializes all readers/writers blocked behind it for 5+ seconds per call, amplified by the calling looper usp_UpdateTablesInLoop.

Stored Procedures Reviewed

Functions / Triggers / Views Provided

Tables > 1,000 Rows

Prioritized Findings

Note: Only the most recent stored procedures were provided; older objects, functions, triggers, and views were not in scope.

Review Scope & Object Counts

Object Type	Count Reviewed
SQL_STORED_PROCEDURE	94
SQL_SCALAR_FUNCTION / TVF	0 (none provided)
VIEW	0 (none provided)
TRIGGER	0 (none provided)

Only the most recent objects were available for review.

Top Priorities (in order)

Critical Eliminate the 5-second WAITFOR inside an open transaction in usp_UpdateTablesWithDelay; refactor the calling loop usp_UpdateTablesInLoop.
Critical Fix correctness bugs: Cartesian joins (sp09974, sp09895, sp09833, sp09881), wrong-key join in sp09826 (Posts.Id = Users.Id), sp00990 (RankedPosts.PostId = UserReputation.UserId), sp09833 (RankedPosts.PostId IS NOT NULL as join key), and join-by-DisplayName in many procs.
High Replace SELECT TOP 1 ... ORDER BY NEWID() per iteration in usp_UpdateTablesInLoop (full scans of Users and Posts per call).
High Stop fan-out: pre-aggregate Votes, Comments, PostHistory, Badges in CTEs before joining (e.g., sp09994, sp09991, sp09972, sp09907, sp09851).
High Replace p.Tags LIKE '%' + t.TagName + '%' with a normalized PostTags bridge table (sp09982, sp09990, sp00992, sp09881, sp09886).
Medium Repair broken date arithmetic: CAST('2024-10-01 ...' AS DATETIME) - DATEADD(YEAR, 1, 0) does not mean "one year ago" (sp09978, sp09921, sp09905, sp09844).
Medium Add targeted indexes (see §9). The existing IX_Posts_OwnerUserId_Covering is unused — confirm/clean.

Detailed Prioritized Recommendations

1. Critical Long-held transaction with `WAITFOR DELAY` Confidence: Very High

dbo.usp_UpdateTablesWithDelay holds an explicit transaction open for 5+ seconds while UPDATE Users and UPDATE Posts retain X locks on the modified key(s) (and U-locks during seek). With XACT_ABORT ON + BEGIN TRAN, any reader using default RC will block. usp_UpdateTablesInLoop magnifies this by calling it repeatedly, while itself doing ORDER BY NEWID() against multi-million-row tables.

Fix: Never wait inside a transaction. If you must simulate latency, do it outside.

-- Refactored
CREATE OR ALTER PROCEDURE dbo.usp_UpdateTablesWithDelay
    @UserId INT, @NewReputation INT, @PostId INT, @NewScore INT
AS
BEGIN
    SET NOCOUNT ON;
    SET XACT_ABORT ON;

    BEGIN TRY
        BEGIN TRAN;
            UPDATE dbo.Users
               SET Reputation = @NewReputation, LastAccessDate = SYSUTCDATETIME()
             WHERE Id = @UserId;

            UPDATE dbo.Posts
               SET Score = @NewScore, LastActivityDate = SYSUTCDATETIME()
             WHERE Id = @PostId;
        COMMIT;
    END TRY
    BEGIN CATCH
        IF @@TRANCOUNT > 0 ROLLBACK;
        THROW;
    END CATCH;

    -- Optional pacing OUTSIDE the transaction
    -- WAITFOR DELAY '00:00:05';
END;

Also enable READ_COMMITTED_SNAPSHOT on SQLStorm to reduce reader/writer blocking globally, and avoid PRINT in hot procedures (it forces network round-trips).

2. High `ORDER BY NEWID()` for random picks Confidence: Very High

In usp_UpdateTablesInLoop:

SELECT TOP 1 @RandomUserId = Id FROM dbo.Users    ORDER BY NEWID();   -- full scan, 267k rows
SELECT TOP 1 @RandomPostId = Id FROM dbo.Posts    ORDER BY NEWID();   -- full scan, 246k rows

Each iteration scans both tables. Use TABLESAMPLE or a key-range pick:

DECLARE @MaxUserId INT = (SELECT MAX(Id) FROM dbo.Users);
DECLARE @MaxPostId INT = (SELECT MAX(Id) FROM dbo.Posts);

SELECT TOP (1) @RandomUserId = Id
FROM dbo.Users
WHERE Id >= ABS(CHECKSUM(NEWID())) % @MaxUserId
ORDER BY Id;

3. Critical Cartesian / CROSS JOIN correctness bugs Confidence: High

sp09974: UserStats CROSS JOIN PopularPosts — every user × every popular post (10× row inflation, no filter).
sp09895: same anti-pattern with a redundantly duplicated inner query.
sp09833: BadgeDistribution bd ON bd.Reputation > 0 — a non-equi predicate that yields a Cartesian-like product.
sp09881: UserVoteSummary u CROSS JOIN PopularTags t then joined by display name (also wrong, see #4).
sp09982: JOIN TagSummary ts ON ts.PostCount > 5 — non-correlated predicate, Cartesian.
sp09861, sp09909, sp09823, sp00990: similar non-correlated ON clauses (e.g., ON UA.PostsChanged > 0, ON ups.PostCount > 5).

Fix: Every JOIN ... ON must include an equality predicate between the two tables. If you really need a "Top 10 tags overall × Top 10 users", do that explicitly and document it; otherwise add the missing equi-join.

4. Critical Wrong-key joins Confidence: High

sp09826: JOIN Users u ON tp.PostId = u.Id — joins post id to user id; results are coincidental matches.
sp00990: JOIN UserReputation ur ON rp.PostId = ur.UserId — same class of bug.
sp09909, sp09867, sp09895, sp09886, sp09831, sp09828, sp09824, sp09815: join on DisplayName, which is not unique in Users and produces duplicate/incorrect results.
sp09891: LEFT JOIN PostTypes pt ON tp.PostId = pt.Id — joins post id to post-type id (1..7).
sp09833: JOIN PostTypes pt ON rp.PostId IS NOT NULL — no predicate.
sp09825: subquery SELECT TOP 1 DisplayName FROM Users WHERE Id = b.UserId referencing the outer alias b incorrectly.

Fix: Always join on surrogate keys. Add a UNIQUE constraint or accept that DisplayName is non-unique and never join on it.

5. High `p.Tags LIKE '%' + TagName + '%'` Confidence: Very High

Used in sp09982, sp09990, sp00992, sp09881, sp09886, sp09915. This pattern is unsargable, scans all 246k Posts per Tag row, and is functionally incorrect (e.g., sql matches mssql).

Fix: Normalize tags into a bridge:

CREATE TABLE dbo.PostTags(
    PostId INT NOT NULL,
    TagId  INT NOT NULL,
    CONSTRAINT PK_PostTags PRIMARY KEY (PostId, TagId)
);
CREATE INDEX IX_PostTags_TagId_PostId ON dbo.PostTags(TagId, PostId);

-- Or, if you must keep Posts.Tags '|sql|tsql|', use STRING_SPLIT in 2022
-- with proper bracket boundaries and an exact match, not LIKE '%...%'.

6. High Pre-aggregate to avoid fan-out Confidence: Very High

Many procedures LEFT JOIN Posts, Comments, Votes, and Badges in one shot, then GROUP BY u.Id. With Votes (926k) and PostHistory (847k), this multiplies row counts by 10–100×. Examples: sp09991, sp09972, sp09994, sp09907, sp09851, sp09908, sp09812, sp09955.

-- Wrong (fan-out: posts × votes × badges)
FROM Users u
LEFT JOIN Posts  p ON u.Id = p.OwnerUserId
LEFT JOIN Votes  v ON p.Id = v.PostId
LEFT JOIN Badges b ON u.Id = b.UserId
GROUP BY u.Id;

-- Right (independent aggregates, then join)
;WITH P AS (SELECT OwnerUserId AS UserId, COUNT(*) AS PostCnt FROM Posts GROUP BY OwnerUserId),
      V AS (SELECT p.OwnerUserId AS UserId,
                   SUM(CASE WHEN v.VoteTypeId=2 THEN 1 ELSE 0 END) AS Up,
                   SUM(CASE WHEN v.VoteTypeId=3 THEN 1 ELSE 0 END) AS Down
            FROM Votes v JOIN Posts p ON p.Id=v.PostId
            GROUP BY p.OwnerUserId),
      B AS (SELECT UserId, COUNT(*) AS BadgeCnt FROM Badges GROUP BY UserId)
SELECT u.Id, u.DisplayName, P.PostCnt, V.Up, V.Down, B.BadgeCnt
FROM Users u
LEFT JOIN P ON P.UserId=u.Id
LEFT JOIN V ON V.UserId=u.Id
LEFT JOIN B ON B.UserId=u.Id;

sp00060Fixed already follows this pattern correctly and is a good template.

7. High Broken date arithmetic Confidence: Very High

Several procs use:

p.CreationDate >= CAST('2024-10-01 12:34:56' AS DATETIME) - DATEADD(YEAR, 1, 0)
-- DATEADD(YEAR,1,0) => '1901-01-01'
-- DATETIME - DATETIME treats RHS as a number of days, producing nonsense.

Found in sp09978, sp09921, sp09905, sp09844, sp09963, sp00990 (similar). Replace with:

p.CreationDate >= DATEADD(YEAR, -1, CAST('2024-10-01 12:34:56' AS datetime2(0)))

Also remove hard-coded "current date" literals — parameterize @AsOf.

8. Medium Correlated subqueries instead of joins Confidence: High

sp09974, sp09946, sp09940, sp09904, sp09908, sp09891, sp09099: multiple per-row (SELECT COUNT(*) FROM Votes WHERE PostId = ...) calls cause 1 scan per outer row. Use a single grouped CTE joined once. sp09908's STRING_AGG over Votes×Users for every top post will be especially expensive on 926k Votes.

9. Medium Indexing recommendations Confidence: Medium

Usage stats show all listed nonclustered indexes have 0 seeks/scans, suggesting the workload either runs cold or lookups are missing. Validate, then ensure these exist:

-- Posts: most queries filter by OwnerUserId, PostTypeId, CreationDate
CREATE INDEX IX_Posts_OwnerUserId_Type_Date
    ON dbo.Posts(OwnerUserId, PostTypeId, CreationDate)
    INCLUDE (Score, ViewCount, Title, AnswerCount, CommentCount, AcceptedAnswerId)
    WITH (DATA_COMPRESSION = PAGE);

CREATE INDEX IX_Posts_CreationDate_Type
    ON dbo.Posts(CreationDate, PostTypeId)
    INCLUDE (Score, ViewCount, OwnerUserId, Title)
    WITH (DATA_COMPRESSION = PAGE);

-- Votes: extreme fan-out source
CREATE INDEX IX_Votes_PostId_VoteTypeId      ON dbo.Votes(PostId, VoteTypeId)
    INCLUDE (UserId, BountyAmount) WITH (DATA_COMPRESSION = PAGE);
CREATE INDEX IX_Votes_UserId_VoteTypeId      ON dbo.Votes(UserId, VoteTypeId)
    WITH (DATA_COMPRESSION = PAGE);

-- Comments
CREATE INDEX IX_Comments_PostId  ON dbo.Comments(PostId)  INCLUDE (UserId, Score, CreationDate);
CREATE INDEX IX_Comments_UserId  ON dbo.Comments(UserId)  INCLUDE (PostId, Score, CreationDate);

-- Badges
CREATE INDEX IX_Badges_UserId_Class ON dbo.Badges(UserId, Class) INCLUDE (Date, Name);

-- PostHistory (847k rows, 820 MB — biggest object)
CREATE INDEX IX_PostHistory_PostId_Type_Date
    ON dbo.PostHistory(PostId, PostHistoryTypeId, CreationDate) INCLUDE (UserId);

-- Consider columnstore for analytical procs (Enterprise/Developer Edition)
CREATE NONCLUSTERED COLUMNSTORE INDEX NCCI_Posts
    ON dbo.Posts(OwnerUserId, PostTypeId, Score, ViewCount, AnswerCount, CommentCount, CreationDate);

Investigate why IX_Posts_OwnerUserId_Covering shows 0 seeks despite many procs filtering on OwnerUserId — likely the queries are scanning the clustered index due to SELECT *-style projections or non-sargable predicates.

10. Medium Misuse of `ORDER BY` in CTE / inline TOP Confidence: Very High

sp09886, sp09895 contain ORDER BY in a CTE without TOP/OFFSET — this is invalid or non-deterministic. sp09950, sp09923, sp09929… use RANK()/ROW_NUMBER() over a single key with no tiebreaker, producing non-deterministic ordering across runs. Always include a tiebreaker (e.g., ... ORDER BY Score DESC, Id DESC).

sp09971: ROW_NUMBER() OVER (PARTITION BY p.Id ORDER BY p.CreationDate DESC) over the primary key always returns 1 — the partition is pointless.

11. Medium Reserved words and style hazards Confidence: High

Rank is a reserved built-in function name; many procs use it as a column alias (sp09994, sp09908, sp09887, sp09947, etc.). Rename to RankNo/PostRank and bracket if necessary.
Mixed case for the same column (p.ANSWERCOUNT in sp09995) — works but defeats text matching, plan cache, and casing consistency on case-sensitive collations.
SELECT * from CTEs (sp09837, sp09891, sp09889, sp09820) — fragile and wastes I/O.

12. Medium SET options & error handling Confidence: High

All SELECT procs include SET NOCOUNT ON ✅, but none set SET XACT_ABORT ON (good for read-only) and none specify SET TRANSACTION ISOLATION LEVEL READ COMMITTED SNAPSHOT/SNAPSHOT as appropriate. Enable RCSI database-wide to make these heavy reporters non-blocking against OLTP writers.
usp_UpdateTablesWithDelay uses SELECT ... AS Status to communicate errors and THROW. Pick one — do not return result sets from CATCH; let the exception propagate.
usp_UpdateTablesInLoop swallows errors with BEGIN CATCH ... PRINT .... Log to a table or rethrow categorically.

13. Low Leverage SQL Server 2022 features Confidence: Medium

These reporting procs vary heavily by parameter (often none, but joining different filters via constants) — once parameterized, Parameter Sensitive Plan optimization (PSP) will adapt for skewed predicates like OwnerUserId and PostTypeId automatically.
DOP feedback and memory grant feedback (percentile) in 2022 will benefit the large GROUP BY / RANK procs over Votes/PostHistory; ensure database COMPATIBILITY_LEVEL = 160.
Use GENERATE_SERIES, STRING_SPLIT(... , separator, enable_ordinal) for tag parsing if you keep the denormalized Posts.Tags column.
Consider Nonclustered Columnstore on Posts, Votes, PostHistory (Developer/Enterprise edition is licensed for it) — the workload is exactly what columnstore wins at (large GROUP BY, COUNT/SUM, top-N).

14. Low Naming and maintainability Confidence: High

Procedure names like sp09994 are opaque. Use a verb-noun convention: usp_Report_TopUsersByReputation.
Avoid the sp_/sp00... prefix — SQL Server scans master first for objects starting with sp_, and the numeric prefix invites confusion with system procedures.
Many procs hard-code '2024-10-01 12:34:56'. Parameterize @AsOfDate datetime2(0) and @TopN int so the plan cache is reused and the report is repeatable.
Several procs (sp09994, sp09940, sp09824) embed (SELECT OwnerUserId FROM Posts WHERE Id = tp.PostId) — fragile if duplicates emerge; replace with a proper join using surrogate keys.