My Boss Needed a Report in 10 Minutes. It Was Going to Take 45. Here's How I Fixed It Forever.
The Walk of Shame to My Manager's Desk
It was 9:52am. The Monday operations meeting started at 10. My manager wanted the weekly performance report on her screen before she walked in. The query had been running since 9:38. The progress bar — if SQL Server Management Studio had one — would have been laughing at me.
I hit F5 at 9:38. It was still running at 9:55. I walked to her desk and did the thing you never want to do: I told her the report wasn't ready. She nodded the way people nod when they've stopped expecting better.
I went back to my desk and spent the rest of that day not working on anything else. I was going to fix this query or die trying. What I discovered changed how I write SQL permanently.
What the Query Was Doing Wrong
The original query was 340 lines. It had nested subqueries seven levels deep. It was doing full table scans on a 40-million-row fact table. It was using SELECT * everywhere. It had no indexes on any of the join columns. It was recalculating the same date logic five times across five different subqueries.
It worked. It was just doing the computational equivalent of digging a tunnel with a spoon.
Fix 1: Kill SELECT * Immediately
Every SELECT * pulls every column across the wire. On a table with 60 columns where you need 4, you're transferring 15× the data you need — and preventing the query optimizer from using covering indexes.
-- Before (pulling 60 columns, killing index coverage)
SELECT * FROM dbo.sales_fact sf
JOIN dbo.dim_date d ON sf.date_key = d.date_key
-- After (explicit columns, optimizer can use covering index)
SELECT
sf.transaction_id,
sf.revenue,
sf.region_key,
d.fiscal_week
FROM dbo.sales_fact sf
JOIN dbo.dim_date d ON sf.date_key = d.date_key
This change alone dropped execution time by 18% on the fact table joins.
Fix 2: CTEs Over Nested Subqueries
Seven levels of nested subqueries doesn't just hurt readability — it forces the database to materialize each result set inside-out. CTEs let the optimizer see the full query plan and make smarter decisions about execution order.
-- Before: nested subquery hell
SELECT region, total_rev
FROM (
SELECT region, SUM(revenue) as total_rev
FROM (
SELECT * FROM sales_fact WHERE date_key > 20250101
) filtered
GROUP BY region
) grouped
WHERE total_rev > 10000
-- After: CTEs that the optimizer can reason about
WITH filtered_sales AS (
SELECT region, revenue
FROM dbo.sales_fact
WHERE date_key > 20250101
),
regional_totals AS (
SELECT region, SUM(revenue) AS total_rev
FROM filtered_sales
GROUP BY region
)
SELECT region, total_rev
FROM regional_totals
WHERE total_rev > 10000
Cleaner, faster, and your future self won't curse your name.
Fix 3: Index the Join Columns
The 40-million-row fact table had no non-clustered indexes on date_key, region_key, or product_key — the three columns used in every join. Every join was a full scan.
-- Add covering indexes on frequently joined columns
CREATE NONCLUSTERED INDEX IX_sales_fact_date_region
ON dbo.sales_fact (date_key, region_key)
INCLUDE (revenue, transaction_id);
-- Check what the optimizer is actually doing
SET STATISTICS IO ON;
SET STATISTICS TIME ON;
After adding three targeted indexes, the fact table scans became seeks. Query time dropped from 45 minutes to 12 minutes. We were getting somewhere.
Fix 4: Pre-aggregate Before Joining
The original query was joining the raw 40M-row fact table to dimension tables, then aggregating. This means every join operation happened on 40 million rows before any aggregation reduced the data set.
Flip it: aggregate first, join the much smaller result to your dimensions.
WITH daily_region_totals AS (
-- Aggregate the 40M rows DOWN to ~10,000 rows first
SELECT date_key, region_key, SUM(revenue) AS revenue
FROM dbo.sales_fact
GROUP BY date_key, region_key
)
-- NOW join the small result to your dimensions
SELECT
d.fiscal_week,
r.region_name,
drt.revenue
FROM daily_region_totals drt
JOIN dbo.dim_date d ON drt.date_key = d.date_key
JOIN dbo.dim_region r ON drt.region_key = r.region_key
This took us from 12 minutes to 4 minutes. The joins were now operating on thousands of rows, not millions.
Fix 5: Partition Pruning for Date Ranges
For queries that always filter by date — which is almost every operational report — table partitioning by date key lets SQL Server skip entire file groups instead of scanning the full table.
The database shouldn't read January's data to answer a question about March. Partitioning makes sure it doesn't.
Combined, these five changes took the query from 45 minutes to 4 minutes and 12 seconds. The next Monday, the report was on my manager's screen at 9:45. She didn't say anything. She didn't need to.
The Checklist
- Replace every
SELECT *with explicit columns - Convert nested subqueries to CTEs
- Index your join and filter columns (and include the columns you SELECT)
- Aggregate before you join, not after
- Use
SET STATISTICS IO ONto see what the optimizer is actually doing - Partition large tables on date ranges if you always filter by date