Hey fellow Rails developers! Ever found yourself wrestling with complex SQL queries that involve aggregations, subqueries, or needing to join against computed results? Today, we’re diving into one of the most powerful yet underutilized techniques in ActiveRecord’s toolkit: derived tables. They’re not just for SQL purists anymore!
What Are Derived Tables Anyway?
In SQL parlance, a derived table is essentially a subquery that behaves like a table. You create it on the fly within your main query, and then join against it as if it were a regular database table. In Rails, this pattern lets us break down complex queries into manageable, composable pieces while maintaining performance.
You can read more about it here.
Let’s explore this concept through a real-world example I recently implemented:
The Problem: Finding "Stale" Client Accounts
Imagine you’re building a security findings platform. You need to identify clients who haven’t had a security finding in the last 25 days – what we call "stale" accounts… that includes Clients which have never had a finding as well.
So far so good, this would be very straightforward to implement, something along the lines of:
Client.left_joins(:findings)
.group(:id)
.having(
"MAX(findings.created_at) IS NULL OR MAX(findings.created_at) < ?",
25.days.ago
)Sigh, if only things in the real world were so simple… However, we can’t simply count all findings like that… imagine we have Groups of Findings and state of Findings and different conditions around counting them or not… this query starts to become quite complex and ugly with all kinds of CASEs inside (if you go that route)…
On top of that, you need to perform this fetching in a filterable and pageable way and, of course, build it to be scalable/performant. If you’re not convinced how difficult this can get yet, take a look at the solution I got to using derived tables:
def fetch_top_stale_accounts
stale_threshold = 25.days.ago
group_name_filter = params[:group_name].present? ? params[:group_name] : Group::TYPES_WE_CARE_ABOUT
findings_scope = Finding
.published
.joins(:group)
.where(groups: {group_name: group_name_filter})
last_finding_subquery = findings_scope
.select("DISTINCT ON (findings.client_id) findings.client_id, findings.published_at AS last_published_at, groups.group_name AS last_finding_group_name")
.reorder("findings.client_id, findings.published_at DESC")
relation = Client.active
.joins("LEFT JOIN (#{last_finding_subquery.to_sql}) AS last_findings ON clients.id = last_findings.client_id")
.where("last_findings.last_published_at < ? OR last_findings.last_published_at IS NULL", stale_threshold)
.where(id: Group.active.where(group_name: group_name_filter).select(:client_id))
.select("clients.*, last_findings.last_published_at, last_findings.last_finding_group_name")
.reorder(Arel.sql("last_findings.last_published_at ASC NULLS LAST"))
relation
endBreaking Down the Magic
Step 1: Building the Derived Table
last_finding_subquery = findings_scope
.select("DISTINCT ON (findings.client_id) findings.client_id, findings.published_at AS last_published_at, groups.group_name AS last_finding_group_name")
.reorder("findings.client_id, findings.published_at DESC")Here, we’re using PostgreSQL’s DISTINCT ON to get the most recent finding for each client… This, together with the sorting, effectively replaces the group(id).having(MAX) logic in the straightforward example I mentioned before.
This subquery becomes our derived table – essentially a virtual table containing each client’s latest finding as well as all the data from different tables we’ll ultimately need to display.
Step 2: The Derived Table Join
.joins("LEFT JOIN (#{last_finding_subquery.to_sql}) AS last_findings ON clients.id = last_findings.client_id")This is where the magic happens! We’re:
Converting the ActiveRecord relation to raw SQL with
.to_sqlEmbedding it directly in our JOIN clause
Giving it an alias (
last_findings) so we can reference it like a regular table
Step 3: Querying Against Our Derived Data
.where("last_findings.last_published_at < ? OR last_findings.last_published_at IS NULL", stale_threshold)
.select("clients.*, last_findings.last_published_at, last_findings.last_finding_group_name")Now we can filter and sort from our derived table just like any other table! And ActiveRecord is clever enough to put those "derived fields" in the in-memory Client object so that we can grab those pieces of data later without needing to fetch anything again.
Why Not Use Regular ActiveRecord Associations?
You might be thinking: There’s got to be a way to do this step by step with cleaner common ActiveRecord methods, without resorting to writing pieces of SQL.
I don’t blame you, I struggle with that notion myself a lot as well, and to be honest, maybe you could, but that would ultimately have worse performance or memory costs in some shape or form, I believe… besides, having a "one query fits all" really helps handling sorting and pagination.
And don’t forget, databases can optimize derived table queries better than what you can probably do with multiple separate queries…
Alternative Approaches (And Why We Didn’t Use Them)
1. ActiveRecord Subqueries (Rails 6.1+)
# Good for simple cases, but limited
Client.where(id: Finding.select(:client_id).distinct)2. CTEs (Common Table Expressions)
# Also powerful, but sometimes overkill and verbose syntax (to not say weird)
last_findings = Finding.select(...).to_cte('last_findings')
Client.joins("LEFT JOIN #{last_findings.to_sql} ON ...")3. Application-Side Processing
# The "easy" way that doesn't scale
clients = Client.all
clients.select { |c| c.findings.max_by(&:published_at) < 25.days.ago }
# ⚠️ Warning: This loads everything into memory!Pro Tips for Derived Table Success
Test SQL Output: Always check
.to_sqlduring developmentParameter Safety: Use bind variables (
.where("... < ?", value)) to prevent SQL injectionIndex Strategically: Ensure your derived table's join columns are indexed
Profile Performance: Use
EXPLAIN ANALYZEon complex queries to compare them & local memory profilers to get a sense of when the "simple" way is not working anymoreKeep It Readable: Add comments for complex derived tables
(one of my sins :p)
When to Reach for Derived Tables
You need to join against aggregated data (like "latest" or "max" records)
Your query involves complex levels of grouping/filtering/sorting
Performance matters and you want to avoid multiple round trips
The logic is too complex for ActiveRecord's standard query methods
Final Thoughts
Derived tables bridge the gap between ActiveRecord's convenience and raw SQL's power. They let you write performant, complex queries while still keeping your code in the Rails ecosystem.
I, personally, find them to be an elegant way to modularizing SQL logic.
The next time you find yourself reaching for raw SQL or writing inefficient application-side logic, consider whether a derived table could be your solution. It might look intimidating at first, but once you master this pattern, you'll unlock a new level of query capabilities.
But remember: With great power comes great responsibility. Use derived tables judiciously, and always document complex queries for your teammates (and future you!).
Happy querying! 🚀
Reference
[derived_tables]: https://www.geeksforgeeks.org/plsql/pl-sql-derived-tables/ "here"
We want to work with you. Check out our Services page!

