In this last piece, we are going to look a bit deeper into queries and play with a few more advanced scenarios. We will cover relationships of Active Record models a bit more in this article, but I'll stay away from examples that could be too confusing for programming newbies. Before you move forward, stuff like the example below should not cause any confusion:
Mission.last.agents.where(name: 'James Bond')
If you are new to Active Record queries and SQL, I recommend that you take a look at my previous two articles before you continue. This one might be hard to swallow without the knowledge that I was building up so far. Up to you, of course. On the flip side, this article is not going to be as long as the other ones if you just want to look at these slightly advanced use cases. Let’s dig in!
Topics
- Scopes & Associations
- Slimmer Joins
- Merge
- has_many
- Custom Joins
Scopes & Associations
Let’s reiterate. We can query Active Record models right away, but associations are also fair game for queries—and we can chain all these thingies. So far so good. We can pack finders into neat, reusable scopes in your models as well, and I briefly mentioned their similarity to class methods.
Rails
class Agent < ActiveRecord::Base belongs_to :mission scope :find_bond, -> { where(name: 'James Bond') } scope :licenced_to_kill, -> { where(licence_to_kill: true) } scope :womanizer, -> { where(womanizer: true) } scope :gambler, -> { where(gambler: true) } end # => Agent.find_bond # => Agent.licenced_to_kill # => Agent.womanizer # => Agent.gambler # => Mission.last.agents.find_bond # => Mission.last.agents.licenced_to_kill # => Mission.last.agents.womanizer # => Mission.last.agents.gambler # => Agent.licenced_to_kill.womanizer.gambler # => Mission.last.agents.womanizer.gambler.licenced_to_kill
So you can also pack them into your own class methods and be done with it. Scopes are not iffy or anything, I think—although people mention them as being a bit magic here and there—but since class methods achieve the same thing, I’d opt for that.
Rails
class Agent < ActiveRecord::Base belongs_to :mission def self.find_bond where(name: 'James Bond') end def self.licenced_to_kill where(licence_to_kill: true) end def self.womanizer where(womanizer: true) end def self.gambler where(gambler: true) end end # => Agent.find_bond # => Agent.licenced_to_kill # => Agent.womanizer # => Agent.gambler # => Mission.last.agents.find_bond # => Mission.last.agents.licenced_to_kill # => Mission.last.agents.womanizer # => Mission.last.agents.gambler # => Agent.licenced_to_kill.womanizer.gambler # => Mission.last.agents.womanizer.gambler.licenced_to_kill
These class methods read just the same, and you don’t need to stab anybody with a lambda. Whatever works best for you or your team; it's up to you which API you want to use. Just don’t mix and match them—stick with once choice! Both versions let you easily chain these methods inside another class method, for example:
Rails
class Agent < ActiveRecord::Base belongs_to :mission scope :licenced_to_kill, -> { where(licence_to_kill: true) } scope :womanizer, -> { where(womanizer: true) } def self.find_licenced_to_kill_womanizer womanizer.licenced_to_kill end end # => Agent.find_licenced_to_kill_womanizer # => Mission.last.agents.find_licenced_to_kill_womanizer
Rails
class Agent < ActiveRecord::Base belongs_to :mission def self.licenced_to_kill where(licence_to_kill: true) end def self.womanizer where(womanizer: true) end def self.find_licenced_to_kill_womanizer womanizer.licenced_to_kill end end # => Agent.find_licenced_to_kill_womanizer # => Mission.last.agents.find_licenced_to_kill_womanizer
Let’s take this one little step further—stay with me. We can use a lambda in associations themselves to define a certain scope. It looks a bit weird at first, but they can be pretty handy. That makes it possible to call these lambdas right on your associations.
This is pretty cool and highly readable with shorter methods chaining going on. Beware of coupling these models too tight, though.
Rails
class Mission < ActiveRecord::Base has_many :double_o_agents, -> { where(licence_to_kill: true) }, class_name: "Agent" end # => Mission.double_o_agents
Tell me this isn’t cool somehow! It's not for everyday use, but dope to have I guess. So here Mission
can “request” only agents that have the licence to kill.
A word about the syntax, since we strayed from naming conventions and used something more expressive like double_o_agents
. We need to mention the class name in order not to confuse Rails, which might otherwise expect to look for a class DoubleOAgent
. You can, of course, have both Agent
associations in place—the usual and your custom one—and Rails won’t complain.
Rails
class Mission < ActiveRecord::Base has_many :agents has_many :double_o__agents, -> { where(licence_to_kill: true) }, class_name: "Agent" end # => Mission.agents # => Mission.double_o_agents
Slimmer Joins
When you query the database for records and you don't need all of the data, you can choose to specify what exactly you want returned. Why? Because the data returned to Active Record will eventually be built into new Ruby Objects. Let’s look at one simple strategy to avoid memory bloat in your Rails app:
Rails
class Mission < ActiveRecord::Base has_many :agents end class Agent < ActiveRecord::Base belongs_to :mission end
Rails
Agent.all.joins(:mission)
SQL
SELECT "agents".* FROM "agents" INNER JOIN "missions" ON "missions"."id" = "agents"."mission_id"
So this query returns a list of agents with a mission from the database to Active Record—which then in turn sets out to build Ruby objects out of it. The mission
data is available since the data from these rows is joined onto the rows of the agent’s data. That means, the joined data is available during the query but will not get returned to Active Record. So you will have this data to perform calculations, for example.
It’s especially cool because you can make use of data that does not also get sent back to your app. Fewer attributes that need to get built into Ruby objects—that take up memory—can be a big win. In general, think about sending only the absolute necessary rows and columns back that you need. That way you can avoid bloat quite a bit.
Rails
Agent.all.joins(:mission).where(missions: { objective: "Saving the world" })
Just a quick aside about the syntax here: because we are not querying the Agent
table via where
, but the joined :mission
table, we need to specify that we are looking for specific missions
in our WHERE
clause.
SQL
SELECT "agents".* FROM "agents" INNER JOIN "missions" ON "missions"."id" = "agents"."mission_id" WHERE "missions"."objective" = ? [["objective", "Saving the world"]]
Using includes
here would also return missions to Active Record for eager loading and take up memory building Ruby objects.
Merge
A merge
comes in handy, for example, when we want to combine a query on agents and their associated missions that have a specific scope that is defined by you. We can take two ActiveRecord::Relation
objects and merge their conditions. Sure, no biggie, but merge
is useful if you want to use a certain scope while using an association.
In other words, what we can do with merge
is filter by a named scope on the joined model. In one of the previous examples, we used class methods to define such named scopes ourselves.
Rails
class Mission < ActiveRecord::Base has_many :agents def self.dangerous where(enemy: "Ernst Stavro Blofeld") end end class Agent < ActiveRecord::Base belongs_to :mission end
Rails
Agent.joins(:mission).merge(Mission.dangerous)
SQL
SELECT "agents".* FROM "agents" INNER JOIN "missions" ON "missions"."id" = "agents"."mission_id" WHERE "missions"."enemy" = ? [["enemy", "Ernst Stavro Blofeld"]]
When we encapsulate what a dangerous
mission is within the Mission
model, we can tuck it on a join
via merge
this way. So moving the logic of such conditions to the relevant model where it belongs is on the one side a nice technique to achieve looser coupling—we don’t want our Active Record models to know a lot of details about each other—and on the other hand, it gives you a nice API in your joins without blowing up in your face. The example below without merge would not work without an error:
Rails
Agent.all.merge(Mission.dangerous)
SQL
SELECT "agents".* FROM "agents" WHERE "missions"."enemy" = ? [["enemy", "Ernst Stavro Blofeld"]]
When we now merge an ActiveRecord::Relation
object for our missions onto our agents, the database doesn’t know which missions we are talking about. We need to make clear which association we need and join the mission data first—or SQL gets confused. One last cherry on top. We can encapsulate this even better by involving the agents as well:
Rails
class Mission < ActiveRecord::Base has_many :agents def self.dangerous where(enemy: "Ernst Stavro Blofeld") end end class Agent < ActiveRecord::Base belongs_to :mission def self.double_o_engagements joins(:mission).merge(Mission.dangerous) end end
Rails
Agent.double_o_engagements
SQL
SELECT "agents".* FROM "agents" INNER JOIN "missions" ON "missions"."id" = "agents"."mission_id" WHERE "missions"."enemy" = ? [["enemy", "Ernst Stavro Blofeld"]]
That’s some sweet cherry in my book. Encapsulation, proper OOP, and great readability. Jackpot!
has_many
Above we have seen the belongs_to
association in action a lot. Let’s take a look at this from another perspective and bring secret service sections into the mix:
Rails
class Section < ActiveRecord::Base has_many :agents end class Mission < ActiveRecord::Base has_many :agents end class Agent < ActiveRecord::Base belongs_to :mission belongs_to :section end
So in this scenario, agents would not only have a mission_id
but also a section_id
. So far so good. Let’s find all sections with agents with a specific mission—so sections that have some sort of mission going on.
Rails
Section.joins(:agents)
SQL
SELECT "sections".* FROM "sections" INNER JOIN "agents" ON "agents"."section_id" = "sections."id"
Did you notice something? A small detail is different. The foreign keys are flipped. Here we are requesting a list of sections but use foreign keys like this: "agents"."section_id" = "sections."id"
. In other words, we are looking for a foreign key from a table we are joining.
Rails
Agent.joins(:mission)
SQL
SELECT "agents".* FROM "agents" INNER JOIN "missions" ON "missions"."id" = "agents"."mission_id"
Previously our joins via a belongs_to
association looked like this: the foreign keys were mirrored ("missions"."id" = "agents"."mission_id"
) and were looking for the foreign key from the table we are starting.
Going back to your has_many
scenario, we now would get a list of sections that are repeated because they have multiple agents in each section, of course. So for each agent column that gets joined on, we get a row for this section or section_id—in short, we are duplicating rows basically. To make this even more dizzying, let’s bring missions into the mix as well.
Rails
Section.joins(agents: :mission)
SQL
SELECT "sections".* FROM "sections" INNER JOIN "agents" ON "agents"."section_id" = "sections"."id" INNER JOIN "missions" ON "missions"."id" = "agents"."mission_id"
Check out the two INNER JOIN
parts. Still with me? We are “reaching” through agents to their missions from the agent’s section. Yeah, stuff for fun headaches, I know. What we get are missions that indirectly associate with a certain section.
As a result, we get new columns joined on, but the number of rows is still the same that gets returned by this query. What gets sent back to Active Record—resulting in building new Ruby objects—is also still the list of sections. So when we have multiple missions going on with multiple agents, we would get duplicated rows for our section again. Let’s filter this some more:
Rails
Section.joins(agents: :mission).where(missions: { enemy: "Ernst Stavro Blofeld" })
SQL
SELECT "sections".* FROM "sections" INNER JOIN "agents" ON "agents"."section_id" = "sections"."id" INNER JOIN "missions" ON "missions"."id" = "agents"."mission_id" WHERE "missions"."enemy" = 'Ernst Stavro Blofeld'
Now we only get sections returned that are involved in missions where Ernst Stavro Blofeld is the involved enemy. Cosmopolitan as some super villains might think of themselves, they could operate in more than one section—say section A and C, the United Sates and Canada respectively.
If we have multiple agents in a given section who are working on the same mission to stop Blofeld or whatever, we would again have repeated rows returned to us into Active Record. Let’s be a bit more distinct about it:
Rails
Section.joins(agents: :mission).where(missions: { enemy: "Ernst Stavro Blofeld" }).distinct
SQL
SELECT DISTINCT "sections".* FROM "sections" INNER JOIN "agents" ON "agents"."section_id" = "sections"."id" INNER JOIN "missions" ON "missions"."id" = "agents"."mission_id" WHERE "missions"."enemy" = 'Ernst Stavro Blofeld'
What this gives us is the number of sections that Blofeld operates out of—that are known—which has agents active in missions with him as an enemy. As a final step, let’s do some refactoring again. We extract this into a nice “little” class method on class Section
:
Rails
class Section < ActiveRecord::Base has_many :agents def self.critical joins(agents: :mission).where(missions: { enemy: "Ernst Stavro Blofeld" }).distinct end end class Mission < ActiveRecord::Base has_many :agents end class Agent < ActiveRecord::Base belongs_to :mission belongs_to :section end
You can refactor this even more and split up the responsibilities to achieve looser coupling, but let’s move on for now.
Custom Joins
Most of the time you can rely on Active Record writing the SQL that you want for you. That means that you stay in Ruby land and don’t need to worry about database details too much. But sometimes you need to poke a hole into SQL land and do your own thing. For example, if you need to use a LEFT
join and break out of Active Record’s usual behaviour of doing an INNER
join by default. joins
is a little window to write your own custom SQL if needed. You open it, plug in your custom query code, close the “window” and can continue to add on Active Record query methods.
Let’s demonstrate this with an example that involves gadgets. Let’s say a typical agent usually has_many
gadgets, and we want to find agents that are not equipped with any fancy gadgetry to help them in the field. A usual join would not yield good results since we are actually interested in nil
—or null
in SQL speak—values of these spy toys.
Rails
class Mission < ActiveRecord::Base has_many :agents end class Agent < ActiveRecord::Base belongs_to :mission has_many :gadgets end class Gadget < ActiveRecord::Base belongs_to :agent end
When we do a joins
operation, we will get only agents returned that are already equipped with gadgets because the agent_id
on these gadgets is not nil. This is the expected behaviour of a default inner join. The inner join builds upon a match on both sides and returns only rows of data that match this condition. A non-existent gadget with a nil
value for an agent who carries no gadget does not match that criterion.
Rails
Agent.joins(:gadgets)
SQL
SELECT "agents".* FROM "agents" INNER JOIN "gadgets" ON "gadgets"."agent_id" = "agents"."id"
We are, on the other hand, looking for schmuck agents that are in dire need of some love from the quartermaster. Your first guess might have looked like the following:
Rails
Agent.joins(:gadgets).where(gadgets: {agent_id: nil})
SQL
SELECT "agents".* FROM "agents" INNER JOIN "gadgets" ON "gadgets"."agent_id" = "agents"."id" WHERE "gadgets"."agent_id" IS NULL
Not bad, but as you can see from the SQL output, it does not play along and still insists on the default INNER JOIN
. That’s a scenario where we need an OUTER
join, because one side of our “equation” is missing, so to speak. We are looking for results for gadgets that don’t exist—more precisely, for agents without gadgets.
So far, when we passed a symbol to Active Record in a join, it was expecting an association. With a string passed in, on the other hand, it expects it to be an actual fragment of SQL code—a part of your query.
Rails
Agent.joins("LEFT OUTER JOIN gadgets ON gadgets.agent_id = agents.id").where(gadgets: {agent_id: nil})
SQL
SELECT "agents".* FROM "agents" LEFT OUTER JOIN gadgets ON gadgets.agent_id = agents.id WHERE "gadgets"."agent_id" IS NULL
Or, if you are curious about lazy agents without missions—hanging possibly in Barbados or wherever—our custom join would look like this:
Rails
Agent.joins("LEFT OUTER JOIN missions ON missions.id = agents.mission_id").where(missions: { id: nil })
SQL
SELECT "agents".* FROM "agents" LEFT OUTER JOIN missions ON missions.id = agents.mission_id WHERE "missions"."id" IS NULL
The outer join is the more inclusive join version since it will match all records from the joined tables, even if some of these relationships don’t exist yet. Because this approach is not as exclusive as inner joins, you will get a bunch of nils here and there. This can be informative in some cases, of course, but inner joins are nevertheless usually what we are looking for. Rails 5 will let us use a specialized method called left_outer_joins
instead for such cases. Finally!
One little thing for the road: keep these peeping holes into SQL land as small as possible if you can. You will do everybody—including your future self—a tremendous favor.
Final Thoughts
Getting Active Record to write efficient SQL for you is one of the main skills you should take away from this mini-series for beginners. That way, you will also get code that is compliant with whatever database it supports—meaning that the queries will be stable across databases. It is necessary that you understand not only how to play with Active Record but also the underlying SQL, which is of equal importance.
Yes, SQL can be boring, tedious to read and not elegant looking, but don’t forget that Rails wraps Active Record around SQL and you should not neglect understanding this vital piece of technology—just because Rails makes it very easy not to care most of the time. Efficiency is crucial for database queries, especially if you build something for larger audiences with heavy traffic.
Now go out on the internets and find some more material on SQL to get it out of your system—once and for all!
Comments