Rails includes vs left (outer) join

Includes

product = Post.create(title: "Product - 1")
product.store_products.create(store_name: "Store - 1", price: 100)
product.store_products.create(store_name: "Store - 2", price: 105)
product = Post.create(title: "Product - 2")
product.store_products.create(store_name: "Store - 1", price: 65)
product.store_products.create(store_name: "Store - 2", price: 63)

If we use the following code to fetch products and associated store products

products = Product.allproducts.each do |product|
  product.store_product.size
end

SELECT `products`.* FROM `products

SELECT COUNT(*) FROM `store_products` WHERE `store_products`.`product_id` = 1
SELECT COUNT(*) FROM `store_products` WHERE `store_products`.`product_id` = 1
SELECT COUNT(*) FROM `store_products` WHERE `store_products`.`product_id` = 2
SELECT COUNT(*) FROM `store_products` WHERE `store_products`.`product_id` = 2

This is known as an N+1 query situation, where a single query ran to select all products, then a separate query for its association.

This is where includes come for the rescue, let see the result with includes

products = Product.includes(:store_products)products.each do |product|
  product.store_product.size
end

SELECT `products`.* FROM `products
SELECT `store_products`.* FROM `store_products` WHERE `store_products`.`product_id` IN (1, 2)

Includes loads the association data in a single separate query, much fewer reads from the database

Includes with references

Now what if you want to have a condition on association

products = Product.includes(:store_products)
  .where("store_products.price > 65")

Mysql2::Error: Unknown column 'store_products.price' in 'where clause'

Now this is what happens (at least in Rails 4), to fix this we can use references

products = Product.includes(:store_products)
  .references(:store_producs)
  .where("store_products.price > 65")

# products.size => 2

products.each do |product|
  product.store_product.size
end

SELECT `products`.`id` AS t0_r0, `products`.`title` AS t0_r1, `store_products`.`id` AS t1_r0,
    `store_products`.`price` AS t1_r1
FROM `products` LEFT OUTER JOIN `store_producs` ON `store_products`.`product_id` = `products`.`id`

Now a left outer join is used to get the association data

Includes with references vs left (outer) join

If includes with references is using left join what is the difference between Includes with references and left join

Let’s do the left join query

products = Product.joins(“left join store_producs on store_producs.product_id = products.id”)
  .select(“products.*, store_producs.*”)
  .where("store_products.price > 65")

# products.size => 4

Left join returns 4 records (in a way duplicate result).

2 products x 2 site products
Product - 1 | Store-1 Product
Product - 1 | Store-2 Product
Product - 2 | Store-1 Product
Product - 2 | Store-2 Product

The queries generated

products = Product.joins(“left outer join store_producs on store_producs.product_id = products.id”)
  .select(“products.*, store_producs.*”)
  .where("store_products.price > 65")

products.each do |product|
  product.store_product.size
end

SELECT products.*, store_products.*
FROM `products` left outer join store_products on store_products.product_id = products.id
WHERE (store_products.price_in_cents > 65)

SELECT COUNT(*) FROM `store_products` WHERE `store_products`.`product_id` = 1
SELECT COUNT(*) FROM `store_products` WHERE `store_products`.`product_id` = 1
SELECT COUNT(*) FROM `store_products` WHERE `store_products`.`product_id` = 2
SELECT COUNT(*) FROM `store_products` WHERE `store_products`.`product_id` = 2

Oh!!! when used left join it runs seperate query when we accessed the association.

Again let’s see the includes with references result

products = Product.includes(:store_products).references(:store_producs).where("store_products.price > 65")

products.each do |product|
  product.store_product.size
end
# products.size => 2

SELECT `products`.`id` AS t0_r0, `products`.`title` AS t0_r1, `store_products`.`id` AS t1_r0,
  `store_products`.`price` AS t1_r1
FROM `products` LEFT OUTER JOIN `store_producs` ON `store_products`.`product_id` = `products`.`id`
WHERE (store_products.price_in_cents > 65)

Includes only return two records, as it internally merges the similar products records. Also, the association is not queried separately when accessed

Includes with reference is much smarter than left join

But if you run the following, you can there are actually four records in includes also

products = Product.includes(:store_products)
  .references(:store_producs)
  .where("store_products.price > 65")
products.size => 2
products.pluck(:id).size => 4 # not sure a feature or bug in Rails 4

Note: I am using Rails 4 while writing this.

Leave a Reply

Your email address will not be published.