ActiveRecord::Batches
provides public methods like find_each
, find_in_batches
and in_batches
to work with the records in batches
which helps in reducing memory consumption.
Before Rails 6.1, the order was automatically set on the primary key, which always returns results in ascending order.
User.find_each(batch_size: 5) do |user|
puts "Processing #{user.id}"
end
User Load (0.2ms) SELECT "users".* FROM "users" ORDER BY "users"."id" ASC LIMIT ? [["LIMIT", 5]]
Processing 1
Processing 2
Processing 3
Processing 4
Processing 5
User Load (0.4ms) SELECT "users".* FROM "users" WHERE "users"."id" > ? ORDER BY "users"."id" ASC LIMIT ? [["id", 5], ["LIMIT", 5]]
Processing 6
In many cases, we might need to process newer records before older records.
This wasn’t possible even if provided the order manually on the scope used for batching,
as order gets overridden by find_each
, etc.
With Rails 6.1
Rails 6.1 now supports
order
option for find_each
, find_in_batches
and in_batches
methods.
find_each
Let’s take a look how this works now for find_each
.
User.find_each(batch_size: 5, order: :desc) do |user|
puts "Processing #{user.id}"
end
User Load (0.3ms) SELECT "users".* FROM "users" ORDER BY "users"."id" DESC LIMIT ? [["LIMIT", 5]]
Processing 6
Processing 5
Processing 4
Processing 3
Processing 2
User Load (0.4ms) SELECT "users".* FROM "users" WHERE "users"."id" < ? ORDER BY "users"."id" DESC LIMIT ? [["id", 2], ["LIMIT", 5]]
Processing 1
find_in_batches
Similarly for find_in_batches
-
User.find_in_batches(batch_size: 5, order: :desc) do |user_group|
user_group.each do |user|
puts "Processing #{user.id}"
end
end
User Load (0.5ms) SELECT "users".* FROM "users" ORDER BY "users"."id" DESC LIMIT ? [["LIMIT", 5]]
Processing 6
Processing 5
Processing 4
Processing 3
Processing 2
User Load (0.3ms) SELECT "users".* FROM "users" WHERE "users"."id" < ? ORDER BY "users"."id" DESC LIMIT ? [["id", 2], ["LIMIT", 5]]
Processing 1
in_batches
And support for in_batches
-
User.in_batches(of: 5, order: :desc) do |user_relation|
user_relation.each do |user|
puts "Processing #{user.id}"
end
end
(0.5ms) SELECT "users"."id" FROM "users" ORDER BY "users"."id" DESC LIMIT ? [["LIMIT", 5]]
User Load (1.2ms) SELECT "users".* FROM "users" WHERE "users"."id" IN (?, ?, ?, ?, ?) [["id", 6], ["id", 5], ["id", 4], ["id", 3], ["id", 2]]
Processing 2
Processing 3
Processing 4
Processing 5
Processing 6
(0.3ms) SELECT "users"."id" FROM "users" WHERE "users"."id" < ? ORDER BY "users"."id" DESC LIMIT ? [["id", 2], ["LIMIT", 5]]
User Load (0.3ms) SELECT "users".* FROM "users" WHERE "users"."id" = ? [["id", 1]]
Processing 1
As shown in the above example,
in_batches
with order: :desc
queries the user table with
descending order clause which returns the batches in descending order.
Note that in_batches
, just yields a relation.
By default the relation has no ordering,
so the records within batches are processed in ascending order which
is the default order on User ActiveRecord::Relation
.
We can make this behave like other scenarios above
by providing an order
to intermediate relation object-
User.in_batches(of: 5, order: :desc) do |user_relation|
user_relation.order(id: :desc).each do |user|
puts "Processing #{user.id}"
end
end