While ActiveRecord works pretty great across most scenarios, it struggles with multi-database architectures. Fortunately, the past few releases have had a lot of work done in this aspect.
One major drawback is the way transactions work in a multi-database architecture. A transaction is basically a set of database operations that are performed as a single unit. Here’s an example,
irb(main):001:1* ActiveRecord::Base.transaction do
irb(main):002:1* Company.first.touch
irb(main):003:1* Company.last.touch
irb(main):004:0> end
TRANSACTION (0.1ms) begin transaction
Company Load (0.2ms) SELECT "companies".* FROM "companies" ORDER BY "companies"."id" ASC LIMIT ? [["LIMIT", 1]]
Company Update (1.2ms) UPDATE "companies" SET "updated_at" = ? WHERE "companies"."id" = ? [["updated_at", "2022-08-21 10:48:22.853590"], ["id", 1]]
Company Load (0.1ms) SELECT "companies".* FROM "companies" ORDER BY "companies"."id" DESC LIMIT ? [["LIMIT", 1]]
Company Update (0.1ms) UPDATE "companies" SET "updated_at" = ? WHERE "companies"."id" = ? [["updated_at", "2022-08-21 10:48:22.860248"], ["id", 2]]
TRANSACTION (1.1ms) commit transaction
...
Though there are two database operations (Company.first.touch
and Company.last.touch
),
they are performed within a single transaction.
However, when you use a transaction across multiple databases, it is essentially left up-to the developer to ensure that the transactions are distributed across the databases in a way that is consistent with the intended use case. This is because transactions work on a single database connection.
Before
Let’s take the example of a multi-database architecture where there is a Company model on the primary database
and a Freelancer model on the secondary database.
Both these models are connected via the has_many :through
association
via the Contract
model on the primary database.
irb(main):001:0> company = Company.create! name: 'Saeloun'
...
irb(main):002:0> freelancer1 = Freelancer.create! name: 'Joe'
...
irb(main):003:0> freelancer2 = Freelancer.create! name: 'Schmoe'
...
irb(main):004:0> company.freelancers = [freelancer1, freelancer2]
TRANSACTION (0.1ms) begin transaction
Contract Create (0.8ms) INSERT INTO "contract" ("company_id", "freelancer_id", "created_at", "updated_at") VALUES (?, ?, ?, ?) [["company_id", 1], ["freelancer_id", 1], ["created_at", "2022-08-21 07:23:41.521082"], ["updated_at", "2022-08-21 07:23:41.521082"]]
TRANSACTION (1.2ms) commit transaction
TRANSACTION (0.0ms) begin transaction
Contract Create (0.4ms) INSERT INTO "contract" ("company_id", "freelancer_id", "created_at", "updated_at") VALUES (?, ?, ?, ?) [["company_id", 2], ["freelancer_id", 1], ["created_at", "2022-08-21 07:23:41.525693"], ["updated_at", "2022-08-21 07:23:41.525693"]]
TRANSACTION (0.6ms) commit transaction
...
As you can see, two transactions take place to insert two records into the contract
table.
This is because the company.freelancers = [freelancer1, freelancer2]
operation is performed on the primary database connection
and the Contract.create!
operation is performed on the secondary database connection.
It is up to the developer to wrap this around a double transaction block to ensure that the two operations are performed within a single transaction.
irb(main):001:1* Company.transaction do
irb(main):002:2* Freelancer.transaction do
irb(main):003:2* company.freelancers = [freelancer1, freelancer2]
irb(main):004:1* end
irb(main):005:0> end
TRANSACTION (0.1ms) begin transaction
Contract Create (0.7ms) INSERT INTO "contract" ("company_id", "freelancer_id", "created_at", "updated_at") VALUES (?, ?, ?, ?) [["company_id", 1], ["freelancer_id", 1], ["created_at", "2022-08-21 07:32:41.351072"], ["updated_at", "2022-08-21 07:32:41.351072"]]
Contract Create (0.1ms) INSERT INTO "contract" ("company_id", "freelancer_id", "created_at", "updated_at") VALUES (?, ?, ?, ?) [["company_id", 2], ["freelancer_id", 1], ["created_at", "2022-08-21 07:32:41.351083"], ["updated_at", "2022-08-21 07:32:41.351083"]]
TRANSACTION (1.3ms) commit transaction
...
After
Thanks to this PR that updates
how through_reflection
distributes transactions across database connections,
we can now perform the same operation without the need for a double transaction block.
irb(main):001:0> company.freelancers = [freelancer1, freelancer2]
TRANSACTION (0.1ms) begin transaction
Contract Create (0.7ms) INSERT INTO "contract" ("company_id", "freelancer_id", "created_at", "updated_at") VALUES (?, ?, ?, ?) [["company_id", 1], ["freelancer_id", 1], ["created_at", "2022-08-21 07:32:41.351072"], ["updated_at", "2022-08-21 07:32:41.351072"]]
Contract Create (0.1ms) INSERT INTO "contract" ("company_id", "freelancer_id", "created_at", "updated_at") VALUES (?, ?, ?, ?) [["company_id", 2], ["freelancer_id", 1], ["created_at", "2022-08-21 07:32:41.351083"], ["updated_at", "2022-08-21 07:32:41.351083"]]
TRANSACTION (1.3ms) commit transaction
...
This was achieved via a very simple update to the ThroughAssociation module.
The transaction
method opens up the second block with the connection
of the through_reflection
instead of the source_reflection
.
module ActiveRecord
module Associations
module ThroughAssociation
...
private
def transaction(&block)
through_reflection.klass.transaction(&block)
end
...
end
end
end