Rails provides shards as a way to connect with multiple databases, much more complex than replicating. Each shard can either be a vertical or a horizontal splice of the entire application database.
With vertical sharding, one can read and write to more tables than available in the primary one. Horizontal sharding allows one to split the data on the primary database based on defined resolvers (id greater than x or domain equal to x).
Here’s how the database is configured,
default: &default
adapter: postgresql
encoding: unicode
pool: <%= ENV.fetch("RAILS_MAX_THREADS") { 5 } %>
development:
primary:
<<: *default
database: primary_database
primary_replica:
<<: *default
database: primary_database
replica: true
primary_shard_one:
<<: *default
database: primary_shard_one
migrations_paths: db/primary_shard_one_migrate
primary_shard_one_replica:
<<: *default
database: primary_shard_one
replica: true
primary_shard_two:
<<: *default
database: primary_shard_two
primary_shard_two_replica:
<<: *default
database: primary_shard_two
replica: true
We will use an example of horizontal sharding, configured in ApplicationRecord,
class ApplicationRecord < ActiveRecord::Base
self.abstract_class = true
connects_to shards: {
default: { writing: :primary, reading: :primary_replica },
shard_two: { writing: :primary_shard_two, reading: :primary_shard_two_replica }
}
end
Rails allows for both manual and automatic shard switching in both vertical and horizontal sharding.
Each ActiveRecord connection (essentially a thread) contains information about the database that it is currently accessing. The connection information is stored in,
> ActiveRecord::Base.connection_db_config
=> #<ActiveRecord::DatabaseConfigurations::HashConfig:0x00007fc1831657f0 @env_name="development", @name="primary", @configuration_hash={:adapter=>"postgresql", :encoding=>"unicode", :pool=>5, :database=>"primary_database"}>
Before
ActiveRecord provides connect_to
to easily swap out shards.
ActiveRecord::Base.connected_to(role: :reading, shard: :default) do
puts ActiveRecord::Base.connection_db_config.name
Blog.count
end
primary_replica
Blog Count (3.3ms) SELECT COUNT(*) FROM "blogs"
=> 3
Let’s swap to shard_two
.
ActiveRecord::Base.connected_to(role: :reading, shard: :shard_two) do
puts ActiveRecord::Base.connection_db_config.name
Blog.count
end
primary_shard_two_replica
Blog Count (3.3ms) SELECT COUNT(*) FROM "blogs"
=> 0
Zero blogs since this shard has not been written to yet!
After
Using ActiveRecord::Base.prohibit_shard_swapping, we can prevent attempts to change the shard within a block. When using sharded databases for the lifecycle of an entire request, it’s often desirable to ensure that the databases’ shard is not unintentionally changed. This option is also thread-safe.
Let’s see this in action!
ActiveRecord::Base.connected_to(role: :reading, shard: :shard_two) do
puts ActiveRecord::Base.connection_db_config.name
puts Blog.count
ActiveRecord::Base.prohibit_shard_swapping do
ActiveRecord::Base.connected_to(role: :reading, shard: :default) do
puts ActiveRecord::Base.connection_db_config.name
puts Blog.count
end
end
end
primary_shard_two_replica
Blog Count (1.5ms) SELECT COUNT(*) FROM "blogs"
0
Traceback (most recent call last):
3: from (irb):84
2: from (irb):88:in `block in irb_binding'
1: from (irb):89:in `block (2 levels) in irb_binding'
ArgumentError (cannot swap `shard` while shard swapping is prohibited.)
irb(main):095:0>