In our opinion, this PR provides one of the simplest yet most performant speed improvements to Rails 7. Ruby recently introduced an inline cache for class variable reads. This allows for values to be read from the cache instead of traversing a complex inheritance tree to resolve a class variable value. When class variables are read, Ruby needs to check each class in the inheritance tree to ensure that the class variable isn’t set on any other classes in the tree.
As you can imagine this becomes an O(n) problem. As the number of nodes increases in a tree, the read performance degrades linearly.
Let’s look at a demonstration using a class that inherits 1 module, 30 modules and finally 100 modules.
require "benchmark/ips"
MODULES = ["B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z", "AA", "BB", "CC", "DD", "EE", "FF", "GG", "HH", "II", "JJ", "KK", "LL", "MM", "NN", "OO", "PP", "QQ", "RR", "SS", "TT", "UU", "VV", "WW", "XX", "YY", "ZZ", "AAA", "BBB", "CCC", "DDD", "EEE", "FFF", "GGG", "HHH", "III", "JJJ", "KKK", "LLL", "MMM", "NNN", "OOO", "PPP", "QQQ", "RRR", "SSS", "TTT", "UUU", "VVV", "WWW", "XXX", "YYY", "ZZZ", "AAAA", "BBBB", "CCCC", "DDDD", "EEEE", "FFFF", "GGGG", "HHHH", "IIII", "JJJJ", "KKKK", "LLLL", "MMMM", "NNNN", "OOOO", "PPPP", "QQQQ", "RRRR", "SSSS", "TTTT", "UUUU", "VVVV", "WWWW"]
class A
  @@foo = 1
  def self.foo
    @@foo
  end
  eval <<-EOM
    module #{MODULES.first}
    end
    include #{MODULES.first}
  EOM
end
class Athirty
  @@foo = 1
  def self.foo
    @@foo
  end
  MODULES.take(30).each do |module_name|
    eval <<-EOM
      module #{module_name}
      end
      include #{module_name}
    EOM
  end
end
class Ahundred
  @@foo = 1
  def self.foo
    @@foo
  end
  MODULES.each do |module_name|
    eval <<-EOM
      module #{module_name}
      end
      include #{module_name}
    EOM
  end
end
Benchmark.ips do |x|
  x.report "1 module" do
    A.foo
  end
  x.report "30 modules" do
    Athirty.foo
  end
  x.report "100 modules" do
    Ahundred.foo
  end
  x.compare!
endThis is how Ruby performs without cache:
Warming up --------------------------------------
            1 module     1.231M i/100ms
          30 modules   432.020k i/100ms
         100 modules   145.399k i/100ms
Calculating -------------------------------------
            1 module     12.210M (± 2.1%) i/s -     61.553M in   5.043400s
          30 modules      4.354M (± 2.7%) i/s -     22.033M in   5.063839s
         100 modules      1.434M (± 2.9%) i/s -      7.270M in   5.072531s
Comparison:
            1 module: 12209958.3 i/s
          30 modules:  4354217.8 i/s - 2.80x  (± 0.00) slower
         100 modules:  1434447.3 i/s - 8.51x  (± 0.00) slowerNow, let’s look at the performance with the cache.
Warming up --------------------------------------
            1 module     1.641M i/100ms
          30 modules     1.655M i/100ms
         100 modules     1.620M i/100ms
Calculating -------------------------------------
            1 module     16.279M (± 3.8%) i/s -     82.038M in   5.046923s
          30 modules     15.891M (± 3.9%) i/s -     79.459M in   5.007958s
         100 modules     16.087M (± 3.6%) i/s -     81.005M in   5.041931s
Comparison:
            1 module: 16279458.0 i/s
         100 modules: 16087484.6 i/s - same-ish: difference falls within error
          30 modules: 15891406.2 i/s - same-ish: difference falls within errorOn Ruby master, including 100 modules is 8.5x slower than including 1 module. However, with the cache, there is no performance difference between including 1 module and including 100 modules!
Now let’s look at how the Rails core team bought this performance improvement to Rails.
Before
ActiveRecord::Base.logger is a cvar that has 63 modules in the inheritance tree.
We can check it out ourselves by calling:
ActiveRecord::Base.ancestors.size
# => 62Opening up ActiveRecord core,
this is how logger is defined.
mattr_accessor :logger, instance_writer: falseSince it is defined as a mattr_accessor and not a class variable,
it is unable to utilize the performance improvements introduced by the latest Ruby.
After
This
PR to Rails,
changes how logger is defined.
class_attribute :logger, instance_writer: falseNow let’s compare the performance improvement!
Calculating -------------------------------------
              logger      1.700M (± 0.9%) i/s -      8.667M in   5.097595s
             clogger     11.556M (± 0.9%) i/s -     58.806M in   5.089282s
Comparison:
             clogger: 11555754.2 i/s
              logger:  1700280.4 i/s - 6.80x  (± 0.00) slowerNow that’s a huge spike in improvement – nearly 7x! This is a great example of a real-world improvement to Rails applications.
Gotchas
There a just a few gotchas with this change.
Since ActiveRecord::Base.logger is now a class_attribute,
we can no longer access it directly through @@logger,
nor can setting logger = on a subclass change the parent’s logger anymore.
However,
this is a very small inconvenience as most use logger as just a method –
but we can never be careful enough!
Now, time to comb through your codebases and see how you can utilize this performance improvement!
