More everyday performance rules for Ruby on Rails developers

Alexis Bernard 2025-09-16

A compilation of good practices to write optimized Ruby code that you can apply to any Rails application.

Previously, I wrote Everyday Performance Rules for Ruby on Rails Developers. I will try to provide another round of good practices. I hope that will help you speed up your application as well.

Rendering a collection is faster than calling a partial in a loop

<%# Slower %>
<% records.each do |record| %>
  <%= render "partial", record: record %>
<% end %>

<%# Faster %>
<%= render partial: "partial", collection: records, as: :record %>

Because render "partial" lookups for the partial N times and generates N instrumentations.

Whereas, rendering a collection lookups for the partial only once and generates a single instrumentation.

As always, less is faster. Here is a benchmark to demonstrate that rendering a collection is a no-brainer when calling the same partial multiple times consecutively.

<%# index.html.erb %>
<% Benchmark.ips do |x| %>
  <% collection = 100.times.to_a %>

  <% x.report "loop" do %>
    <% collection.each { |i| %><%= render "empty_partial", item: i %><% } %>
  <% end %>

  <% x.report "collection" do %>
    <%= render partial: "empty_partial", collection: collection, as: :item %>
  <% end %>

  <%= x.compare!(order: :baseline) %>
<% end %>

Before showing the result, I must give some disclaimers. This is a microbenchmark, mostly measuring ActionView, since the partial is empty. In a real application, the partial could trigger IOs or heavy computations, which is much slower than an ActionView lookup. I chose 25 iterations because this is the default value for most paginations, but changing that number has an impact on the result. Finally, the benchmark is run with RAILS_ENV=production to achieve a less verbose output and reduce logging.

RAILS_ENV=production rails server
rails 8.0.2
ruby 3.4.5 (2025-07-16 revision 20cda200d3) +YJIT +PRISM [x86_64-linux]
Warming up --------------------------------------
                loop   272.000 i/100ms
          collection   739.000 i/100ms
Calculating -------------------------------------
                loop      2.758k (± 3.0%) i/s  (362.64 μs/i) -     13.872k in   5.035324s
          collection      5.551k (±15.3%) i/s  (180.16 μs/i) -     27.343k in   5.027300s

Comparison:
                loop:     2757.5 i/s
          collection:     5550.7 i/s - 2.01x  faster

Again, this is a microbenchmark, so rendering a collection won’t speed up your application by 2. However, each time a partial is rendered inside a loop, it’s a low-hanging fruit to call render with the collection option.

Note, that there is also ActiveSupport::Benchmarkable which is convenient to time any part of a view.

Caching collections is also faster than loops

Render has an option to enable collection caching, which is much faster compare to rendering cached partials in a loop.

<%# Slower %>
<% records.each do |record| %>
  <%= render "partial", record: record %>
<% end %>

<%# Faster %>
<%= render partial: "partial", collection: records, as: :record, cached: true %>

<%# _partial.html.erb %>
<% cache record do %>
  <%# ... %>
<% end %>

Indeed all fragments are retrieved in a single call thanks to read_multi. Let’s run a benchmark to verify:

<% Benchmark.ips do |x| %>
  <% collection = 25.times.to_a %>

  <% x.report "loop cached partial" do %>
    <% collection.each do |i| %>
      <%= render "cached_sleep_100ms", item: i %>
    <% end %>
  <% end %>

  <% x.report "collection cached" do %>
    <%= render partial: "cached_sleep_100ms", collection: collection, as: :item, cached: true %>
  <% end %>

  <%= x.compare!(order: :baseline) %>
<% end %>

<%# _cached_sleep_100ms.html.erb %>
<% cache item do %>
  <%= sleep 0.1 %>
<% end %>

RAILS_ENV=production rails server
rails 8.0.2
config.cache_store = :mem_cache_store
ruby 3.4.5 (2025-07-16 revision 20cda200d3) +YJIT +PRISM [x86_64-linux]
Warming up --------------------------------------
 loop cached partial   189.000 i/100ms
   collection cached   295.000 i/100ms
Calculating -------------------------------------
 loop cached partial      1.525k (± 1.1%) i/s  (655.73 μs/i) -      7.749k in   5.081897s
   collection cached      2.575k (± 1.9%) i/s  (388.35 μs/i) -     12.980k in   5.042672s

Comparison:
 loop cached partial:     1525.0 i/s
   collection cached:     2575.0 i/s - 1.69x  faster

The result shows that collection caching is 69% faster. Be aware that this is in the specific case of a 100% cache hit ratio since all cache entries have been written during the warmup. If you switch to Solid Cache and run the same bechmark the result is even more impressive since the loop is 90% slower.

Small partials are slower than you expect

Splitting views into many small partials is beneficial for modularity and reuse, but it comes with a performance penalty. Indeed, the lookup for the template is not trivial because it has to handle relative paths and absolute paths into many view paths. I suggest avoiding writing small partials that are called often. However, if you want modularity and reuse, you can use helpers or view components. Let’s run a benchmark.

# app/components/avatar_component.rb
class AvatarComponent < ViewComponent::Base
  erb_template %Q(<img src="<%= @url %>" alt="<%= @name %>'s avatar"/>)

  def initialize(url:, name:)
    @url, @name = url, name
  end
end

<%# index.html.erb %>
<%
  # Helpers defined here for convenience
  def avatar_tag(url, name)
    tag.img(src: url, alt: "#{name}'s avatar")
  end

  def avatar_concat(url, name)
    %Q(<img src="#{h(url)}" alt="#{h(name)}'s avatar"/>).html_safe
  end
%>

<%= Benchmark.ips do |x| %>
  <% url, name = "data:,", "Name" %>

  <% x.report "partial" do %>
    <%= render "avatar", url: url, name: name %>
  <% end %>

  <% x.report "helper tag" do %>
    <%= avatar_tag url, name %>
  <% end %>

  <% x.report "helper concat" do %>
    <%= avatar_concat url, name %>
  <% end %>

  <% x.report "component" do %>
    <%= render AvatarComponent.new(url: url, name: name) %>
  <% end %>

  <%= x.compare!(order: :baseline) %>
<% end %>

The benchmark is run with RAILS_ENV=production to have a less verbose output to decrease logging.

RAILS_ENV=production rails server
rails 8.0.2
ruby 3.4.5 (2025-07-16 revision 20cda200d3) +YJIT +PRISM [x86_64-linux]
Warming up --------------------------------------
             partial    24.137k i/100ms
          helper tag    45.905k i/100ms
       helper concat    52.996k i/100ms
           component    33.276k i/100ms
Calculating -------------------------------------
             partial    176.451k (± 3.3%) i/s    (5.67 μs/i) -    893.069k in   5.067688s
          helper tag    344.303k (± 1.0%) i/s    (2.90 μs/i) -      1.744M in   5.066987s
       helper concat    530.749k (± 4.1%) i/s    (1.88 μs/i) -      2.650M in   5.003223s
           component    340.409k (± 2.7%) i/s    (2.94 μs/i) -      1.730M in   5.087353s

Comparison:
             partial:   176450.9 i/s
       helper concat:   530748.8 i/s - 3.01x  faster
          helper tag:   344303.1 i/s - 1.95x  faster
           component:   340408.9 i/s - 1.93x  faster

The interpolation is by far the fastest, but it is also the ugliest. Furthemore, it’s less secure since it’s easy to forget to escape a string to prevent from XSS attacks. For basic partials called very often, it sounds reasonable to trade a less attractive code for some precious milliseconds. However, for more complex partials that are called frequently, with branching, switching to View Component appears to be a good option, as it’s significantly faster than a partial invocation. Nevertheless, I’m not saying to switch all partials to interpolations or View Component. For a partial called rarely or once per request, or doing any IO, the lookup time is probably insignificant.

Maybe later is faster than surely now

If an area of the page is unlikely to be seen by the visitor, it is probably best not to display it immediately. A typical example is an article that is longer than the screen’s height, with comments shown below. Only a few visitors will reach the comment section. So, rendering comments each time is a waste.

The idea is to replace a placeholder only when it becomes visible, thanks to IntersectionObserver. It’s different from basic lazy loading. Here, if the placeholder is not reached, there are ZERO requests. It’s a 100% optimization.

You can achieve that with the following JavaScript.

var placeholder = document.getElementById("placeholder")

new IntersectionObserver(function(entries) {
  if (entries[0].intersectionRatio > 0) {
    intersection.unobserve(placeholder)
    // Load content and replace placeholder
  }
}).observe(placeholder)

With Turbo, you can do it easily with lazy frames:

<turbo-frame id="comments" src="/articles/123/comments" loading="lazy">
  Comments placeholder
</turbo-frame>

However by handling directly IntersectionObserver you can use the rootMargin option to trigger loading earlier and thus visitors won’t see the placeholder.

Create lots of records quickly

For each Model#create call, there are three round-trips to the database to execute queries:

BEGIN
INSERT INTO ...
COMMIT

The order of magnitude of the network latency is approximately 10 microseconds on the same host and 100 microseconds in the same data center. In the case of creating bulk records, it is not optimal to wait 3 times for each record.

ActiveRecord provides the method insert_all, which creates N records in a single query. However, keep in mind that there are neither validations, callbacks, nor instantiations. If the model has business logic in callbacks or validations, that won’t help. However, those skips save more time and object allocations.

array_of_attributes = [
  {name: "Foo", email, "foo@email.example"},
  {name: "Bar", email, "bar@email.example"},
]

# Slower => 2 * (BEGIN + INSERT + COMMIT)
array_of_attributes.each { |attributes| User.create(attributes) }
# BEGIN
# INSERT INTO users (name, email) VALUES ('Foo', 'foo@email.example')
# COMMIT
# BEGIN
# INSERT INTO users (name, email) VALUES ('Bar', 'bar@email.example')
# COMMIT

# Faster => 1 INSERT
User.insert_all(array_of_attributes)
# INSERT INTO users (name, email) VALUES
#   ('Foo', 'foo@email.example'),
#   ('Bar', 'bar@email.example')

Importing a lot of records and their associations quickly

ActiveRecord‘s method insert_all is fast but limited when dealing with associations. Fortunately, the gem activerecord-import has come to the rescue. It triggers a single query per table to avoid round-trips with the database, making it significantly faster than a loop that calls create on each iteration.

It comes with many options (validations, batch size, all-or-nothing, …), and this code example is just one usage.

data = [
  {
    user: {email: "user@mail.test", name: "User"},
    ideas: [
      {title: "Foo", description: "Lorem ipsum"},
      {title: "Bar", description: "Lorem ipsum"},
    ]
  },
  # Many more entries ...
]

users = data.map do |params|
  user = User.new(params[:user])
  user.ideas = params[:ideas].map { |attrs| Idea.new(attrs) }
  user
end

User.import(users, recursive: true)
# Triggers 2 queries only:
#   1. INSERT INTO users (email, name) VALUES (...), (...), (...), ...
#   2. INSERT INTO ideas (title, description) VALUES (...), (...), (...), ...

For more examples, I encourage you to read the README of the Git repository.

Stop waiting for Redis responses with pipelining

Redis is fast, but each command still requires waiting for a network round-trip. Pipelining sends multiple commands without waiting for each one individually. Instead of having N round-trips, there is only one, thus the code is less idle. Of course, that’s not possible when you need the result of the previous command for the next one.

Here is an example, where we need to store some analytics on each request:

# Store analytics on each request into Redis with the minimum extra cost
# by sending all commands in a single round-trip.

class ApplicationController < ActionController::Base
  before_action :collect_analytics

  private

  def collect_analytics
    # Send 5 commands in a single round-trip
    redis.pipelined do |pipe|
      pipe.hincrby("pages", request.path, 1)
      pipe.hincrby("referrers", request.referrer, 1)
      pipe.hincrby("actions", "#{controller_name}##{action_name}", 1)
      pipe.hincrby("ips", request.remote_ip, 1)
      pipe.hincrby("users", current_user.id, 1) if current_user
    end
  end
end

For more details, I encourage you to read the pipelining section of the README of redis-rb. And of course, the pipelining documentation of Redis.

Enqueuing a lot of jobs really fast

Enqueuing via ActiveJob::Base#perform_later triggers callbacks and generates one round-trip for each job. However, perform_all_later skips callbacks and enqueues all jobs in a single step.

require "benchmark"

sections = Section.limit(1_000).load

Benchmark.measure do
  sections.map { |section| NormalizeDataJob.perform_later(section) }
end # @real=0.18012950796401128

Benchmark.measure do
  jobs = sections.map { |section| NormalizeDataJob.new(section) }
  ActiveJob.perform_all_later(jobs)
end # @real=0.05221099697519094

# Sidekiq with Redis running locally and config.log_level = :error

Enqueuing 1000 jobs is 3.45 times faster. The result may vary according to Active Job’s adapter.

Moreover, bulk enqueuing is not supported by all job backends, but if you’re using Solid Queue, GoodJob, or Sidekiq, you’re good. Finally, bulk enqueuing is available from Rails 7.1 only.

Enqueuing a lot more jobs

The previous example works fine with a few thousand jobs. But that would probably eat too much memory if you’re loading a lot more ActiveRecord instances. The trick is to load IDs only if you do not need to access any attributes or methods.

require "benchmark"

ids = Section.limit(100_000).ids

Benchmark.measure do
  ids.map { |id| NormalizeDataJob.perform_later(id) }
end # @real=17.353170770045836

Benchmark.measure do
  jobs = ids.map { |id| NormalizeDataJob.new(id) }
  ActiveJob.perform_all_later(jobs) # @real=5.2103117069927976
end
# Sidekiq with Redis running locally and config.log_level = :error

Only 5 seconds are required to enqueue 100,000 jobs, which is 3.33 times faster.

Avoid returning records or relations from a model’s method

Because a SQL query is triggered on each call. Memoizing the result is not necessarily better since it won’t be uncached when reload is called. Instead, switch that method to a relation.

# Avoid returning records or relations from a model's method
class User < ApplicationRecord
  has_many :subscriptions

  # Bad because a SQL query is triggered on each call 👎
  def active_subscription
    subscriptions.active.first
  end
end

user = User.first
user.active_subscription # SELECT * FROM subscriptions ...
user.active_subscription # SELECT * FROM subscriptions ... ☹️
user.active_subscription # SELECT * FROM subscriptions ... ☹️

# Memoizing the result is not necessarily better since it won't be uncached
# when reload is called. Instead, switch that method to a relation:
class User < ApplicationRecord
  has_many :subscriptions

  # Good because the relation is cached until reload is called 👍
  has_one :active_subscription, -> { active }, class_name: "Subscription"
end

user = User.first
user.active_subscription # SELECT * FROM subscriptions ...
user.active_subscription # Cached, no queries 🙂
user.active_subscription # Cached, no queries 🙂
user.reload
user.active_subscription # SELECT * FROM subscriptions ... 🙂

Array#find does not scale

Because scanning sequentially becomes linearly slower as the array grows, but looking for data through an ordered array is extremely fast!

Indeed, it allows for searching via dichotomy. At each step, it splits the array in half, then splits it in half again, and so on. It avoids scanning sequentially all entries. The complexity is less: O(log(n)) instead of O(n).

It’s much faster. It’s like an index. It’s possible because the data are sorted.

We are lucky because Ruby provides Array#bsearch. Let’s compare it with Array#find against 1 million entries from 1 to 1,000,000.

123,456 is at the beginning of the array, and 654,321 is a bit after the middle. To be fair, we will look for both values and see the difference at the end.

require "benchmark/ips"
array = (1..1_000_000).to_a # [1, 2, 3, ..., 1000000]

Benchmark.ips do |x|
  x.config(time: 1, warmup: 1)

  x.report("Array#bsearch(123456)") do
    array.bsearch { |n| 123456 <=> n }
  end

  x.report("Array#bsearch(654321)") do
    array.bsearch { |n| 654321 <=> n }
  end

  x.report("Array#find(123456)") do
    array.find { |n| 123456 == n }
  end

  x.report("Array#find(654321)") do
    array.find { |n| 654321 == n }
  end

  x.compare!
  # Comparison:
  # Array#bsearch(123456):  1664871.8 i/s
  # Array#bsearch(654321):  1459509.3 i/s - 1.14x  slower
  #   Array#find(123456):      275.6 i/s - 6041.51x  slower
  #   Array#find(654321):       51.5 i/s - 32320.07x  slower
end

Bsearch is always a lot faster. It’s impossible to determine the exact amount, as it depends on the size of the array and the location of the data. Do not forget that the array must be sorted, and the condition cannot change.

Moreover, the execution time of bsearch does not vary too much. Whereas for Array#find(654321) is a lot slower than Array#find(123456) because the former is further in the array. That is why Array#find does not scale.

Looking through an array of numbers is not a real use case. Let’s see how we can create an index for real data.

# Let's find all countries with a population between 100 million and 1 billion.
# We will fetch data form World Bank's API and load JSON into Country objects.
# Then countries are sorted by population and finally find them with bsearch_index.

require "json"
require "net/http"

Country = Struct.new(:name, :population)

# Download countries with population
url = "http://api.worldbank.org/v2/country/all/indicator/SP.POP.TOTL?format=json&date=2024&per_page=300"
File.write("population.json", Net::HTTP.get(URI(url))) if !File.exist?("population.json")

# Load JSON into Country objects
data = JSON.parse(File.read("population.json"))[1]
data = data[49..-1] # Skip regions to keep countries only
countries = data.map { |hash| Country.new(hash["country"]["value"], hash["value"]) }.compact

# Countries must be sorted before using bsearch_index
countries_per_population = countries.sort_by(&:population)

# Find lower and upper indexes of countries between 100 million and 1 billion population
from = countries_per_population.bsearch_index { |country| country.population >= 100_000_000 }
to = countries_per_population.bsearch_index { |country| country.population >= 1_000_000_000 }

# The upper index must be excluded
countries_per_population[from...to].each { |c| puts "#{c.name}: #{c.population}" }

Now you’ve got a way to find data super-fast. Note that, as a database index, it’s relevant only if you have more reads than writes, because of the upfront cost of sorting. You can go any further by reading Ruby’s Binary Searching documentation.

Conclusion

We went through a bunch of optimisation good practices that cover the entire spectrum of a Rails application, from the frontend to the database, without forgetting views:

Creating a lot of records quickly with insert_all and activerecord-import
Enqueuing many jobs really fast with perform_all_later and IDs
Decrease round-trips thanks to Redis pipelining
Faster views by rendering collections and avoiding small partials
Loading content when viewable only thanks to IntersectionObserver or Turbo lazy frame
Fast array search with bsearch

Alexis Bernard

Ruby on Rails developper

One of the makers of RoRvsWild.

Website GitHub Twitter BlueSky