More everyday performance rules for Ruby on Rails developers
A compilation of good practices to write optimized Ruby code that you can apply to any Rails application.

Previously, I wrote Everyday Performance Rules for Ruby on Rails Developers. I will try to provide another round of good practices. I hope that will help you speed up your application as well.
Rendering a collection is faster than calling a partial in a loop
<%# Slower %>
<% records.each do |record| %>
<%= render "partial", record: record %>
<% end %>
<%# Faster %>
<%= render partial: "partial", collection: records, as: :record %>
Because render "partial"
lookups for the partial N times and generates N instrumentations.
Whereas, rendering a collection lookups for the partial only once and generates a single instrumentation.
As always, less is faster. Here is a benchmark to demonstrate that rendering a collection is a no-brainer when calling the same partial multiple times consecutively.
<%# index.html.erb %>
<% Benchmark.ips do |x| %>
<% collection = 100.times.to_a %>
<% x.report "loop" do %>
<% collection.each { |i| %><%= render "empty_partial", item: i %><% } %>
<% end %>
<% x.report "collection" do %>
<%= render partial: "empty_partial", collection: collection, as: :item %>
<% end %>
<%= x.compare!(order: :baseline) %>
<% end %>
Before showing the result, I must give some disclaimers.
This is a microbenchmark, mostly measuring ActionView, since the partial is empty.
In a real application, the partial could trigger IOs or heavy computations, which is much slower than an ActionView lookup.
I chose 25 iterations because this is the default value for most paginations, but changing that number has an impact on the result.
Finally, the benchmark is run with RAILS_ENV=production
to achieve a less verbose output and reduce logging.
RAILS_ENV=production rails server
rails 8.0.2
ruby 3.4.5 (2025-07-16 revision 20cda200d3) +YJIT +PRISM [x86_64-linux]
Warming up --------------------------------------
loop 272.000 i/100ms
collection 739.000 i/100ms
Calculating -------------------------------------
loop 2.758k (± 3.0%) i/s (362.64 μs/i) - 13.872k in 5.035324s
collection 5.551k (±15.3%) i/s (180.16 μs/i) - 27.343k in 5.027300s
Comparison:
loop: 2757.5 i/s
collection: 5550.7 i/s - 2.01x faster
Again, this is a microbenchmark, so rendering a collection won’t speed up your application by 2.
However, each time a partial is rendered inside a loop, it’s a low-hanging fruit to call render
with the collection option.
Note, that there is also ActiveSupport::Benchmarkable
which is convenient to time any part of a view.
Caching collections is also faster than loops
Render has an option to enable collection caching, which is much faster compare to rendering cached partials in a loop.
<%# Slower %>
<% records.each do |record| %>
<%= render "partial", record: record %>
<% end %>
<%# Faster %>
<%= render partial: "partial", collection: records, as: :record, cached: true %>
<%# _partial.html.erb %>
<% cache record do %>
<%# ... %>
<% end %>
Indeed all fragments are retrieved in a single call thanks to read_multi. Let’s run a benchmark to verify:
<% Benchmark.ips do |x| %>
<% collection = 25.times.to_a %>
<% x.report "loop cached partial" do %>
<% collection.each do |i| %>
<%= render "cached_sleep_100ms", item: i %>
<% end %>
<% end %>
<% x.report "collection cached" do %>
<%= render partial: "cached_sleep_100ms", collection: collection, as: :item, cached: true %>
<% end %>
<%= x.compare!(order: :baseline) %>
<% end %>
<%# _cached_sleep_100ms.html.erb %>
<% cache item do %>
<%= sleep 0.1 %>
<% end %>
RAILS_ENV=production rails server
rails 8.0.2
config.cache_store = :mem_cache_store
ruby 3.4.5 (2025-07-16 revision 20cda200d3) +YJIT +PRISM [x86_64-linux]
Warming up --------------------------------------
loop cached partial 189.000 i/100ms
collection cached 295.000 i/100ms
Calculating -------------------------------------
loop cached partial 1.525k (± 1.1%) i/s (655.73 μs/i) - 7.749k in 5.081897s
collection cached 2.575k (± 1.9%) i/s (388.35 μs/i) - 12.980k in 5.042672s
Comparison:
loop cached partial: 1525.0 i/s
collection cached: 2575.0 i/s - 1.69x faster
The result shows that collection caching is 69% faster. Be aware that this is in the specific case of a 100% cache hit ratio since all cache entries have been written during the warmup. If you switch to Solid Cache and run the same bechmark the result is even more impressive since the loop is 90% slower.
Small partials are slower than you expect
Splitting views into many small partials is beneficial for modularity and reuse, but it comes with a performance penalty. Indeed, the lookup for the template is not trivial because it has to handle relative paths and absolute paths into many view paths. I suggest avoiding writing small partials that are called often. However, if you want modularity and reuse, you can use helpers or view components. Let’s run a benchmark.
# app/components/avatar_component.rb
class AvatarComponent < ViewComponent::Base
erb_template %Q(<img src="<%= @url %>" alt="<%= @name %>'s avatar"/>)
def initialize(url:, name:)
@url, @name = url, name
end
end
<%# index.html.erb %>
<%
# Helpers defined here for convenience
def avatar_tag(url, name)
tag.img(src: url, alt: "#{name}'s avatar")
end
def avatar_concat(url, name)
%Q(<img src="#{h(url)}" alt="#{h(name)}'s avatar"/>).html_safe
end
%>
<%= Benchmark.ips do |x| %>
<% url, name = "data:,", "Name" %>
<% x.report "partial" do %>
<%= render "avatar", url: url, name: name %>
<% end %>
<% x.report "helper tag" do %>
<%= avatar_tag url, name %>
<% end %>
<% x.report "helper concat" do %>
<%= avatar_concat url, name %>
<% end %>
<% x.report "component" do %>
<%= render AvatarComponent.new(url: url, name: name) %>
<% end %>
<%= x.compare!(order: :baseline) %>
<% end %>
The benchmark is run with RAILS_ENV=production
to have a less verbose output to decrease logging.
RAILS_ENV=production rails server
rails 8.0.2
ruby 3.4.5 (2025-07-16 revision 20cda200d3) +YJIT +PRISM [x86_64-linux]
Warming up --------------------------------------
partial 24.137k i/100ms
helper tag 45.905k i/100ms
helper concat 52.996k i/100ms
component 33.276k i/100ms
Calculating -------------------------------------
partial 176.451k (± 3.3%) i/s (5.67 μs/i) - 893.069k in 5.067688s
helper tag 344.303k (± 1.0%) i/s (2.90 μs/i) - 1.744M in 5.066987s
helper concat 530.749k (± 4.1%) i/s (1.88 μs/i) - 2.650M in 5.003223s
component 340.409k (± 2.7%) i/s (2.94 μs/i) - 1.730M in 5.087353s
Comparison:
partial: 176450.9 i/s
helper concat: 530748.8 i/s - 3.01x faster
helper tag: 344303.1 i/s - 1.95x faster
component: 340408.9 i/s - 1.93x faster
The interpolation is by far the fastest, but it is also the ugliest. Furthemore, it’s less secure since it’s easy to forget to escape a string to prevent from XSS attacks. For basic partials called very often, it sounds reasonable to trade a less attractive code for some precious milliseconds. However, for more complex partials that are called frequently, with branching, switching to View Component appears to be a good option, as it’s significantly faster than a partial invocation. Nevertheless, I’m not saying to switch all partials to interpolations or View Component. For a partial called rarely or once per request, or doing any IO, the lookup time is probably insignificant.
Maybe later is faster than surely now
If an area of the page is unlikely to be seen by the visitor, it is probably best not to display it immediately. A typical example is an article that is longer than the screen’s height, with comments shown below. Only a few visitors will reach the comment section. So, rendering comments each time is a waste.
The idea is to replace a placeholder only when it becomes visible, thanks to IntersectionObserver. It’s different from basic lazy loading. Here, if the placeholder is not reached, there are ZERO requests. It’s a 100% optimization.
You can achieve that with the following JavaScript.
var placeholder = document.getElementById("placeholder")
new IntersectionObserver(function(entries) {
if (entries[0].intersectionRatio > 0) {
intersection.unobserve(placeholder)
// Load content and replace placeholder
}
}).observe(placeholder)
With Turbo, you can do it easily with lazy frames:
<turbo-frame id="comments" src="/articles/123/comments" loading="lazy">
Comments placeholder
</turbo-frame>
However by handling directly IntersectionObserver
you can use the rootMargin
option to trigger loading earlier and thus visitors won’t see the placeholder.
Create lots of records quickly
For each Model#create
call, there are three round-trips to the database to execute queries:
BEGIN
INSERT INTO ...
COMMIT
The order of magnitude of the network latency is approximately 10 microseconds on the same host and 100 microseconds in the same data center. In the case of creating bulk records, it is not optimal to wait 3 times for each record.
ActiveRecord provides the method insert_all
, which creates N records in a single query.
However, keep in mind that there are neither validations, callbacks, nor instantiations.
If the model has business logic in callbacks or validations, that won’t help.
However, those skips save more time and object allocations.
array_of_attributes = [
{name: "Foo", email, "foo@email.example"},
{name: "Bar", email, "bar@email.example"},
]
# Slower => 2 * (BEGIN + INSERT + COMMIT)
array_of_attributes.each { |attributes| User.create(attributes) }
# BEGIN
# INSERT INTO users (name, email) VALUES ('Foo', 'foo@email.example')
# COMMIT
# BEGIN
# INSERT INTO users (name, email) VALUES ('Bar', 'bar@email.example')
# COMMIT
# Faster => 1 INSERT
User.insert_all(array_of_attributes)
# INSERT INTO users (name, email) VALUES
# ('Foo', 'foo@email.example'),
# ('Bar', 'bar@email.example')
Importing a lot of records and their associations quickly
ActiveRecord‘s method insert_all
is fast but limited when dealing with associations.
Fortunately, the gem activerecord-import
has come to the rescue.
It triggers a single query per table to avoid round-trips with the database, making it significantly faster than a loop that calls create on each iteration.
It comes with many options (validations, batch size, all-or-nothing, …), and this code example is just one usage.
data = [
{
user: {email: "user@mail.test", name: "User"},
ideas: [
{title: "Foo", description: "Lorem ipsum"},
{title: "Bar", description: "Lorem ipsum"},
]
},
# Many more entries ...
]
users = data.map do |params|
user = User.new(params[:user])
user.ideas = params[:ideas].map { |attrs| Idea.new(attrs) }
user
end
User.import(users, recursive: true)
# Triggers 2 queries only:
# 1. INSERT INTO users (email, name) VALUES (...), (...), (...), ...
# 2. INSERT INTO ideas (title, description) VALUES (...), (...), (...), ...
For more examples, I encourage you to read the README of the Git repository.
Stop waiting for Redis responses with pipelining
Redis is fast, but each command still requires waiting for a network round-trip. Pipelining sends multiple commands without waiting for each one individually. Instead of having N round-trips, there is only one, thus the code is less idle. Of course, that’s not possible when you need the result of the previous command for the next one.
Here is an example, where we need to store some analytics on each request:
# Store analytics on each request into Redis with the minimum extra cost
# by sending all commands in a single round-trip.
class ApplicationController < ActionController::Base
before_action :collect_analytics
private
def collect_analytics
# Send 5 commands in a single round-trip
redis.pipelined do |pipe|
pipe.hincrby("pages", request.path, 1)
pipe.hincrby("referrers", request.referrer, 1)
pipe.hincrby("actions", "#{controller_name}##{action_name}", 1)
pipe.hincrby("ips", request.remote_ip, 1)
pipe.hincrby("users", current_user.id, 1) if current_user
end
end
end
For more details, I encourage you to read the pipelining section of the README of redis-rb. And of course, the pipelining documentation of Redis.
Enqueuing a lot of jobs really fast
Enqueuing via ActiveJob::Base#perform_later
triggers callbacks and generates one round-trip for each job.
However, perform_all_later
skips callbacks and enqueues all jobs in a single step.
require "benchmark"
sections = Section.limit(1_000).load
Benchmark.measure do
sections.map { |section| NormalizeDataJob.perform_later(section) }
end # @real=0.18012950796401128
Benchmark.measure do
jobs = sections.map { |section| NormalizeDataJob.new(section) }
ActiveJob.perform_all_later(jobs)
end # @real=0.05221099697519094
# Sidekiq with Redis running locally and config.log_level = :error
Enqueuing 1000 jobs is 3.45 times faster. The result may vary according to Active Job’s adapter.
Moreover, bulk enqueuing is not supported by all job backends, but if you’re using Solid Queue, GoodJob, or Sidekiq, you’re good. Finally, bulk enqueuing is available from Rails 7.1 only.
Enqueuing a lot more jobs
The previous example works fine with a few thousand jobs. But that would probably eat too much memory if you’re loading a lot more ActiveRecord instances. The trick is to load IDs only if you do not need to access any attributes or methods.
require "benchmark"
ids = Section.limit(100_000).ids
Benchmark.measure do
ids.map { |id| NormalizeDataJob.perform_later(id) }
end # @real=17.353170770045836
Benchmark.measure do
jobs = ids.map { |id| NormalizeDataJob.new(id) }
ActiveJob.perform_all_later(jobs) # @real=5.2103117069927976
end
# Sidekiq with Redis running locally and config.log_level = :error
Only 5 seconds are required to enqueue 100,000 jobs, which is 3.33 times faster.
Avoid returning records or relations from a model’s method
Because a SQL query is triggered on each call.
Memoizing the result is not necessarily better since it won’t be uncached when reload
is called.
Instead, switch that method to a relation.
# Avoid returning records or relations from a model's method
class User < ApplicationRecord
has_many :subscriptions
# Bad because a SQL query is triggered on each call 👎
def active_subscription
subscriptions.active.first
end
end
user = User.first
user.active_subscription # SELECT * FROM subscriptions ...
user.active_subscription # SELECT * FROM subscriptions ... ☹️
user.active_subscription # SELECT * FROM subscriptions ... ☹️
# Memoizing the result is not necessarily better since it won't be uncached
# when reload is called. Instead, switch that method to a relation:
class User < ApplicationRecord
has_many :subscriptions
# Good because the relation is cached until reload is called 👍
has_one :active_subscription, -> { active }, class_name: "Subscription"
end
user = User.first
user.active_subscription # SELECT * FROM subscriptions ...
user.active_subscription # Cached, no queries 🙂
user.active_subscription # Cached, no queries 🙂
user.reload
user.active_subscription # SELECT * FROM subscriptions ... 🙂
Array#find does not scale
Because scanning sequentially becomes linearly slower as the array grows, but looking for data through an ordered array is extremely fast!
Indeed, it allows for searching via dichotomy. At each step, it splits the array in half, then splits it in half again, and so on. It avoids scanning sequentially all entries. The complexity is less: O(log(n)) instead of O(n).
It’s much faster. It’s like an index. It’s possible because the data are sorted.
We are lucky because Ruby provides Array#bsearch
.
Let’s compare it with Array#find
against 1 million entries from 1 to 1,000,000.
123,456 is at the beginning of the array, and 654,321 is a bit after the middle. To be fair, we will look for both values and see the difference at the end.
require "benchmark/ips"
array = (1..1_000_000).to_a # [1, 2, 3, ..., 1000000]
Benchmark.ips do |x|
x.config(time: 1, warmup: 1)
x.report("Array#bsearch(123456)") do
array.bsearch { |n| 123456 <=> n }
end
x.report("Array#bsearch(654321)") do
array.bsearch { |n| 654321 <=> n }
end
x.report("Array#find(123456)") do
array.find { |n| 123456 == n }
end
x.report("Array#find(654321)") do
array.find { |n| 654321 == n }
end
x.compare!
# Comparison:
# Array#bsearch(123456): 1664871.8 i/s
# Array#bsearch(654321): 1459509.3 i/s - 1.14x slower
# Array#find(123456): 275.6 i/s - 6041.51x slower
# Array#find(654321): 51.5 i/s - 32320.07x slower
end
Bsearch is always a lot faster. It’s impossible to determine the exact amount, as it depends on the size of the array and the location of the data. Do not forget that the array must be sorted, and the condition cannot change.
Moreover, the execution time of bsearch does not vary too much.
Whereas for Array#find(654321)
is a lot slower than Array#find(123456)
because the former is further in the array.
That is why Array#find
does not scale.
Looking through an array of numbers is not a real use case. Let’s see how we can create an index for real data.
# Let's find all countries with a population between 100 million and 1 billion.
# We will fetch data form World Bank's API and load JSON into Country objects.
# Then countries are sorted by population and finally find them with bsearch_index.
require "json"
require "net/http"
Country = Struct.new(:name, :population)
# Download countries with population
url = "http://api.worldbank.org/v2/country/all/indicator/SP.POP.TOTL?format=json&date=2024&per_page=300"
File.write("population.json", Net::HTTP.get(URI(url))) if !File.exist?("population.json")
# Load JSON into Country objects
data = JSON.parse(File.read("population.json"))[1]
data = data[49..-1] # Skip regions to keep countries only
countries = data.map { |hash| Country.new(hash["country"]["value"], hash["value"]) }.compact
# Countries must be sorted before using bsearch_index
countries_per_population = countries.sort_by(&:population)
# Find lower and upper indexes of countries between 100 million and 1 billion population
from = countries_per_population.bsearch_index { |country| country.population >= 100_000_000 }
to = countries_per_population.bsearch_index { |country| country.population >= 1_000_000_000 }
# The upper index must be excluded
countries_per_population[from...to].each { |c| puts "#{c.name}: #{c.population}" }
Now you’ve got a way to find data super-fast. Note that, as a database index, it’s relevant only if you have more reads than writes, because of the upfront cost of sorting. You can go any further by reading Ruby’s Binary Searching documentation.
Conclusion
We went through a bunch of optimisation good practices that cover the entire spectrum of a Rails application, from the frontend to the database, without forgetting views:
- Creating a lot of records quickly with
insert_all
andactiverecord-import
- Enqueuing many jobs really fast with
perform_all_later
and IDs - Decrease round-trips thanks to Redis pipelining
- Faster views by rendering collections and avoiding small partials
- Loading content when viewable only thanks to
IntersectionObserver
or Turbo lazy frame - Fast array search with
bsearch