Database migrations are an everyday task and as such are not generally considered as exciting as software architecture or any other hot topic in the industry. Yet, they are essential in any applications that use databases!
Ruby on Rails handles a lot of problems with migrations for us. However, there are still some details everyone must remember to avoid troubles. That’s why I would like to share with you some simple tips on how to improve your database migrations, which hopefully, you will find very useful and powerful!
Here are the tips you should follow:
- Redefine ActiveRecord classes
- Watch out for irreversible migrations
- Clean up old migrations
- Remember about bulk migrations
- Handle complex data migrations in temporary rake tasks
- Get familiar with ActiveRecord
1. Redefine ActiveRecord classes
Often, we need to refer to ActiveRecord Models inside migrations, for example, to update some fields. So, let’s imagine we want to introduce a new :status
column to the Users
table and update Users
with the active :status
.
It might look like this…
class AddStatusToUser < ActiveRecord::Migration[5.1] ❌
def up
add_column :users, :status, :string
User.update_all(status: "active")
end
def down
remove_column :users, :status, :string
end
end
But there’s a better way to do this by stubbing out the model in the migrations. It gives us two main advantages. First of all, it guards against the case where a model is removed from the codebase but is still being called in a migration. Secondly, it prevents validations from being run, as well as eliminates the associations overhead.
The improved version would look like this:
class AddStatusToUser < ActiveRecord::Migration[5.1] ✅
class User < ActiveRecord::Base
end
def up
add_column : users, : status, :string
User. update_all(status: "active")
end
def down
remove_column :users, :status, :string
end
end
2. Irreversible migrations
Always make sure your migration is reversible (you should be able to run rails db:rollback
on it). Rails has a #change method that is primarily used in writing migrations. It works for most of the cases in which Active Record knows how to reverse a migration automatically.
However, if you modify the data, for example, update the :status column of User, Rails won’t know how to rollback that migration. Hence, never use the #change method in migration in such a scenario and always use #up/#down methods, where #down raises ActiveRecord::IrreversibleMigration
or implements logic to reverse change introduced in #up.
class AddStatusToUser < ActiveRecord::Migration[5.1] ❌
def change
User.update_all(status: "active")
end
end
class AddStatusToUser < ActiveRecord::Migration [5.1] ✅
class User < ActiveRecord::Base
end
def up
User.update_all(status: "active")
end
def down
raise ActiveRecord::IrreverisibleMigration
end
end
class AddStatusToUser < ActiveRecord::Migration [5.1] ✅
class User < ActiveRecord::Base
end
def up
add_column : users, : status, :string
User. update_all(status: "active")
end
def down
remove_column :users, :status, :string
end
end
Alternatively, you can use the #reversible block/helper in the #change method as shown in the example below:
class AddStatusToUser < ActiveRecord::Migration[5.1] ✅
class User < ActiveRecord::Base
end
def change
reversible do |dir|
dir.up do
User.update_all/status: "active")
end
dir.down do
raise ActiveRecord::IrreverisibleMigration
end
end
end
end
Active Record has a class (ActiveRecord::Migration::CommandRecorde
r) that records commands done during migration and knows how to reverse them. Currently, there are around 27 commands and most of them are reversible like #create_column or #add_column. Examples of a non-reversible may be #execute, #execute_block, or #transaction. If you would like to find out more, please visit documentation.
3. Clean up old migrations
It might be a good idea to clean up old migrations in case of the circumstance when you can’t run all migrations (rails db:migrate
) on the fresh database. Otherwise, it might not be necessary to do so as there aren’t many benefits besides fewer files in db/migrations.
In GAT, we only did it once because of old, poorly written migrations. Cleaning up legacy migrations is fairly simple; you just need to create one migration file that has the current state of the database. In other words, a copy of your schema.rb
. Now you can forget about all the troubles when running migrations on a fresh database:
class InitSchema < ActiveRecord::Migration[5.1]
def up
create_table "user" do |t|
t.string "first_name", null: false
t.string "last_name", null: false
t.datetime "created_at", null: false
t.datetime "updated_at", null: false
end
# etc…
end
def down
raise ActiveRecord::IrreversibleMigration, "The initial migration is not reversible"
end
end
4. Bulk migrations
Let’s say that we need to add :first_name, :last_name, :address, :phone_number to the Users table.
Instead of adding each field separately we can do it in bulk mode and add them at once. This way the query will be executed once instead of four times, and in the database world, executing a single query is much faster than executing four individual queries one by one even though their results are the same.
Compare these two examples:
Class AddFieldsToUser < ActiveRecord::Migration[5.1] ❌
def change
add_column :users, :first_name, :string
add_column :users, :last_name, :string
add_column :users, :address, :string
add_column :users, :phone_number, :string
end
end
Class AddFieldsToUser < ActiveRecord::Migration[5.1] ✅
def change
change_table :users, bulk: true do |t|
t.string :first_name
t.string :last_name
t.string :address
t.string :phone_number
end
end
In general, it’s 2-3 times faster, which is quite a time saving, especially with a large database, given how simple the improvement is.
5. Handle complex data migrations in temporary rake tasks
By definition, Rails migrations should only be used for schema changes and not for actual data changes in the database. It might be a good idea to handle complex data migrations in temporary rake tasks. Given that, we decouple our data changes from the deployment. Furthermore, data migrations don’t follow business logic so shouldn’t stay forever in the codebase.
That said, it’s not the worst idea to add a little data change after adding a new column, especially if you stick to tip #1 from this article or use the Database migration tips in Rails. Part 2 - Strong migrations. It shouldn’t cause you any trouble and there are also cool gems like after_party and maitenence_task that you might find helpful.
class AddStatusTouser < ActiveRecord::Migration[5.1] ❌
def change
User.update_all(status: "active")
end
end
namespace :data_migration do ✅
desc "Sets the default user status"
task set user status: :environment do
puts "Setting up default users status"
User.find_each do |user|
begin
user.status = 'active'
user.save!
rescue
puts "Error updating #{user.id}"
end
end
puts "All Done! 🎉”
end
end
I consider the code below fine as well, as long as we keep in mind tip #1 from this article, and we know that migration won’t take too much time:
class AddStatusToUser < ActiveRecord::Migration[5.1]
class User < ActiveRecord::Base
end
def up
add_column :users, :status, :string
User.update_all(status: "active")
end
def down
remove_column :users, :status, :string
end
end
6. Get familiar with Active Record
When dealing with large datasets, be aware of what SQL query Active Record generates under the hood. You can run to_sql
on the Active Record query to do so. It may turn out that your query is not performant enough, and even though it was fine on the dev environment, it takes too much time to be executed on production.
Here’s a short list of Active Record tips that might help you with trying to speed up your query:
Use
find_each
overeach
- > It reduces memory consumption as it’s more efficient than looping through a collection because it doesn’t instantiate all the objects at once. Instead, it allows us to work with individual records.Use
update_all
/delete_all
instead ofupdate
/delete
-> It will update/delete all records with one SQL query which is far more performant than doing it one by one.Use
insert_all
/upsert_all
-> It inserts multiple records in a single SQL INSERT statement without instantiating models nor triggering ActiveRecord’s callbacks.Watch out for
destroy
/destroy_all
-> Remember that destroying might be an expensive operation as it instantiates, executes callbacks, and deletes each record.
Summary
Migrations are essential in database applications. Some frameworks, like Ruby on Rails, will do a lot for us under the hood, but we shouldn’t forget how important migrations are. As such, it’s worth still giving them plenty of attention.
Hopefully, these tips will prove helpful and easy to apply to your project, but we’re not done yet! In part two we’ll take a look at strong migrations and see how we can make our migrations even stronger!
*Originaly posted at GAT Enginnering Blog