Always check in schema.rb

Sometimes developers are unsure about Rails' schema.rb file and whether or not it should be checked in to source control. The answer is simple: yes.

As you are familiar, Rails encourages developers to alter the database state through migrations. One migration adds some table, and the following migrations add more, change something, or remove something. And as you well know, this is a great way to keep developers in sync with each other's changes to the database structure.

With each migration you add and subsequently run, your schema.rb file is altered to represent the final state of your database. Seemingly because this is automatically done for you, some developers think this file should be left out of your project; it should not.

Don't store it just because David Heinemeier Hansson said you should

First things first, this is a terrible reason to add schema.rb to your source control. Often, in an argument about something like this it's easy to look to an authority, find a relevant quote and say "See, he said so." Yes, there is a commit in Rails where he suggests you store schema.rb, but this was done to end an argument, it is not an argument itself.

Authorities are authorities because of their deep understanding of something, not because they command you to do things. But often people will allow authorities to command them to do things because they are authorities. David Heinemeier Hansson is an authority on Rails and his control of the project allowed him to make that commit and force the direction of Rails. If you agree with this commit, agree because of the reasoning, not because DHH said so.

Don't store generated files, except for schema.rb

As a general rule of thumb in compiled languages, you don't store generated files. The reason for this is not because an authority said so or because many consider it to be a good rule to follow, but because you want your code to generate the proper files for correct execution of the application.

If you add your generated files to your source control you may in the future find that you've altered your code in a way that prevents a particular file from being generated at all. But if that file exists regardless of the bug you introduced you're likely to find errors in the application's behavior. Tracking that down may be tough until you realize that you've checked-in your generated files.

Additionally, if you store generated file X in source control but change the code to instead generate file Y, you'll leave unused code in your project possibly giving other developers the misconception that it is important in some way. Worse yet, depending on your application the mere presence of file X could affect the execution of your application. I've heard things like this appropriately called "code turds." If it's not going to be used, it shouldn't be there.

Regardless of all of this, schema.rb doesn't affect the execution of your application so there is no danger in storing it in source control. Avoiding generated files in source control is a good rule to follow, but knowing when to break that rule is important too. Leave the code turds out and schema.rb in.

File churn in source control is not an issue with schema.rb

If you are concerned about the amount of churn (that is, frequent changes) you have with the development of your schema.rb file, then you probably are actively developing your database, or you have a problem elsewhere and need to get your team to work on a clearer picture of what your database should do.

Churn in schema.rb can actually be valuable, however. It's easy to overlook multiple migration files changing things in your database, but in reviewing code commits the amount of churn in an area of schema.rb can reveal problems with your development team.

Conflicts in schema.rb are valuable

If your team is making dueling commits over the purpose of a database field, your problem is not with resolving conflicts in schema.rb, it's with resolving conflicts between and among your developers about the structure of your database. Keeping schema.rb in source control will help to reveal this.

Before you commit any of your files to your master/production/whatever branch, you should

  1. run your tests
  2. pull down and merge any other changes
  3. re-run your migrations if any new ones were pulled down
  4. re-run your tests
  5. commit/push your changes (including schema.rb)

Following those steps ensures that your tests are run against the database that everyone else will have and ensures that the schema.rb file you commit is the latest and most up-to-date. Maybe you don't want to run your tests twice, that's fine, but be sure to run them after you pull down the latest code for the project and merge your changes.

Store it because schema.rb is a representation of the current state of the database

At any point in development, you can look at schema.rb to give you an accurate representation of your database structure. Other developers can checkout the project and run rake db:schema:load and almost instantly they are ready to develop (barring any sample data they need to load).

For a new team member, there is absolutely no need to change or rename anything in the database as may be done with migrations. Your ultimate goal to begin development is to have a database in a desired state. It doesn't matter a bit if a field name was changed from "login" to "username" or if your :text field was once a :string field in your first migration. For your database structure, the final state is all that matters, and schema.rb does this for you.

Application code mixed with migrations can cause problems

Sometimes you may have a need to alter your structure and do something like ProductType.reset_column_information in a migration and add some data. For now, I'll avoid the discussion on whether or not that's appropriate, but if you are doing this a problem may arise when at some point in time you remove the ProductType model, or rename it to Category. In that case, you'll need to go back and maintain your migrations... read that again you'll need to maintain your migrations. This is a pointless exercise: use schema.rb.

This is also an example of why you shouldn't mix database seeding and migrations.

Migrations are slow

Relying on migrations to get up to speed for development is slow and gets slower as your application's database changes and as the number of your migrations increases. Because schema.rb skips over changes and represents the final (and desired) state of your database, it's fast.

Your blank slate production database only needs the final state

A production database only needs the final state assuming, of course, that the database is a blank slate. Running all of the migrations in production to get where schema.rb would be is not necessary. schema.rb weeds out all of the changes for you and gets the job done quickly.

"But what about a database that already has a structure?" you may ask. Then all you need is the migrations that haven't been run; you'll never run rake db:schema:load on an existing production database.

Keep schema.rb with your code

schema.rb in your project's source control adds value for all developers. It loads the desired state quickly, it gives you a clear representation of your database structure, it reveals conflicts and unnecessary churn, it doesn't affect the execution of your application. Add schema.rb to your source control and add value for your team.