Monday, October 26, 2009

Tips for using Rails migrations

 The concept of a migration is that each change that is made to the structure of the database is captured in a version controlled script.  Any time you want to create a table, add a field, change a field type, etc., this is done via a migration.  Migrations are generated along with models, views, and controllers when a Rails scaffold is created to manage a resource. In addition, migrations are used to create a test database for each user which allows for rapid, automated quality checks on the application. If you have ever worked on a project where it is difficult to create a development "sandbox" database for each developer, where databases are copied via backup processes that leave weird permissions behind (I'm looking at you SQL Server), or where keeping the development database in sync with the application is something you even have to think about, you will easily see benefits from migrations.

Using Rails migrations can be a bit of change for teams. In some cases, developers are not very familiar with relational database concepts and object modeling principles and can create some overly complex and/or inefficient structures. Having the overall model reviewed by a database modeling expert and an object modeling expert can be quite helpful.  I recommend Coad's "Java Modeling in Color with UML", Fowler's "Analysis Patterns" and David Hay's "Data Model Patterns"  for getting yourself up to speed with this. However, it is also worth communicating with the rest of the team that may be touching the model. 

Here are a few things we've done in the process of keeping migrations under control:

Don't edit an existing migration, create a new one.
     Editing an existing migration defeats the whole point.  You might be able to get away with it if no one has run it.  One obvious exception to this is if someone creates a migration that doesn't run and checked in. They must get the pig.

Update and run new migrations before you check in a migration.
      While this might seem obvious, in many cases, it is easy to check in a new migration file to the repository without seeing if someone else has created one that does something that conflicts with yours, since the names will be different.  If you don't run the migration, you risk doing something with your migration that causes issues. If you have already run your own migration locally, but not checked it in, and you get an update of a new migration, you need to roll back your own migration and change the numbering on it so that it runs after the migrations you just received from your version control system.

Check in the models that go with the migration when you check in the migration
     One key benefit of migrations is keeping the code changes in sync with the database changes.  Add new tables and models in a single, logical commit.

Assign models to class owners
      The feature of XP that I have been least successful with is the concept of common code ownership.  It is one of the easiest practices to apply in theory, but without the sense of collective responsibility and the compensating practices of pair programming and continuous integration, it can be problematic.  In larger teams, it often makes more sense to have people or subteams assigned to manage a particular model so that changes can be coordinated.  In theory, there can be enough communication to make it work. In reality, many teams are full of introverts that would rather rewrite huge swaths of code (and tests) than ask a question.

     Changes to the object model can impact the whole application. These changes should be communicated to the rest of the team so that the reasoning behind them can be better understood and the data can be used correctly.  There is a risk of someone criticizing or trying to change your proposal, but this is ultimately a lower risk than building something that others fail to understand.

Roll up all migrations into a big file after a major release
    Once you've been going in your project for a while and get around the 100-150 migration point, it is probably worth it to roll the migrations prior to the current point into a single migration.

Generate SQL to run on the production database
   In many cases it is preferable to generate SQL to run on the production database, as opposed to running the migrate task directly against the database. This also allows for some more extensive code review on structural changes.

Don't forget to create constraints and indexes (indices?)
  If you are using foreign keys in your database, it's often worth the cost to create a constraint on the column to prevent any data integrity issues.  It's also a good practice to create an index on those columns to speed joins.

This is really just the tip of the iceberg, but it's a yummy water ice iceberg. Learning to use database migrations is a key skill, not just for Rails development, but for any agile development. A project which cannot create its database from scripts accessible to every developer is missing something important. By breaking those scripts down into small chunks and coordinating them with changes to the code, the situation where the application and database are out of sync simply does not exist.

blog comments powered by Disqus