What's the advantage?

shimon · on Dec 8, 2010

Consistent ordering even if two developers happen to add migrations starting from the same source version.

mgedmin · on Dec 9, 2010

Could you elaborate on that?

nerme · on Dec 9, 2010

Schema version is at 200.

Developer A creates a migration numbered 201. Developer B creates a migration numbered 201. They push their code, pull, and get conflicts.

So instead, they use a timestamp based system. Developer A created his and got 20100604181252. Developer B got 20100604182530. Now, everyone is happy, and no one has to change their migrations.

alextgordon · on Dec 9, 2010

I think I prefer the first version. At least the conflict is obvious.

Developer A creates migration 20100604181252. Developer B creates migration 20100604181253. Developer B doesn't notice Developer A's migration: there's no way to know from the timestamps that there might be a conflict. Everything breaks.

vs

Developer A creates migration 201. Developer B creates migration 201. Developer B gets a handy notification that something has happened in the meantime, and gets a chance to patch the migration if necessary.

binbasti · on Dec 9, 2010

Actual conflicts from working with the same fields in the same tables are so rare, I haven't even experienced it once.

However, the second scenario would happen every single day in a team.

tim_iles · on Dec 9, 2010

I have occasionally modified a sproc, only to find that a conflicting check-in has also modified that same sproc. One of our modifications has been overwritten by the other.

I do concede it is rare, but it happened, and we only discovered the bug too late.

rahoulb · on Dec 9, 2010

The real issue is when you have feature branches and merge them in to master/trunk - so it's not a single file conflicting, but maybe a week or more's worth of work from both developers.

As you say, it would rely on notifications that there was the possibility of clashes, but unless you monitor every other developer's local feature branch, you would only know when the branches are merged into master.

kemiller · on Dec 9, 2010

Two developers pull code with migration version 4. Each writes a new feature involving a migration. Both get the number 5, which then collides when they both check in. But it doesn't collide at the source level, so you only find out when the db doesn't work right. This was a big pain in the ass in Rails until they fixed it.

kiddo · on Dec 9, 2010

Why don't they collide at the source level? I'm assuming the db updates get checked into your central source code repo. Wouldn't the collision happen at that time?

What happens when developers create branches and use time stamps? Developer A working on trunk might check in db updates at 4pm and 5pm. Developer B working on a branch checks db updates into his branch at 4:30pm and 5:30pm. When working on the branch his db updates don't affect the "trunk" db. As soon as he merges does his 4:30pm update get run on the "trunk" db, or is it skipped over because the 5pm db update already ran, and only the 5:30pm change gets run?

Where I come from we use int values for our db update script names. When I merge from a branch I renumber my db update script file names starting at the tip int value +1 of the trunk db update.

fyi - I'm not coming from a Ruby/Rails perspective, if that matters.

rahoulb · on Dec 9, 2010

There's a table listing the migrations that have already been run on the database. The list of migrations in source control are compared against the list in the database, the outstanding ones sorted into timestamp order and then run.

So if the two branches are separate there should be no issue - all my changes get run in order, irrespective of what the other developers have been doing.

kemiller · on Dec 9, 2010

They don't collide because the file names include more than just the number.

And yes, manually renumbering is exactly what you'd have to do. Timestamps obviate that.

Theoretically, you can still have collisions (two migrations alter the same column, eg) or violated prereqs this way, so some communication is still required, but not to the degree before.

gchpaco · on Dec 9, 2010

Posit repository A which is at schema state S(A) = N. Developer X branches it to make B and adds a new database migration, increasing S(B) to N+1. Developer Y branches it to make C and adds a new database migration, increasing S(C) to N+1. Now, merge B and C.

It's not necessary that schema versions be that rigid. All you really need is that they be monotonically increasing; a timestamp in UTC is sufficient for those purposes, although technically you could run into a situation where a UUID is necessary.

stavros · on Dec 8, 2010

Ah, very interesting. I'm sold.

wladimir · on Dec 9, 2010

Timestamps, UUIDs, and such have no dependency on a central source (and thus, single source fo failure) for generating them. Much better scalability.

Even with a single server, not relying on IDs being sequental is useful when you want to do a database migration without temporarily shutting down the site. It is much simpler to merge the two databases together.