[OPEN-ILS-DEV] Database upgrades (recent approach)

Tue Sep 22 00:56:13 EDT 2009

On Mon, 2009-09-21 at 16:44 -0400, Bill Erickson wrote:
> 
> 
> On Mon, Sep 21, 2009 at 3:23 PM, Dan Scott <dan at coffeecode.net> wrote:
>         It's great to see the atomic database migration scripts
>         (better name
>         suggestions appreciated) getting committed recently! Is there
>         a design
>         document that lays out how these are expected to be used & how
>         other
>         developers should create these?
> 
> 
> Thanks for the poke Dan, I've been meaning to discuss and request
> feedback on this.  The problem of managing schema changes has gotten
> out of hand when working on a development server.  The goal was to
> start moving in the direction of atomic changes then work out the
> rough edges.   Here's a rough brain dump of where things stand
> currently: 
> 
> Each branch has it's own upgrade directory in Open-ILS/src/sql/Pg/, if
> needed.
> 
> Every schema change or addition of seed data to a branch results in a
> new file in that directory.  
> 
> Clumping multiple changes into a single file is OK if they are tightly
> linked (e.g. seed data for 2 related org unit settings).
> 
> Each new file uses the next unique numeric prefix.  0001, 0002, etc.  
> 
> Following the numbers is ".schema" or ".data", indicating a schema
> change or new seed data.

Hmm. How about the case that I outlined previously about adding an
enhanced subject index, in which we would generate new rows for every
existing bib record? It's munging existing data and creating new data,
but it's not seed data. I think that we could put that in the data
category and name it a .data file, but we wouldn't put it into the
950.sql seed data file.

Just a picky detail.

> Anything after that is just descriptive.
> 
> If it's necessary to back-port a DB change, just merge the .sql file
> back into the branch's upgrade directory. 
> 
> Each new file tells the DB that it has been inserted with "INSERT INTO
> config.upgrade_log (version) VALUES ('$ENTRY');"  The value for $ENTRY
> needs discussion.  I belive most of the existing scripts put the
> actual file name in there, while 002.schema.config.sql (which defines
> the baseline for a newly created DB) is currently just storing the
> numeric prefix.  
> 
> The value in 002.schema.config.sql will also need to be bumped up with
> each additional script.
> 
> For schema changes, the db upgrade scripts capture the ALTER TABLE,
> etc. statements.  Changes to the table definitions still need to be
> added to the base schema file for new installations.  
> 
> Similarly, seed data should still go into 950.data.seed-values.sql as
> well.  At least, that's the current thought.  I could see an argument
> for changing that.

I kind of like one big 950.data.seed-values.sql file, as it makes it a
bit easier to see everything in one place. That said, there's something
that seems a bit wrong about duplicating data... chances of typos
creeping into one version or the other, for example.

> At minimun, this makes it possible for a developer to easily see what
> DB changes need to be applied since a given revision.  The longer term
> goal is automated DB upgrades.
> 
> I'm sure I'm missing a thing or two.  Does this sound reasonable?

It sounds very reasonable to me. The only thing that I've noticed
missing is I18N, but we can probably handle that by teaching the current
db-seed-i18n.py script to handle migration scripts as well... I think.
Will need to wrestle with it for a little bit.

> I'll be glad to wiki-fy this after we come to concensus, work out the
> edges, etc.

Awesome. I don't think there are too many edges to smooth out, you've
laid things out nice and logically (as usual).

>         I also think we have a number of atomic scripts to create
>         retroactively (?)
> 
> 
> It may be simpler to just account for missing changes in the first
> batch of upgrade scripts.  IOW, going back and adding all the changes
> could be a big job and not worth the effort.  I'll defer judgement on
> that, though.

Right, I agree that it could be a big job. So you're suggesting rolling
those changes into a traditional big upgrade SQL script for this kick at
the can? If so, I would support that.

Thanks for the careful design and quick response, Bill!