[Evergreen-dev] Thoughts on db upgrade scripts
Jason Stephenson
jason at sigio.com
Tue Jun 13 11:27:50 EDT 2023
Good questions, Jason. My responses are below.
On 6/13/23 11:09, Jason Boyer via Evergreen-dev wrote:
> After doing this system admin thing for a while you'll frequently find
> situations where you've applied a fix to a bug that's still in progress
> (oh, this horrible perf issue just needs an index? Do it!) but when the
> upgrade script finally makes its way to a release the upgrade script
> fails because you have the index but not the upgrade_log entry. Bummer.
I've had that happen in the past. However, I usually catch it on a
test database before it happens in production. I try to remember to
keep an eye on these things, and I will go back and insert the
appropriate config.upgrade_log entry when I become aware that I need to
do that. The same goes for other back ported SQL fixes.
I use git to manage upgrade branches and customization, so I also make
sure to cherry-pick the relevant commits so that I can use a script to
make database upgrade scripts and avoid the conflicts in the future.
>
> We regularly use the "CREATE OR REPLACE" construction for functions and
> views, who has thoughts about moving to a format where the fresh-install
> sql is the usual INSERT (), CREATE INDEX ..., etc. while upgrade scripts
> are the guarded versions of such:
>
> INSERT INTO table (fields) SELECT (values) WHERE NOT EXISTS (SELECT
> values FROM table);
> CREATE (INDEX,TABLE) IF NOT EXISTS ...;
> ALTER TABLE (most options) ...;
> etc.
>
> I realize the immediate issue is further drift between the upgrade
> scripts and the seed data for INSERTs and UPDATEs though it shouldn't be
> too difficult to spot issues. (There's no reason to avoid IF NOT EXISTS
> in the seed data aside from churn caused by adding them.) It probably
> wouldn't hurt to re-work some of the enormous piles of INSERTs to also
> make seed data patches easier to manage, but that's just a separate
> thought I'm putting out there.
>
> What say you lot?
I think it is fine to use "IF NOT EXISTS" with all of the appropriate
commands, i.e. all that offer it as an option. There should be no harm
in using it in the seed data. If we started doing that everywhere, the
only drift would be with the older data that hasn't been changed.
CREATE OR REPLACE with functions can cause issues when the parameter
list has changed. Hopefully, this is caught when the db upgrades are
tested, but you can't replace a function with the parameter list has
changed, so you have to drop the old one, first. There is also an issue
when a function is overloaded to have different parameter lists. The
latter has bitten a couple of sites, CW MARS being one, when an old
function definition was still around having fewer parameters than the
one that was replaced. (This wasn't really part of the question, but
something that occurred to me while answering, and I felt it important
enough to bring up.)
Cheers,
Jason Stephenson
More information about the Evergreen-dev
mailing list