[Evergreen-dev] Thoughts on db upgrade scripts

Jason Stephenson jason at sigio.com
Tue Jun 13 11:27:50 EDT 2023


Good questions, Jason.  My responses are below.

On 6/13/23 11:09, Jason Boyer via Evergreen-dev wrote:
> After doing this system admin thing for a while you'll frequently find 
> situations where you've applied a fix to a bug that's still in progress 
> (oh, this horrible perf issue just needs an index? Do it!) but when the 
> upgrade script finally makes its way to a release the upgrade script 
> fails because you have the index but not the upgrade_log entry. Bummer.

I've had that happen in the past.  However, I usually catch it  on a 
test database before it happens in production.  I try to remember to 
keep an eye on these things, and I will go back and insert the 
appropriate config.upgrade_log entry when I become aware that I need to 
do that.  The same goes for other back ported SQL fixes.

I use git to manage upgrade branches and customization, so I also make 
sure to cherry-pick the relevant commits so that I can use a script to 
make database upgrade scripts and avoid the conflicts in the future.


> 
> We regularly use the "CREATE OR REPLACE" construction for functions and 
> views, who has thoughts about moving to a format where the fresh-install 
> sql is the usual INSERT (), CREATE INDEX ..., etc. while upgrade scripts 
> are the guarded versions of such:
> 
> INSERT INTO table (fields) SELECT (values) WHERE NOT EXISTS (SELECT 
> values FROM table);
> CREATE (INDEX,TABLE) IF NOT EXISTS ...;
> ALTER TABLE (most options) ...;
> etc.
> 
> I realize the immediate issue is further drift between the upgrade 
> scripts and the seed data for INSERTs and UPDATEs though it shouldn't be 
> too difficult to spot issues. (There's no reason to avoid IF NOT EXISTS 
> in the seed data aside from churn caused by adding them.) It probably 
> wouldn't hurt to re-work some of the enormous piles of INSERTs to also 
> make seed data patches easier to manage, but that's just a separate 
> thought I'm putting out there.
> 
> What say you lot?

I think it is fine to use "IF NOT EXISTS" with all of the appropriate 
commands, i.e. all that offer it as an option.  There should be no harm 
in using it in the seed data.  If we started doing that everywhere, the 
only drift would be with the older data that hasn't been changed.

CREATE OR REPLACE with functions can cause issues when the parameter 
list has changed.  Hopefully, this is caught when the db upgrades are 
tested, but you can't replace a function with the parameter list has 
changed, so you have to drop the old one, first.  There is also an issue 
when a function is overloaded to have different parameter lists.  The 
latter has bitten a couple of sites, CW MARS being one, when an old 
function definition was still around having fewer parameters than the 
one that was replaced.  (This wasn't really part of the question, but 
something that occurred to me while answering, and I felt it important 
enough to bring up.)

Cheers,
Jason Stephenson



More information about the Evergreen-dev mailing list