[OPEN-ILS-GENERAL] Testing and Evergreen's quality (was: Database schema deprecation/supersedes stuff)

Wed Jun 22 14:10:01 EDT 2011

Top-posting because my response (or, more appropriately, follow-up) is
in the form of a public google doc.

Short version: We agree on all the goals, so I propose a way forward
that leverages the size (bigger than pre-2.0 and growing) of our
community and IMO gives us the tools to enact change toward those
shared goals.

Long version: https://docs.google.com/document/d/1EyLZ9PH25kwQvbC9uvYo0z8cYG5Ui3bFrzlkp2C2FYA/edit?hl=en_US

Thanks, Dan, for clearly articulating your thoughts.  I hope the doc
above does the same.

All, please, PLEASE take a few minutes to read that.  Feedback
strongly encouraged, both general and specific.  If this thread grows
a lot I'll happily open that doc for editing.

--miker

On Mon, Jun 6, 2011 at 1:34 AM, Dan Scott <dan at coffeecode.net> wrote:
> On Sun, Jun 5, 2011 at 11:17 AM, Mike Rylander <mrylander at gmail.com> wrote:
>> On Sun, Jun 5, 2011 at 1:04 AM, Dan Scott <dan at coffeecode.net> wrote:
>
>>>  * Are we ready to start making use of pgTAP? Changing the database
>>> schema seems like a perfect use case for unit tests, to ensure that
>>> expected behaviour is maintained through the upgrade, and to
>>> demonstrate that buggy behaviour is fixed or non-existent behaviour
>>> comes into existence via the upgrade.
>>
>> Ready? Sure. Tuit-ful? Not I...
>
> I'm not sure how to respond to this tactfully, so I won't try to be
> clever or cute, I'll just be blunt. The alternative to putting in time
> upfront on quality is to spend more time addressing quality problems
> later after a release, and we've done a lot of the latter. We've had
> trouble publishing high quality initial releases. Production sites
> have been finding too many problems with their patrons and staff, and
> it's not good for the Evergreen name. My hands are far from clean on
> this front (hello, sites who upgraded from 1.6 -> 2.0 and ran into
> problems with authorities), which is one of the reasons that I have
> invested much of my own time in getting the continuous integration
> server running again and creating a skeleton set of unit tests (and
> thanks to Kevin for his efforts in that area too). It's also why I've
> been a proponent of getting sign-off on branches from another
> contributor instead of committing your own work directly to a core
> branch.
>
> I believe that we can begin to address some of these quality issues
> via more unit test coverage. I don't think that we're going to get
> very far, though, if we just have one or two people trying to add unit
> tests to other people's work - and those people are likely to have
> their own areas of new functionality that they want to contribute to
> Evergreen, rather than spending all of their time writing tests for
> other people's code. The people creating new functionality or
> modifying existing functionality are the ones who are in the best
> position of knowing what inputs and outputs to expect from a given
> chunk of code, and therefore to create basic unit tests demonstrating
> those expectations - which helps other contributors weeks, months, or
> years later know whether their own changes will break expectations.
> But we need to adopt the approach as a team, not as individuals.
> Tackling the database schema via pgTAP as modifications happen seems
> like a small, reasonable step to take in this direction. It's not
> trying to boil the ocean by saying that we need unit tests for every
> function and every table in the database immediately; it's suggesting
> that, when you modify the schema, you commit tests at the same time
> that demonstrate that your changes do what you say they do (and
> maintain existing behaviour). And eventually, I bet we would get a lot
> of the database schema covered with this gradual approach.
>
> Unit tests alone won't prevent all of the problems that we've run into
> with new releases, of course. I've been guilty of introducing new
> functionality that proved to perform poorly at scale until indexes
> were added, or that only showed up when data was migrated from a
> previous release rather than loaded directly into the new release.
> Bug #788379 ("broad searches are slow") is an example of a serious
> performance regression in 2.0 that has yet to be addressed.
> constrictor gives us some great tools on the performance testing
> front, but it takes time to set up a clean environment loaded up with
> sufficient data to trigger noticeable performance problems (let alone
> tracking performance over time) or to run that environment through an
> upgrade process and put the resulting environment through its paces.
> We need repeatable upgrade tests and performance tests - maybe a
> community environment that runs a standard set of system tests on a
> regular basis and tracks those results over time?
>
> In summary, I don't think I'm the only person who feels that we've had
> quality problems. There are probably ways to address these problems
> that I haven't raised here, and I'd be happy to hear about
> alternatives from people who are prepared to adopt them. I just don't
> want to see a 2.1.0 release that isn't really ready for prime time
> until 2.1.6, and a 3.0 release that isn't ready for adoption until
> 3.0.6, and I don't want libraries playing a game of chicken to see who
> is willing to be the early adopter of a new release. I want libraries
> confident that they can adopt a *.*.0 release, and I want them to be
> proud of Evergreen's quality and able to recommend it without
> reservations to other libraries.
>

-- 
Mike Rylander
 | VP, Research and Design
 | Equinox Software, Inc. / Your Library's Guide to Open Source
 | phone:  1-877-OPEN-ILS (673-6457)
 | email:  miker at esilibrary.com
 | web:  http://www.esilibrary.com