[OPEN-ILS-GENERAL] ***SPAM*** RE: Evergreen & Software Performance Analysis

Scott Myers sMyers at catalystitservices.com
Wed Sep 25 17:51:44 EDT 2013


Mike,

The multithreaded reingest project was shared during the hackathon at the last evergreen conference.

Here is a link to what we ended up running for moving KCLS from 2.1 to 2.2.

https://github.com/CatalystIT/multithread_2_2_update

The files to pay attention to are the data_update_driver.pl and the update_driver.pl both have pod files attached with quite a few comments on how they work. 

If I can clear up what that means basically we created driver files that divide large amounts of data into smaller chunks and run those on multiple connections for cpu bound updates. A good example is the 2.1->2.2  which had changes in how the data was stored in the metabib field entry tables. This was a very CPU bound update and ended up being run with 32 simultaneous connections to reduce the amount of estimated time from 5 days to complete in 4 hours. 

Let me know if you have questions on how this can be setup or run. 

Thanks

Scott Myers

-----Original Message-----
From: open-ils-general-bounces at list.georgialibraries.org [mailto:open-ils-general-bounces at list.georgialibraries.org] On Behalf Of Mike Rylander
Sent: Wednesday, September 25, 2013 1:41 PM
To: Evergreen Discussion Group
Subject: Re: [OPEN-ILS-GENERAL] Evergreen & Software Performance Analysis

Scott,

I echo Rogan's down-thread thanks for following up here.

I'm curious where the multi-threaded reingest project is shared.  I can't find anything like that searching any of the Evergreen the mailing lists or launchpad for terms like "ingest" and "multi".
Perhaps I'm just missing it.  Some interest was expressed in the community IRC channel, but also some confusion as to what exactly that means.

TIA,

--
Mike Rylander
 | Director of Research and Development
 | Equinox Software, Inc. / Your Library's Guide to Open Source  | phone:  1-877-OPEN-ILS (673-6457)  | email:  miker at esilibrary.com  | web:  http://www.esilibrary.com


On Wed, Sep 25, 2013 at 3:50 PM, Scott Myers
<sMyers at catalystitservices.com> wrote:
> Hi Rogan,
>
>
>
> The db work Command Prompt has done for KCLS is mostly configuration things,
> work mem, max connections, etc. They have been fine tuning all those
> settings to get the best performance. These settings wouldn't help other
> people as it would be dependent on each libraries load. Another change made
> by Command Prompt was to remove slony replication and move to pgpool. If
> anyone needs help doing the same with their database I would highly
> recommend Command Prompt.
>
>
>
> As for work done by Catalyst, all work that is directly applicable and
> beneficial to the community has been added. Kyle Tomita
> https://launchpad.net/~tomitakyle and Fred Parks
> https://launchpad.net/~fparks have been the most active community members
> from our team with Kyle being the 9th on the top contributors list as of
> 9/24/13.
>
>
>
> Catalyst also shared a multithreaded bib reingest that greatly reduces the
> time needed to do a full reingest. We also plan to share the way that
> Catalyst deploys code to KCLS without downtime.
>
>
>
> Catalyst considers itself part of the community and is actively working to
> add more value. We have developed a strong relationship with KCLS and enjoy
> working with them greatly and our relationship has allowed us to gain a
> strong understanding of Evergreen. We've got some interesting work that we
> are going to be doing in the near future for KCLS, and as we have in the
> past, that which is beneficial to the community will be shared.
>
>
>
> If you would like detail on any of these items now, feel free to reach out
> to me. You have my cell phone number.
>
>
>
> Thanks
>
>
>
> Scott Myers
>
>
>
>
>
> From: open-ils-general-bounces at list.georgialibraries.org
> [mailto:open-ils-general-bounces at list.georgialibraries.org] On Behalf Of
> Rogan Hamby
> Sent: Tuesday, September 24, 2013 7:10 AM
> To: Joshua D. Drake
> Cc: Evergreen Discussion Group
> Subject: Re: [OPEN-ILS-GENERAL] Evergreen & Software Performance Analysis
>
>
>
> Picking back up an old thread...
>
>
>
> I was hoping at some point to hear more about the db work Command Prompt has
> done for KCLS and perhaps see some work in git. I was sad to see that in the
> new LJ article that Jed Moffitt said that at this point KCLS has forked
> Evergreen so I suppose the work Catalyst and Command Prompt has done isn't
> relevant to the rest of the Evergreen community.  I suppose that also means
> that any experience gained in working on the KCLS system isn't
> transferrable.
>
>
>
>
>
>
>
>
>
>
>
> On Thu, Aug 22, 2013 at 11:05 AM, Rogan Hamby <rogan.hamby at yclibrary.net>
> wrote:
>
> Hi Joshua,
>
>
>
> I don't know if you had a chance to see my message below so I'll copy you in
> directly as well and maybe touch base again after labor day.  With the
> Evergreen community having a rich collection of input from various
> contributors (many like yourself paid to do individual development by
> community members) all participating in the open source spirit and putting
> their code out there, allowing others to build on top of it or modify it or
> package it into master it would be exciting to see this work since you've
> indicated it's had a big impact for your customers.
>
>
>
> I did a quick mark mail search since I sometimes lose emails to spam filters
> and noticed that back in Feb you mentioned that your Evergreen customer has
> been KCLS.  I know that at the conference they talked about setting up a
> public repo that would be available right after the conference.  Maybe they
> can chime in on an update on that?
>
>
>
>
>
> On Fri, Aug 9, 2013 at 11:52 AM, Rogan Hamby <rogan.hamby at yclibrary.net>
> wrote:
>
> HI Josh,
>
>
>
> Can you share with folks some more specifics?
>
>
>
> For example:
>
>
>
> In regards to optimizing the conf file can you share what kind of
> optimizations and the benchmarks?  E.g. with X records we see Y performance
> in activity Z.
>
>
>
> A lot of other changes obviously touch on changes to code and/or schema
> changes.  Are these going to be released on a public repo or fed back into
> master?
>
>
>
>
>
>
>
>
>
> On Thu, Aug 8, 2013 at 2:01 PM, Joshua D. Drake <jd at commandprompt.com>
> wrote:
>
>
> On 08/07/2013 10:12 AM, Rogan Hamby wrote:
>
> I'm guessing maybe Joshua doesn't keep track of the list serv but is
> there someone else from Command Prompt or whomever they did the
> development work for that could chime in?  When he says they've made
> improvements do those include GPLed code?
>
>
>
> Sorry folks, I do watch this list but not as much as the postgresql lists.
> We have also been very busy. Here are some of the basic things we have done:
>
> 1. Optimized the postgresql.conf, it is amazing how much you can get from
> some minor tweaks after some performance analysis.
>
> 2. Converted some of the procedures to C, for example translate_isbn1013
>
> 3. Modified the holds process to use a look up table.
>
> 4. Changed the process for holds so they don't indefinitely exist but get
> migrated out for reporting but does not affect performance of the active
> table.
>
> 5. Partitioning of larger tables
>
> 6. Upgraded versions of PostgreSQL to more modern versions (this can also
> result in noticeable gains in performance).
>
> 7. Lots of query tuning, adding indexes where appropriate, increasing
> maintenance on particular tables to reduce bloat more aggressively etc...
>
> As well as various other things (stabilizing the system so there isn't weird
> overloads, unexpected apache load events etc..). It certainly has been a
> rather wild ride over the last 9 months as we get further and further into
> the adventure that is the Evergreen software.
>
> Sincerely,
>
> Joshua D. Drake
>
>
>
>
> --
> Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
>
>
> PostgreSQL Support, Training, Professional Services and Development
>
> High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
> For my dreams of your image that blossoms
>    a rose in the deeps of my heart. - W.B. Yeats
>
>
>
>
>
> --
>
>
>
> Rogan Hamby, MLS, CCNP, MIA
>
> Managers Headquarters Library and Reference Services,
>
> York County Library System
>
>
>
> "You can never get a cup of tea large enough or a book long enough to suit
> me."
> -- C.S. Lewis
>
>
>
>
>
> --
>
>
>
> Rogan Hamby, MLS, CCNP, MIA
>
> Managers Headquarters Library and Reference Services,
>
> York County Library System
>
>
>
> "You can never get a cup of tea large enough or a book long enough to suit
> me."
> -- C.S. Lewis
>
>
>
>
>
> --
>
>
>
> Rogan Hamby, MLS, CCNP, MIA
>
> Managers Headquarters Library and Reference Services,
>
> York County Library System
>
>
>
> "You can never get a cup of tea large enough or a book long enough to suit
> me."
> -- C.S. Lewis


More information about the Open-ils-general mailing list