[OPEN-ILS-GENERAL] Programmatic Merging of Bibliographic Records

Blake Henderson blake at mobiusconsortium.org
Mon Apr 25 15:46:22 EDT 2016


> Whatever method you use I heartily recommend doing so on a testing 
> system and having catalogers look over the results first.
> You may have already done all the due diligence but I say it for 
> anyone reading along as well.  I've never had problems with
> this method and heard back from others with positive success with it 
> as well but I also heard from at least one whose data
> was apparently different enough that it was not a clean merge.  Caveat 
> usor, let the user beware.

We used this method for identifying the duplicate records. We found that 
it merged electronic resources with books. It connected other formats as 
well. We learned the hard way that we need to have better MARC records 
before we run such a tool. Tons of MARC from LOC, includes all of the 
ISBN's of the related formats for example. We subsequently wrote an 
enormous amount of code to "guess" the correct format for all of our 
bibs before deduping them. It uses phrase matching in the MARC. We 
presented this at the Evergreen conference 2015. Slides here: 
http://slides.mobiusconsortium.org/blake/evergreencatclean/#/
If you are curious, ping me.

-Blake-
Conducting Magic
MOBIUS

On 4/25/2016 2:04 PM, Rogan Hamby wrote:
> Hi Jim,
>
> It is available. To be clear I helped create the de-duplication 
> algorithm but the actual coding was done by Galen Charlton of 
>  Equinox. You can find it here:
>
> http://git.esilibrary.com/?p=migration-tools.git;h=300a04108fc6a3d14424c6d365329be334114f7d
>
> The full scope of the script goes a bit beyond the original question 
> as it also does de-duplication before the merging.  The merging work 
> is done by the merge_record_assets function that Jason referenced.
>
>
> On Mon, Apr 25, 2016 at 2:36 PM, swills beyond-print.com 
> <http://beyond-print.com> <swills at beyond-print.com 
> <mailto:swills at beyond-print.com>> wrote:
>
>     Rogan Hamby shared his work with me.  It's a set of SQL procedures
>     that product a 'best bib' and then identifies the less interesting
>     duplicate and it seems to work well.  I modified it so that it
>     produces the candidates but doesn't actually do the merge since we
>     like to have that personal touch up in Maine.  I'm not sure if it
>     is in Evergreen Repos or not?
>
>     Rogan, can you help and thanks again.
>
>     Steve Wills
>
>>     On April 25, 2016 at 2:24 PM Jim Taylor <jtaylor at jtdata.com
>>     <mailto:jtaylor at jtdata.com>> wrote:
>>
>>     I raised the question at the conference regarding the ability to
>>     merge records outside the program interface and was told there
>>     was a procedure/function that would allow this to be done.  Does
>>     anyone know where I can find this function?   My searching has
>>     availed me naught.  I found something under the Vandelay tables
>>     but not sure it is what I am needing as the above mentioned
>>     function is supposed to take two tcn numbers.
>>
>>     Thanks.
>>
>>     Jim
>>
>
>
>
>
> -- 
> --------------------------------------------------------------
> Rogan R. Hamby, Data and Project Analyst
> Equinox - Open Your Library
> rogan at esilibrary.com <mailto:rogan at esilibrary.com>
> 1-877-OPEN-ILS | www.esilibrary.com <http://www.esilibrary.com>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://libmail.georgialibraries.org/pipermail/open-ils-general/attachments/20160425/ad3617f5/attachment-0001.html>


More information about the Open-ils-general mailing list