[OPEN-ILS-GENERAL] Here we grow again! Link checker functionality in Evergreen
Duimovich, George
George.Duimovich at NRCan-RNCan.gc.ca
Fri May 11 18:37:21 EDT 2012
Paul,
Thanks also for your suggestion re: politeness. That's something our current linkchecker doesn't specifically address, except by coincidence (we currently don't sort the targeted URLs to be checked by domain).
George
George Duimovich
NRCan Library / Bibliothèque de RNCan
-----Original Message-----
From: open-ils-general-bounces at list.georgialibraries.org [mailto:open-ils-general-bounces at list.georgialibraries.org] On Behalf Of Paul Hoffman
Sent: May 11, 2012 15:14
To: open-ils-general at list.georgialibraries.org; open-ils-dev at list.georgialibraries.org
Subject: [OPEN-ILS-GENERAL] ***SPAM*** Re: Here we grow again! Link checker functionality in Evergreen
On Fri, May 11, 2012 at 02:55:42PM -0400, Suzannah Lipscomb wrote:
> Equinox Software, Inc. is excited to announce the development of link
> checker functionality in Evergreen.
Uh-oh!
> Evergreen currently has no built-in mechanism for verifying the
> validity of URLs stored in MARC records. The ability to verify URLs
> will be of particular benefit to locations with large electronic
> resource collections. The requirements for this project are being
> developed in partnership with NRCan Library and Statistics Canada
> Library. The technical specifications for this project will be shared
> with the Evergreen Community once they are ready. Equinox developers
> estimate that coding will be completed no later than the end of the
> third quarter of 2012.
As someone who has had to deal with poorly written link checkers and the havoc they wreak, I sincerely hope that one of the requirements will be
*politeness* -- specifically, a throttling mechanism that keeps the link checker from hammering away at a server that happens to serve a large number of resources linked to from an Evergreen catalog. (This is actually quite simple in most cases: just shuffle the list of URLs that you check, and don't check too fast. If you have 860,400 links in a catalog you can check ten per second and still finish in 24 hours, but you'd better make sure that those ten per second aren't all requested from the same server!)
And -- though it's not clear if this is what's intended or not --
*please* let's not develop an *integrated* link checker for Evergreen at all. A link checker is properly a separate special-purpose tool with a simple, well-defined interface that allows it to be used by any number of applications; it shouldn't be all tangled up in an ILS. There might even be a perfectly good link checker that fits the bill now -- I don't know, as I haven't looked into the matter very closely.
OK, I'll get down off my soap box now. :-)
Paul.
--
Paul Hoffman <paul at flo.org>
Systems Librarian
Fenway Libraries Online
c/o Wentworth Institute of Technology
550 Huntington Ave.
Boston, MA 02115
(617) 445-2914
(617) 442-2384 (FLO main number)
More information about the Open-ils-general
mailing list