[OPEN-ILS-DEV] Serials Schema Proposal - Further De-emphasis of MARC as Record Format

Mike Rylander mrylander at gmail.com
Mon May 24 14:04:35 EDT 2010


On Mon, May 24, 2010 at 11:03 AM, Dan Wells <dbw2 at calvin.edu> wrote:
> Hello,
>
> I will try to be briefer than usual, as I really want to get some initial reactions before I go to far in trying to think this out.  Basically, when I began working on serials, I kept with the idea that a MFHD record would be central to both predictions and holdings.  Being the opaque and complex format that it is, this adds some fairly serious complexity and overhead to serials management (predicting, receiving, claiming, discarding, and displaying).  What we really want is to maintain full compatibility with the MFHD standard without the overhead of maintaining a full MFHD record.  The simplest way to do this is to keep MFHD field data in some stringified form (where appropriate), but to associate this data directly with the element it represents.  This is precisely what we have done with the issuance table, as it has fields for storing the enumeration/chronology statement (863-5) which the issuance represents.  We have also parted out the textual holdings (866-8) to be directly associated with the appropriate holding library via the distribution table (and friends), and location and item information are handled by the item and unit tables.
>
> At this point we really need to ask, what's left?  The answer: not a whole lot.  Chief among the survivors are the caption and pattern statements (853-5).  I am proposing we get those out as well.  We will continue using the MFHD data, but it will be stored where it is most relevant, not as a central record (which we could always regenerate for export purposes).  We will need at least one new table, something like:
>
> serial.caption_and_pattern (
>    id          SERIAL  PRIMARY KEY,
>    base                INT     NOT NULL REFERENCES serial.base (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED,
>    type                TEXT    NOT NULL CHECK (type IN ('basic','supplement','index')),
>    active              BOOL    NOT NULL DEFAULT FALSE,
>    marc                TEXT    NOT NULL,
>    enum_1      TEXT    DEFAULT NULL,
>    enum_2      TEXT    DEFAULT NULL,
> ...
> );
>
> The first five columns are essential, but I can see benefits from separate columns for each level of enumeration and chronology captions as well.  I also propose (as evidenced by column 2) that we create a new root table called 'serial.base' (or something similar).

I'm confused on what serial.base is.  I think Scott mentioned this as well.

> What do we gain by doing these things?
>
> 1) In the most recent schema, holdings data is linked to its matching caption/pattern via subfield 8s in various fields 'hidden' within the marc column of serial.record_entry.  While it is therefore well-defined, it is also invisible to the DB layer.  A separate caption_and_pattern table will allow the holdings data to reference its caption/pattern data directly.
> 2) If we proceed (as planned) to create columns in both caption_and_pattern and serial.issuance to hold the non-repeatable enumeration and chronology fields, we can reliably query for and display a single issuance or group of issuances without consulting the MARC at all.
> 3) We can easily augment places where the MFHD standard is weak.  One very obvious place is its failure to identify whether a pattern should be considered active or not, a problem this table easily rectifies.
> 4) The only 'serial' table currently in production use is serial.record_entry.  Rather than altering or repurposing it, we can allow it to hang on as-is for those libraries (like us, Laurentian, and perhaps others) which have invested in it (we have around 1,650 of these records in use).  Doing so will allow us to phase in the new system on our own terms.
>
> Well, this is getting longish, so I'm going to stop there and hopefully get the first reactions I am seeking.  Thank you very much for your time and thoughts.
>

How about leaving serial.record_entry where it is (no marc column on
serial.caption_and_pattern, unless I'm misunderstanding the point of
that column) and pointing from serial.caption_and_pattern to
serial.record_entry?

Is there a reason to have more than one active caption_and_pattern row
of each type (basic, supplement, index) for a given MFHD? If not, we
can constrain the c-a-p rows to one active per type per mfhd we can
use a unique conditional index.

Overall, +1.  Thanks, Dan!

-- 
Mike Rylander
 | VP, Research and Design
 | Equinox Software, Inc. / The Evergreen Experts
 | phone:  1-877-OPEN-ILS (673-6457)
 | email:  miker at esilibrary.com
 | web:  http://www.esilibrary.com


More information about the Open-ils-dev mailing list