[OPEN-ILS-DEV] ***SPAM*** ***SPAM*** Re: ***SPAM*** Re: Serials Schema Proposal - Further De-emphasis of MARC as Record Format

Dan Wells dbw2 at calvin.edu
Wed May 26 18:12:24 EDT 2010


Hello Scott,

First, sorry for the conflation of suggested changes in my original email.  It is really two separate proposals:

1) add the serial.caption_and_pattern table, as outlined
2) create a new table (serial.serial (aka serial.base)) in the schema to function in place of serial.record_entry going forward (retaining serial.record_entry for legacy use only).

The logic for (1) was fairly well stated (I think).  The logic for (2) is:
 a. moving the caption/pattern fields out of the 'marc' column (aka MFHD record) in serial.record_entry doesn't leave enough of value in the 'marc' column, so the column should be dropped or at least nullable
   b. it doesn't make much sense to have a null 'marc' value in a table named 'record_entry' (implying a MARC record entry, e.g. biblio.record_entry), so the table should be renamed (e.g. serial.serial)
     c. if we are both repurposing and renaming the table, it makes sense to keep serial.record_entry around untouched for legacy use (i.e. the libraries who have already loaded MFHD records using the current basic functionality).  serial.serial will be largely the same, but with NO MARC AT ALL :)

So...

> As I understand it, the marc in serial.caption_and_pattern would *not*
> be a copy of the marc in serial.record_entry, but a subset of it, or
> somehow derived from a subset of it.  Is that right?

Yes, correct.  Just a stringified version of the directly related 85X field would be kept here.

> Your proposed serial.caption_and_pattern table contains columns named
> enum_1, enum_2, and apparently (judging from the ellipsis) a series of
> enum_[0-9]* columns.  On its face, that doesn't look very normalized to
> me.  Is there a firm, well behaved limit on the number of enum columns?
> Is that a reflection of how MFHD records work (of which I am supremely
> ignorant)?  Or would it make sense to add a child table to hold the
> enums?

Yes, there is a firm limit, sorry for being lazy.  The table will have 6 enum fields and 5 chron fields.  That is the extent of the standard and certainly a reasonable one.

> The name "caption_and_pattern" bothers me a bit too -- not because the
> name itself matters much, but because it suggests, or induces, a bit of
> muddlement.  Does a row in this table contain a caption *and* a pattern?
> Or maybe one or the other?  Or maybe both?  Or neither?  How do we store
> a caption differently from a pattern -- in different enums?

The caption parts are distinct and knowable.  The pattern parts are much more fluid.  I think it is reasonable to model the caption parts directly, but the pattern parts will be stored in blob form (i.e. the 'marc' column).  As for keeping them in one table, there are three related reasons.  One, they are in the same field in the MFHD standard (maybe not the best reason, but a convenient one).  Two, a pattern only makes sense in the context of the caption (e.g. we have 4 "No." per "V."), so it makes sense to edit and store them together.  Three, due to reason two, captions and patterns will always exist in a one-to-one relationship; storing them together makes that clear.

> "Serial.serial" does have a certain alliterative appeal -- like "Sirhan
> Sirhan", or "Boutros Boutros-Ghali."

This might have been a joke, but I'll gladly give 'serial.serial' a +1.

> We could also consider creating a whole new "ser" schema, with a
> "ser.serial" table.  Anybody using the old "serial" schema could keep
> it around without interference until they're ready to blow it away, or
> forever if they want.

The *only* table being used in the current serial schema is record_entry.  Keeping our new tables in 'serial' and letting 'record_entry' stick around as legacy shouldn't cause too much confusion, IMO.

Ultimately, as Mike suggested in IRC, this is really not a critical change by any means.  I am currently coding with this setup in mind, but reverting/revising later won't be the end of the world by any means.

Thanks again for all the help,
Dan





-- 
*********************************************************************************
Daniel Wells, Library Programmer Analyst dbw2 at calvin.edu
Hekman Library at Calvin College
616.526.7133


>>> On 5/26/2010 at 11:18 AM, Scott McKellar <mck9 at swbell.net> wrote:
> --- On Mon, 5/24/10, Dan Wells <dbw2 at calvin.edu> wrote:
> 
>> >>> On 5/24/2010 at 12:25 PM, Scott McKellar
>> <mck9 at swbell.net>
> 
> <snip>
> 
>> 1-3.  Sorry for not being more clear, but I think
>> serial.base (or whatever it is called) will be a direct
>> replacement for serial.record_entry everywhere it is used in
>> the new schema.  So for starters it will have all the
>> fields in record_entry minus 'marc'.  We might consider
>> going without the 'edit' related fields as well, since there
>> won't be much to edit there anymore.
> 
> I believe Mike Rylander has proposed leaving serial.record_entry in
> place, to serve the same role as your serial.base.  The new
> serial.caption_and_pattern table would then be a child of
> serial.record_entry.  We might want to make the marc column nullable
> in serial.record_entry.
> 

> 

> 
> 
> Maybe a different table name is in order, e.g. record_entry_detail.
> 
> <snip>
> 
>> > 5. Can we come up with a better name than
>> "serial.base"?  It's too
>> > vague.  It could represent sodium hydroxide, the
>> std::basic_string
>> > class, or Wright-Patterson Air Force Base.  Maybe
>> "serial.periodical"?
>> > 
>> 
>> 5.  I agree that serial.base seems too generic. 
>> Honestly it should probably be called 'serial.serial', as
>> some would argue that the term 'periodical' doesn't
>> technically include newspapers (and probably a few other
>> minor things).  Again, however, I am willing to be more
>> pragmatic than technical if people are opposed to a
>> 'serial.serial' table.
> 
> 
> Scott McKellar


More information about the Open-ils-dev mailing list