[OPEN-ILS-DEV] Monograph Parts

Dan Wells dbw2 at calvin.edu
Thu Feb 17 18:54:18 EST 2011


Hello Mike,

Thank you for the detailed response.  I'll *try* to keep this reply more
brief, and see if I can highlight a few things which still concern me (and
places where I was not clear).

>> It makes good sense, but I think we could ultimately benefit by putting
less
>> emphasis on a bib record point of view and tilting things a bit more
towards
>> the item point of view.
> 
> I don't see why items (particularly, barcoded physical items) should
> be the focus.  On the other hand, records are the core of a
> bibliographic system -- everything else (items and their attributes,
> access restrictions for electronic resources, libraries-as-places,
> etc) is a set of filter axes to apply atop the record when searching
> for or manipulating it.  The record is the nucleus, and everything
> else enhances/vivifies/subdivides it.
> 

I do not agree with this, at least not entirely.  Bibliographic records are
very important, but that is in large part due to the current reality and how we
got here.  I think we can agree that libraries exist to organize and provide
access to content (via 'items', whether physical or digital).  A monolithic
record is a convenient descriptive tool, but not the only one, and in the
future may not be the best one.  Slightly loosening the link between items and
records may be just one small step forward.

>>  From the item perspective these proposals are modeling
>> the same thing, a mapping of items to contents, and the fewer ways we have

> to
>> do that, the better (as long as we cover all the cases).  With a simpler
>> mapping table of item to record(-part), we easily traverse in either 
> direction,
>> and we have ultimate flexibility.
> 
> But I contend that we don't traverse in either direction for a given
> relationship, nor do we have a need for ultimate flexibility at the
> cost of complexity.  (I'm not referring to schema complexity here, but
> code complexity -- the need for inferences that will certainly come
> from commingling aggregation and dis-integration.)
> 
> The direction of traversal is critical, and should be ensconced in the
> schema.  Not only does this make the code for each function simpler
> (we don't have to infer a relationship, it's dictated by the fact that
> we're using part or multi-home) but it models what libraries actually
> do: barcode parts of a work (volumes, disks, etc); or, collect
> manifestations of many records into a big binder (or e-reader) with
> one barcode on the outside.
> 
>>  So, if I have a bib record on my screen, and
>> I ask the question, "which items' contents does this record represent?", we

> can
>> simply go record->part(s)->item(s).
> 
> IMO, the question one would ask is, "what and where are the things
> (nominally, barcoded physical items) that contain what I describe?"
> ISTM that it's very important to know, and perhaps even critical for
> efficient workflow, to list separately subsets of what a record
> describes (parts) and "bound" items that contain the described work
> along with others.  With a unified map there's no mechanism other than
> a magic value (or a human hoping to interpret another human's label
> correctly) to distinguish these concepts.  With parts and multi-home
> separate, it's obviously a natural property.
> 

***EDIT***
I might (finally!) understand your perspective, see paragraph near the end
***/EDIT***
I think this is the point where I am missing something.  Why distinguish the
concepts?  I think all we need to model is the concept of "contains" (copy
contains part).  If we are dealing with a record/part, we can list the copies
which contain it, and if we are dealing with a copy, we can list what it
contains.  What is the source of the ambiguity?  We dis-integrate first (where
needed), then aggregate the parts.

>> On the other hand, if I have an item, and
>> I ask the question "what are the contents of this item?", we can go
>> item->part(s)->record(s).  Naturally we can traverse related records (via
>> items) and related items (via records/parts) as well.
> 
> This is directly supported by multi-homed items, with the exception
> that you do need to look at the call number to get the primary record.
>  I don't see a practical drawback to this, since that's what the code
> already does, and will still have to do as long as the record field
> exists on asset.call_number (null-ability or elimination of which is
> mentioned below).
> 
>> This also eliminates the
>> primacy of call numbers when managing items, which I see as a benefit.
>>
> 
> There are three problems I see here:
> 
>   * Call numbers will always have a first-class billing, regardless of
> how they're implemented, since they represent something physical. Two
> things, actually: the location in a range of other items (shelf order
> and position), and a tag pasted to the spine of the item.
>   * I can't see any obvious benefit to eliminating the
> record<->call_number link (mentioned directly below, and intimated
> here)
>   * The mounds and mounds of code that assume and depend on the
> existence of the record->call number->copy hierarchy that will
> instantly break
> 

The immediate benefit to breaking this link is that an item (and by
association its call number) can now fully exist in the context of any record
which describes it, even if only in part.  We could transition by using code
which builds the current hierarchy dynamically (that is, go
record->copy->call_number, then attach the call number to the current record
context).  So if item 12345 with Call Number ABC123 is linked through the
contents map to both Record A and Record B, when viewing A we see:

Record A
--ABC123
----12345

and of course the same with B:

Record B
--ABC123
----12345

The item row might somehow indicate its 'special'-ness (which is going to be
needed in some way regardless), but would be otherwise transparent.  It is also
not strictly necessary to null the call_number.record_id value, as we can just
as easily overwrite it temporarily as needed, and it could be a useful
fallback.

>> Or stated more simply, I feel our foundational assumptions in relating
items
>> to records should be:
>> 1) Records describe contents
>> 2) Items contain contents
>> 3) Item content boundaries can overlap record content boundaries in
various
>> ways
>>
> 
> I see this as an oversimplification from the conceptual point of view
> -- it fails to recognize that the arity of the relationship (which I
> call direction) is important and different.  IOW, record<<->item
> serves a completely different function from record<->>item, and
> forcing them both through a record<<->>item relational model does both
> a disservice.
> 

I can't agree with "completely" different, and if you view the record/part as
a sort of really expressive tag of some kind, I feel like they are not so
different at all.

>> All that said, I know from experience to trust your judgement (most of the
>> time ;).  For my own future benefit, do you have cases already in mind
where
>> this flexibility would end up causing 'split-brain' logic?  (Or maybe I
have 
> a
>> split brain...)
>>
> 
> Split-brain is probably a misnomer ... we have to commingle the logic
> for aggregation and dis-integration (disaggregation?) wherever we use
> either.
> 
> From a practical point of view, here are some more random-ish thoughts
> that don't seem to fit directly into this response elsewhere ... ;)
> 
> When going from records to items (via the Monograph Parts
> infrastructure as described), we need to be able to name label the
> subdivision that the part represents in relation to the record as a
> whole -- we need to be able to say "barcode X contains only Volume 1
> of the content described by record A".  This is not something we need
> to do for binding in the general case (note, however, that you can
> indeed use both at the same time -- multi-home and parts -- to get the
> effect of "volume 1 of record A is bound with some other things").

Correct me if I am wrong, but you can only do this if that "some other thing"
is not another part.  So if, for instance, I have a set of books, each with a
different record, and each including a 'CD supplement', I cannot create a copy
which is a binder containing all the CD supplements.  Or, if I have a
multi-volume work in two languages, I cannot bind the English and French V.1s
(etc.) together.  Or if I buy a few e-book Bibles, I cannot put all the Old
Testaments on reader 1 and all the New Testaments on reader 2.  These limits
are a direct result of one-part-per-copy, and multi-home doesn't change that,
does it?

> Also, the only purpose of the record-to-item path is to dis-integrate
> the record into constituent, separately barcoded items, so there is
> only one relationship type.
> 
> However, going in the other direction, from items to records (via
> Multi-homed Items as described) we do not need a label -- what we need
> instead is a /reason/ for the relationship.  Bound-with, e-reader,
> etc.  IOW, there are multiple potenial causes for the relationship
> being created.
> 

With the possible exception of bilingual, it seems to me that the records
themselves have no special relationship, but rather that the relationship only
exists at the item level.  As such, we don't actually need a reason.  These
labels can usefully describe the character of an item, so it makes sense to
include them as a copy attribute if one does not wish to make a new item type.

> Not surfacing these differences explicitly (in my case, by using
> separate, though admittedly superficially similar mechanisms) is
> inviting trouble down the road, IMO.
> 
> Now (fastforwarding to your schema outline below), IIUC, what you're
> attempting to do with the copy_type table is to have a magic value of
> "Multi-part" inform us that the direction is from record to item, and
> all others are the other direction.  From a bibliographic point of
> view this is incorrect -- it's not the copy that is Multi-part, it is
> the record.  From a normalization point of view, this is not modeling
> reality IMO, and because it uses magic rows in a table it's brittle
> against DML.
> 

That was not my intention.  The copy_type does not need to be set at all,
other than for convenience of labeling as I noted above.  "Multi-part" is not
intended as magic, just a generic way to say "this item shows up on more than
one record, but the reason why can't be neatly expressed in a label" (and maybe
not the best choice of term at that, especially since I used the word 'part'
(multi-record'?)).  Probably should have left it out!

>> Also, I think this quote from Elaine deserves a bit more attention:
>>
>>>> I'm particularly interested in how this would function in a consortium
>>>> like PINES where different libraries might process a multipart set
>>>> differently. For example, one library might process and circulate a 3
>>>> part DVD set as one item, where another might put each in a separate
>>>> container with a separate barcode.
>>
>> If we want the complete-set copy from Library A to conclusively fulfill a
>> P-level hold from Library B, we will want to allow multiple parts per copy.
 
> Or
>> am I missing something?
>>
> 
> You're not ... I interpreted what she was saying differently (that
> different libraries would be /able/ to spit records along different
> lines), and I see what you're saying.  We could allow a copy to belong
> to multiple parts (it's a trivial change to the schema), but it would
> be the responsibility of the cataloger with the item in hand to make
> sure that the copy is in the appropriate parts -- not hard, except
> that some parts may not exist yet. ;)  (And, of course, this
> existential problem exists no matter the scheme*.)
> 

I was not expecting that libraries in the same system could divvy up the
record differently, but rather that the parts would be set globally at
reasonable common denominator and then assembled locally as needs dictated.  I
am certainly fine with allowing local divvying to happen, but by not even
allowing multiple parts per copy, we are effectively forcing an immediate
choice between local part-bundling practice and accurate resource sharing.

> Converting from one part per copy to multiple is simple at the
> database level, and would be nearly trivial in higher level code, but
> until we have use in the field I think it's a solution without a
> problem, because of the cataloging overhead of trying to keep every
> copy current across all parts as parts are added to a bib when each
> library adds their own subdivision scheme for the bib.  For that
> reason I left it out explicitly.  (*It also invites the desire for a
> "collection of parts" concept that is a much bigger, and more
> importantly, controversial project.  That too, though, is not barred
> from the future with the design as it stands.)
> 
>> Finally, for those it may help, here is a quick version of a simple
>> item-record schema.  The part concerning copy_type is optional, but I
wanted 
> it
>> to show a more complete replacement for the proposed tables:
>>
>> CREATE TABLE biblio.part (
>>        id SERIAL PRIMARY KEY,
>>        record BIGINT NOT NULL REFERENCES biblio.record_entry (id),
>>        label TEXT NOT NULL,
>>        label_sortkey TEXT NOT NULL,
>>        CONSTRAINT record_label_unique UNIQUE (record,label)
>> );
>>
>> CREATE TABLE asset.copy_contents_map (
>>        id SERIAL PRIMARY KEY,
>>        --record BIGINT NOT NULL REFERENCES biblio.record_entry (id),
>> --optional path to partless items, or we force records to have at least
one
>> part
>>        part INT NOT NULL REFERENCES biblio.part (id) ON DELETE CASCADE
>>        target_copy BIGINT NOT NULL -- points to asset.copy
>> );
>>
>> CREATE TABLE asset.copy_type (
>>        id SERIAL PRIMARY KEY,
>>        name TEXT NOT NULL UNIQUE -- i18n
>> );
>>
>> INSERT INTO asset.copy_type (name) VALUES
>>        (‘Bound Volume’),
>>        (‘Bilingual’),
>>        (‘Back-to-back’),
>>        (‘Set’),
>>        (‘Multi-part’);  --generic type
>>
>> -- ALSO:
>> -- asset.copy grows a nullable reference to asset.copy_type
> 
> 
>> -- asset.call_number.record is nullable (should be null for new-style
copies)
>>
> 
> Given the codebase, that will be a large and separate project, if ever
> undertaken, and is not something we can look at now if we want
> anything discussed here to happen in a near-term release.  I won't
> discount it out of hand for all time, just for this time. ;)
> 

While it took me a (long) while to realize it, I think the source of our
disagreement may be what I will call the "is-ness" factor.  Does a bib record
tell us what an item *contains*, or does it tell us what an item *is*?  Well,
traditionally it tries to do both, and it has always been a problem.  I am
unwittingly assuming that describing contents matters more and more, and
describing containers matters less and less.  Doing so makes it difficult to
truly represent a content-less container record (like an e-book reader record),
but if we no longer need such things (because the item already appears wherever
the contents are described), maybe it is not such a loss.

I understand that my perspective is not always (ever?) the most realistic.  My
aim is only to try to encourage a little more pain now if it even *might* save
us from greater pain in the future.  Since I know you are a speedy and tireless
worker, it may be best at this point to just wait and see the code, which will
probably illuminate for me some of the issues I don't yet see.

Dan

> --miker
> 
>> Dan
>>
>>
>>
>>> --miker
>>>
>>>> Dan
>>>>
>>>> --
>>>>
>>>
>> 
>
*****************************************************************************
>>> ****
>>>> Daniel Wells, Library Programmer Analyst dbw2 at calvin.edu 
>>>> Hekman Library at Calvin College
>>>> 616.526.7133
>>>>
>>>>
>>>>>>> On 2/15/2011 at 3:09 PM, Mike Rylander <mrylander at gmail.com> wrote:
>>>>> I'll be starting work on an implementation of Monograph Parts (think:
>>>>> DIsks 1-3; Volume A-D; etc), well, right now, in a git branch that
>>>>> I'll push to
http://git.esilibrary.com/?p=evergreen-equinox.git;a=summary 
>>
>>>>> but I wanted to get the basic plan out there for comment.  So,
>>>>> attached you'll find a PDF outlining the plan.  Comments and feedback
>>>>> are welcome, but time for significant changes is slim.
>>>>>
>>>>> This is not intended to cover every single possible use of the concept
>>>>> of Monograph Parts, but I believe it is a straight-forward design that
>>>>> offers a very good code-to-feature ratio and should be readily used by
>>>>> existing sites after upgrading to a version containing the code.
>>>>
>>>>
>>>
>>
> 


More information about the Open-ils-dev mailing list