[OPEN-ILS-DEV] Monograph Parts

Mike Rylander mrylander at gmail.com
Wed Feb 16 20:27:44 EST 2011


First, thanks, Dan, for taking the time to consider the design.  I
know we've been down this road before, and I'm committed to moving
this functionality forward, so I've made some cuts and simplifications
to the earlier grand designs we've discussed.  Also, having ruminated
on this for quite a while in the context of practice, and discussions
about day-to-day needs from a relatively broad set of libraries, my
thoughts have crystallized fairly well.  That's what's below -- a
cold, technical, emotionless review of my thinking based on (what I
understand to be) working needs and practice, cast in the need for
some expedience and to spec projects that can be undertaken within a
single dev/release cycle.  Can we do more to generalize the
architecture? Indeed we can, and should, over time.  And thanks for
pushing for that.

Now let the wild rumpus begin! :)

...

On Wed, Feb 16, 2011 at 4:44 PM, Dan Wells <dbw2 at calvin.edu> wrote:
> Hello Mike,
>
>>>> On 2/16/2011 at 12:37 PM, Mike Rylander <mrylander at gmail.com> wrote:
>> On Tue, Feb 15, 2011 at 4:35 PM, Dan Wells <dbw2 at calvin.edu> wrote:
>>> Hello Mike,
>>>
>>> At first glance I think this is a very welcome development, and I have a
>> just a few comments.  First, I would advocate for some kind of a
>> 'label_sortkey' on biblio.monograph_part.  Even if all it did was pad
>> numbers, it would solve 95% of 'part' sorting problems.
>>
>> That's a very good idea ... I will make it so.  Proposed initial
>> algorithm: strip non-spacing marks, force everything to upper case,
>> remove all spaces, left-pad numeric strings with '0' to 5 characters.
>> Thoughts?
>
>
> Sounds good to me.
>
>
>>
>> (NOTE: I'm kinda loath to invent something like the call number
>> classification normalizer setup for this, and I don't think that will
>> work directly with these strings.  And without some field testing we
>> won't do a good jobs of covering our bases with anything non-trival.)
>>
>>> Second, and perhaps this was already discarded as a simplification measure,
>
>> but I think we should consider dropping the primary key on
>> asset.copy_part_map.target_copy to allow for multiple parts per copy.  This
>> would not only better reflect reality in certain cases, but I think it could
>> also lay some groundwork for future bound-with functionality (put 'part's on
>> your boundwith records (or let the map point to a part *or* a record), then
>> sever the link from call_number to record).
>>>
>>
>> Before spec'ing out this, I'd already begun working up something
>> separate to cover (among several other use-cases) bound-with.  I'll
>> post that soon (hopefully today).  The short version of why I
>> intentionally kept bound-with and monograph parts separate is that the
>> former is about aggregating multiple bib records (several metadata
>> records involved in one physical thing) and the latter is about
>> dis-integration (one metadata records covering multiple physical
>> things).  While we /could/ design a subsystem that goes both ways, the
>> implicit complexity (and split-brain logic) required outweighs both
>> the design and maintenance simplicity of single-function
>> infrastructure.  I'm normally in favor of infrastructure re-use, but
>> in this case the concepts being modeled have opposite purposes (from
>> the bib record point of view).
>>
>> Is that too ramble-y to make sense? ;)
>>
>
>
> It makes good sense, but I think we could ultimately benefit by putting less
> emphasis on a bib record point of view and tilting things a bit more towards
> the item point of view.

I don't see why items (particularly, barcoded physical items) should
be the focus.  On the other hand, records are the core of a
bibliographic system -- everything else (items and their attributes,
access restrictions for electronic resources, libraries-as-places,
etc) is a set of filter axes to apply atop the record when searching
for or manipulating it.  The record is the nucleus, and everything
else enhances/vivifies/subdivides it.

>  From the item perspective these proposals are modeling
> the same thing, a mapping of items to contents, and the fewer ways we have to
> do that, the better (as long as we cover all the cases).  With a simpler
> mapping table of item to record(-part), we easily traverse in either direction,
> and we have ultimate flexibility.

But I contend that we don't traverse in either direction for a given
relationship, nor do we have a need for ultimate flexibility at the
cost of complexity.  (I'm not referring to schema complexity here, but
code complexity -- the need for inferences that will certainly come
from commingling aggregation and dis-integration.)

The direction of traversal is critical, and should be ensconced in the
schema.  Not only does this make the code for each function simpler
(we don't have to infer a relationship, it's dictated by the fact that
we're using part or multi-home) but it models what libraries actually
do: barcode parts of a work (volumes, disks, etc); or, collect
manifestations of many records into a big binder (or e-reader) with
one barcode on the outside.

>  So, if I have a bib record on my screen, and
> I ask the question, "which items' contents does this record represent?", we can
> simply go record->part(s)->item(s).

IMO, the question one would ask is, "what and where are the things
(nominally, barcoded physical items) that contain what I describe?"
ISTM that it's very important to know, and perhaps even critical for
efficient workflow, to list separately subsets of what a record
describes (parts) and "bound" items that contain the described work
along with others.  With a unified map there's no mechanism other than
a magic value (or a human hoping to interpret another human's label
correctly) to distinguish these concepts.  With parts and multi-home
separate, it's obviously a natural property.

> On the other hand, if I have an item, and
> I ask the question "what are the contents of this item?", we can go
> item->part(s)->record(s).  Naturally we can traverse related records (via
> items) and related items (via records/parts) as well.

This is directly supported by multi-homed items, with the exception
that you do need to look at the call number to get the primary record.
 I don't see a practical drawback to this, since that's what the code
already does, and will still have to do as long as the record field
exists on asset.call_number (null-ability or elimination of which is
mentioned below).

> This also eliminates the
> primacy of call numbers when managing items, which I see as a benefit.
>

There are three problems I see here:

  * Call numbers will always have a first-class billing, regardless of
how they're implemented, since they represent something physical. Two
things, actually: the location in a range of other items (shelf order
and position), and a tag pasted to the spine of the item.
  * I can't see any obvious benefit to eliminating the
record<->call_number link (mentioned directly below, and intimated
here)
  * The mounds and mounds of code that assume and depend on the
existence of the record->call number->copy hierarchy that will
instantly break

> Or stated more simply, I feel our foundational assumptions in relating items
> to records should be:
> 1) Records describe contents
> 2) Items contain contents
> 3) Item content boundaries can overlap record content boundaries in various
> ways
>

I see this as an oversimplification from the conceptual point of view
-- it fails to recognize that the arity of the relationship (which I
call direction) is important and different.  IOW, record<<->item
serves a completely different function from record<->>item, and
forcing them both through a record<<->>item relational model does both
a disservice.

> All that said, I know from experience to trust your judgement (most of the
> time ;).  For my own future benefit, do you have cases already in mind where
> this flexibility would end up causing 'split-brain' logic?  (Or maybe I have a
> split brain...)
>

Split-brain is probably a misnomer ... we have to commingle the logic
for aggregation and dis-integration (disaggregation?) wherever we use
either.



More information about the Open-ils-dev mailing list