Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Library of Congress classifications and subject headings (those are two separate things, for those unfamiliar) are not perfect, but they're pretty good, apply to a huge copus, and to my mind most importantly, have evolved over a bit over a century under numerous circumstances, including an absolute explosion of published materials, substantial changes to understanding organisation and classification of knowledge, and an awareness of the social and cultural aspects of these (as well as the institutional bias that's often embodied within them). That is, they have evolved a change management process.

The Classifications are substantively hierarchical, though that's really an outgrowth of the fact that they're used to locate books within physical shelf space, in which a record must occupy an address (physical space), and given that the Library's settled on subject classification as its storage and retrieval basis, this maps what's effectively a folded linear structure (shelf space) onto the multidimensional subject classification. It's not ideal, but it's workable. And many of the quirks of the LoCCS come out of the fact that it addresses both the composition (comprehensive, but still US-centred) and process (shelving, search, and retrieval) of the Library.

The Subject Headings are not hierarchical, though they're structured. In particular, they're relational, with numerous subject headings referring to others. There's some parent-child relations (though the top level hierarchy is broad), numerous retired classifications, and many "use that instead of this" notes.

(I've made ... some progress ... at a structured parsing of the subject headings, though that work's been stranded Because Reasons.)



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: