I'm hoping to writer a parser for Dewey numbers, but have been a little overwhelmed at the wide variety of syntaxes I've found. How do I read a number like "614.0979 I4 C5 B6p" and is there a guide I can refer to?

asked Feb 13 '12 at 11:21

Hi Bill,

Apologies if I'm covering ground familiar to you... I'm not sure exactly what you're trying to do, so my comments might be way off base. But I thought I'd share a few considerations anyway.

If you are just trying to parse/use the dewey number, you can use the numbers up to the first space. The rest is cutter numbers & shelfmarks so that the books order 'properly' on the shelf within that dewey 'subject'.

If you're trying to use the whole call number, then the library that you're working with might have documentation of their cuttering scheme. The library will have their own conventions for cuttering, and within a library collection, the cuttering might vary depending on the type of material or topic. As an example, most books in my library get one cutter (for author/title); however, some books might get two cutters (1st for artist, 2nd for author).

As Becky said, DDC is under copyright I don't think there are many free guides/open linked data sets on the web. However, there is an OCLC's Developer Network. I don't know if they do anything with DDC, but it's a possibility. http://www.oclc.org/developer/

I noticed they allow access to a "portion of Dewey Decimal classification represented as Linked Data" here: http://www.oclc.org/developer/services/dewey

That being said, when classifying with Dewey, the cataloger generally 'builds' numbers by adding additional numbers unto a base dewey number. The most common place to find these additions is the Dewey tables (standard subdivisions, geographic facets, subdivisions for arts, languages, or people groups for example).

Generally a "0" denotes the start of an addition to the main number (essentially an additional facet). So a number like 614.0979 could be broken down as 614 (forensic medicine, public preventative medicine), followed by a geographic facet 0979 (Great Basin and Pacific Slope of United States).

If you are working from the 082 field in a MARC record, you may find that there are slashes or some sort of mark indicating a logical break in the class number. As an example, an 082 marc field might have 745.2/0947/0904 meaning 745.2 (industrial design) 0947 (in Russia) 0904 (in the 20th century). So taking a look at the 082 field might give you some ideas about breaking down those DDC numbers. However, the slashes/marks indicating a logical break are not always consistent.

And I don't mean to muddy the waters, but I don't think 614.0979 is a valid number, since I don't see any instructions under 614 that allow the addition of numbers from the tables. But I'm no expert... this might be a case where the addition of numbers from the tables is implicit? But that's probably beside the point.

Good luck! @lagina


answered Feb 13 '12 at 20:58

Thanks for the pointers. The answer is, as I'd feare, "Well, it really depends, and you can't count on a damn thing." Have I mentioned how much I love working with library data?

(Feb 13 '12 at 22:23) BillDueber

yep, unfortunately dewey numbers are not easily 'read' by looking at character position or some other standard means. Of course, it's made worse by catalogers not building numbers properly, or truncating class numbers arbitrarily. We've worked for such a long time without considering the computing applications of our data.

Just as another note, I've found these DDC to LC crosswalks helpful in the past, but they're basically just lists: http://www.questionpoint.org/crs/html/help/it/ask/ask_map_ddctolcc.html

(Feb 14 '12 at 15:28) lagina

I was typing out the answer when I realized I should ask you if you are dealing with call numbers from your own institution or if you're pulling them from multiple institutions. Call numbers usually have some localization in them, so you need to be mindful of that. If you're pulling from just one institution, your best bet is to talk to those who know the call number policies. That way you can fine tune your parser with local data in mind. If you're pulling from multiple institutions, plan for basic patterns and hope for the best since you're dealing with multiple local practices for classification.

Anywho, some resources for syntax info and how DDC call numbers come to exist:

Disclaimer - I do LC, not Dewey. YMMV.

Because DDC is under copyright (OCLC), there's a limited number of free resources that you can refer to for DDC syntax. OCLC does have the first three summaries available (aka the first three numbers in the call number, before the period) online for DDC 22. If your library subscribes to WebDewey that will be a more complete resource to refer to, as well as the print DDC edition.

For cutter numbers, you can refer to the cutter table, which can be found in various places.


answered Feb 13 '12 at 19:39

I don't know if http://dewey.info/ would be helpful for you.

That number you cite as an example "614.0979 I4 C5 B6p" is very odd. Why are there three Cutters?

Also, there are different Cutter tables -- My library uses a three-letter Cutter-Sanborn one. And we are a Dewey library. So you'll have to know which table your library uses.

Hope this helps! Jen


answered Feb 14 '12 at 10:58

Asked: Feb 13 '12 at 11:21

Seen: 26,116 times

Last updated: Feb 14 '12 at 15:28

