2021-10-20 Metadata meeting notes

 

Microsoft Teams meeting

Click here to join the meeting

Participants

  • @Elizabeth Roke

  • @Valerie Addonizio

  • @Kevin Schlottmann

  • @Jared Campbell

  • @James Griffin

  • @Regine Heberlein

Minutes

  • @Valerie Addonizio

Previous Agendas

https://archivesspace.atlassian.net/wiki/spaces/AC/pages/2889777230

https://archivesspace.atlassian.net/wiki/spaces/AC/pages/2901704756

Discussion Topics

Time

Item

Notes

Time

Item

Notes

5 min

New member!

Please welcome our new colleague Regine Heberlein, Library IT Data Analyst at Princeton!

(it’s re-ghee-nah, which I'm declaring so that she doesn't have to!)

Regine recently worked on an EAD2002 migration and has recent notes and experience with the importer; she also has a huge supply of test data.

10 min

Review and confirm work plan

2021-2022 Work Plan

Remaining issues from 2021-06-08 minutes have been added to this term's workplan. This makes the workplan more detailed than usual, but brings this info together into one place.

Regarding a more pro-active role in the development cycle, please list ideas for that. I know there are some ongoing, but I don’t have them written down.

25 min

Update on progress

Action items from last agenda: https://archivesspace.atlassian.net/wiki/spaces/AC/pages/2901704756/2021-09-16+Metadata+meeting+notes#Action-Items

*https://archivesspace.atlassian.net/browse/ANW-1382

The ticket writer is correct, this is technically true: https://www.loc.gov/marc/authority/ad024.html

In AS 3.0, record identifier is a required field (correctly), but only 024$a maps there (AS code). If we were to recommend a change, the logic would be:

If no 024$a, then use 024$0 as identifier; if no 024$0 either, then use 024$1. If 024 exists but none of ${a,1,2} are present, reject the import

However, this is an edgy edge case; I’d suggest we state the above but not this is not a development priority. However, it is probably an easy fix for someone looking to dip a toe into pushing code. (kws)

10 min

New/Ongoing ticket review

Check for new tickets. If there are no new tickets, we will move on.

https://archivesspace.atlassian.net/browse/ANW-943?filter=13658

https://archivesspace.atlassian.net/browse/ANW-1416

Thanks to Kevin for taking a look at 1416:

The MARC spec for the 008 position 15-17 points to this MARC country code list

However, when serializing to MARC, AS finds this data by looking up the country listing for the repository, and then finding the iso 3166 country code from a local controlled value list (country_iso_3166).  

ISO3166 is not remotely equivalent to the MARC list.  

So the ticket writer is correct that the wrong value for the Netherlands is getting into their MARC export.  (This is avoided in the US because of a special exception that places the xxu country in all US-American records.)

One could tell the ticket writer to edit (i.e. break) the local iso 3166 controlled value list to map to the correct MARC Country code for the Netherlands.  This however would make the EAD export invalid, since EAD points to iso 3166.  

One could also suggest to the ticket writer to do some local post-export processing for theis MARC records (which we do here at Columbia for similar issues).  However, this presupposes a fair bit of technical knowledge and access to the various implicated systems. And it is brittle.  

The *right* answer is to add the MARC country code list to the controlled value lists that ship with AS, and have AS look to that list when serializing MARC.  

5 min

Order of events and individual momentum

  1. In general, it’s my intention to finish old work before beginning new work. Some of us are still working on tasks and themes from last term.

  2. But also, in general, not everyone needs to work on the same topics. I propose @Regine Heberlein @James Griffin and @Jared Campbell proceed with EAD2002 import mapping, while myself, @Elizabeth Roke and @Kevin Schlottmann finish up old business, but with the expectation that we’ll all discuss any topic as it is raised.

5 min

Next steps/homework

 

Action Items

@Regine Heberlein Will brainstorm how to transform or add to the EAD import spreadsheet to better reflect handling at different levels. For example, to document how a unittitle is handed in the top-matter versus as a component. Some ideas floated: adding every single element, attribute, and level; XPath. Spreadsheet | Code
@Elizabeth Roke Will list the DACS elements missing from the MARC importer and provide to the group. Elizabeth, please also confirm which version (most recent?) of DACS you’re using.
@Elizabeth Roke Will add two columns to the EAD2002 importer spreadsheet to match the new columns on the MARC importer
@James Griffin Will begin a code review of the EAD2002 importer: Spreadsheet | Code
@Jared Campbell Will begin a code review of the EAD2002 importer: Spreadsheet | Code
@Kevin Schlottmann Will comment on https://archivesspace.atlassian.net/browse/ANW-1382
@Kevin Schlottmann Will comment on https://archivesspace.atlassian.net/browse/ANW-1416
@Kevin Schlottmann Will draft a ticket requesting that the AS importer code spit out any element that is not handled (though I did not note whether you meant for MARC, EAD, or both)
@Valerie Addonizio Add minutes and action items and find unfinished business; revisit workplan; draft new agenda; check for new tickets

Minutes

  • Elizabeth review of the MARC importer versus DACS revealed that AS does not support every DACS element in the DACS crosswalk, and we will be following up

  • We discussed and agreed that we are using DACS to judge support of the MARC importer because:

    • ArchivesSpace is not a bibliographic software

    • We are aligning with the dominant standard in our field

  • We will make recommendations that support for MARC fields be depreciated, but not until a certain year so that the community has plenty notice that this is going away

  • For EAD2002, we are still feeling our way in how much to document, but agreed in general that we should first focus on what is already present in the code. We will begin by getting organized and oriented.

    • We expect to start with the NOT IMPORTED and MIXED CONTENT sections.

  • We are splitting attention and members between old business and new business but will discuss everything as a group