2020-01-09 Meeting Notes

 

Date and Time

Thursday 01/09/20, 3pm Eastern

 

Zoom URL

https://lyrasis.zoom.us/j/897871318

 

Participants

  • @Kevin Schlottmann

  • @James Griffin (Unlicensed) (Note taker)

  • @Bria Lynn Parker (Unlicensed)

  • @Christine Di Bella

  • @Daniel Michelson

  • @Jared Campbell

  • @Wiedeman, Gregory

  • @Dallas Pillen

 

Goals

 

Discussion topics

Time

Item

Presenter

Notes

Time

Item

Presenter

Notes

5 min

Ice Breaker Question: Favorite kind of weather / season

@Kevin Schlottmann

 

15 min

Standing item: review metadata tickets

@Kevin Schlottmann

Link to board

Specific tickets =flagged for us by @Christine Di Bella

https://archivesspace.atlassian.net/browse/ANW-974

https://archivesspace.atlassian.net/browse/ANW-969

See below for Kevin’s discussion prompts / draft replies

10 min

Ideas from TAC discussion

 

*There is a desire to see us engage with RiC. Suggestions included a blog post about RiC for the AS community; and inviting EGAD to present to TAC.

*We should make explicit what AS versions the import/export mappings are keyed too. Speed of AS updates vs review might be an issue. Thoughts?

5 min

Deliverables from previous meeting

 

James will provide updates on the GitHub repositories

Kevin will show the draft instructions for import/export review: https://archivesspace.atlassian.net/wiki/spaces/AC/pages/941588554

Kevin created ticket for 245f/g issue:

Christine created a dedicated test server for Metadata Standards team, available at http://metadata.lyrtech.org/staff/. (admin/admin) For testing against the current version using/creating data that is reliably and predictably there.

10 min

DACS tooltips review - updates on scope and process

@Daniel Michelson

Update on the spreadsheet

10 min

MARCXML /EAD import review

 

@Kevin Schlottmann @Wiedeman, Gregory

Updates on the Google Sheet:

 

5 min

Anything else?

 

 

Action items

 

Notes

Introductions

Metadata Tickets

Proposed reply for

“As Rachel notes correctly, the 044 is incorrect, since country of publication is determined at the repository level and in AS there can only be one. Per MARC rules therefore the country code should (and is, by AS) encoded only in the 008, p15-17.  Light testing indicates that the country codes are being exported correctly, including the US code ‘xxu’ as discussed in ANW-673.

The metadata subgroup of TAC agrees with the reporter and recommends that the 044 field be removed from the MARC exporter.”

Other comments:

The 008 AS->Marc needs to be updated (I made a note in our Google sheet)
In the sandbox (2.7.0), I created a test repo with the country as Germany.  The MARC export code to the 008 and 044 is de. The code is correct. The placement in the 008 is correct.

MARC 008 definition: https://www.loc.gov/marc/bibliographic/bd008.html

MARC 044 definition: https://www.loc.gov/marc/bibliographic/bd044.html

MARC country code list: http://www.itsmarc.com/crs/mergedProjects/councod/councod/contents.htm

 

Looking at the spec, current AS->EAD2002 behavior matches the implementation request (encoded language of description is NOT exported to EAD2002; only the free-text language of description is exported).  I also note that langusage is optional in EAD2002, and not required by DACS(!). 

 

All agreed to this proposed solution, this will be posted to the JIRA Issue on 01/10/19

Question: What is the process for these types of Issues for Dev. Pri.?

For something like this when the Issue is submitted, and then it is delivered to this Sub-Team, there is enough information now to know that this MARC field should be suppressed

Dev. Pri. shouldn’t need any additional information, and this would be routed as “ready for implementation”

Prioritization should be minor for this issue, we want to ensure that the Co-Chair for Dev. Pri. is okay with this

Maybe it does make sense for this to be placed into the queue for Dev. Pri. - the ticket itself does not need priority if that is the case



Proposed Reply for

We should discuss, but if this were open for a do-over, I would propose that the EAD2002 export behavior of language of description mimic that of language of materials (in short, if a free-text note exists export that, otherwise export the translation of the encoded language as a text strings with the codes in attributes, with some boilerplate prepended.   E.g.:

<langusage> Finding aid written in: <language langcode="[Language (code value)]" scriptcode="[Script (code value)]">[Language (translation value)]​</language></language>

However, the spec appears to have been drafted pretty carefully, so perhaps we should reach out to Corey Nimer and see if this was intentional.

Spec (see PDF linked in ticket):

EAD:  https://www.loc.gov/ead/tglib/elements/langusage.html

DACS:  https://www2.archivists.org/standards/DACS/part_I/chapter_4/5_languages_and_scripts_of_the_material

Feedback from the Sub-Team:

  • This is a weird history, as people have not encoded language; They have chosen to use mixed content

  • Language of description is not a DACS elements; Only language of material is

  • We should still attempt to export this information in a structured way

  • We’re encoding EADs with mixed content given that the toolkit encoded it as such

  • EAD3 was the intended version for enhanced support for extended language encoding

  • EAD2002 is far more restrictive in structure

  • Next steps: Reach out to Corey to determine if there was any intent here, and if there was, determine what the next step would be for the Sub-Team

 

Another Ticket:

  • This was raised by Dev. Pri., but the process for forwarding this to the Metadata Standards was not in place

  • Issue was how the “published” status should work, it seemed to be that there was a lot of discussion in the comments for this

  • The interaction between the “published” status for the Digital Object in ArchivesSpace, and how this is encoded within the EAD

  • This will be added to the agenda for the next scheduled meeting

 

Ideas from December TAC Meeting

Desire to see us engage with RiC

  • Blog posts, particularly focusing upon our efforts to engage with the EGAD Steering Group (https://www.ica.org/en/egad-steering-committee-0)

  • Invite members of the Working Group in order to discuss their work with the TAC

  • How might we engage with Records in Context in other ways?

    • Daniel Pitti is the current Chair

  • There is also another individual who serves as the representative for the US

  • No members of the Sub-Team have any close relationships with members of the Steering Group

  • It would make more sense to contact Daniel directly in order to engage with them

    • Daniel Michelson volunteered to contact Daniel Pitti

    • This invitation would be for the next joint TAC meeting, and we should check with Maggie first

 

Making explicit which ASpace versions apply to the import/export mappings

  • We could just work with version 2.7 until we have a completed mapping, and then we could work with 2.7.4 for the next updated cycle on a separate iteration

  • Daniel Michelson: If things change in the upgrade, wouldn’t that be a good way of knowing what they are?

  • Christine: From a development perspective, we would know which issues have been worked on

  • Daniel: Changes to the exporter in the new version would be something which was presumably already tested, and we wouldn’t need to perform a separate test from us

  • Christine: There shouldn’t be additional testing required, but the mapping itself will still need to be updated. Should developers be feeding any of this information into this Sub-Team

  • Greg: We need to think about what we are doing; our pace is slow and this is a very labor-intensive

  • Updating the mappings for “unittitle” elements, as an example, required several variations of “unittitle” elements to be provided for the test import

  • It is actually easier to understand the mappings by looking directly at the code for the exporter

  • It might be possible to have things generated from comments from the code

  • Ruby Yard: This is in place for ArchivesSpace, but the rest of the documentation is moving away from being so attached to the codebase

  • The downside to this is that we are limiting the audience who might be able to contribute to this

  • We might still try this, but we need to be certain that this is even possible

  • Christine: One of the things which the ArchivesSpace did try was to try to extract the information from the code itself, but it could not be successfully implemented. It might still be possible.

  • Greg: For reference:

    • Greg had tested the conversion of “unittitle” elements with

  • Daniel: For some types of data, there are relatively straightforward cases for testing imports

    • Is there any way to determine which of these cases might be more straightforward?

  • Request to have Greg write up a description for what they found when testing the importation of “unittitle” elements

    • Perhaps invite Laney to the next meeting in order to determine whether or not automating documentation generation from the code base is possible

    • It should be noted that it is more efficient to update the code comments during updates to the code base if this is indeed possible

 

Documenting our Review Process

  • Kevin drafted a Confluence page ()

  • These directions should prove to be quite useful for providing guidance to future Sub-Teams

  • Perhaps referencing the code in certain cases might be more straightforward for discussion the import process and mappings

 

Bulk Import Issue ()

  • Daniel: This should go directly to the Dev. Pri. meeting given that this has already been evaluated by this group

 

Christine created a dedicated test server at http://metadata.lyrtech.org/staff

 

DACS Tooltips Spreadsheet Update

  • Most of the progress has been quite straightforward

  • The priority of this is significantly less than the Importer/Exporter testing

  • It might be more efficient to proceed by perhaps setting a month for dedicated focus upon this and finalizing it

 

MARC-XML and EAD Import/Export Sheet Updates

  • Bugs are extremely difficult to test

  • With a massive MARC import, Columbia did not find that many of the more obscure fields were used

  • MARC 342 tag represents a case which is probably a very extreme edge case

  • Maybe something like paring down the imports for this process might make it much easier

  • We should explore defining a core group of minimum supported fields

  • Prioritizing testing by determining which elements in MARC and EAD are used

  • There have been usage surveys to try and find points at which we could reduce the complexity offered by MARC where use cases aren’t relevant

 

Meeting adjourned at 16:00 EST