Required Schematron Assertions
EAD:
- Collection-level level attribute*
- Collection-level title*
- still need to test how the importer handles multiple title elements
- still need to test how the importer handles multiple title elements
- Collection-level date*
- can occur within did/unititle
- Collection-level identifier*
- Collection-level extent statement* (but may need to be tested differently)
- Component-level level attribute*
- Component-level title OR component-level date*
- If there's a level attribute set to 'otherlevel', then the otherlevel attribute must be set*
- Normalized unitdates
- must be valid dates (e.g. 2015-02-30 will probably raise an error, but still need to verify this)
- the end date cannot occur before the begin date (but the end date can equal the begin date).*
- No more than 3 container groups per level of description
- (should we also ensure that container elements have id attributes and parent attributes, or can we just go ahead and change the importer so that it only uses these values when they're present in the EAD source files??? i.e. if not id and parent attributes, then assume that all sibling container elements constitute the same group.)
EAD database and data type constraints
- title fields can only be 8704 characters (including mixed-content element tags, i believe)....is that right?
- the following can only be 255 characters (do we need to enforce these requirements, or will the importer still work by truncating the input value? if the latter, we could raise warnings and/or let the importer add this information to a log file).
- unitid
- ead_id
- ead_location
- finding_aid_date
- finding_aid_language
- finding_aid_revision_date
- otherlevel attribute
- extent number
- dimensions (when not a note)
- container value
- container barcode (don't test for this, though, since they are not importable now, and when they are, the barcode length will be validated by the value set in the configuration file).
- also need to list out which fields can only be text, and enforce that they don't exceed 65536 characters (or is there a ruby gem that limits this to 65000?).
- what else?
EAC-CPF:
MARC:
???:
(anything marked with an asterisk is currently being tested by the schematron file).