handling orphaned records

Description

Filing this as a feature request and not a bug, since it's an issue with a known workaround.

Having just upgraded from 1.5.1 to 2.1.1 on our test instance, we're seeing the following:

  • Perform upgrade tasks as described in docs

  • Delete all files under indexer_state/

  • Start application

  • Application begins adding records to index by repository numeric order

  • When application reaches the end of one of our repositories (in our case, repositories/7, log helpfully reports:

  • Indexer then starts over with repository 7, throws an error upon completion, restarts indexing, and so on.

From what I can tell, the indexer gets hung up on orphaned records. The workaround is to identify orphaned records with db queries, then remove the orphaned records with API calls. Following that, the indexer can proceed.

Is there a way to more gracefully handle orphaned records to that they don't interfere with indexing?

Complexity

None

Activity

Show:

Christine Di BellaMarch 17, 2020 at 2:05 PM

Closing because we recommend Lora's plugin to handle this issue: https://github.com/archivesspace-plugins/orphan_finder_job

Lora WoodfordMarch 21, 2019 at 7:27 PM

A first run at this plugin is now up at https://github.com/archivesspace-plugins/orphan_finder_job

It’s not 100% ready for primetime yet, but if anyone is feeling bold it would be great to have some initial/informal testing done (with backups of dbs of course!).

Lora WoodfordJune 14, 2018 at 5:32 PM

Hi ,

At present, the plugin I'm working on will search for and provide the option to delete orphaned subjects, agents (all types), and instances. Are there other record types that you've found cause this trouble? The select above seems to suggest that you're ending up with multiple orphaned archival_objects as well? It should be fairly trivial to add additional record types to this plugin.

Lora

Laney McGlohonJune 12, 2018 at 2:17 PM

Lora Woodford is working on a plugin for cleaning up orphaned records that should take care of this ticket.

Jason LoefflerFebruary 1, 2018 at 10:13 PM

Hi , this is a persistent issue with our ArchivesSpace instance and I've seen a few other reports here and there. What's the best approach for designing and implementing a cleanup script? Where would that process live? migrate_db.rb?

Done

Details

Assignee

Reporter

Priority

Harvest Time Tracking

Open Harvest Time Tracking

Created August 22, 2017 at 6:30 PM
Updated March 17, 2020 at 2:05 PM
Resolved March 17, 2020 at 2:03 PM
Harvest Time Tracking