How we get things done: the transcription workflow

diagram of image preparation workflow showing process from digitization to image processing and csv creation to omeka s item creation to datascribe transcription
Bills of Mortality Workflow

Once items are added to DataScribe and the datasets are ready for transcription, the transcription workflow begins. The project owner can assign users one of two roles: reviewer or transcriber. Reviewers can edit all records and items, regardless of the item’s status. For Bills of Mortality, Reviewers include the staff members on the project and our Digital History Research Assistants. Transcribers can only edit records and items which are locked to them. The Bills of Mortality transcription team is made up of undergraduate and graduate students.

The Bills of Mortality transcription process starts with a Reviewer assigning items to the transcribers. Each item in DataScribe begins as an unlocked and new item in a dataset. When the Reviewer opens the dataset, they use filters to find these unlocked items and then assign them to individual members of the transcription team. The batch action dropdown menu allows Reviewers to select and edit multiple items at one time.

DataScribe interface showing three unassigned items. Two have a blue checkbox next to their title. There is an open dropdown menu with options for “Edit selected” and “Edit all”
Batch edit in action

After selecting items and choosing “Edit selected”, the Reviewer proceeds to the action options. In the Lock action dropdown menu, the Reviewer selects the person to lock the items to and then clicks save. Reviewers do have to be careful to make sure that they click “Edit selected” and not “Edit all” and thereby mistakingly assign 1,600 items to a single transcriber. A certain reviewer and co-author of this post may have done that once and then he never did it again.

Locking an item to a transcriber assigns that item to that person. They are now the only transcriber who is able to edit that item. To find the items that have been assigned to them, the Transcriber can filter the dataset, much as the Reviewer did. The Transcriber will filter for “My items (items locked to me),” looking for those which are new or in progress. They open the item they want to work on and then select “Add new record.” Transcription work happens at the record level, even for projects like the Bills where each item only has one record. Any item with multiple records means something went wrong and the Transcriber should check with the Reviewer.

With a new record, the Transcriber begins the transcription by filling in the form. They can save the record as they go along – a useful feature for a project with very long forms!  If the transcription goes smoothly, without any need for questions or clarifications, the Transcriber saves the record and marks “submit for review.”

If the Transcriber encounters any issues, they can reach out to the Reviewer through the DataScribe interface or other channels. If the record includes any oddities or if the Transcriber has a question or concern, there is a field in the sidebar where they can leave a note for the Reviewer.

Close up of transcriber and reviewer notes. The transcriber notes discuss a few minor errors and illegible numbers. The reviewer’s reply confirms the reading of the illegible numbers.
Notes fields

The Bills of Mortality team also uses Slack to deal with questions and other issues. Our current transcription team is led by our Reviewer Dan with Atta, Emily, Katie, and Kayleigh working as Transcribers. Each week Dan checks in via Slack to let the Transcribers know of any changes to the workflow or any general feedback (it’s always good), but more importantly he always includes a photo of a dog. Members of the transcription team can message Dan through Slack or email him directly, or they can post to the project’s Slack channel where Megan or Principle Investigator Jessica can field questions. The most common expected errors are discussed in a blog post by Dan and Emily: 7 Problems to Expect when You’re Transcribing Historical Data and How to Avoid Them.

A brindle boxer leaning her head over the back of a couch
A very good dog

After the Transcriber has done their work, the item goes back to the Reviewer. In order to find completed transcriptions, the Reviewer filters the dataset for “All Items that need review”. These items can be sorted by the item number, submission date, review date, prioritized records, and by title. When reviewing, sometimes the PI wants a continuous set of items completed and it’s more helpful to sort by the item title or number, but most of the time the BOM reviews are done by submission date to catch any transcription problems early. If the transcription is without error, the Reviewer will check the box to “mark as approved” and save the item as a completed transcription. If the record has a minor issue, such as an “is missing” that needs to be checked off or a typo leaving off the second digit of a number, the Reviewer can use their discretion to correct it before marking it as approved. Only items which have been approved are included in dataset exports.

On occasion, transcriptions need more than a quick fix. The Reviewer can mark an item as “Not Approved” to send back to the Transcriber to fix. This is most often the result of a record going unsaved before submission, or of a Transcriber losing track of an in progress item versus a completed item. In this project we try to be particularly attentive with newer Transcribers to help them become comfortable with the sources and the form with a minimum of errors. The current Bills of Mortality Transcriber team submit few mistakes, and it’s exceptionally rare that a Bill needs substantial work. If any Bills are sent back, the Reviewer leaves a comment in the notes section and the Transcriber corrects the record before submitting for review again.

— Dan Howlett and Megan Brett