Deep History, Deepening Collaborations

32 I, Historian Researching the Past in the Age of Artificial Intelligence By Dr David Brown, Archival Discovery Lead, Virtual Record Treasury of Ireland Although Artificial Intelligence (AI) burst into the public consciousness towards the end of 2022 with the launch of ChatGPT, AI has been a core research interest for the Virtual Record Treasury of Ireland since the early days of the project. Starting in 2018, the Beyond 2022 | Virtual Record Treasury of Ireland research programme developed a suite of deep-learning ‘models’, perfectly curated transcriptions, to train an AI system to read the handwriting in digital images of historical documents relating to Ireland, and to convert these into searchable text. In 2020, the Library of Trinity College Dublin — a Core Partner within the Virtual Record Treasury of Ireland — became a founding member of the READ Consortium, the developer behind the ‘Transkribus’ platform for machine transcription of historical documents. The English-language handwriting models developed by the Virtual Treasury were trained on some 7,500 images of documents. The model is now very powerful, capable of transcribing most handwritten texts in English from the period 1600–1850. The models are designed to produce a reasonably accurate first draft, good enough to make the document searchable. We have made our most capable general purpose transcription model publicly available for anyone to use. This automatic conversion of handwritten historical documents into searchable text is the latest major step in the digitisation of our written cultural heritage. Digital images have become the preferred format in libraries and archives because of their accessibility and convenience, compared to older formats like microfilm. Digital images can be viewed on any device, shared easily, and they facilitate collaboration among researchers. They also offer customization options for users with visual impairments. However, digital images can be expensive to store, prone to degradation, and can proliferate excessively. AI has the potential to preserve all of the benefits of digital images while reducing these drawbacks. This new ability to find a person, place or event among thousands, or even millions, of pages of handwritten text, could be the first step in an exciting AI-powered world that might enable more complex historical hypotheses to be tested and questions to be answered at a speed and level of detail unimaginable only a few short years ago. Further advancements in the AI environment are being made at a rapid pace, with emerging technologies such as Large Language Models (LLMs) powering human-like interfaces like ChatGPT. Careful experimental research by the Virtual Treasury team is investigating how LLMs might improve transcription accuracy, summarise dense documents effectively, extract valuable information for knowledge graphs, and even organise documents chronologically. History is only as reliable as the sources from which it is written. As the new generation of AI already has the ability to ‘write’ history from any content available on the internet, the prudent approach adopted by the Virtual Record Treasury of Ireland is essential.

RkJQdWJsaXNoZXIy MTQzNDk=