AI in the Archive: Using Data Analytic Tools and Social Network Techniques to Uncover Hidden Histories

In this blog post, Joanne Peryer draws on the experience of writing her IHR MA dissertation to reflect on uses of AI in historical research. The 21st century has seen the rapid and continuing expansion of the digitisation and online searchability of archives—many widely available through commercial platforms. Digitisation—whilst not completely error free—has enabled not […] The post AI in the Archive: Using Data Analytic Tools and Social Network Techniques to Uncover Hidden Histories appeared first on On History.

AI in the Archive: Using Data Analytic Tools and Social Network Techniques to Uncover Hidden Histories
Why the Hen Does Not Have Teeth Story Book

WHY THE HEN DOES NOT HAVE TEETH STORY BOOK

It’s an amazing story, composed out of imagination and rich with lessons. You’ll learn how to be morally upright, avoid immoral things, and understand how words can make or destroy peace and harmony.

Click the image to get your copy!

Why the Hen Does Not Have Teeth Story Book

WHY THE HEN DOES NOT HAVE TEETH STORY BOOK

It’s an amazing story, composed out of imagination and rich with lessons. You’ll learn how to be morally upright, avoid immoral things, and understand how words can make or destroy peace and harmony.

Click the image to get your copy!

Why the Hen Does Not Have Teeth Story Book

WHY THE HEN DOES NOT HAVE TEETH STORY BOOK

It’s an amazing story, composed out of imagination and rich with lessons. You’ll learn how to be morally upright, avoid immoral things, and understand how words can make or destroy peace and harmony.

Click the image to get your copy!

In this blog post, Joanne Peryer draws on the experience of writing her IHR MA dissertation to reflect on uses of AI in historical research.

The 21st century has seen the rapid and continuing expansion of the digitisation and online searchability of archives—many widely available through commercial platforms. Digitisation—whilst not completely error free—has enabled not only the easier identification of individuals but also the discovery of much more information about them. This is helpful and often offers intriguing new insights, but such knowledge cannot in itself reveal much that is significant about the communities and social dynamics of the society in which individuals lived and operated.

The sheer volume of digitised material now accessible has created both opportunities and challenges for historical research. Researchers can examine far more records than would have been feasible through traditional archival visits, and this in itself has democratised research. In his February 2025 National Humanities Lecture, David Olusoga recognised that genealogical websites are now amongst the most visited worldwide, and that family history is a now a huge commercial industry. However, this abundance of data needs new methodological approaches to make sense of it all. This is where AI and machine learning techniques become essential tools. These machine learning tools are now also available to anyone inside or outside of the academy, further democratising the ability to undertake this type of research.

Computational Approaches to Historical Networks

Network analysis can be a particularly powerful tool for understanding historical communities. By mapping the relationships between individuals documented in archival sources, researchers can identify clusters of connection that might otherwise remain invisible. These techniques allow us to move beyond the study of individual lives to understand the broader social structures within which people lived and worked.

My recent dissertation for the MA History Place and Community at the IHR applied some of these methods to 17th century Quaker birth records—documents that have traditionally been used primarily for genealogical purposes—as a use case to demonstrate the untapped potential within familiar archival sources. The Quaker birth registers, with their unique practice of recording both midwives and birth witnesses’ names, provided a glimpse into Quaker women’s birth support networks. However, it is only through systematic quantitative analysis that patterns within this data can be discerned. This approach has revealed structures and relationships that no amount of close reading of individual records could have uncovered.

Until now it has been almost impossible to uncover most aspects of the early modern birthing chamber due to childbirth being a female, and therefore an almost completely undocumented, experience. Adrian Wilson—the foremost historian of this topic[1]—has drawn on (mostly male) existing narrative sources to estimate, for example, the numbers of ‘gossips’ (or ‘witnesses’ in Quaker parlance) in attendance at a birth. He has also speculated on how midwives were co-constituted in concert with their prospective clientele. Doreen Evenden’s seminal work on the midwives of 17th-century London[2] was clear that Quaker women in this period were happy to make use of Church of England licensed midwives. There is also an oft repeated assumption that labouring women were attended by friends and family.

Network analysis of these records as set out in the methodological approach below was able to identify the range and average numbers of birth witnesses, and clusters of family or friendship groups. Reciprocal attendance at births by friends was not proved to be the anticipated norm. This analysis  was also able to identify named midwives and to see where and how frequently they worked. It refuted the notion that any but a tiny number of midwives used by Quakers were licensed by the Church of England. Most strikingly it became clear that- in Southwark’s Quaker community at least—a small cadre of experienced women attended a majority of births rather than simply family and friends. It was also possible to glimpse midwives ‘training’ relationships and to identify birth witnesses who went on to become midwives themselves.

After some initial exploration I used JuliusAI as my tool of choice. This is not a large language model (LLM) like ChatGPT that many might immediately think of when “AI” is mentioned. Instead it is an AI powered tool which uses Python code to provide data analysis and deploy community detection algorithms. My process was to manually transcribe approximately 1300 records—available online—from the archive into spreadsheets. An example of these records is shown above.

These spreadsheets were uploaded into the tool to be used as the building blocks for analysis. As a historian with no ability in—or desire to learn—coding, the opportunity to ask questions in plain in English and receive real time answers was initially astonishing. By thinking of these tools as a research assistant or even project team, it became clear that individual historians may rapidly now do the kinds of work that previously would have required human research assistants or teams with all the messy supervision (not to say funding) that this would have required. Rather than requiring adherence to a traditional hypothesis based approach, the rapidity of the tool’s work allows historians to follow new interesting connections, and asking new ‘what about’ questions which can in themselves lead to new hypotheses. As a historian I was conscious of my ignorance of the vast specialist field of Computer Science that is network analysis. Nevertheless, I was aware that different kinds of ‘centrality’ measures can reveal different aspects of a social network’s structure and dynamics, such as how easily information spreads, who holds power, and who is most connected. The tool – when prompted – can provide immediate visualisations of this kind of data in a way which can provide immediate insights and a spur to further research. These network concepts are vital to historians interested in populations where information about individuals is scant and can also support prosopographical approaches.

The simple example below shows a snapshot of the women ‘bridging’ between each of the three London Quaker Meetings whose birth records I studied. Each colour represents a different Meeting and the strength of the line represents the connections between individuals. It begins to give an insight into the most influential and connected women (in birth terms) in their local communities.

Methodological Considerations and Limitations

It is crucial to acknowledge that these computational approaches are not without their challenges and limitations. The quality of digitised records can vary and, moreover, the algorithms used for community detection make assumptions about network structure that may not always align with historical realities.

Perhaps most importantly, whilst AI can identify patterns at scale, it cannot replace the interpretative work of the historian. It is vital therefore to have a thorough understanding of both the limitations of the original data being analysed and the context of the historical sources used. The significance of a particular network structure, the meaning of a relationship, the context that makes a pattern meaningful—these require historical expertise and critical analysis. The role of computational tools is to reveal what needs to be interpreted, not to provide the interpretation itself.

The Future of AI in Historical Research

The integration of AI and machine learning into historical research is still in its early stages, but the potential is considerable. As these tools become more sophisticated and more widely accessible, they will enable historians to ask questions at scales that were previously thought impossible. They will allow us to test hypotheses against larger datasets, to identify patterns that might challenge our existing assumptions, and to uncover stories that have been hidden in plain sight within the archives. My research – as revealed by these network analysis techniques – has for example challenged existing beliefs around 17th century midwifery and childbirth.

To sum up, community detection approaches can been used to reconstruct aspects of communities and networks from the scant mentions of individuals found in the archive, and can reveal patterns that would otherwise have remained hidden. The historian’s critical faculties remain essential; algorithms can identify patterns but cannot interpret their meanings or assess their significance within the full complexity of historical contexts.

These tools are therefore not a replacement for close reading and contextual analysis, but a complement to it—a way of seeing the archive from a different angle, of revealing structures and relationships that would otherwise remain hidden. There are almost certainly existing databases and projects that the author is unaware of where these approaches would be useful. It is the work of a few hours for historians initially to explore whether these tools can offer insights into their own projects. As more explorations are undertaken inevitably new questions, developments, concerns and critiques will come to the fore. What is certain is that we must engage with these new tools, embrace what they can give us whilst remaining alert to their limitations.


[1] Wilson, Adrian, Ritual and Conflict: The Social Relations of Childbirth in Early Modern England (London: Routledge, 2016

[2]Evenden, Doreen, The Midwives of Seventeenth-Century London (Cambridge: Cambridge University Press, 2000)

Joanne Peryer graduated from SSEES, University of London with a BA (Hons) History in 1984. Subsequently she gained an MBA whilst holding senior management positions in the Civil Service and NHS. She holds a Post Graduate Diploma (distinction) from the IHR’s History, Place and Community programme (2025).

She has a long standing interest in family history which first led her to the London Quaker birth, marriage and death records. These records together with her other long standing interests in service improvement and systems thinking, has led Joanne to develop an interest in how AI powered tools can help the historian identify otherwise unobservable patterns and connections in the archive.

Joanne lives in rural east Leicestershire where she is currently developing an oral community history project on the impact of World War Two on local families and researching local links to slavery.

The post AI in the Archive: Using Data Analytic Tools and Social Network Techniques to Uncover Hidden Histories appeared first on On History.

What's Your Reaction?

like

dislike

love

funny

angry

sad

wow