
For nearly 200 years, the Baltimore & Ohio (B&O) Railroad forged an innovative path through the American landscape. As the country’s first common carrier commercial railroad and builder of the first American-made steam locomotive, the B&O Railroad was essentially the internet of its time, transforming communication, transportation, and daily life in the US.

This article originally appeared in the July/August 2025 issue of Museum magazine, a benefit of AAM membership.
» Read Museum.
The first telegraph message was sent by Samuel Morse along B&O lines to a station in downtown Baltimore. Our current time zones were standardized by railroads. Even first responders and Mother’s Day have ties to the railroads. Chartered just 50 years after the nation’s founding, 2027 marks the 200th anniversary of the B&O’s founding.
At the B&O Railroad Museum, located at the historic birthplace of American railroading, we not only celebrate the economic, social, and cultural contributions of the railroad to American life, we are also looking ahead, using artificial intelligence technology on archival records to unlock the past. One initiative centers on the B&O Railroad Relief Department, established in 1880, which made the B&O the first railroad, and one of the first American companies, to institute an employer-managed health insurance program.
In partnership with Johns Hopkins University, the museum is using AI to digitize these records and create a searchable database to aid access and analysis. With access to data at a scale not available outside of the federal government, researchers, for the first time ever, will be able to compile statistics and better understand a vast and diverse segment of the working population in the 20th century.
A Treasure Trove of Data
After the Great Railroad Strike of 1877, when railroad employees in multiple states went on strike to protest wage cuts and working conditions, the B&O Relief Department was formed. For the next eight decades, well before our current employer-paid health insurance system was established, B&O employees could opt in to the program, paying a portion of their paycheck into a fund. In exchange, the railroad provided medical care to employees when they became sick or injured on the job, and their families received death benefits should they be killed at work. Open to all employees regardless of gender, ethnicity, or position, the program was a model that inspired similar efforts in other railroads and industries and eventually played a role in the creation of federal workplace benefits that continue to this day.
In the 1990s, when Baltimore’s Camden Station warehouse was sold and redeveloped into Oriole Park at Camden Yards, the B&O Railroad Museum acquired the surviving Relief Department records—over 16 million documents. With such an enormous collection, which included medical records subject to legal access restrictions, how could the museum realistically understand what it had—let alone make the data accessible? What researcher could or would take the time to review and make sense of so many records?
These questions remained unanswered until 2022, when the museum received a two-year initial processing grant from the National Historic Publications and Records Commission (NHPRC). From 2022 to 2024, case files representing 15,000 B&O employees who joined the Relief Department between 1900 and 1960 were surveyed, rehoused, and documented, while archives staff simultaneously worked to understand the legal and ethical implications at play.
As the team neared the end of the project, new questions surfaced. What kind of researcher would be interested in this data? What data would be important to them, and how could it be provided in a meaningful way? How could sensitive information be protected without sacrificing access? And how could deteriorating paper records be preserved?
In late 2023, we commenced a series of meetings with medical researchers and historians from Johns Hopkins University (JHU) and Hospital to better gauge potential interest in the collection and identify information of value in these records. Everyone agreed that it was a one-of-a-kind source of information with real-world potential for new findings and uses, particularly in the public health field. For example, by better understanding workers’ medical and occupational histories, researchers could gain insight into workers’ safety, medical practice improvements and treatments over time, the benefits of unemployment insurance, the logistics of private health insurance programs, and the development of government welfare programs.
However, there was a catch. Traditional research methods would most likely require pulling fields one at a time to manually find and compile relevant data, a task too onerous for a collection containing 16 million documents. There had to be a better way forward. Given the immense scale of the collection, finding the right institutional partner would be key.
Enter Johns Hopkins University
Johns Hopkins was one of the B&O’s founders, and his legacy in Baltimore lives on through his eponymous university and hospital. Through a series of introductions, the museum became familiar with JHU’s new SNF Agora Institute, which is spearheading the university’s research into artificial intelligence, as well as one of its professors, Dr. Louis Hyman. Hyman, a labor historian, is interested in using AI to bring order and interpretation to historical records. He and his colleagues were already doing similar work with The Samuel Gompers Papers at the Library of Congress, and Hyman was interested in working with a much more extensive collection that could be used to learn more about early 20th-century labor practices.
The museum provided a B&O employee’s record to be assessed by their AI. In less than 10 minutes, the AI read, analyzed, and concisely summarized key information about the employee: family members, job, dates of claims, and types of illnesses and injuries. Given the ease and speed of this evaluation, we wondered what could be accomplished at greater scale.
Recognizing the importance and complexity of this treasure trove of documents, the B&O Railroad Museum and JHU formed a partnership in the spring of 2024. We secured a grant to digitize 25,000 pages of Relief Records and 150 microfilm rolls from the B&O Railroad Employee Records Collection—approximately 600,000 images. These would pilot a database, with a custom AI interface to aid access and analysis, that could interpret records ranging from handwritten notes to typewritten forms spanning 80 years. We wondered: Could AI not just read these medical forms but also understand them?
Data, Transformed
By fall of 2024, the pilot project was completed, resulting in three major outcomes. First, approximately 625,000 documents and microfilm images, representing the lives of around 150,000 B&O employees, were scanned to create a body of data for AI to analyze. Second, AI converted the text using optical character recognition and generated transcriptions of each page. And third, AI pulled demographic data (names, dates, locations, occupations, etc.) from these transcriptions to populate entries for each employee in a searchable database, which were then linked to both the scanned images and the AI-generated transcriptions.
With the completed search interface, records can be queried by keyword. Due to the sensitive nature of the personal medical information in many of the records, the search protocol is contained on a small computing device that connects to the secure servers where the data is stored. Users must therefore use the database in person to search and compile results for analysis.
At present, the AI database is simple but allows keyword searching within the demographic records and full-text transcriptions. Entering single words or short phrases generates a list of records containing those words or phrases. It easily compiles a list of employees who share a common piece of information—like a name or location. The interface is limited to elementary searches; however, it can peruse and compile results from hundreds of thousands of records within seconds—exponentially faster than any human researcher. It has also proven useful in comparing employees listed in the relief and employment collections and identifying those individuals found in both.
Next Steps
By all accounts, the project has been a clear and groundbreaking success. The digitized documents are now accessible and protected from physical handling. But one key challenge remains: accuracy. Given the mix of handwriting, aging paper, and degraded ink, how faithful are the transcriptions to the original content? And how does that affect search results for specific subsets of records?
Moving forward, answering these and other questions will be key to achieving our three goals: to continue scanning documents to increase the data pool and train the AI; to develop methods for testing and improving the veracity of the AI-generated transcriptions and data entries; and to upgrade the interface to support more dynamic, chatbot-style queries.
So far, just 0.15 percent of the Relief Records have been digitized. A second NHPRC preservation grant we received in 2024 will support the processing of 25,000 more case files. Yet, the vast majority of records remain untouched, and much more work lies ahead.
The fact that the Relief Records have survived to present day is remarkable. Who could have foreseen that 21st-century technology would not only revolutionize the study of this data but also transform how archival research is conducted? With AI, we now have the power to analyze and organize vast amounts of historical data like never before. The implications are far-reaching for research, budgets, staffing, and storage footprints, both physical and digital.
Many important questions remain on this journey. How can we protect sensitive information at this scale from being inadvertently revealed to researchers? How can we create a sophisticated interface that enables complex questions and nuanced results? And how can we ensure that the data interpreted and compiled by AI is free from misrepresentation and bias?
With one of the oldest and largest surviving collections of early occupational health data, the B&O Railroad Relief Records Collection represents an unprecedented opportunity—for AI development; for new, untapped areas of research; and for museum collections and archival exploration.
Learn more and get involved:
Visit borail.online/ReliefRecords or email Anna at akresmer@borail.org for more information on this project.
Visit AmericanRail200.org to find out how you can participate in the 200th anniversary of American rail.
Want to dive deeper? In the AAM Member Resource Library, you can explore even more about AI and digital content in museums.
» Related Resources
Anna Kresmer is Archivist and Jonathan Goldman is Chief Curator at the B&O Railroad Museum in Baltimore, Maryland.
RELATED:
Using AI to Work at Scale
by Dr. Louis Hyman, Dorothy Ross Professor of History at the Center for Economy and Society, SNF Agora Institute
As is typical for many archives, a large portion of the B&O collection consists of employees’ bureaucratic forms: readable individually but resistant to large-scale understanding. A single worker’s story might be recovered, but drawing broader insights into workers’ history from those forms remained out of reach.
Using large language models (LLMs), commonly known as AI, we have begun to “read” these forms in a new way. LLMs enable machine transcription that conventional optical character recognition techniques could not achieve. Beyond simple transcription, AI organizes the information from highly varied documents. In our pilot, we’ve arranged nearly 100,000 records in a NoSQL database, which supports flexible, free-form data—transforming the records’ heterogeneity into an opportunity rather than a constraint.
Unlike traditional database software, our web interface displays the original document alongside the structured data, adapting the format to each document. Researchers can view the source document and extracted data side by side, making interpretation and verification easier.
Throughout 2025, we are revising the system into a retrieval-augmented generation (RAG) LLM. In addition to performing conventional searches, researchers will be able to ask questions and receive natural language responses through AI.
As this project has shown, AI opens new possibilities for making sense of bureaucratic archives once considered too vast to fully comprehend.