Chapter 6 - Algorithmic Close Reading: Analyzing Vectors of Agency in Holocaust Testimonies 

This chapter focuses on the analysis of agency in Holocaust testimonies — expressions of what people say they did or was done to them — to understand vectors of action ranging from coerced actions to possibilities for resistance and defiance. Using Natural Language Processing (NLP), we developed a rule-based algorithm to extract, index, and visualize mentions of agency in testimonies. These mentions are in the form of “semantic triplets,” concise textual units consisting of subjects, verb relations, and objects. Our algorithm categorizes these triplets according to the type of speech used (active, passive, speculative/modal, or coerced speech; subjective evaluations or contextual orientations). We use those expressions to identify unindexed actions, such as microhistorical acts of resistance, as well as to explore the contours of shared experiences (for example, how hundreds of witnesses remember and talk about antisemitism in the 1930s and early 1940s). Our “AI and Cultural Heritage” Lab is building on these methodologies and initial prototypes by using the affordances of Large Language Models to create robust knowledge graphs that facilitate finer-grained analyses of agency and context.    

1. Semantic triplets: dashboard and data

Developed by Lizhou Fan and Todd Presner

The semantic triplet dashboard allows users to explore four Holocaust testimonies attuned to expressions of agency in the narratives: Jürgen Bassfreund (later known as Jack Bass), Anna Kovitzka, and Erika Jacoby. The dashboard filters allow you to browse phrases that describe active, passive, or coerced actions, as well as expressions of evaluation (personal opinions), orientation (context information), and speculation (uncertain or imagined actions). These expressions are in the format of “semantic triplets,” meaning subject-relation-object structures.

The first dashboard shows our original approach using “contextual terms” (all the co-occurring terms near the triplet), while the second dashboard shows the “meticulous approach” (which parses every sentence and tends to return longer object phrases). Our methodology and classification algorithm are built off of SpaCy’s natural language processing methodology. The dashboard results have been manually reviewed and corrected by our team (correction rate is about 15% for the “meticulous” approach). The downloadable data document the review and correction process.

You can visit Lizhou Fan’s semantic_triplets Github repository for more information about the process, including full documentation, code, and sample data.

Data sources: Voices of the Holocaust (Paul V. Galvin Library, Illinois Institute of Technology); David Boder, Topical Autobiographies (1957), UCLA Young Research Library Special Collections; and the USC Shoah Foundation Visual History Archive.


2. Network Visualization: Shared discrimination experiences

Developed by Jack Schaefer, with data by Lizhou Fan and Todd Presner

The video shows an interactive network visualization that connects 607 experiences of discrimination in the late 1930s to early 1940s as told by 268 Holocaust survivors. Larger nodes represent discrimination experiences mentioned by multiple witnesses, and lines connect the person who is speaking (subject) to the experience they are describing (relation and object).

The graph was created using VOSviewer, an online network visualization tool. To recreate and explore the interactive network in your own browser, download the VOSviewer file, unzip it, go to VOSviewer, click “open” on the folder icon in the top right corner, and upload the json file there.

Data sources: USC Shoah Foundation Visual History Archive

Triplet examples of discrimination experiences
Triplet examples of discrimination with “we” as subject.
Examples of discrimination experiences by Rudi Bamber and Maly Kohn
Example of discrimination experiences by Max Eisen

3. Experimental Interface: Testimonial Ensemble

Developed by Chereen Tam and Todd Presner

This experimental interface creates a visual form to experience a multi-voiced testimony of survivors describing discrimination experiences in the late-1930s and early 1940s across Europe. Hundreds of “semantic triplets,” or subject-relation-object structures, were derived from each testimony’s transcript and characterized by experiences and what type of agency the witness is expressing. The video shows an interactive montage of interview snippets corresponding to discrimination triplets.

Data sources: USC Shoah Foundation Visual History Archive

Explore two versions of the interactive montage: