LGBTQ+ Audio Archive Mining Project

Project Team | Proposal | Collections | Updates


LGBTQ+ history has often been hidden away. But we can bring that history out into the open–and you can play a part!

The Issue

The UWM Libraries house important archives holding historical and contemporary LGBTQ+ materials. Included are rich records of LGBTQ+ communities in Milwaukee, Wisconsin, and the Midwest generally. Not only do these archives contain textual documents such as community newsletters, advocacy group records, and personal letters–they also contain audiovisual materials. Examples include local television news and radio broadcasts, early LGBTQ+ community cable programming, and videorecorded oral histories.

Working with archives is fascinating, because these are primary sources that can be full of surprises. Reading and listening, the researcher makes new discoveries: a handwritten note on a news clipping might yield new insights A recording of an old news broadcast on high schools opening might make a surprising quick reference to a new student Gay-Straight Alliance having formed.

But what makes archival research fascinating is also what makes it frustrating, because often a user will have little idea of what is in the archive. Some textual sources like newspapers may have been digitized and processed using “OCR”–Optical Character Recognition–so you can use search terms to find materials you are interested in. But what about audiovisual materials? The UWM Library archives contain many digitized videos and audio recordings that users can access, but obviously they can’t just be run through a text-recognition package to be able to search their content. So those using the audiovisual archives have had to rely on the terse descriptions each item was given when it was added to the collection, which necessarily only provide a broad outline of the actual contents of the item.

The Solution

This is where the LGBTQ+ Audio Archive Mining Project comes in. Members of this project team are developing ways to automatically generate searchable text transcripts of the audiovisual materials in the archive. More than this: the team is using open-source software to create tools that will allow users like you to visualize patterns across the texts, such as how often various words appear, and relationships between terms that get used.

The end goal is that users will be able to search for terms that interest them–like “bisexual” or “domestic partner”–in novel places, like recordings of local news broadcasts not labeled as containing LGBTQ+ content. Users can find out how selected terms may have been used differently in the same timeframe in different contexts–were mainstream local news broadcasts mostly using the term “homosexual” at a time when community news sources were using “lesbian and gay”? A researcher could trace the uneven fading from usage of terms like “transvestite” alongside the growing usage of terms like “transgender”. And a user can trace the relationships between words. Are terms for LGBTQ+ people appearing in close proximity to terms identifying people of color? How does this change over time?

The LGBTQ+ Audio Archive Mining Project aims to allow academic researchers and community members to discover new information about LGBTQ+ history. It also aims to make the tools it is creating–and the processes for developing them–available to all, so that they can be used to mine information about other communities, in other archives.

At this point, as the tools are being developed, you can help! Click here for information on how to help with beta testing. (Link to be developed.)