Digital Media’s Demands and Yields

Written by: Ryan Edge

Primary Source: Digital Scholarship Collaborative Sandbox

In lieu of telling you where this budding Media Preservation program and I are at in our fourth month together, I’m going to share a few basic concepts of the field, before steering slightly toward digital scholarship and a few tools/resources that might excite you. In other words, I’ll keep it light, and will share projects and outcomes at a later date (very soon).

First, some obligatory background: Media Preservation, as a field, is at this time pushing at full steam to migrate vast numbers of analog and physical digital recordings stored on obsolete and endangered formats. Media like these are rapidly deteriorating on the shelves of libraries and archives worldwide, and—like those in our own Special Collections—typically contain rare or unique content of high research value.

The urgency around mass digitization initiatives stems from the very real notion that not all of our AV artifacts can or will be saved. We are fighting against the physical degradation of media objects (e.g. delaminating lacquer discs, shedding magnetic tape), but also factors of technological obsolescence. For obsolete formats, access becomes more difficult and costly as functional playback devices disappear, just as the tools, supplies, and expertise required to sustain these technologies become more obscure and thus more prohibitively expensive. Magnetic tapes (i.e. audio and video tapes), for instance, comprise the majority of AV objects in Special Collections. The consensus among media preservationists is that these formats generally have less than fifteen years left before the two-headed threat of “degralescence” (a term coined by Indiana University’s Mike Casey) renders these media irrecoverable.

media degradation of lacquer disc, magnetic tape, rewritable compact disc, and minidisc

“Degralescence” waits for no format: (1) palmitic acid deposits on lacquer disc, (2) magnetic tape breakdown, (3) CD-R disc rot, (4) MiniDisc. All images but image 2 (Flickr user windthoek, CC BY-NC-SA 2.0) are my own.

Preservation/conservation of any kind has the relatively odd distinction of being “about” particular formats or materials, with concerns that are largely agnostic of content. You could say that media preservation is found further down this rabbit hole: we fight to prevent machine-dependent “signal carriers” from catastrophic loss, migrating their encoded information (video/audio signals) to more stable digital files, and then we steward these surrogates through digital preservation environments and technology migrations into an indeterminate future. And it is at this point, I suspect, that our interests intersect.Time-based media is a different animal than the classical data formats, as you likely know. You have heard rumblings about digital media’s significant data rates and demands on storage? It’s true, probably all of it. As a micro-level illustration, one minute of standard definition preservation-quality video (uncompressed 10-bit, 4:2:2, w/ 2-channel audio) equals 1.7 gigabytes! That single minute is equal to a full-color preservation scan of a very large map (uncompressed 24-bit TIFF, 600 ppi). And, on a macro level, a 2013 study from IBM forecasts that media will increasingly represent the most significant wave of data to come in our near future (the next 15–20 years). Video, audio, and images will continue to account for the vast majority of the world’s data—and video, above all, will surge. Few sectors will likely feel the impact as acutely as academic libraries and other similar repository/memory institutions.

graph depricting sharp increase of video data in coming years

Fig. – Projected torrent of data exhibiting AV data surge over next 20 years. Source: IBM Market Insights 2013

And while the colossal demands of uncompressed video data will curb optimism of Moore’s Law regarding digital storage hardware, costs will continue to decrease gradually over time, just as consistent protocol for digital file submission and organization will reduce unnecessary waste. Regardless, digital media are enormously rich resources, in addition to being enormous. We are capturing far more potential research data in these monolithic files than we realize, and this potential will only grow larger in the communities we serve, just as capacities for search, manipulation, and analysis grow.

Whether or not the appropriate tools are ready to meet grand expectations of researchers depends greatly on the nature of the work. Legacy text and statistical data have found a new lease on life through computational analysis in recent years. I realize audiovisual sources will continue to get short shrift next to these classical building block formats in digital scholarship. Yet recorded sound and moving images can contain all these elements and more—it just requires more work upfront to convert the encoded AV information into something researchers can interpret.

Tools/Software

Here are some audiovisual-centric tools that I use and suspect some of you have heard of; this list is by no means comprehensive. All of these are free, most are open source. These are command line interface (CLI) tools, unless otherwise noted as having a graphical user interface (GUI).

Access | Play

  • VLC Media Player – Audio and video player GUI. This is the most robust media player out there. Leveraging the exhaustive libavcodec library, VLC is capable of handling nearly any format you throw at it. Available for Mac and PC.

Reformat | Edit | Manipulate

  • FFmpeg – Comprehensive suite of AV tools: transcoder, editor, player, analyzer, and validator. Converts, records, and plays audio and video of nearly any format. FFmpeg comes bundled with an unparalleled number of codec libraries, as well as ancillary transcoding and authentication functions. Many other well-known software employ FFmpeg (albeit with restrictions), including Handbrake, QCTools, FFmpegX (an outdated GUI, don’t mistake the two), and nearly any other open source or web application that touches audio or video. FFmpeg is my favorite tool and I seem to use it every day; it can perform nearly every function detailed in this list. Available for Mac and PC.
  • MPEG Streamclip – Video transcoder, editor, and player GUI. “MPEG” is kind of a misnomer as the tool supports the encoding and export of many other video codecs and formats. Streamclip can help you to quickly trim, divide, and join videos, export audio tracks and individual frames, while also supporting high-quality uncompressed or HD video encoding. Available for Mac and PC.

Analysis | Processing | Validation

  • MediaInfo – AV format-specific technical metadata extractor and identifier. Unlike more widely used characterisation tools like JHOVE and Droid, MediaInfo supports the analyzation of components and tags unique to audio and video files. Available for Mac and PC; also available on Mac as a lightweight GUI for a reasonable price (~$2).
  • ExifTool – Metadata extractor, identifier, and editor. ExifTool supports many metadata formats, and has been a part of general digital preservation ingest workflows for some time, but has recently increased support for audiovisual files. Available for Mac and PC; GUI available for Windows only.
  • QCTools or “Quality Control Tools” – Video quality assurance GUI. Enables visual analyzation of digital video and detection of corruption or visual artifacts. This has been particularly useful for those scanning for interstitial errors post-digitization, but that sells it short. Available for Mac and PC. More details on QCTools, its applications and updates (current version 0.7), can be found through the Bay Area Video Coalition, which developed the tool along with an excellent team of media preservationists. Another excellent BAVC project is the A/V Artifact Atlas, an online resource used to identify and diagnose artifacts and errors in media and analog-to-digital workflows.
  • BWF MetaEdit – Metadata embedder, validator, and extractor for Broadcast WAVE Format (BWF) audio files. More details about BWF and metadata chunks can be found through FADGI, which developed MetaEdit with AVPreserve. Available for Mac and PC.

Transcribe | Search

  • CMUSphinx – Speech recognition toolkit (developed by Carnegie Mellon University). CMUSphinx’s primary functions include speech transcription, closed captioning, (live) speech translation, and voice search. It also supports keyword spotting, alignment, and pronunciation evaluation. Supports English, French, Mandarin, German, Dutch, Russian, as well as the ability to model others. CMUSphinx is leveraged in many other applications that support voice control. Available for Mac and PC.

Supercuts (for kicks, laffs, yuks)

  • Videogrep / Audiogrep – Twin projects by Sam Lavigne, each is essentially a Python script that can automatically assemble a “supercut” when passed a word, phrase, or grammatical structure (e.g. “[gerund] [determiner] [adjective] [noun]”). In the case of Videogrep, this is achieved through searching an accompanying subtitle track (.srt text file). Audiogrep, on the other hand, must first use a component of CMUSphinx (PocketSphinx) to index the speech. The script then crawls the text files and jumps to the corresponding timecode in the video/audio file and stitches elements together. Available for Mac and PC. More on Videogrep and Audiogrep.

Harnessing information contained in audio, video, and other rich ancillary bitstreams will never be as straightforward as, say, text mining. In fact, audiovisual search is facilitated by textual data, most often as a sidecar metadata file containing transcribed speech anchored by timecodes. So if you want to search spoken word recordings, you must first index the audio, alternatively by hand or speech-to-text software (or by employing a combination of the two). Speech-to-text and image recognition software is gradually improving, but certainly has not permeated the digital scholarship arena. AV data has not yet scale up to meet most researchers’ expectations of “distant viewing,” but there are many applications (above) and services that are making strides today. (An aside: it’s perversely beautiful, in a way, that the subtle nuances of recorded sound and human speech continue to elude algorithmic recognition, despite all advances in communication.)

I’d be happy to answer any questions you may have, or to discuss any projects that might come to mind in these fields of media!


Ryan Edge, your Media Preservation Librarian

The following two tabs change content below.
Ryan Edge
Ryan Edge is the Media Preservation Librarian at MSU Libraries, where he coordinates audiovisual reformatting and born-digital content migration activities. Prior to this, he served as the Project Manager for the Preservation Self-Assessment Program (PSAP) web application at the University of Illinois Libraries. He is interested in the application of old (legacy) and new media in research and education, as well as the technical challenges involved in facilitating long-term preservation and access to these complex document forms. He received his MLIS from the University of Illinois Graduate School of Library & Information Science in 2013.
Ryan Edge

Latest posts by Ryan Edge (see all)