Durability vs Reproducibility

In our digital/capitalist age, we have traded a focus on durability for a focus on reproducibility (see: Wendy Chun)

We digitize manuscripts because we want them to last longer, transferring the text from pen-and-paper (which can last a good 400 years with proper handling) onto hard drives (mechanical or otherwise, will inevitably degrade within a decade).

This is not to say that digitizing is not preservation, but only that we now understand preservation very differently. Preservation is about redundancy and continued access to the spirit of the artifact, not continued access to the material artifact.

Think of building preservation in Japan and China and compare them to Western museums: historical Japanese buildings get rebuilt every year in the original style, with the same materials and skills; Chinese buildings get rebuilt in the modern style, with modern tools, to resemble the old; and Western building preservation is about trying to halt decay. What is important about these buildings? Is it the form, the function, or their continuity?

Digital humanities is more than using tools to present old data in new ways, more than presenting new data in old ways: where can we go beyond this?

  • DH as it currently stands is mostly about having new tools to let us continue traditional disciplinary research (or at least within traditional disciplinary bounds). Topic modelling is neat, but at its core it’s pretty much just using digital tools to get us a corpus analysis of text.

“Digital” humanities is a redundant term because, if you grow up surrounded by the digital world and spend a good chunk of your life on it, this isn’t a new strain of the humanities/a new way of conducting studies into the digital world—this is just another extension of “how to study the world around us”

Therefore, ideally, digital humanities should be racing to make itself as a term obsolete: it implies a constant separation between the humanities/digital humanities, as if the two are even separable now. 

  • Potential counterargument: there should be a difference between a digital humanities approach to a topic and and humanities done with digital tools.

    • For example: Internet sociolinguistics (for example): a humanities approach to understanding the language of a digital space

    • Humanities with digital tools: just using computers to study sociolinguistics.

  • So “digital humanities” as a term would evolve from the sort of broad-tent meaning we may be using to something more narrowly defined by its relationship to the digital world.

  • The difference between the modern conception (which seems to be defined by its being exterior to the contemporary fields of humanities) and a “next generation” of digital humanities would be that digital humanities, as a field and as a lens, would be an integral part of humanities.

Think of this: what if we could write a digital humanities work without ever once touching a computer for it? What if DH was more than just a new methodical lens?

What could some DH ideas be?

Can we develop models that bring us something that existing models can’t do? Can they make useful predictions for us?

For example, is it possible to stop thinking about human beings as bounded individuals, and start thinking about them as porous, non-biologically determined, replicable, patterns of data? → Some links to epistemological ideas of the nature of knowledge: replicability of knowledge, consciousness, embodiment of cognition? On Rebekah’s turf, approaching challenges to consciousness from the perspective of “can machines be conscious” (i.e., the China brain example), is an example of our previous thinking about what humans are, how we should study them.

  • Daniel Dennett will do things like say if we say humans are consciousness, that machines have to be, but draws the opposite conclusion and says hey, we don’t necessarily have sufficient reasons to even say that HUMANS are conscious or that humans can have knowledge.

What if WE could extend what we consider to be our bodies?

Towards an algorithmic criticism: DH methods in application

All criticism is algorithmic.

Ramsay (2011) writes:

“In the classroom one encounters the professor instructing his or her students to turn to page 254, and then to page 16, and finally to page 400. They are told to consider just the male characters, or just the female ones, or to pay attention to the adjectives, the rhyme scheme, images of water, or the moment in which Nora Helmer confronts her husband. The interpreter will set a novel against the background of the Jacobite Rebellion, or a play amid the historical location of the theater. He or she will view the text through the lens of Marxism, or psychoanalysis, or existentialism, or postmodernism. In every case, what is being read is not the ‘original’ text, but a text transformed and transduced into an alternative vision, in which, as Wittgenstein put it, we ‘see an aspect’ that further enables discussion and debate.” (16)

Writing a thesis chapter, I note every reference to Nietzsche, Freud, post-structuralism, or the hermeneutics of suspicion in a Sheila Heti novel, recording the page numbers in pencil on the book’s inside cover, then refer back to that index while writing. Writing a seminar paper, I search various keywords in an 800-page Dickens novel, looking for references to water, drowning, the Thames.

Computers make algorithmic criticism easier to perform, but in doing so they aid an operation readers and interpreters have been performing since the beginning of humanistic study. In Too Much to Know, Ann M. Blair gives a history of note-taking practices spanning centuries, beginning with the advent of print. Eighteenth century readers would habitually copy favorite passages from their reading into commonplace books, which they might organize by theme. Scholars debated whether copying out relevant passages from books was “mechanical” work they should have their servants, amanuenses, apprentices, wives, or children perform, or an integral part of scholarly work. In the sixteenth century, Conrad Gesner, a bibliographer, recommended indexing books by cutting passages “directly from the printed book,” keeping two copies so that passages could be retrieved from both recto and verso (96). In the seventeenth century, Thomas Harrison designed a “scrinium literatum” or literary closet “designed to store slips of paper on hooks associated with commonplace headings that were inscribed alphabetically on little lead plaques”: a sort of spatialized commonplace book (94). The closet had “3,000 headings and a further 300 slots left blank for additions” and could, at least in theory, allow groups of scholars to easily share passages and notes amongst each other (94, 101).

When we interpret texts, we interpret fragments of texts that are themselves selected through interpretive processes. We “commonplace” passages into our notes. We copy or physically cut portions of texts. Then we further transform these texts or paratexts, narrowing down the relevant data or rearranging it. Since the time of illuminated manuscripts, books have often been designed to facilitate this type of algorithmic reading or criticism: books (in the eighteenth century, even novels) include indexes; they employ other paratext, such as chapter divisions, headings for each page giving brief plot summaries, lists of scenes, footnotes, and printed marginal notes. Readers also produce their own paratext through written marginalia, notes, commonplace books, and so on.

What if we view the computational tools available to contemporary scholars as an extension of these practices? Ramsay suggests a similar conclusion, describing the texts produced by algorithmic transformations as “paratexts” of the original. These paratexts provide us with new maps of the original text, and like geographic maps, they emphasize particular aspects of their “territories” over others, bringing particular aspects of those territories into focus, opening up new interpretive possibilities.

Computing allows us to perform these operations unusually systematically and quickly, even on a length text or large corpus. Algorithms may also produce paratexts that are productively at odds with the ones human interpreters might normally produce: the “enormous liberating power of the computer,” Ramsay writes, lies not in “the immediate apprehension of knowledge, but instead what the Russian Formalists called ostranenie - the estrangement and defamiliarization of textuality” (3). These operations can allow us to see new and unexpected things in the texts we transform. But algorithmically generated paratexts are only a starting point for the work of the human interpreter.

The role of computational tools in literary criticism, “If text analysis is to participate in literary critical endeavour in some manner beyond fact-checking” should be to “endeavour to assist the critic in the unfolding of interpretative possibilities. We might say that its purpose should be to generate further ‘evidence,’ though we do well to bracket the association that term holds in the context of less methodologically certain pursuits.” (10) Computational tools enable us to very efficiently identify patterns in texts  — computers can scan large sets of data, often with high degrees of accuracy, and return evidence that can be used to develop original claims or to support existing ones.


The problem with empiricism and falsifiability in the digital humanities


Ramsay identifies a tendency in text analysis in which “the analogy of science” is “being put forth as the highest aspiration of literary study.” (3) He cites a number of scholars who believe that the propositions put forth by literary critics “‘have the technical status of hypothesis, since they have not been confirmed empirically in terms of the data which they propose to describe - literary texts.’” (4)

Scholars working on post-critical methods that challenge the supposed dominance of critique and the “hermeneutics of suspicion” in literary studies have also picked up on the digital humanities as a field that could lend more objectivity to literary study. For example, in “Surface Reading: an Introduction,” Stephen Best and Sharon Marcus write that digital humanities approaches “resonate” with their work on surface reading, a method aimed at “attend[ing] to the surfaces of texts rather than plumb[ing] their depths” (1-2). According to Best and Marcus, digital humanities work, like surface reading, “seeks to occupy” a “space of minimal critical agency.” Digital humanists can attempt to “correct for” their “critical subjectivity, by using machines to bypass it, in the hopes that doing so will produce more accurate knowledge about texts. […] [C[omputers can help us to find features that texts have in common in ways that our brains alone cannot. Computers are weak interpreters but potent describers, anatomizers, taxonomists.” Because of this, they may “bypass the selectivity and evaluative energy that have been considered the hallmarks of good criticism, in order to attain what has almost become taboo in literary studies: objectivity, validity, truth.” This could help to produce knowledge that is both qualitative and empirical, “expand[ing]” the work that literary critics can do (17).

As Ramsay cautions, however, the aim of computational tools in the digital humanities is not to ascertain the empirical “truth” of a critical claim. This is because “literary arguments … do not stand in the same relationship to facts, claims,and evidence as the more empirical forms of inquiry. There is no experiment that can verify the idea that Woolf’s ‘playful formal style’ reformulates subjectivity or that her ‘elision of corporeal materiality’ exceeds the dominant Western subject.” (7)

Indeed, the goals of scientific investigation and those of literary criticism are entirely at odds — we do not aspire to or assume that “there is a singular answer (or a singular set of answers) to the question at hand.” (15) Instead, we seek a number of answers and interpretations that can generate further discussion, “not so that a given matter might be definitely settled, but in order that the matter might become richer, deeper, and ever more complicated.” (16)

The fact that computers cannot piece together scraps of evidence to make critical claims doesn’t mean we shouldn’t use them. Actually, “calling computational tools ‘limited’ because they cannot do this makes it sound as if they might one day evolve this capability, but it is not clear that human intelligence can make this determination objectively or consistently.” Creating computers that can interpret texts in the way humans do might not even be possible, let alone desirable — we are as yet uncertain that our ability to create arguments is reproducible, since we don’t know the mechanisms by which we do so, and it’s doubtful even if we did that these could somehow be considered objective. And objectivity is certainly not the goal, either: “We read and interpret, and we urge others to accept our readings and interpretations. Were we to strike upon a reading or interpretation so unambiguous as to remove the hermeneutical questions that arise, we would cease to refer to the activity as reading and interpretation” (10).