Topic Modeling as an Archaeological Dig

I actually suspect that the topics identified by [latent Dirichlet allocation] probably always have the character of “discourses.”
—Ted Underwood, “What kinds of ‘topics’ does topic modeling actually produce?

The tools that enable historians to carry out this work of analysis are partly inherited and partly of their own making […]. These tools have enabled workers in this historical field to distinguish various sedimentary strata; linear successions, which for so long had been the object of research have given way to discoveries in depth.
—Michel Foucault, introduction to The Archaeology of Knowledge (tr. A.M. Sheridan Smith), p. 3

I’ve been thinking about topic modeling over the last few weeks as I re-read (the A.M. Sheridan Smith English translation of) Foucault’s The Archaeology of Knowledge, and thinking about what topic modeling has to offer those of us who (more or less) count as methodological Foucauldians. What I’d like to suggest is that, in point of fact, not only is topic modeling a useful exploratory tool for those engaged in constructing Foucauldian archaeologies, but that topic modeling is already a Foucauldian methodology. I’ll explain why I think this is the case in a bit, but first want to provide some quick definitions to anchor this discussion.

Topic modeling, as many readers are likely already aware, is an computational technique that aims to algorithmically discover clusters of words — the “topics” of “topic modeling,” though this nomenclature is itself not uncontroversial — in a text or set of texts that “belong together” by virtue of comparatively frequent co-occurrence in the source corpus.  It does this by applying one or another variation of an algorithm called “latent Dirichlet allocation,” which constructs the “topics” algorithmically, without human intervention. (A good not-particularly-technical introduction to topic modeling is David Blei’s Topic Modeling and Digital Humanities; an introduction that talks more about how the algorithm actually works is Matt Burton’s The Joy of Topic Modeling. A more mathematically oriented overview is John Mohr and Petko Bogdanov’s Topic Models: What They Are and Why They Matter.) A commonly used piece of software that performs topic-modeling operations — as well as several other types of operations — is MALLET.)

Now, that was a very brief introduction, and not an adequate one by any means; anyone not familiar with topic modeling should probably take a look at one of the articles linked in the last paragraph. But, for the sake of not stating that everyone who isn’t already familiar with the methodology needs to go off and do some background reading before continuing with this blog post, which would be almost unbearably snooty, I will say that, probably, the best very concise explanation of what topic modeling is can be found in Sharon Block’s Doing More with Digitization:

Topic modeling is based on the idea that individual documents are made up of one or more topics. It uses emerging technologies in computer science to automatically cluster topically similar documents by determining the groups of words that tend to co-occur in them. Most importantly, topic modeling creates topical categories without a priori subject definitions. This may be the hardest concept to understand about topic modeling: unlike traditional classification systems where texts are fit into preexisting schema (such as Library of Congress subject headings), topic modeling determines the comprehensive list of subjects through its analysis of the word occurrences throughout a corpus of texts. The content of the documents—not a human indexer—determines the topics collectively found in those documents.

Perhaps one reason why Block’s definition is so good is that it is a very early explanation: Matthew Jockers, certainly a person more experienced with topic modeling than I am, says that Block’s 2006 description “is, to my knowledge, the earliest example of topic modeling in the humanities” (123). As is so often the case with new methodologies, the earliest people to develop the methodology came up with (at least some of) the best explanations, perhaps in part because they (unlike me) couldn’t depend on their audiences already understanding anything about the methodology.

Discussion of a previous example

I’ve talked previously, a bit, about one of my own experiments with topic modeling, when I let MALLET run over a group of ten short stories by H.P. Lovecraft. There are several things that I take to be significant in these results. Perhaps most obviously, I take them to be fair and valid because the topic-modeling process has identified topics that are, in many cases, similar to the topics that an astute reader of these Lovecraft stories would likely identify, with less numeric data and more effort, as “what these stories are about”: the algorithm passes a basic sanity check, because if the program gave us topical clusters like “banking Jesus roses fedora Samsung puppy snickerdoodle polyethylene,” we would know that something was wrong. The fact that the algorithm is coming up with (more or less) what we expect suggests that it is doing more or less what we expect it to do.

Of course, this produces a separate problem, and those experienced with topic modeling are nodding their heads in advance here: if it just gives us what we expect, what good is it, even if it is accurate? We might just as well be doing the same thing ourselves. And, though one fair potential response might be “yes, but it does so much faster, because your laptop can ‘read’ those ten stories much more quickly than you can,” and though this in itself opens up a lot of possibilities that are currently being explored by intelligent people (a great example is Alison Chaney and David Blei’s Visualizing Topic Models project, which includes an example working over a large number of documents from Wikipedia), I’m interested specifically in the method’s exploratory possibilities, its ability to tell us “things about the texts” that we wouldn’t have noticed ourselves.

Because no one ever gets everything out of any given text, no matter how talented a reader she is: our attention is always focused on specific features and overlooks other features. No one can notice every feature of a text, though some people clearly notice more (or even much more) than others, and part of the point of a contemporary education in the humanities, and particularly in literature, is to teach students to notice more as they read. Topic modeling software, though, is a kind of an “objective reader” in ways that human readers rarely or never are: it’s not previously involved in any scholarly or interpretive project other than counting words and noticing co-occurrences, and so it catches things that we miss.

It also misses things that we catch, and is therefore not a substitute for informed human readings: topic models supplement our existing reading practices by providing data that would be incredibly time-consuming to collect by hand but are useful as places to start thinking about a text. A topic modeling application is not an automatic interpretor that grinds out final answers (“THE POEM IS ABOUT DEATH.“), crushing human creativity and enslaving all of us with its uncaring and inhuman efficiencies, as any number of 1970s dystopian sci-fi movies imagined. It produces intermediate data, not final interpretations, and part of the reason for this is that the data can’t be interpreted in any meaningful way without being informed by, well, an informed reading of the text being interpreted. Taking another look at the topic models I produced in my Lovecraft analysis shows that the topics on their own are just, as the phrase has it, a bag of words that involves counting word frequency and comparative distance between words, but discards word order and other grammatical aspects of the text it is analyzing. Which means that topic modeling throws out a whole lot of information that it can’t process, unlike us: it doesn’t “really understand plot” (or characterization, or reference, or irony, or puns, or jokes, or any of hundreds of other things that we all pick up on when we read); it’s just seeing which words tend to occur near each other.

But the intermediate data it provides is incredibly useful on its own: if words aren’t occurring right next to each other, regularly, it’s hard for humans to notice them, but topic modeling can count words that appear near each other for a much more expansive definition of “near” than humans can, and can do so more reliably than we could; and this data that a group of words tends to appear comparatively close together actually says a lot about theme. Too, topic modeling can pick up on words that we tend to filter out, such as prepositions and conjunctions (a great example is cited by Matthew Jockers on page 26 of Macroanalysis: John Burrow’s — not topic-modeling-based, as it was written nearly two decades before topic modeling was invented, but computationally based and conceptually similar — 1987 study of pronoun usage in Jane Austen).

In my own case, with the limited experiments in topic-modeling Lovecraft’s stories, there were a number of things that I noticed from the topic models that I had not previously noticed, even though I had read (some of) those particular stories multiple times across a substantial chunk of my reading life: the extreme degree of prominence of the questions of heredity in “Arthur Jermyn” is one of the less insightful examples here (I’d noticed the emphasis in the story, but not the degree to which the story emphasized it; in fact, discovering that the machine-detected topic 2 comprises 39.8% of the story was quite a surprise to me). But there are other, more subtle clues that the topic-modeling process turned up that are quite interesting and deserve further attention. Most notable, I think are topics 10-12, which seem to be exclusive of each other in stories where they appear: not more than one of topics 10-12 appears in the top nine topics for any of the stories under consideration. These topics are all variants on a single theme that Lovecraft scholars tend to collapse into a single “Lovecraftian concern”: horror and epistemology, inflected in different ways (topic 10 might be seen as a “pure” form of the topic, while topic 11 transposes these concerns onto rural settings and rural people, and topic 12 transposes them onto what might be considered the time’s version of “science-fiction”  concerns). Topic modeling in this case has turned up something that, I think, scholars have overlooked, and that deserves more attention than it has received.

But for Foucault …

Turning back to Ted Underwood’s comment at the beginning of this post, I’d like to examine the question of whether topic modeling and its underlying algorithm, LDA, do in fact “always have the character of ‘discourses.'” Discourse for Foucault is a slippery topic, one he uses in multiple senses; but halfway through the Archaeology, he looks back on his previous uses of the word “discourse” and says that

in the most general, and vaguest way, it denoted a group of verbal performances; and by discourse, then, I meant that which was produced (perhaps all that was produced) by the groups of signs. But I also meant a group of acts of formulation, a series of sentences or propositions. Lastly— and it is this meaning that was finally used (together with the first, which served in a provisional capacity) — discourse is constituted by a group of sequences of signs, in so far as they are statements, that is, in so far as they can be assigned particular modalities of existence. (107)

There are several reasons here why I believe that topic modeling fits a number of characteristics of these overlapping definitions quite nicely:

  • Perhaps most obviously, Foucault’s first definition of “a group of verbal performances” is made by specifying that “that which was produced […] by groups of signs,” and this coincides with the “bag of words” nature of topic modeling quite well. Foucault is of course not (entirely) unconcerned with grammar, word order, or meaning; but neither does he assign them transcendentally revelatory functions; they can be better understood, and more accurately re-inscribed, when their underlying relations are understood. Which brings me to my next point:
  • Foucault writes near the beginning of the Archaeology that “there is a negative work to be carried out first: we must rid ourselves of a whole mass of notions […]. We must question those ready-made syntheses, those groupings that we normally accept before any examination, those links whose validity is recognized from the outset; we must oust those forms and obscure forces by which we usually link the discourse of one man with that of another; they must be driven out from the darkness in which they reign. And instead of according them unqualified, spontaneous value, we must accept, in the name of methodological rigour, that, in the first instance, they concern only a population of dispersed events.” (21, 22) This is precisely what topic modeling does: it abandons traditional groupings of — and relations between — ideas in favor of groupings based purely on word frequency. To (tell a machine to) engage in a topic-modeling exercise is a methodological step that engages in an automated (part of an) analysis that is unbiased by any pre-existing ideas about how ideas should be grouped together (except insofar as words are grouped together based on how frequently they occur together — which is an ideological presupposition on its own, and therefore not neutral, but which has the advantage that the algorithm is known and that open-source implementations are available and can be examined).
  • Topic modeling is a way of throwing those “groups of sequences of signs” into relief, helping us to drive out “the forms and obscure forces by which we usually link the discourse of one man with that of another” “from the darkness in which they reign.” Seeing words that we had not noticed frequently co-occurring put into co-occurrent groups tells us something about the connection of signs, of how those signs construct meaning, of how “knowledge” and power are understood, and constructed, by a text. (But what it tells us, exactly, depends on the text, and requires interpretation, just as the text itself requires interpretation. Topic modeling is a tool for interpretation, not an algorithm that interprets for us.) As Foucault puts it: “In fact, the systematic erasure of all given unities enables us first of all to restore to the statement the specificity of its occurrence.” (28)

For all of these reasons, I believe that topic modeling is a technique whose underlying goals are basically compatible with Foucault’s; and I would also like to suggest that there are some parts of Foucault’s archaeological methodology to which topic modeling is a particularly well-suited investigative tool. In particular, I’d like to take a brief look at how Foucault analyzes what he calls “rules of formation,” with an emphasis on the formation of the formation of concepts (discussed in chapter five of part II of the Archaeology).

First, though, I want to quote Foucault’s explanation of a methodological problem that arises in the analysis of discourse and where discursive analysis found itself later in the twentieth century:

We sought the unity of discourse in the objects themselves, in their distribution, in the interplay of their differences, in their proximity or distance — in short, in what is given to the speaking subject; and, in the end, we are sent back to a setting-up of relations that characterizes discursive practice itself; and what we discover is neither a configuration, nor a form, but a group of rules that are immanent in practice, and define it in its specificity. (46)

This is precisely what topic modeling does: it analyzes and helps to uncover “relations that characterize discursive practice itself,” without regard to underlying “objects themselves,” and helps to reveal the “group of rules that are immanent in practice,” even if never specified explicitly, and to make them explicitly visible.

Focusing more closely on “The Formation of Concepts,” though, opens up questions (of course) about what Foucault means by “concept”; Foucault is rather in character here: he uses the word without defining it; it becomes, in part, the object of analysis in this chapter, and answers are suggested without being formulated explicitly. But I’m willing, for the purpose of avoiding letting this blog post get any longer than it seems likely to get, to simply fall back on the Stanford Encyclopedia of Encyclopedia of Philosophy‘s rather neutral definition, with which Foucault doesn’t explicitly differ: that is, I’ll take “concept” to mean some combination of (1) a mental representation; (2) a mental ability held by cognitive agents; and (3) a Fregean sense (i.e., an abstract object), with Foucault’s usage being more closely involved with senses (1) and (3) than with (2), with which he is not particularly concerned throughout most of the Archaeology.

But what’s interesting about this particular chapter of the Archaeology, I think, is that topic modeling maps closely onto an appropriate investigative methodology for what Foucault describes as the various linguistic “fields” constituted by various discursive actions and operations. Foucault’s brief summary:

The configuration of the enunciative field also involves forms of coexistence. These outline first a field of presence […] Distinct from this field one may also describe a field of concomitance. [….] Lastly, the enunciative field involves what might be called a field of memory. (57-58)

The discursive fields described here — especially the first two — are precisely what topic modeling is designed to investigate directly. Of course, there are numerous other tools that can investigate a “field of presence,” particularly concordance tools (AntConc, in particular, deserves a mention), though I think that topic modeling takes this further than simple concordance work by beginning the analytical process and starting to reveal the  “forms of coexistence” that Foucault names by automating some of the drudgery required and making collection of numerical data possible. The “field of concomitance” that Foucault names is, of course, the primary target of topic modeling software, which intends to analyze precisely which words co-occur and how often, and to group them together to help show the researcher which words seem to be related to each other in a particular discourse. (The astute will notice, of course, that topic modeling analyzes [sets of] texts, not “discourses”; but then, so do literary scholars and historians: at some point, we need to define where the boundaries of the “discourse” with which we’re working and how they’re instantiated in and reflected by individual texts, and to what extent a text constitutes a discourse; these also are not problems whose solution topic modeling automates.)

Topic modeling is not primarily targeted at analyzing the Foucauldian “field of memory,” as it doesn’t take temporal development across an individual text or between a set of texts into account; it simply groups words. But, as with other fields discussed here, topic modeling has the potential to be a good preliminary and intermediate step toward investigating what’s still hanging on in conceptual groupings: collecting data on texts that we think of as “on related topics” and seeing how the clusters of words change can provide insight into this area, too, and (again) provides an  opportunity to reveal previously unnoticed groupings.

Foucault also comments on the aim of discursive analysis in ways that show how topic modeling can contribute to discursive analysis:

The description of such a system could not be valid for a direct, immediate description of the concepts themselves. […] One stands back  in relation to this manifest set of concepts; and one tries to determine according to what schemata (of series, simultaneous groupings, linear or reciprocal modification) of the statements may be linked to one another in a type of discourse; one tries in this way to discover how the recurrent elements of statements can reappear, dissociate, recompose, gain in extension or determination, be taken up into new logical structures, acquire, on the other hand, new semantic contents, and constitute partial organizations among themselves.  These schemata make it possible to describe — not the laws of the internal construction of concepts, not their progressive and individual genesis in the mind of man — but their anonymous dispersion through texts, books, and œvures. A dispersion that characterizes a type of discourse, and which defines between concepts, forms of deduction, derivation, and coherence, but also of incompatibility, intersection, substitution, exclusion, mutual alteration, displacement, etc. Such an analysis, then, concerns, at a kind of preconceptual level, the field in which concepts can coexist and the rules to which this field is subjected. (60)

Again, what I want to note here is primarily that topic modeling is itself a way of beginning to analyze discourse in a Foucauldian way, one that doesn’t presuppose that the discourse in question was constructed according to an intentional system by which “the laws of the internal construction of concepts” become articulate in a text, or according to which the ideas’ “progressive and individual genesis in the mind of man” is taken in advance to be the determining factor in the construction of the texts. Again, topic modeling is a way of getting at characteristics of the discourse itself without resorting to the traditional models involving intentionality and human creativity that human interpreters tend to re-inscribe as they interpret texts; it is a way of getting outside the presumption of the author’s intentionality, of which Foucault was so critical, as a latent assumption in interpretive method. “[N]ot the laws of the internal construction of concepts, not their progressive and individual genesis in the mind of man — but their anonymous dispersion through texts, books, and œvures,” Foucault writes, describing the aims of archaeological analysis; this is precisely what word frequency-counting gets us over the set of selected texts (which may also constitute “books, or œvures,” depending on how they are selected) that are imported into the topic modeling software. Again, the software does not perform genuine interpretive work on the behalf of the scholar, but it does help to reveal connections that are hidden, that are difficult to notice, that slip past the attention of even the most determined scholar; it helps to get at “[a] dispersion that characterizes a type of discourse, and which defines between concepts, forms of deduction, derivation, and coherence, but also of incompatibility, intersection, substitution, exclusion, mutual alteration, displacement, etc.” so that we can notice and interpret it. It brings these relations onto our radar and assists with our hermeneutic and structural work, so that we can see the “preconceptual level, the field in which concepts can coexist and the rules to which this field is subjected. ”

Methodologically, Foucault says that, in investigating the linguistic fields he has described,

one does not subject the multiplicity of statements to the coherence of concepts, and this coherence to the silent recollection of a meta-historical ideality; one establishes the inverse series; one replaces the pure aims of non-contradiction in a complex network of conceptual compatibility and incompatibility; and one relates this complexity to the rules that characterize a particular discursive practice. (62)

Again, I’d like to point out that this is precisely what topic modeling does: it starts with a statistical analysis of work co-occurrence, without even a preconception of what the words in question mean; it establishes linguistic networks in a meaning-agnostic way, leaving the hermeneutics of these networks to the scholar running the software to interpret; it avoids “subject[ing] the multiplicity of statements to the coherence of concepts, and this coherence to the silent recollection of a meta-historical ideality.” It just counts words, notices how they “hang together” according to the “bag of words” model, and goes ahead to assist in beginning to describe the “complex network” (or networks?) “of conceptual compatibility and incompatibility” as a way of beginning to “[relate] this complexity to the rules that characterize a particular discursive practice.”

It’s worth saying again (and again, and again) that topic modeling is not a replacement for analysis, but a tool that’s productive in assisting in engaging in it. It is a place for beginning and a tool for noticing, which may very well lead to beginning again (and again, and again), running further experiments to investigate medium to large corpora of data (there are ways in which topic modeling is also useful on small data sets, but I am primarily concerned here with the large-scale archaeological digs made possible by software tools). Again, Foucault has anticipated me here:

when one speaks of a system of formation, one does not only mean the juxtaposition, coexistence, or interaction of heterogeneous elements (institutions, techniques, social groups, perceptual organizations, relation between various discourses), but also the relation that is established between them — and in a well determined form — by discursive practice. What is to be done with these […] systems or rather those […] groups of relations? How can they all define a single system of formation? (72)

Foucault answers his own question by reminding the reader that “the different levels thus defined are not independent of one another” and that “strategic choices do not emerge directly from a world-view or from a predominance of interests peculiar to this or that speaking subject; but […] their very possibility is determined by points of divergence in the group of concepts” (72).

Which is to say (again) that there are numerous points of contact between the archaeological methodology and the kinds of tasks that topic modeling aims to achieve; but I don’t want to belabor the point. There’s a book to be written here someday, I think, talking in more detail about how topic modeling achieves this, but this blog post has already grown much longer than I intended, so I’ll close up with a specific proposal for research that I’m not qualified to conduct but that might serve as a test case for how well topic modeling can be used to conduct archaeological research.

Epilogue: a Proposal

At the very end of the Archaeology, in the last three pages of (the 1972 publication of A.M. Sheridan Smith’s English translation of) the book, Foucault discusses the possibility of “other” archaeologies than those he has conducted throughout the book as methodological examples. (Indeed, I largely see my own mostly-yet-unwritten dissertation as one of these “other archaeologies” whose potential is briefly discussed on pp. 192-95 of the Archaeology.) His own first example of a potential alternate discourse in this section, first published seven years before the first volume of the History of Sexuality, is an archaeology of sexuality:

Such an archaeology would show, if it succeeded in its task, how the prohibitions, exclusions, limitations, values, freedoms, and transgressions of sexuality, all its manifestations, verbal or otherwise, are linked to a particular discursive practice. It would reveal, not of course as the the ultimate truth of sexuality, but as one of the dimensions in accordance with which one can describe it, a certain ‘way of speaking’; and one would show how this way of speaking is invested not in scientific discourses, but in a system of prohibitions and values. (193)

In fact, this is a fair overview of Foucault’s approach to the topic in the first volume of the series — at least in its general outline. What I propose as a test of my thesis that topic modeling is a useful tool for the production of Foucauldian archaeologies is a series of topic modelings based on Foucault’s own data set discussed in the first volume of The History of Sexuality: what happens when MALLET (or another topic modeling tool) runs over (well-selected, pre-processed) groups of corpora identified by Foucault? What topics does it turn up, and how do those topics change over time, and how does this compare to Foucault’s own analysis? What happens when other texts, unexamined by Foucault (or, at least, not explicitly theorized by him), but germane to his arguments and analyses in that volume, are added to these corpora? And what does this tell use about Foucault’s methodology, and about topic modeling?

There are whole series of potential books ready to be written on this topic. Alas, they should be written by someone who reads the French of Foucault’s source texts far better than I do.

(Print) References

Foucault, Michel. “The Archaeology of Knowledge.” The Archaeology of Knowledge and The Discourse on Language. Trans. A[lan] M. Sheridan Smith. New York: Pantheon Books, 1972. 3–211. Print.

Jockers, Matthew Lee. Macroanalysis: Digital Methods and Literary History. Urbana: University of Illinois Press, 2013. Print.

Leave a Reply

Your email address will not be published. Required fields are marked *