Jack Chen “From Topos to Topic (Modeling): Conceptual Histories in Literary Information Management”
What do we mean when we speak about literary topics? In the history of rhetoric, the notion of the topic was used to refer to sources of argumentation, templates by which an argument might be constructed. As adapted by Western literary scholars, most prominently Ernst Robert Curtius, the term topic was used to denote a constitutive element of literary composition, a recurring theme or phrase that organized the development of an argument in a literary text. A parallel case is found in Chinese literary studies, in which “topic” is used to translate the term ti 體, which is used to identify shared thematic commonalities within a given genre. From the perspective of literary information management, one might begin by considering the origins of these practices, how texts are organized and conceived across the histories of their production.
The notion of topic in topic modeling is not unrelated to its usage in literary theory. How the topic model works, on a basic level, is to examine the documents a particular corpus by means of a set of probabilistic algorithms that returns the hypothetical “topics” that would have generated the documents. These topics are a kind of analytical fiction, since the documents were never constructed by such means, but they provide valuable insight into latent textual structures—which I will argue is not unlike the more intentional structures of rhetorical topoi. What this talk will explore is the conceptual histories, the possibilities, and the limits of a topic-based approached to literary studies.