HeidelGram: A network of evaluative terms in 19th-century British grammars – Methodological challenges and practical solutions

Beatrix Busse and Ingo Kleiber

Abstract

The HeidelGram project has a twofold aim. Firstly, it makes an essential contribution to historical grammar studies by compiling, analysing, and giving open access to a representative 10-million word corpus of historical English grammar books from the 16th to the 19th centuries. Secondly, it introduces state-of-the-art network analysis into diachronic corpus linguistics; thus, considerably extending the set of concepts and methods applied in historical linguistics. Our overall aim is to examine discourses in English grammar writing by exemplarily implementing and analysing three networks – a network of grammars and grammarians, a network of evaluative terms associated with verbal hygiene (Cameron 2012 [1995]), and a network of lexemes referring to grammatical phenomena.

While network analytical methods have been applied to historical textual material (e.g. Bergs 2005; Sairio 2009; Fitzmaurice 2010) and fictional texts (e.g. Agarwal et al. 2012; Moretti 2013), the combination of corpus-based diachronic linguistics and network analysis is rather uncharted territory. This new approach poses significant methodological challenges and requires us to come up with new forms of extracting, annotating, and analysing historical linguistic data. A series of exploratory studies (Busse, et al. 2016a and 2016b; Busse and Gather 2016), based on a systematically compiled and representative corpus of 19th -century British grammar books (40 texts, approx. 2.6 mio. words), has already shown the potential of this approach towards conducting historical grammar studies. In the present paper we want to present initial findings regarding the network of evaluative terms and discuss some of the major methodological and technical challenges associated with this approach. These include expressions like “greatly erred” in Crombie’s 1802- grammar: “Priestley, in defending the other phraseology, appears to me to have greatly erred” (Crombie 1802: 302).

This second network will not only help us to critically reflect upon the concepts of prescriptivism and descriptivism, but also to uncover linguistic practices and patterns that may have led to these discursive turns. Based on an extended and optimized version of our pilot-corpus, containing the most-well known and widely distributed grammars of the 19th century (cf. Leitner 1986, 1991; Linn 2006; Michael 1987; Görlach 1998), we will begin to quantitatively investigate terms associated with verbal hygiene (Cameron 2012 [1995]), i.e. active practices of filtering, evaluating, and modifying normative language usage, and their relationships.

Furthermore, informed by this initial analysis, we will discuss three major challenges associated with historical corpus-based network analysis and potential strategies of mitigating them. We will discuss typical issues with optical character recognition (OCR) and state-of-the-art workflows and procedures and tools, both automatic and manual, to reduce misreadings. Also, we will look at problems and solutions associated with automatically generating meaningful graphs (i.e. networks) out of unstructured and unannotated linguistic data. Finally, we will present an early approach of visualizing such graphs in a way that allows for visual diachronic analysis.

References

Agarwal, A., Corvalan, A., Jensen, J., & Rambow, O. (2012). Social Network Analysis of Alice in Wonderland. In D. Elson, A. Kazantseva, R. Milhalcea, & S. Szpakowicz (Eds.), Proceedings of the NAACL-HLT 2012 Workshop on Computational Linguistics for Literature. Montrèal: Association for Computational Linguistics, 88–96. Retrieved from http://www.aclweb.org/anthology/W12-2513.

Bergs, A. (2005). Social Networks and Historical Sociolinguistics: Studies in Morphosyntactic Variation in the Paston Letters (1421–1503). Berlin/New York: Mouton de Gruyter.

Busse, B., & Gather, K. (2017, July). HeidelGram: Network Analysis of Grammarians’ References in 19th-Century British Grammars: A Corpus-Based Study. Corpus Linguistics Conference, Birmingham.

Busse, B., Gather, K., & Kleiber, I. (2016, August). Paradigm Shifts in 19th-Century British Grammar Writing: A Network of Texts and Authors. Proceedings of the 19th International Conference on English Historical Linguistics, Duisburg/Essen.

Busse, B., Gather, K., & Kleiber, I. (2016, November). Assessing the Connections between English Grammarians of the 19th Century: A Corpus-Based Network Analysis. Proceedings of the 6th Grammar and Corpora Conference, Mannheim.

Cameron, D. (2012). Verbal Hygiene: The Politics of Language. London: Routledge (Original work published 1995).

Crombie, A. (1802). The Etymology and Syntax of the English Language, Explained and Illustrated. London: J. Johnson.

Fitzmaurice, S. (2010). Coalitions, Networks, and Discourse Communities in Augustan England: The Spectator and the Early Eighteenth-Century Essay. In R. Hickey (Ed.), Studies in English Language. Eighteenth-Century English: Ideology and Change. Cambridge: Cambridge University Press, 106–132.

Görlach, M. (1998). An Annotated Bibliography of Nineteenth-Century Grammars of English. Amsterdam/Phildadelphia: John Benjamins.

Leitner, G. (1986). English Traditional Grammars in the Nineteenth Century. In D. Kastovsky & A. Szwedek (Eds.), Trends in Linguistics. Studies and Monographs: Vol. 32. Linguistics Across Historical and Geographical Boundaries: Vol 2: Descriptive, Contrastive, and Applied Linguistics. In Honour of Jacek Fisiak on the Occasion of His Fiftieth Birthday. Berlin: De Gruyter, 1333–1356.

Leitner, G. (1991). English Traditional Grammars: An International Perspective. Amsterdam Studies in the Theory and History of Linguistic Science.: Vol. 62. Amsterdam/Philadelphia: John Benjamins.

Linn, A. (2006). English Grammar Writing. In B. Aarts & A. M. S. McMahon (Eds.), Blackwell Handbooks in Linguistics. The Handbook of English Linguistics. Malden: Blackwell, 72–92.

Michael, I. (1987). The Teaching of English. Cambridge: Cambridge University Press.

Moretti, F. (2013). Distant Reading. London: Verso.

Sairio, A. (2009). Language and Letters of the Bluestocking Network: Sociolinguistic Issues in Eighteenth-Century Epistolary English. Mémoires de la Société Néophilologique de Helsinki: Vol. 75. Helsinki: Société Néophilologique.