A quick post today on how I used some specialist corpora during a workshop with visiting Chinese professors. This post is entitled “Ants on a Blog,” a pun that combines the American snack food (ants on a log) with the fact that I utilized two wonderful tools from Laurence Anthony: AntCorGen and AntConc.
The visiting professors come from different fields and I thought this would be a great opportunity during their orientation week to help them explore research trends in their field, common language used in their field, and pronunciation of discipline-specific vocabulary.
Building the Corpora
Building four different corpora? Yes! It only took about five minutes using AntCorGen. In AntCorGen, you simply select the field or subfield you wish to explore, select the type of information you want (e.g. abstract, methods, full-text), how many texts you want, and press “Create Corpus”. In my case, I created four corpora that consisted of 300 abstracts each. Here is an example:
Explaining the Purpose
My next step was to demonstrate AntConc and how a corpus is both used and useful. I showed them how to open the corpus, and basic searches using only the “Clusters/N-Gram” tab. I focused on this tab because you can sort single words by range whereas in the “Wordlist” tab, you can only sort them by frequency. For our purposes, range was more important because it showed how words were distributed across texts. Basically, this will show you what words many different people are using while a very frequent words
Typically, the usefulness of corpora is not always easy to grasp. Any English language corpora will tell you the most frequent words are of, the, in, at. This is not useful stuff. By focusing on range, I explained that they could make guesstimates about trending concepts or research areas. Apart from that, I also explained that a corpus is not necessarily useful for answering specific questions as it is for simply exploring how language is used. I told them we would be going on language adventures, and none of us could be sure what we would find. I also asked them to give me ideas on how it could be useful, and this immediately elicited responses about writing, especially using correct and common phrases.
Exploring the Corpora
I placed each corpus and a copy of AntConc on separate USBs and headed to the lab with the professors. We used AntConc with the purpose of finding research trends, frequently used words, hard to pronounce words (e.g. utilitarianism, pharmozoocognosy), and “interesting” combinations of two, three, or four words.
I gave them a short worksheet I made for them to complete independently and offered feedback individually for searching. One of the activities was about finding hard to pronounce words, and when I saw that they had listed about 8-10, I offered one-on-one pronunciation instruction and feedback. What was great about this is not so much the one-off pronunciation practice of infrequent words but the rules these words embodied regarding stress placement, unstressed vowel placement, phonics and word origin (i.e. “ch” in most academic or scientific words is likely to have a /k/ sound due to their Greek origin), and chunking multisyllable words. Some wrote down acronyms or website names thinking these were words (they lose their capitalizations in “Clusters” tab, so I showed them how to examine the concordances for meaning, and how to look at the word in its entire context, too.
The worksheet I used is here. It contains the activities as well as instructions for the different types of analyses.
I think the professors enjoyed exploring their field’s language usage. They found the pronunciation activities very fun and were surprised at some of the words and those words’ variations they found. For example, using the “Regex” option, one professor and I found many different words using “phono” and explored those meanings. We also enjoyed reviewing the Greek mathematical letters that appeared, too.
These professors are experts in their fields, and while they do often communicate with each other and other international colleagues in English using discipline-specific language, any common ELF communication patterns could cause minor (probably not major) issues on an American campus. I thought that such independent explorations and feedback could only benefit them and give them the tools to do further exploration on their own, thus allowing them to be in even more control of their expertise. And many of them said they would in fact download and use these tools again to help with their writing.
I was happy to see that I was able to spark genuine interest in not only the corpus tools but how language is being used in their field. I hope to get more opportunities like these in the future.