Research Bites: Take Care of Your Concordancing – Using Corpora for Self-Correction

I have written about corpora, concordancing, and DDL on this site before. Last year, my colleague and I completed a semester-long quantitative research project and co-wrote a paper on using DDL in the classroom (which has now been rejected three times!). I used to be a big fan of teaching students how to use these tools as an alternative reference and learning resource. However, due to lack of patience with computer illiterate “digital natives“, heaps of incomprehensible input that is difficult for learners to parse, and the paucity of the linguistic sixth sense among students, this kind of practice fell out of favor with me. Then, I stumbled upon Cynthia Quinn’s (2014) article in ELT Journal, and now the interest has been slightly rekindled. A snowball effect took place after reading this article, and I was happy to find a number of new corpus tools and active corpus linguistics websites. I’m not sure what effect this will have on my teaching, but I do present to you the latest Research Bites.


Quinn, C. (2014). Training L2 writers to reference corpora as a self-correction tool. ELT Journal. [$link]

Twitter Summary

New on #researchbites: Quinn shows how to scaffold #corpus use to teach error correction #ddl #corpuslinguistics

Introduction and Findings

Quinn’s article outlines how she introduced the Collins Wordbanks Online corpus to her Japanese EFL university students in order to help them self-correct teacher coded errors on their essays. She discovered that most students found the corpora useful, especially for easily identifiable preposition, word form, and article errors – but not so much for more lexical (as opposed to lexicogrammatical) items like poor word choice. She also found students enjoyed finding more natural and varied language patterns with which they could express themselves.However, as is typical with DDL, students often found the interfaces, search queries, and data difficult to wade through. Nevertheless, she found that “corpus referencing was a positive experience for the majority of learners who agreed that it could improve their written expression”. Because of this, it remains a worthwhile tool to introduce, if not for its effectiveness, then at the very least, for its ability to supplement or supplant dictionaries, thesauruses, and translation tools.


There is a time investment and learning curve to doing DDL, and Quinn’s article explained how she scaffolding concordancing to address these issues. Here is what she did (note: my outline below does not necessarily represent the way her introduction was organized in the article):

For the first five 90-minute classes (about half of each class spent on DDL):

  1. Introducing corpora
    1. Introducing students to the concept of a corpus
    2. Showing the types of rich data that can be gleaned from a corpus
  2. Justifying corpus use
    1. Comparing corpora to other resources
    2. Showing students how a corpus may be better than other resources in some situations
    3. This is especially useful, as students need to often convinced to use such a tool
  3. Paper-based practice
    1. Numerous other researchers have pointed out that it is easier to make sense of concordance data if it is first presented on paper
    2. Students practiced essential DDL skills, learning:
      1. scanning for linguistic features
      2. identifying language patterns
      3. making “pragmatic generalizations” about the patterns
    3. Controlled practice where “question prompts guided learners to notice meaning and usage pattern”
  4. Controlled computer-based practice
    1. Before using the online corpus, students completed exercises to learn important vocabulary such as query, part of speech, lemma, token, etc.
    2. Students did in-class searches on terms from class readings
    3. Students investigated a single word for homework and reported the information they found.
    4. Students discussed these reports with classmates

After the first five class sessions:

  1. Controlled revision practice
    1. Students practiced revising errors by learning how to search for teacher-coded linguistic errors found on example essays.
      1. common error codes such as WW (wrong word), WF (word form), etc. were used
    2. Students completed exercises for each error code. An example sequence is as follows:
  2. Independent practice
    1. After students wrote their essays and the teacher gave them feedback (content and language), students worked to correct their own errors using the corpus.
    2. Students kept a revision log to document what they had found, changed, and their experiences with DDL


What Quinn offers is a model way to introduce corpora usage to students. She presented it in a logical fashion which naturally led to learner uptake and clearly helped students. If anyone is taken with using DDL in their classrooms, I highly recommend the model Quinn used. But, as she said, there is a certain time investment (not to mention the need for a computer lab) that is involved. What this research report lacks is an empirical aspect which looks at not just learner feelings about using corpora, but actually tracks their effective employment of such a tool.

If going full blown concordancing scares you, as it should if you have ever played with COCA, there are a number of simpler corpus tools out there. Some that I use, either behind the scenes to create materials, or in-class with students are:

And don’t forget to check out the Corpus Linguistics community on Google+, as well as EFL Notes, and excellent site with a clear corpus focus.

2 thoughts on “Research Bites: Take Care of Your Concordancing – Using Corpora for Self-Correction

  1. hi Anthony

    thanks for linking G+ CL comm and shoutout for my blog

    good luck with the paper, hope to be able to read it soon!

    it struck me as strange that the author of the paper u report uses a paid corpus (and not an inexpensive one at that).

    i doubt many teachers will want to spend 5 x90min lessons on prepping students for direct corpus use. if one was to go down that route maybe using google as a corpus may be worth thinking about since there is no new interface to learn

    it’s nice to be able to add the wiki corpus at byu-coca to your list, am hoping to see some neat write-ups of how people are using this new resource


    • Anthony Schmidt says:


      I think she used that particular corpus because it was available at her university, so access was free for her and her students. I do agree with her decision to use something far simpler than the BYU interfaces.

      DDL requires a steep learning curve, and while it was a great time investment, what I liked about her article is the slow, step-by-step process of first convincing them it was worthwhile (because it does take convincing) and then slowly showing them, from paper to web, how to use it.

Leave a Reply