Crowdsourcing to Create Better Test Questions
How do you create the best exam questions possible without spending half the semester on trial and error? Researchers at the Harvard-Smithsonian Center for Astrophysics (CfA) suggest relying on others—in a word, crowdsourcing—to improve test content.
“Crowdsourcing opens up a whole new possibility for people creating tests,” says Philip Sadler, who led the research in question. “And instead of taking a semester or a year, you can do it in a weekend.”
In a new study published in Educational Assessment, Sadler and his team evaluated the scores from 110 multiple choice life science questions that had been crowdsourced from websites like Amazon’s Mechanical Turk, which assigns thinking tasks to a global community that receives small payments in return. They compared these results to those collected from students who had been asked questions designed by content experts.
“The key to creating good standardised tests isn’t the expert crafting of every test question at the outset, but uncovering the gems hidden in a much larger pile of ordinary rocks,” says co-investigator Gerhard Sonnert. “Crowdsourcing, coupled with using commercially available test-analysis software, can now easily identify promising candidates for those needle-in-a-haystack items.”
The results of the study showed that the best test questions identified by crowdsourcing turned out to be the same questions identified as high quality by content experts.
Sonnert says test developers, school systems, educators, and students could all benefit from this new approach. “For example, some schools are moving to standardise their exams and share them across the school system,” he says. “In addition, curriculum developers and textbook authors can rapidly test and refine the questions they include in their materials. Educational researchers will be able to produce questions that more effectively measure changes in student knowledge. And professional development programs that now have teachers produce assessment questions for their students can, overnight, measure the performance of those questions.”
Although Sadler acknowledges that the “best” exam questions will vary from one group of students to the next based on background knowledge and other factors, he says crowdsourcing can offer an effective first step, “allowing educators to quickly evaluate questions for deletion, revision, or acceptance.” The best of the batch can then undergo more rigorous testing.
Some nations are already starting to move toward a more crowdsourced approach to test design.
The UK’s Cambridge Assessment, which operates and manages Cambridge University’s three exam boards and carries out leading-edge academic and operational research on assessment in education, is now asking teachers to “submit questions that they feel have stretched and challenged their pupils in lessons for consideration in national testing.” This is a move away from the current method, which is to draw from a cache of questions designed by exam committees with detailed knowledge of each subject.
“We want to know what questions teachers ask in the classroom and whether they were good for unlocking that bit of thinking or revealed that misconception,” said Cambridge Assessment research director Tim Oates in an interview with TES. “We don’t think we should necessarily just commission those through asking a limited number of people.”
Ideally, all teachers would be able to access the new, crowdsourced questions through an online question bank. Sourcing questions directly from teachers would result, Oates says, in “really interesting questions which—put to children—encourage them to think hard, to integrate things, to understand things and challenge their ideas a bit.”
But some educators have raised concerns over the crowdsourcing method, claiming that “teacher influence could give some pupils an unfair advantage.”
Professor Alan Smithers, an assessment expert from Buckingham University, says it may come down to who’s submitting the most questions: “The risk is that some schools begin to dominate the process [of submitting questions] and they’re probably the schools who do well at the moment and are very aware of the examinations game and will play it to their advantage. Other schools will think they’re too busy and have other things to do than have their brains picked.”
Students might recognise familiar questions, too, says Malcolm Trobe, interim general secretary of the Association of School and College Leaders: “The only potential weakness I can see in the system is, if I’m sending in questions and using the same questions in preparation for the examination, if they then turn up on the exam paper it advantages those youngsters because they’re seeing a question they’re familiar with.”
But Oates says this has always been a problem with exams, as schools have always been able to anticipate and prepare for certain types of questions using exams from previous years. The crowdsourced approach will not create unfairness, he says, because “there would be so many questions in the pool that teachers would have little incentive to drill pupils for any particular question.” Plus, no exam would “be made up entirely of crowdsourced questions.”
At least not yet.