I. Choosing Between Objective and Subjective Test Items
There are two general categories of test items: (1) objective items which require students to select the correct response from several alternatives or to supply a word or short phrase to answer a question or complete a statement; and (2) subjective or essay items which permit the student to organize and present an original answer. Objective items include multiple-choice, true-false, matching and completion, while subjective items include short-answer essay, extended-response essay, problem solving and performance test items. For some instructional purposes one or the other item types may prove more efficient and appropriate. To begin out discussion of the relative merits of each type of test item, test your knowledge of these two item types by answering the following questions.
Test Item Quiz |
(circle the correct answer) |
1. |
Essay exams are easier to construct than are objective exams. |
T |
F |
? |
2. |
Essay exams require more thorough student preparation and study time than objective exams. |
T |
F |
? |
3. |
Essay exams require writing skills where objective exams do not. |
T |
F |
? |
4. |
Essay exams teach a person how to write. |
T |
F |
? |
5. |
Essay exams are more subjective in nature than are objective exams. |
T |
F |
? |
6. |
Objective exams encourage guessing more so than essay exams. |
T |
F |
? |
7. |
Essay exams limit the extent of content covered. |
T |
F |
? |
8. |
Essay and objective exams can be used to measure the same content or ability. |
T |
F |
? |
9. |
Essay and objective exams are both good ways to evaluate a student's level of knowledge. |
T |
F |
? |
Quiz Answers
1. |
TRUE |
Essay items are generally easier and less time consuming to construct than are most objective test items. Technically correct and content appropriate multiple-choice and true-false test items require an extensive amount of time to write and revise. For example, a professional item writer produces only 9-10 good multiple-choice items in a day's time. |
2. |
? |
According to research findings it is still undetermined whether or not essay tests require or facilitate more thorough (or even different) student study preparation. |
3. |
TRUE |
Writing skills do affect a student's ability to communicate the correct "factual" information through an essay response. Consequently, students with good writing skills have an advantage over students who have difficulty expressing themselves through writing. |
4. |
FALSE |
Essays do not teach a student how to write but they can emphasize the importance of being able to communicate through writing. constant use of essay tests may encourage the knowledgeable but poor writing student to improve his/her writing ability in order to improve performance. |
5. |
TRUE |
Essays are more subjective in nature due to their susceptibility to scoring influences. Different readers can rate identical responses differently, the same reader can rate the same paper differently over time, the handwriting, neatness or punctuation can unintentionally affect a paper's grade and the lack of anonymity can affect the grading process. While impossible to eliminate, scoring influences or biases can be minimized through procedures discussed later in this booklet. |
6. |
? |
Both item types encourage some form of guessing. Multiple-choice, true-false and matching items can be correctly answered through blind guessing, yet essay items can be responded to satisfactorily through well written bluffing. |
7. |
TRUE |
Due to the extent of time required by the student to respond to an essay question, only a few essay questions can be included on a classroom exam. Consequently, a larger number of objective items can be tested in the same amount of time, thus enabling the test to cover more content. |
8. |
TRUE |
Both item types can measure similar content or learning objectives. Research has shown that students respond almost identically to essay and objective test items covering the same content. Studies1 by Sax & Collet (1968) and Paterson (1926) conducted forty-two years apart reached the same conclusion:
"...there seems to be no escape from the conclusions that the two types of exams are measuring identical things." (Paterson, p. 246)
This conclusion should not be surprising; after all, a well written essay item requires that the student (1) have a store of knowledge, (2) be able to relate facts and principles, and (3) be able to organize such information into a coherent and logical written expression, whereas an objective test item requires that the student (1) have a store of knowledge, (2) be able to relate facts and principles, and (3) be able to organize such information into a coherent and logical choice among several alternatives. |
9. |
TRUE |
Both objective and essay test items are good devices for measuring student achievement. However, as seen in the previous quiz answers, there are particular measurement situations where one item type is more appropriate than the other. Following is a set of recommendations for using either objective or essay test items: (Adapted from Robert L. Ebel, Essentials of Educational Measurement, 1972, p. 144). |
1Gilbert Sax and LeVerne S. Collet, "An Empirical Comparison of the Effects of Recall and Multiple-Choice Tests on Student Achievement," Journal of Educational Measurement, vol. 5 (1968), 169-73.
Donald G. Paterson, "Do New and Old Type Examinations Measure Different Mental Functions?" School and Society, vol. 24. (August 21, 1926), 246-48.
When to Use Essay or Objective Tests
Essay tests are especially appropriate when:
- the group to be tested is small and the test is not to be reused.
- you wish to encourage and reward the development of student skill in writing.
- you are more interested in exploring the student's attitudes than in measuring his/her achievement.
- you are more confident of your ability as a critical and fair reader than as an imaginative writer of good objective test items.
Objective tests are especially appropriate when:
- the group to be tested is large and the test may be reused.
- highly reliable test scores must be obtained as efficiently as possible.
- impartiality of evaluation, absolute fairness, and freedom from possible test scoring influences (e.g., fatigue, lack of anonymity) are essential.
- you are more confident of your ability to express objective test items clearly than of your ability to judge essay test answers correctly.
- there is more pressure for speedy reporting of scores than for speedy test preparation.
Either essay or objective tests can be used to:
- measure almost any important educational achievement a written test can measure.
- test understanding and ability to apply principles.
- test ability to think critically.
- test ability to solve problems.
- test ability to select relevant facts and principles and to integrate them toward the solution of complex problems.
In addition to the preceding suggestions, it is important to realize that certain item types are
better suited than others for measuring particular learning objectives. For example, learning objectives requiring the student
to demonstrate or
to show, may be better measured by performance test items, whereas objectives requiring the student
to explain or
to describe may be better measured by essay test items. The matching of learning objective expectations with certain item types can help you select an appropriate kind of test item for your classroom exam as well as provide a higher degree of test validity (i.e., testing what is supposed to be tested). To further illustrate, several sample learning objectives and appropriate test items are provided on the following page.
Learning Objectives |
Most Suitable Test Item |
The student will be able to categorize and name the parts of the human skeletal system. |
Objective Test Item (M-C, T-F, Matching) |
The student will be able to critique and appraise another student's English composition on the basis of its organization. |
Essay Test Item (Extended-Response) |
The student will demonstrate safe laboratory skills. |
Performance Test Item |
The student will be able to cite four examples of satire that Twain uses in Huckleberry Finn. |
Essay Test Item (Short-Answer) |
After you have decided to use either an objective, essay or both objective and essay exam, the next step is to select the kind(s) of objective or essay item that you wish to include on the exam. To help you make such a choice, the different kinds of objective and essay items are presented in the following section of this booklet. The various kinds of items are briefly described and compared to one another in terms of their advantages and limitations for use. Also presented is a set of general suggestions for the construction of each item variation.
III. TWO METHODS FOR ASSESSING TEST ITEM QUALITY
This section of the booklet presents two methods for collecting feedback on the quality of your test items. The two methods include using self-review checklists and student evaluation of test item quality. You can use the information gathered from either method to identify strengths and weaknesses in your item writing.
CHECKLIST FOR EVALUATING TEST ITEMS
EVALUATE YOUR TEST ITEMS BY CHECKING THE SUGGESTIONS WHICH YOU FEEL YOU HAVE FOLLOWED.
Multiple-Choice Test Items
____ |
When possible, stated the stem as a direct question rather than as an incomplete statement. |
____ |
Presented a definite, explicit and singular question or problem in the stem. |
____ |
Eliminated excessive verbiage or irrelevant information from the stem. |
____ |
Included in the stem any word(s) that might have otherwise been repeated in each alternative. |
____ |
Used negatively stated stems sparingly. When used, underlined and/or capitalized the negative word(s). |
____ |
Made all alternatives plausible and attractive to the less knowledgeable or skillful student. |
____ |
Made the alternatives grammatically parallel with each other, and consistent with the stem. |
____ |
Made the alternatives mutually exclusive. |
____ |
When possible, presented alternatives in some logical order (e.g., chronologically, most to least). |
____ |
Made sure there was only one correct or best response per item. |
____ |
Made alternatives approximately equal in length. |
____ |
Avoided irrelevant clues such as grammatical structure, well known verbal associations or connections between stem and answer. |
____ |
Used at least four alternatives for each item. |
____ |
Randomly distributed the correct response among the alternative positions throughout the test having approximately the same proportion of alternatives a, b, c, d, and e as the correct response. |
____ |
Used the alternatives "none of the above" and "all of the above" sparingly. When used, such alternatives were occasionally the correct response. |
True-False Test Items
____ |
Based true-false items upon statements that are absolutely true or false, without qualifications or exceptions. |
____ |
Expressed the item statement as simply and as clearly as possible. |
____ |
Expressed a single idea in each test item. |
____ |
Included enough background information and qualifications so that the ability to respond correctly did not depend on some special, uncommon knowledge. |
____ |
Avoided lifting statements from the text, lecture or other materials. |
____ |
Avoided using negatively stated item statements. |
____ |
Avoided the use of unfamiliar language. |
____ |
Avoided the use of specific determiners such as "all," "always," "none," "never," etc., and qualifying determiners such as "usually," "sometimes," "often," etc. |
____ |
Used more false items than true items (but not more than 15% additional false items). |
Matching Test Items
____ |
Included directions which clearly stated the basis for matching the stimuli with the response. |
____ |
Explained whether or not a response could be used more than once and indicated where to write the answer. |
____ |
Used only homogeneous material. |
____ |
When possible, arranged the list of responses in some systematic order (e.g., chronologically, alphabetically). |
____ |
Avoided grammatical or other clues to the correct response. |
____ |
Kept items brief (limited the list of stimuli to under 10). |
____ |
Included more responses than stimuli. |
____ |
When possible, reduced the amount of reading time by including only short phrases or single words in the response list. |
Completion Test Items
____ |
Omitted only significant words from the statement. |
____ |
Did not omit so many words from the statement that the intended meaning was lost. |
____ |
Avoided grammatical or other clues to the correct response. |
____ |
Included only one correct response per item. |
____ |
Made the blanks of equal length. |
____ |
When possible, deleted the words at the end of the statement after the student was presented with a clearly defined problem. |
____ |
Avoided lifting statements directly from the text, lecture or other sources. |
____ |
Limited the required response to a single word or phrase. |
Essay Test Items
____ |
Prepared items that elicited the type of behavior you wanted to measure. |
____ |
Phrased each item so that the student's task was clearly indicated. |
____ |
Indicated for each item a point value or weight and an estimated time limit for answering. |
____ |
Asked questions that elicited responses on which experts could agree that one answer is better than others. |
____ |
Avoided giving the student a choice among optional items. |
____ |
Administered several short-answer items rather than 1 or 2 extended-response items. |
Grading Essay Test Items
____ |
Selected an appropriate grading model. |
____ |
Tried not to allow factors which were irrelevant to the learning outcomes being measured to affect your grading (e.g., handwriting, spelling, neatness). |
____ |
Read and graded all class answers to one item before going on to the next item. |
____ |
Read and graded the answers without looking at the student's name to avoid possible preferential treatment. |
____ |
Occasionally shuffled papers during the reading of answers. |
____ |
When possible, asked another instructor to read and grade your students' responses. |
Problem Solving Test Items
____ |
Clearly identified and explained the problem to the student. |
____ |
Provided directions which clearly informed the student of the type of response called for. |
____ |
Stated in the directions whether or not the student must show work procedures for full or partial credit. |
____ |
Clearly separated item parts and indicated their point values. |
____ |
Used figures, conditions and situations which created a realistic problem. |
____ |
Asked questions that elicited responses on which experts could agree that one solution and one or more work procedures are better than others. |
____ |
Worked through each problem before classroom administration. |
Performance Test Items
____ |
Prepared items that elicit the type of behavior you wanted to measure. |
____ |
Clearly identified and explained the simulated situation to the student. |
____ |
Made the simulated situation as "life-like" as possible. |
____ |
Provided directions which clearly inform the students of the type of response called for. |
____ |
When appropriate, clearly stated time and activity limitations in the directions. |
____ |
Adequately trained the observer(s)/scorer(s) to ensure that they were fair in scoring the appropriate behaviors. |
STUDENT EVALUATION OF TEST ITEM QUALITY
USING ICES QUESTIONNAIRE ITEMS TO ASSESS YOUR TEST ITEM QUALITY
The following set of ICES (Instructor and Course Evaluation System) questionnaire items can be used to assess the quality of your test items. The items are presented with their original ICES catalogue number. You are encouraged to include one or more of the items on the ICES evaluation form in order to collect student opinion of your item writing quality.
IV. ASSISTANCE OFFERED BY THE Center for Innovation in Teaching and Learning (CITL)
The information in the booklet is intended for self-instruction. However, CITL staff members will consult with faculty who wish to analyze and improve their test item writing. The staff can also consult with faculty about other instructional problems. The Measurement and Evaluation Division of CITL also publishes a semi-annual newsletter called Measurement and Evaluation Q & A which discusses various classroom testing and measurement issues. Instructors wishing to receive the newsletter or to acquire CITL assistance can call the Measurement and Evaluation Division at 333-3490.
102--How would you rate the instructor's examination questions? |
116--Did the exams challenge you to do original thinking? |
|
Excellent |
Poor |
|
Yes, very challenging |
No, not challenging
|
103--How well did examination questions reflect content and emphasis of the course? |
118--Were there "trick" or trite questions on tests? |
|
Well related |
Poorlyrelated |
|
Lots ofthem |
Few if any
|
114--The exams reflected important points in the reading assignments. |
122--How difficult were the examinations? |
|
Strongly agree |
Stronglydisagree |
|
Toodifficult |
Too easy
|
117--Examinations mainly testedtrivia. |
123--I found I could score reasonably well on exams by just cramming. |
|
Strongly agree |
Stronglydisagree |
|
Stronglyagree |
Strongly disagree
|
119--Were exam questions worded clearly? |
121--How was the length of exams for the time allotted. |
|
Yes, veryclear |
No, very unclear |
|
Too long |
Too short
|
115--Were the instructor's testquestions thought provoking? |
109--Were exams, papers, reports returned with errors explained or personal comments? |
|
yesDefinitely |
Definitelyno |
|
Almost always |
Almost never
|
125--Were exams adequately discussed upon return? |
|
|
Yes,adequately |
No, not enough |
|
V. REFERENCES FOR FURTHER READING
Ebel, Robert L.
Measuring educational achievement. Englewood Cliffs, New Jersey: Prentice-Hall, 1965, Chapters 4-6.
Ebel, Robert L.
Essentials of educational measurement. Englewood Cliffs, New Jersey: Prentice-Hall, 1972, Chapters 5-8.
Gronlund, N. E.
Measurement and evaluation in teaching. New York: Macmillan Publishing Co., 1976, Chapters 6-9.
Mehrens, W. A. & Lehmann, I. J.
Measurement and evaluation in education and psychology. New York: Holt, Rinehart & Winston, Inc., 1973, Chapters 7-10.
Nelson, C. H.
Measurement and evaluation in the classroom. New York: Macmillan Publishing Co., 1970, Chapters 5-8. Measurement and Evaluation Division, 247 Armory Building. Especially useful for science instruction.
Payne, David A.
The assessment of learning. Lexington, Mass.: D.C. Heath and Co., 1974, Chapters 4-7.
Scannell, D. P. & Tracy, D. B.
Testing and measurement in the classroom. New York: Houghton-Mifflin Co., 1975, Chapters 4-6.
Thorndike, R. L. (Ed.).
Educational measurement (2nd ed.). Washington, D.C.: American Council on Education, 1971, Chapter 9 (Performance testing) and Chapter 10 (Essay exams).