Robert L. Ebel and David A. Frisbie (1991) in their book, write that “teachers are often as concerned with measuring the ability of students to think about and use knowledge as they are with measuring the knowledge their students possess. In these instances, tests are needed that permit students some degree of latitude in their responses. Essay tests are adapted to this purpose. Student writes a response to a question that is several paragraphs to several pages long. Essays can be used for higher learning outcomes such as synthesis or evaluation as well as lower level outcomes. They provide items in which students supply rather than select the appropriate answer, usually the students compose a response in one or more sentences. Essay tests allow students to demonstrate their ability to recall, organize, synthesize, relate, analyze and evaluate ideas.
Types of Essay Tests
Essay tests may be divided into many types. Monree and Cater (1993) divide essay tests into the many categories like Selective recall-basis given, evaluation recall-basis given, comparison of two things on a single designated basis, comparison of two things in general, Decisions – For or against, cause and effect, explanation of the use or exact meaning of some word, phrase on statement, summary of some unit of the text book or article, analysis, statement of relationships, Illustration or examples, classification, application of rules, laws, or principles to new situation, discussion, statement of an author’s purpose in the selection or organization of material, Criticism – as to the adequacy, correctness or relevance of a printed statement or to a class mate’s answer to a question on the lesson, reorganization of facts, formulation of new question – problems and question raised, new methods of procedure etc.
Types of Constructed Response Items
Essay items can vary from very lengthy, open ended end of semester term papers or take home tests that have flexible page limits (e.g. 10-12 pages, no more than 30 pages etc.) to essays with responses limited or restricted to one page or less. Thus essay type items are of two types:-
Restricted Response Essay Items
Extended Response Essay Items
I. Restricted Response Essay Items
An essay item that poses a specific problem for which a student must recall proper information, organize it in a suitable manner, derive a defensible conclusion, and express it within the limits of posed problem, or within a page or time limit, is called a restricted response essay type item. The statement of the problem specifies response limitations that guide the student in responding and provide evaluation criteria for scoring.
Example 1:
List the major similarities and differences in the lives of people living in Islamabad and Faisalabad.
Example 2:
Compare advantages and disadvantages of lecture teaching method and demonstration teaching method.
When Should Restricted Response Essay Items be used?
Restricted Response Essay Items are usually used to:-
Analyze relationship
Compare and contrast positions
State necessary assumptions
Identify appropriate conclusions
Explain cause-effect relationship
Organize data to support a viewpoint
Evaluate the quality and worth of an item or action
Integrate data from several sources
II. Extended Response Essay Type Items
An essay type item that allows the student to determine the length and complexity of response is called an extended-response essay item. This type of essay is most useful at the synthesis or evaluation levels of cognitive domain. We are interested in determining whether students can organize, integrate, express, and evaluate information, ideas, or pieces of knowledge the extended response items are used.
Example:
Identify as many different ways to generate electricity in Pakistan as you can? Give advantages and disadvantages of each. Your response will be graded on its accuracy, comprehension and practical ability. Your response should be 8-10 pages in length and it will be evaluated according to the RUBRIC (scoring criteria) already provided.
Scoring Essay Type Items
A rubric or scoring criteria is developed to evaluate/score an essay type item. A rubric is a scoring guide for subjective assessments. It is a set of criteria and standards linked to learning objectives that are used to assess a student's performance on papers, projects, essays, and other assignments. Rubrics allow for standardized evaluation according to specified criteria, making grading simpler and more transparent. A rubric may vary from simple checklists to elaborate combinations of checklist and rating scales. How elaborative your rubric is, depends on what you are trying to measure. If your essay item is a restricted-response item simply assessing mastery of factual content, a fairly simple listing of essential points would be sufficient. An example of the rubric of restricted response item is given below.
Test Item:
Name and describe five of the most important factors of unemployment in Pakistan. (10 points)
Rubric/Scoring Criteria:
(i) 1 point for each of the factors named, to a maximum of 5 points
(ii) One point for each appropriate description of the factors named, to a maximum of 5 points
(iii) No penalty for spelling, punctuation, or grammatical error
(iv) No extra credit for more than five factors named or described.
(v) Extraneous information will be ignored.
However, when essay items are measuring higher order thinking skills of cognitive domain, more complex rubrics are mandatory. An example of Rubric for writing test in language is given below.
Advantages of Essay Type Items
The main advantages of essay type tests are as follows:
(i) They can measures complex learning outcomes which cannot be measured by other means.
(ii) They emphasize integration and application of thinking and problem solving skills.
(iii) They can be easily constructed.
(iv) They give examines freedom to respond within broad limits.
(v) The students cannot guess the answer because they have to supply it rather than select it.
(vi) Practically it is more economical to use essay type tests if number of students is small.
(vii) They required less time for typing, duplicating or printing. They can be written on the blackboard also if number of students is not large.
(viii) They can measure divergent thinking.
(ix) They can be used as a device for measuring and improving language and expression skill of examinees.
(x) They are more helpful in evaluating the quality of the teaching process.
(xi) Studies has supported that when students know that the essay type questions will be asked, they focus on learning broad concepts and articulating relationships, contrasting and comparing.
(xii) They set better standards of professional ethics to the teachers because they expect more time in assessing and scoring from the teachers.
Limitations of Essay Type Items
The essay type tests have the following serious limitations as a measuring instrument:
(i) A major problem is the lack of consistency in judgments even among competent examiners.
(ii) They have Halo effects. If the examiner is measuring one characteristic, he can be influenced in scoring by another characteristic. For example, a well behaved student may score more marks on account of his good behaviour also.
(iii) They have question to question carry effect. If the examinee has answered satisfactorily in the beginning of the question or questions he is likely to score more than the one who did not do well in the beginning but did well later on.
(iv) They have examinee to examinee carry effect. A particular examinee gets marks not only on the basis of what he has written but also on the basis that whether the previous examinee whose answered book was examined by the examiner was good or bad.
(v) They have limited content validity because of sample of questions can only be asked in essay type test.
(vi) They are difficult to score objectively because the examinee has wide freedom of expression and he writes long answers.
(vii) They are time consuming both for the examiner and the examinee.
(viii) They generally emphasize the lengthy enumeration of memorized facts.
Suggestions for Writing Essay Type Items
I. Ask questions or establish tasks that will require the student to demonstrate command of essential knowledge. This means that students should not be asked merely to reproduce material heard in a lecture or read in a textbook. To "demonstrate command" requires that the question be somewhat novel or new. The substance of the question should be essential knowledge rather than trivia that might be a good board game question.
II. Ask questions that are determinate, in the sense that experts (colleagues in the field) could agree that one answer is better than another. Questions that contain phrases such as "What do you think..." or "What is your opinion about..." are indeterminate. They can be used as a medium for assessing skill in written expression, but because they have no clearly right or wrong answer, they are useless for measuring other aspects of achievement.
III. Define the examinee's task as completely and specifically as possible without interfering with the measurement process itself. It is possible to word an essay item so precisely that there is one and only one very brief answer to it. The imposition of such rigid bounds on the response is more limiting than it is helpful. Examinees do need guidance, however, to judge how extensive their response must to be considered complete and accurate.
IV. Generally give preference to specific questions that can be answered briefly. The more questions used, the better the test constructor can sample the domain of knowledge covered by the test. And the more responses available for scoring, the more accurate the total test scores are likely to be. In addition, brief responses can be scored more quickly and more accurately than long, extended responses, even when there are fewer of the latter type.
V. Use enough items to sample the relevant content domain adequately, but not so many that students do not have sufficient time to plan, develop, and review their responses. Some instructors use essay tests rather than one of the objective types because they want to encourage and provide practice in written expression. However, when time pressures become great, the essay test is one of the most unrealistic and negative writing experiences to which students can be exposed. Often there is no time for editing, for rereading, or for checking spelling. Planning time is short changed so that writing time will not be. There are few, if any, real writing tasks that require such conditions. And there are few writing experiences that discourage the use of good writing habits as much as essay testing does.
VI. Avoid giving examinees a choice among optional questions unless special circumstances make such options necessary. The use of optional items destroys the strict comparability between student scores because not all students actually take the same test. Student A may have answered items 1-3 and Student B may have answered 3-5. In these circumstances the variability of scores is likely to be quite small because students were able to respond to items they knew more about and ignore items with which they were unfamiliar. This reduced variability contributes to reduced test score reliability. That is, we are less able to identify individual differences in achievement when the test scores form a very homogeneous distribution. In sum, optional items restrict score comparability between students and contribute to low score reliability due to reduced test score variability.
VII. Test the question by writing an ideal answer to it. An ideal response is needed eventually to score the responses. It if is prepared early, it permits a check on the wording of the item, the level of completeness required for an ideal response, and the amount of time required to furnish a suitable response. It even allows the item writer to determine if there is any "correct" response to the question.
VIII. Specify the time allotment for each item and/or specify the maximum number of points to be awarded for the "best" answer to the question. Both pieces of information provide guidance to the examinee about the depth of response expected by the item writer. They also represent legitimate pieces of information a student can use to decide which of several items should be omitted when time begins to run out. Often the number of points attached to the item reflects the number of essential parts to the ideal response. Of course if a definite number of essential parts can be determined, that number should be indicated as part of the question.
IX. Divide a question into separate components when there are obvious multiple questions or pieces to the intended responses. The use of parts helps examinees organizationally and, hence, makes the process more efficient. It also makes the grading process easier because it encourages organization in the responses. Finally, if multiple questions are not identified, some examinees may inadvertently omit some parts, especially when time constraints are great.
Types of Essay Tests
Essay tests may be divided into many types. Monree and Cater (1993) divide essay tests into the many categories like Selective recall-basis given, evaluation recall-basis given, comparison of two things on a single designated basis, comparison of two things in general, Decisions – For or against, cause and effect, explanation of the use or exact meaning of some word, phrase on statement, summary of some unit of the text book or article, analysis, statement of relationships, Illustration or examples, classification, application of rules, laws, or principles to new situation, discussion, statement of an author’s purpose in the selection or organization of material, Criticism – as to the adequacy, correctness or relevance of a printed statement or to a class mate’s answer to a question on the lesson, reorganization of facts, formulation of new question – problems and question raised, new methods of procedure etc.
Types of Constructed Response Items
Essay items can vary from very lengthy, open ended end of semester term papers or take home tests that have flexible page limits (e.g. 10-12 pages, no more than 30 pages etc.) to essays with responses limited or restricted to one page or less. Thus essay type items are of two types:-
Restricted Response Essay Items
Extended Response Essay Items
I. Restricted Response Essay Items
An essay item that poses a specific problem for which a student must recall proper information, organize it in a suitable manner, derive a defensible conclusion, and express it within the limits of posed problem, or within a page or time limit, is called a restricted response essay type item. The statement of the problem specifies response limitations that guide the student in responding and provide evaluation criteria for scoring.
Example 1:
List the major similarities and differences in the lives of people living in Islamabad and Faisalabad.
Example 2:
Compare advantages and disadvantages of lecture teaching method and demonstration teaching method.
When Should Restricted Response Essay Items be used?
Restricted Response Essay Items are usually used to:-
Analyze relationship
Compare and contrast positions
State necessary assumptions
Identify appropriate conclusions
Explain cause-effect relationship
Organize data to support a viewpoint
Evaluate the quality and worth of an item or action
Integrate data from several sources
II. Extended Response Essay Type Items
An essay type item that allows the student to determine the length and complexity of response is called an extended-response essay item. This type of essay is most useful at the synthesis or evaluation levels of cognitive domain. We are interested in determining whether students can organize, integrate, express, and evaluate information, ideas, or pieces of knowledge the extended response items are used.
Example:
Identify as many different ways to generate electricity in Pakistan as you can? Give advantages and disadvantages of each. Your response will be graded on its accuracy, comprehension and practical ability. Your response should be 8-10 pages in length and it will be evaluated according to the RUBRIC (scoring criteria) already provided.
Scoring Essay Type Items
A rubric or scoring criteria is developed to evaluate/score an essay type item. A rubric is a scoring guide for subjective assessments. It is a set of criteria and standards linked to learning objectives that are used to assess a student's performance on papers, projects, essays, and other assignments. Rubrics allow for standardized evaluation according to specified criteria, making grading simpler and more transparent. A rubric may vary from simple checklists to elaborate combinations of checklist and rating scales. How elaborative your rubric is, depends on what you are trying to measure. If your essay item is a restricted-response item simply assessing mastery of factual content, a fairly simple listing of essential points would be sufficient. An example of the rubric of restricted response item is given below.
Test Item:
Name and describe five of the most important factors of unemployment in Pakistan. (10 points)
Rubric/Scoring Criteria:
(i) 1 point for each of the factors named, to a maximum of 5 points
(ii) One point for each appropriate description of the factors named, to a maximum of 5 points
(iii) No penalty for spelling, punctuation, or grammatical error
(iv) No extra credit for more than five factors named or described.
(v) Extraneous information will be ignored.
However, when essay items are measuring higher order thinking skills of cognitive domain, more complex rubrics are mandatory. An example of Rubric for writing test in language is given below.
Advantages of Essay Type Items
The main advantages of essay type tests are as follows:
(i) They can measures complex learning outcomes which cannot be measured by other means.
(ii) They emphasize integration and application of thinking and problem solving skills.
(iii) They can be easily constructed.
(iv) They give examines freedom to respond within broad limits.
(v) The students cannot guess the answer because they have to supply it rather than select it.
(vi) Practically it is more economical to use essay type tests if number of students is small.
(vii) They required less time for typing, duplicating or printing. They can be written on the blackboard also if number of students is not large.
(viii) They can measure divergent thinking.
(ix) They can be used as a device for measuring and improving language and expression skill of examinees.
(x) They are more helpful in evaluating the quality of the teaching process.
(xi) Studies has supported that when students know that the essay type questions will be asked, they focus on learning broad concepts and articulating relationships, contrasting and comparing.
(xii) They set better standards of professional ethics to the teachers because they expect more time in assessing and scoring from the teachers.
Limitations of Essay Type Items
The essay type tests have the following serious limitations as a measuring instrument:
(i) A major problem is the lack of consistency in judgments even among competent examiners.
(ii) They have Halo effects. If the examiner is measuring one characteristic, he can be influenced in scoring by another characteristic. For example, a well behaved student may score more marks on account of his good behaviour also.
(iii) They have question to question carry effect. If the examinee has answered satisfactorily in the beginning of the question or questions he is likely to score more than the one who did not do well in the beginning but did well later on.
(iv) They have examinee to examinee carry effect. A particular examinee gets marks not only on the basis of what he has written but also on the basis that whether the previous examinee whose answered book was examined by the examiner was good or bad.
(v) They have limited content validity because of sample of questions can only be asked in essay type test.
(vi) They are difficult to score objectively because the examinee has wide freedom of expression and he writes long answers.
(vii) They are time consuming both for the examiner and the examinee.
(viii) They generally emphasize the lengthy enumeration of memorized facts.
Suggestions for Writing Essay Type Items
I. Ask questions or establish tasks that will require the student to demonstrate command of essential knowledge. This means that students should not be asked merely to reproduce material heard in a lecture or read in a textbook. To "demonstrate command" requires that the question be somewhat novel or new. The substance of the question should be essential knowledge rather than trivia that might be a good board game question.
II. Ask questions that are determinate, in the sense that experts (colleagues in the field) could agree that one answer is better than another. Questions that contain phrases such as "What do you think..." or "What is your opinion about..." are indeterminate. They can be used as a medium for assessing skill in written expression, but because they have no clearly right or wrong answer, they are useless for measuring other aspects of achievement.
III. Define the examinee's task as completely and specifically as possible without interfering with the measurement process itself. It is possible to word an essay item so precisely that there is one and only one very brief answer to it. The imposition of such rigid bounds on the response is more limiting than it is helpful. Examinees do need guidance, however, to judge how extensive their response must to be considered complete and accurate.
IV. Generally give preference to specific questions that can be answered briefly. The more questions used, the better the test constructor can sample the domain of knowledge covered by the test. And the more responses available for scoring, the more accurate the total test scores are likely to be. In addition, brief responses can be scored more quickly and more accurately than long, extended responses, even when there are fewer of the latter type.
V. Use enough items to sample the relevant content domain adequately, but not so many that students do not have sufficient time to plan, develop, and review their responses. Some instructors use essay tests rather than one of the objective types because they want to encourage and provide practice in written expression. However, when time pressures become great, the essay test is one of the most unrealistic and negative writing experiences to which students can be exposed. Often there is no time for editing, for rereading, or for checking spelling. Planning time is short changed so that writing time will not be. There are few, if any, real writing tasks that require such conditions. And there are few writing experiences that discourage the use of good writing habits as much as essay testing does.
VI. Avoid giving examinees a choice among optional questions unless special circumstances make such options necessary. The use of optional items destroys the strict comparability between student scores because not all students actually take the same test. Student A may have answered items 1-3 and Student B may have answered 3-5. In these circumstances the variability of scores is likely to be quite small because students were able to respond to items they knew more about and ignore items with which they were unfamiliar. This reduced variability contributes to reduced test score reliability. That is, we are less able to identify individual differences in achievement when the test scores form a very homogeneous distribution. In sum, optional items restrict score comparability between students and contribute to low score reliability due to reduced test score variability.
VII. Test the question by writing an ideal answer to it. An ideal response is needed eventually to score the responses. It if is prepared early, it permits a check on the wording of the item, the level of completeness required for an ideal response, and the amount of time required to furnish a suitable response. It even allows the item writer to determine if there is any "correct" response to the question.
VIII. Specify the time allotment for each item and/or specify the maximum number of points to be awarded for the "best" answer to the question. Both pieces of information provide guidance to the examinee about the depth of response expected by the item writer. They also represent legitimate pieces of information a student can use to decide which of several items should be omitted when time begins to run out. Often the number of points attached to the item reflects the number of essential parts to the ideal response. Of course if a definite number of essential parts can be determined, that number should be indicated as part of the question.
IX. Divide a question into separate components when there are obvious multiple questions or pieces to the intended responses. The use of parts helps examinees organizationally and, hence, makes the process more efficient. It also makes the grading process easier because it encourages organization in the responses. Finally, if multiple questions are not identified, some examinees may inadvertently omit some parts, especially when time constraints are great.
No comments:
Post a Comment