ALL ABOUT ENGLISH: Stages in test construction: Pre-Testing and analyzing the result (level of difficulty, item discrimination, and the effectiveness of distracters

I. INTRODUCTION

Today many institutions and people are demanded to construct the test by themselves. Especially as a teacher, we need to know the steps of test construction. After planning, writing, and reviewing the items, the test should be pre-tested and analyzed the result. Here is the summary of the steps in constructing test.

II. SUMMARY OF CONTENT

Pre-testing

Pretesting is a process for determining a target group’s reaction to and understanding of health messages or behavior change information before materials are produced in final form.

Pretesting is not research to help you understand the audience. That is called formative research. Conduct formative research on your target audience before developing a BCC project.

During pretesting, members of the target group are asked to react to draft BCC materials. Their responses are analyzed, and then the materials are revised. Pretesting may be conducted several times before final materials are produced.

Pretesting Methods

The two most common pretesting methods are individual interviews and focus group discussions. Readability testing and expert review are also used.

1) Individual Interviews are one-on-one interviews where discussion between one interviewer and one participant takes place in a private, confidential setting.

2) Focus Group Discussions (FGDs) are small group gatherings (8-10 people per session) where the materials and messages are discussed in a group setting.

3) Readability Assessments help determine the level of reading difficulty of a written material. This is done during the materials development process before pretesting with the target audience.

4) Expert Review involves asking experts to review the draft materials and give comments and suggestions for improvements.

Item Facility (IF)

Item facility (sometimes called item difficulty) refers to the proportion of correct responses given to a certain item. For example, if sixty out of one hundred examinees give correct response to a particular item, the item facility will be calculated to be .60 according to the following formula:

number of correct responses 60

IF = ————————————— = ——— = .60

total number of responses 100

If we consider the proportion of incorrect responses given to an item, the calculated value will be the item difficulty for that item. In the example above, the item difficulty will be .40 as calculated from the following formula:

number of wrong responses 40

Item Difficulty = ————————————— = ——— = .40

total number of responses 100

It should be clear that either item facility or item difficulty ranges from 0 to 1. It should also be clear that by subtracting the value of item facility from 1, the value of item difficulty can easily be determined. Finally, it should be obvious that the higher item facility, the lower the item difficulty, and thus the easier the test item. For example, an item facility of zero is an indication of an item difficulty of 1. It means that nobody has given a correct response to that item. Thus, the item is too difficult. On the other hand, an item facility of 1 refers to the item to which everybody had given a correct response. Thus, that item is very easy. The criterion for an acceptable level of item facility depends on the function of the test. However, as a generally agreed upon convention, items with facility indexes below .37 and above .63 are recommended to be either modified or discarded. Ideal values of item facility for true-false, three-choice, four-choice, and five-choice items are .75, .67, .63, and .60 respectively.

Item Discrimination (ID)

One of the many purposes of testing is to distinguish knowledgeable examinees from less knowledgeable ones. Each item of the test, therefore, should contribute to accomplishing this aim. That is, each item in the test should have a certain degree of power to discriminate examinees on the basis of their knowledge. Item discrimination refers to this power in each item. In order to calculate item discrimination (ID), the total scores on the test are ranked. Ranking means to list the scores from the highest to the lowest. Then the scores are divided into two parts: high and low. Finally, the number of examinees who have given correct responses to a particular item in each group would be counted, and these numbers would be utilized in the following formula:

Number of correct responses in the high group minus

Number of correct responses in the low group

Item Discrimination = —————————————————————————

½ of the total number of responses

Item discrimination, like item difficulty, usually ranges from 0 to 1. The acceptable range for ID is .40 and above. However, it is possible to obtain a negative ID in some cases. This means that more examinees in the high group than those in the low group missed the item, indicating that the item is inappropriate. It is also possible for an item to have a good index of IF but a weak ID or vice versa. In either case, the item should be modified or discarded.

Based on the results of item analysis, modifications needed to improving the test items should be made. These modifications may include changing a distracter, stem, complete item, or discarding an item altogether. At this time, the pre-final draft of the test would be ready. This test, of which the items have reasonably acceptable characteristics, should go through one more step, referred to as validation.

Test Item Analysis

Item analysis demands a clear understanding on course planner about the information derivable from test when administered. Diagnostic information such as the difficulty of test to test, the discriminating power or the effectiveness of distracters in case of objective tests is examples of such information.

Group evaluation methods in virtual environment as listed in module 3 present also such opportunity but require carefully selected approaches which address each method as unique on its own.

Abounding (1999) defined test analysis as ‘‘…the process involved in examining reanalyzing test’s responses to each item on a test with a basic intent of judging the of item specifically, the difficulty and discriminating ability of the item as well as effectiveness of each alternative.’’ He suggested following as the information that test analysis could provide:

Ø To provide some quantitative evidence to reveal and/ or support the difficulty and discrimination indices of test items.

Ø To judge the worth or quality of a test

Ø To reveal to the test construction how his/her tests behaves so as to build a test file which is constantly being improved upon

Ø To make known and to determine what to do as regards making subsequent revisions of tests.

Ø To provide interesting and useful information on the achievement of individual test which can be used as valuable data for diagnosis individual difficulties and prescribing remedial measures or planning future learning activities.

Ø To impress on the teacher the need for improvement biased on the resulting data (improvement in teaching and teaching resources will often be made obvious by the analysis)

Ø To provide a basis for discussing test results.

Ø To provide learning experiences for students, if students assist in or are told the results of item analysis

III. CONCLUSION

After doing the first step, we need to do following step in constructing test before make the final test that will be tested to the testers. There are pre-test the language test and analyzing the result. This step is really important because it will influence the result of final draft test. Those are the conclusion of the summary related to the topic.

REFERENCES

Hughes, Arthur. (1983). Testing for Language Teachers. UK: Cambridge University Press.

ALL ABOUT ENGLISH

Monday, October 28, 2013

Stages in test construction: Pre-Testing and analyzing the result (level of difficulty, item discrimination, and the effectiveness of distracters

No comments:

Post a Comment