I.
INTRODUCTION
Today
many institutions and people are demanded to construct the test by themselves.
Especially as a teacher, we need to know the steps of test construction. After planning,
writing, and reviewing the items, the test should be pre-tested and analyzed
the result. Here is
the summary of the steps in constructing test.
II.
SUMMARY
OF CONTENT
Pre-testing
Pretesting
is a process for determining a target group’s reaction to and understanding of
health messages or behavior change information before materials are produced in
final form.
Pretesting
is not research to help you understand the audience. That is called formative
research. Conduct formative research on your target audience before developing
a BCC project.
During
pretesting, members of the target group are asked to react to draft BCC
materials. Their responses are analyzed, and then the materials are revised.
Pretesting may be conducted several times before final materials are produced.
Pretesting Methods
The
two most common pretesting methods are individual interviews and focus group
discussions. Readability testing and expert review are also used.
1) Individual
Interviews are one-on-one interviews where discussion between one interviewer
and one participant takes place in a private, confidential setting.
2)
Focus Group Discussions (FGDs) are small group gatherings (8-10 people per
session) where the materials and messages are discussed in a group setting.
3)
Readability Assessments help determine the level of reading difficulty of a
written material. This is done during the materials development process before
pretesting with the target audience.
4) Expert Review
involves asking experts to review the draft materials and give comments and
suggestions for improvements.
Item Facility (IF)
Item facility (sometimes called item difficulty) refers to
the proportion of correct responses given to a certain item. For example, if
sixty out of one hundred examinees give correct response to a particular item,
the item facility will be calculated to be .60 according to the following
formula:
number of
correct responses 60
IF = ————————————— = ——— = .60
total number
of responses 100
If we consider the proportion of incorrect responses given
to an item, the calculated value will be the item difficulty for that item. In
the example above, the item difficulty will be .40 as calculated from the
following formula:
number of wrong responses
40
Item Difficulty = ————————————— = ——— = .40
total number of responses
100
It should be clear that either item facility or item
difficulty ranges from 0 to 1. It should also be clear that by subtracting the
value of item facility from 1, the value of item difficulty can easily be
determined. Finally, it should be obvious that the higher item facility, the
lower the item difficulty, and thus the easier the test item. For example, an item
facility of zero is an indication of an item difficulty of 1. It means that
nobody has given a correct response to that item. Thus, the item is too
difficult. On the other hand, an item facility of 1 refers to the item to which
everybody had given a correct response. Thus, that item is very easy. The
criterion for an acceptable level of item facility depends on the function of
the test. However, as a generally agreed upon convention, items with facility
indexes below .37 and above .63 are recommended to be either modified or
discarded. Ideal values of item facility for true-false, three-choice,
four-choice, and five-choice items are .75, .67, .63, and .60 respectively.
Item Discrimination (ID)
One
of the many purposes of testing is to distinguish knowledgeable examinees from
less knowledgeable ones. Each item of the test, therefore, should contribute to
accomplishing this aim. That is, each item in the test should have a certain
degree of power to discriminate examinees on the basis of their knowledge. Item
discrimination refers to this power in each item. In order to calculate item
discrimination (ID), the total scores on the test are ranked. Ranking means to
list the scores from the highest to the lowest. Then the scores are divided
into two parts: high and low. Finally, the number of examinees who have given
correct responses to a particular item in each group would be counted, and
these numbers would be utilized in the following formula:
Number of correct responses in the high group minus
Number of correct responses in the low group
Item
Discrimination = —————————————————————————
½ of the total number of responses
Item
discrimination, like item difficulty, usually ranges from 0 to 1. The
acceptable range for ID is .40 and above. However, it is possible to obtain a
negative ID in some cases. This means that more examinees in the high group
than those in the low group missed the item, indicating that the item is
inappropriate. It is also possible for an item to have a good index of IF but a
weak ID or vice versa. In either case, the item should be modified or
discarded.
Based on the results of
item analysis, modifications needed to improving the test items should be made.
These modifications may include changing a distracter, stem, complete item, or
discarding an item altogether. At this time, the pre-final draft of the test
would be ready. This test, of which the items have reasonably acceptable
characteristics, should go through one more step, referred to as validation.
Test Item Analysis
Item analysis demands a
clear understanding on course planner about the information derivable from test
when administered. Diagnostic information such as the difficulty of test to
test, the discriminating power or the effectiveness of distracters in case of
objective tests is examples of such information.
Group evaluation
methods in virtual environment as listed in module 3 present also such
opportunity but require carefully selected approaches which address each method
as unique on its own.
Abounding (1999)
defined test analysis as ‘‘…the process involved in examining reanalyzing
test’s responses to each item on a test with a basic intent of judging the of
item specifically, the difficulty and discriminating ability of the item as
well as effectiveness of each alternative.’’ He suggested following as the
information that test analysis could provide:
Ø To provide some quantitative evidence to
reveal and/ or support the difficulty and discrimination indices of test items.
Ø To judge the worth or quality of a test
Ø To reveal to the test construction how
his/her tests behaves so as to build a test file which is constantly being
improved upon
Ø To make known and to determine what to do
as regards making subsequent revisions of tests.
Ø To provide interesting and useful
information on the achievement of individual test which can be used as valuable
data for diagnosis individual difficulties and prescribing remedial measures or
planning future learning activities.
Ø To impress on the teacher the need for
improvement biased on the resulting data (improvement in teaching and teaching
resources will often be made obvious by the analysis)
Ø To provide a basis for discussing test
results.
Ø To provide learning experiences for
students, if students assist in or are told the results of item analysis
III.
CONCLUSION
After doing the first step, we need
to do following step in constructing test before make the final test that will
be tested to the testers. There are pre-test the language test and analyzing
the result. This step is really important because it will influence the result
of final draft test. Those are the conclusion of the summary related to the
topic.
REFERENCES
Hughes,
Arthur. (1983). Testing for Language
Teachers. UK: Cambridge University Press.
No comments:
Post a Comment