I. INTRODUCTION
1.1.
The Importance of Test Analysis
Item analysis demands a clear understanding
on course planner about the information derivable from test when administered.
Diagnostic information such as the difficulty of test to tester, the
discriminating power or the effectiveness of distracters in case of objective
tests are examples of such information. It is very important to analyze to test
since it is provides some advantages such:
ü To provide some quantitative evidence to
reveal and/ or support the difficulty and discrimination indices of test items.
ü To judge the worth or quality of a test
ü To reveal to the test construction how
his/her tests behaves so as to build a test file which is constantly being
improved upon
ü To make known and to determine what to do
as regards making subsequent revisions of tests.
ü To provide interesting and useful
information on the achievement of individual tester which can be used as
valuable data for diagnosis individual difficulties and prescribing remedial
measures or planning future learning activities.
ü To impress on the teacher the need for
improvement biased on the resulting data (improvement in teaching and teaching
resources will often be made obvious by the analysis)
ü To provide a basis for discussing test
results.
ü To provide a learning experience for
students, if students assist in or are told the results of item analysis.
1.2.
What are to Analyze? (general)
The analysis of the test
sometimes focus on analyzing the information such as the validity, reliability,
difficulty level, discriminating power, the effectiveness of distracters.
1.3.
What test and what are to analyze? (Specific)
In this chance, I will
analyze the grammar II test. This analyzing purposes to analyze the information
of content validity of the related grammar test, the difficulty level, the
discriminating power, and the effectiveness of distracters.
II. CONTENT VALIDITY
2.1. Definition
Validity is
concerned with the extent to which test result serve their intended use.
Methods of determining validity:
1.
Validity refers to the interpretation of test (not the test
itself)
2.
Validity is inferred from available evidence (not measured)
3.
Validity is specific to particular use (selection, placement,
evaluation of learning, and so forth)
4.
Validity is expressed by degree (for example, high, moderate, or
law)
Content Validity is
especially important in achievement testing. We can built a test that has high
content validity by (1) identifying the subject-matter topic and learning
outcome to be measured, (2) preparing a set of specification, which defines the
samples of items to be used, and (3) constructing a test that closely fits the
set of specification.
2.2. Table of Test Specification
|
No
|
Topic
|
Sub
Topic
|
Test
Type
|
Items
Number
|
Number
of Items
|
Percentage
|
|
1
|
Tenses
|
Past continuous
Simple present
Simple past
Present perfect
Future perfect
Past perfect
|
Multiple Choice
|
15
10
24
32,36,38
34,35
37
|
1
1
1
3
2
1
|
22,5%
|
|
2
|
Modal Aux and Similar Expressions
|
Would/should
Could
Have/ has/had
Must
Supposed to
Used to
|
Multiple Choice
|
1,2,6,8,12,23,27,29
11,13,19,21,28
3,4,25
5,7,14,16,17,18,20,22,26
9
30
|
8
5
3
9
1
1
|
67,5%
|
|
3
|
Passive Voice
|
Modal passive
Present Perfect
|
Multiple Choice
|
22
31,33,39
|
1
3
|
10%
|
|
|
3
|
14
|
|
|
40
|
100%
|
III. DIFFICULTY LEVEL
3.1. Definition
The difficulty of an
item is understood as the proportion of the persons who answer a test item
correctly. The higher this proportion, the lower the difficulty level. The
higher the difficulty of an item, the lower its index. The formula of item
difficulty (p) is shown as bellow:
p : Difficulty
index of item
A: Number of
correct answer to item
N: Number of
correct answers plus number of incorrect answers to item
3.2. Classification of Difficulty Level
|
Percentage of Difficulty Level
|
Item Meaning
|
|
100 %
|
Too easy
|
|
80% – 99%
|
Very Easy
|
|
71% - 79%
|
Easy
|
|
40% - 70%
|
Moderate
|
|
20% - 39 %
|
Difficult
|
|
1% - 19%
|
Very Difficult
|
|
0%
|
Too
Difficult
|
IV. DISCRIMINATING POWER
4.1. Definition
One of the many purposes of testing is to
distinguish knowledgeable examinees from less knowledgeable ones. Each item of
the test, therefore, should contribute to accomplishing this aim. That is, each
item in the test should have a certain degree of power to discriminate
examinees on the basis of their knowledge. Item discrimination refers to this
power in each item.
The higher the discrimination index, the better the
item because such a value indicates that the item discriminates in favor of the
upper group, which should get more items correct. It means the item has
positive ID. When more students in the lower group than in the upper group
select the right answer to an item, the item actually has negative validity. It
means the item has negative ID.
4.2. Formula
The
discrimination index (D):
D =Discrimination
index of item
GA correct answers =Number of correct answers to item in upper
group
GB correct answers = Number of correct answers to item in lower
group
½ N = ½ of the total
number of responses
In computing the discrimination
index, D, first score each student's test and rank order the test scores. Next,
the 27% of the students at the top and the 27% at the bottom are separated for
the analysis.
4.3. Criteria of Discriminating power
V. EFFECTIVENESS OF DISTRACTERS
5.1. Definition
Acceptable p and D values are two important
requirements for a single item. However, these values are based on the number
of correct and wrong responses given to an item. They are not concerned with
the way distractors have operated. There are cases that an item shows
acceptable p and D, but does not have challenging distractors. Therefore, the
function of analyzing test is to examine the quality of the distractors.
A discrimination index or discrimination
coefficient should be obtained for each option in order to determine each
distractor's usefulness (Millman & Greene, 1993). Whereas the discrimination
value of the correct answer should be positive, the discrimination values for
the distractors should be lower and, preferably, negative. Distractors should
be carefully examined when items show large positive D values. When one or more
of the distractors looks extremely plausible to the informed reader and when
recognition of the correct response depends on some extremely subtle point, it
is possible that examinees will be penalized for partial knowledge.
VI. CONCLUSION
The analysis of grammar II
test, which is shown in previous explanation in this paper, has some points as
the result that I can conclude as bellow:
ü The topics of this grammar test consist of 22, 5% of
tenses, 67, 5% of modal auxiliary and expression, and 10% of passive voice. These
topics are included through 40 items, and the test is administered for 31
testees.
ü Then as the result of students’ rank, the highest
score is 35, and the lowest score is 8.
ü After analyzing the 20 sample, it was found that
this grammar test is in moderate level since 57,5% of 40 items show moderate
difficulty level. Besides, the very difficulty level of item was found only 5%
of 40 items that is representing by items number 21 and 31.
ü For the result of discriminating power, this grammar
test generally showed positive discrimination index. The excellent level
discriminating power is 67,5% of 40 items. On the other hand, 5% of 40 items is
the worst level that showed negative discrimination index which is shown in
item number 10 and 13.
ü For the last result of this grammar test analysis,
the distracters showed 66% of 120 distracters are effective which have
distracted successfully more students in lower group.
REFERENCES
Farhady, Hossein. (1986). Fundamental Concepts in Language
Testing (3) Characteristics of Language Tests: Item Characteristics.
Roshd Foreign Language Teaching
Journal, 2 (2 & 3). 15-17
Alderson, J. C., C. M. Clapham & D. Wall
(1995). Language Test Construction and Evaluation. Cambridge: Cambridge
University Press.
Backhoff, E., Larrazolo, N., & Rosas, M.
(2000). The level of difficulty and discrimination power of the Basic
Knowledge and Skills Examination (EXHCOBA). Revista Electrónica de
Investigación Educativa, 2 (1). Retrieved March 8, 2012 from:
http://redie.uabc.mx/vol2no1/contents-backhoff.html
Hetzel, Susan Matlock. (1997). Basic Concepts in
Item and Test Analysis. Texas: A&M University Press