|
ELT News, March 2010
March 2010
- Monitoring and
Evaluating the KPG Speaking Test
By Xenia Delieza*
Introduction
Oral assessment of
language proficiency is a complex
and largely subjective process in
which many variables or facets have
been found to affect the quality and
quantity of candidate language
output and
the rating of their performance.
Ultimately, this threatens the
validity, reliability and fairness
of the oral test procedure. The
role and linguistic behaviour of the
interlocutor during the oral exam
has been highlighted by many
researchers as a major variable
which can potentially affect
candidate output and examiner
rating. With this in mind, the KPG
English team has begun systematic
examiner training but also
examiner-conduct quality assessment,
through the English Speaking Test
Observation Project (ESTOP).
This project was
launched in November 2005, aiming to
identify whether and to what extent
examiners follow oral test conduct
rules, adhere to the test guidelines
and carry out the oral test as
instructed. In other words, by
having especially trained
professionals observe examiners
while testing candidtaes, with the
help of specially constructed
observations tools, the English team
wanted to obtain information about
the efficiency of the oral test
administration, about examiner
conduct, about the applicability of
the oral assessment criteria and the
usefulness of the marking grids. The
information actually obtained has
been essential for the development
and refinement of the oral test and
for the training and evaluation of
oral examiners. The results from
this first phase (November 2005)
were used in Observation Phase 2, in May
2006 and this also led to four more
observation phases: May 2007,
November 2007, May
2008 and November 2008.
To date, six
observation phases have been carried
out and for each one, a new, refined
observation form has been produced,
based on the findings of the
previous phase.
As one can see in the
table below, during these six
observation phases 1,948 oral
examiners were observed examining
6,755 candidates.
|
|
|
Examiners |
Candidates |
PHASE 1: November 2005
Levels B2 & C1 |
25 observers |
B2 |
138 |
470 |
C1 |
98 |
288 |
PHASE 2: May 2006 Levels
B2 & C1 |
33 observers |
B2 |
155 |
540 |
C1 |
118 |
418 |
PHASE 3: May 2007 Levels
B1, B2 & C1 |
32 observers |
B1 |
35 |
132 |
B2 |
156 |
588 |
C1 |
105 |
342 |
PHASE 4: November 2007 Levels
B1, B2 & C1 |
42 observers |
B1 |
50 |
201 |
B2 |
177 |
753 |
C1 |
100 |
339 |
PHASE 5: May 2008
Levels A1-2, B1, B2 & C1 |
48 observers |
A1-2 |
45 |
184 |
B1 |
60 |
193 |
B2 |
182 |
612 |
C1 |
136 |
440 |
PHASE 6: November 2008 Levels
A1-2, B1, B2 & C1 |
41 observers |
A1-2 |
51 |
113 |
B1 |
55 |
154 |
B2 |
187 |
659 |
C1 |
100 |
329 |
Table 1: The KPG
observation project in numbers
How is Observation
conducted?
During observation,
selected trained
professionals are assigned to
different examination centres to
monitor the oral test, without
interfering with the procedure in
any way. While watching, as third
parties, observers fill in their
forms before, while and after the
oral test has been conducted. The
observation forms are designed
so that each one is used for only
one test session, and observers are
instructed to monitor each examiner
twice; i.e. with two pairs of
candidates.
The project has
been conducted, so far, in a random
choice of examination centres around
Greece,[1] and
the observers are present at the
examination centres going from one
examination room to the other, for
as long as the examination sessions
last: morning to afternoon.
When their
observation job is completed,
observers send their completed
observation forms to the English
Team so that information is
processed, and data is analysed.
Qualitative and quantitative results
are included in a report prepared
and taken into account by the
speaking test development team, by
the persons responsible for
designing the next phase of
observation and by those responsible
for the examiner training programme.
The observation forms
The tools prepared
for this project, i.e., the
observation forms are structured as
checklists, with specific
categories and subcategories.
Respondents circle
YES/NO or tick each item. There is
also space for open ended remarks
next to certain items. The
content of these forms helps the
English Team to elicit
information about the candidates
(age, sex, literacy level and how
well they did on which tasks). More
importantly, however, it is designed
to elicit information regarding the
examiners and their conduct, their
choice of tasks, whether or not they
used time effectively, how they
applied the criteria for marking,
etc. Finally, they elicit
information regarding examiners’
language use and whether or not they
alter task rubrics and thus
interfere with candidates’ language
output.
A summary of some
of the results[2]
The
findings of the observation project
proved valuable in many respects.
Fir, t hey verified what the English
Team suspected regarding the
frequency of examiner intervention
and their potential effects on the
validity, reliability and fairness
of the test as a whole. Secondly,
they highlighted the need to
introduce changes in the examiner
training programme, so as to limit
examiner intervention. Thirdly, the
findings revealed that there are no
examiners that systematically
intervene and others that do not.
Rather, their interference depends
on a number of factors, such as
candidate level of competence and
qua of performance, stage of the
test, etc. More
specifically, the findings reveal
that examiners most frequently
change task rubrics (by using an
introductory question, adding their
own question or expanding the
original question with added
information) in the first activity
of the lower level exams. The
interpretation is that examiners
tend to do this to reduce
candidates’ anxiety and to
facilitate language output. In the
other two activities of the B1 and
B2 level exams, examiners tend to
tamper with task rubrics, but less
frequently than at lower levels.
Interventions mainly take the form
of expanding the original task
rubric or simplifying it through the
use of examples in order to help
candidates understand task
requirements and to ensure that
candidates respond to the demands of
the task. A
general conclusion is that the
higher the level of the oral test,
the lower the intensity of the
examiners’ interference. During
the C1 level speaking test, there is
sporadic intervention.
The importance of the
observation project for the KPG oral
test
The information
elicited from the ESTOP has
proven valuable and extremely
useful for the
KPG test developers in many ways,
and especially because the
results have contributed to the
improvement of test content. In
other words, the speaking tasks take
into consideration, among other
things, the results of the
observation project. Furthermore,
the guidelines for how to conduct
the speaking test have been affected
by the observation project results.
One of the important outcomes was
that an
Interlocutor Frame was introduced,
to tackle the problem of examiner
performance variation.
The ESTOP has been
constructive on a variety of other
levels too. For one thing, it has
allowed the English Team to evaluate examiners’
performance.
This is very important since the
ultimate aim of the system is to establish
and maintain a certified body of
trained examiners.
Secondly, insights
from the project have been crucial
for the preparation of examiner
training material.
For all the reasons
above and for others that will be
discussed in future publications, it
has become obvious that structured
observation is a very functional and
expedient way to monitor and assess
the speaking test and examiners.
[2] For
a more detailed presentation
of results see Karavas
E. & Delieza X. (2009). On
site observation of KPG oral
examiners: Implications for
oral examiner training and
evaluation. Apples
(Journal of Applied Language
Studies), Vol. 3, No. 1,
p. 51-77.
<Πίσω> |
|
|