
calibration and monitoring requires standardized
approaches to data collection, data management, pro-
cessing, and results generation.
Types of Biometric Performance Testing
Standards
Biometric performance tests are typically categorized
as technology tests, scenario tests, or operational tests.
These test types share commonalities – addressed in
framework performance testing standards – but also
differ in important ways.
▶ Technology tests are those in which biomet-
ric algorithms enroll and compare archived (i.e.,
previously-collected) data. An essential characteristic
of technology testing is that the test subject is not ‘‘in
the loop’’ – the test subject provides data in advance,
and biometric algorithms are implemented to process
large quantities of test data. Technology tests often
involve cross-comparison of hundreds of thousands
of biometric samples over the course of days or
weeks. Methods of executing and handling the out-
puts of such cross-comparisons are a major component
of technology-based performance testing standards.
Technology tests are suitable for evaluation of both
verification- and identification-based systems, although
most technology tests are verification-based. Technolo-
gy testing standards accommodate evaluations based on
biometric data collected in an operational system as well
as evaluations based on biometric data collected for the
specific purpose of testing. Technology tests based on
operational data are often designed to validate or project
the performance of a fielded system, whereas technology
tests based on specially-collected data are typically more
exploratory or experimental.
▶ Scenario tests are those in which biometric sys-
tems collect and process data from test subjects in a
specified application. An essential characteristic of sce-
nario testing is that the test subject is ‘‘in the loop,’’
interacting with capture devices in a fashion represen-
tative of a target application. Scenario tests evaluate
end-to-end systems, inclusive of capture device, qual-
ity validation software, enrollment software, and
matching software. Scenario tests are based on smaller
sample sizes than technology tests due to the costs of
recruiting and managing interactions with test subjects
(even large scenario tests rarely exceed more than
several hundred test subjects). Scenario tests are also
limited in that there is no practical way to standardize
the time between enrollment- and recognition-phase
data collection. This duration may be days or weeks,
depending on the accessibility of test subjects.
Scenario-based performance testing standards have
defined the taxonomy for interaction between the
test subject and the sensor; this taxonomy addresses
presentations, attempts, and transactions, each of
which describes a type of interaction between a test
subject and a biometric system. This is particularly
important in that scenario testing is uniquely able
to quantify ‘‘level of effort’’ in biometric system
usage; level of effort directly impacts both accuracy
and capture rates.
▶ Operational tests are those in which a biometric
system collects and processes data from actual system
users in a field application. Operational tests differ
fundamentally from technology and scenario tests in
that the experimenter has limited control over data
collection and processing. Because operational tests
should not interfere with or alter the operational usage
being evaluated, it may be difficult to establish ground
truth at the subject or sample level. As a result, opera-
tional tests may or may not be able to evaluate false
accept rates (FAR), false reject rates (FRR), or failure
to enroll rates (FTE); instead they may only be able to
evaluate acceptance rates (without distinction between
genuine and impostor) and operational throughput.
One of the many challenges facing developers of
operational testing standards is the fact that each opera-
tional system differs in some way from all others, such
that defining commonalities across all such tests is diffi-
cult to achieve. It is therefore essential that opera-
tional performance test reports specify which elements
were measurable and which were not. Operational
tests may also evaluate performance over time, such
as with a system in operation for a number of months
or years.
In a general sense, as a given biometric technol-
ogy matures, it passes through the cycle of technology,
scenario, and then operational testing. Biometric tests
may combine aspects of technology, scenario, and op-
erational testing . For example, a test might combine
controlled, ‘‘online’’ data collection from test subjects
(an element of scenario testing) with full, ‘‘offline’’
comparison of this data (an element of technology
testing). This methodology was impleme nted in iris
recognition testing sponsored by the US Department
of Homeland Securit y in 2005 [1 ].
1070
P
Performance Testing Methodology Standardization