Towards an experimental timing standards laboratory

R.R. Plant, N. Hammond and T. Whitehouse

Department of Psychology, University of York, York, United Kingdom

 

The timing of events in studies of human performance increasingly relies on the use of software tools running within complex hardware and software environments. With the advent of graphical multitasking operating systems and associated software for designing and running studies, it is hard for the researcher to be confident of the accuracy of the timing of stimulus and response events. Thus, the reliability of timing is subject to a wide range of influences.

It is common for chronometric studies performed by psychologists or human factors engineers to treat differences between conditions of ten milliseconds or less as behaviorally significant, yet inaccuracies resulting from hardware and software for individual events may typically be considerably larger than this. Whilst repeated sampling can overcome random variation, it cannot counter systematic bias which might be correlated with different experimental conditions.

Accuracy claims made by commercial software developers are seldom supported by hard evidence, and in any case may refer to specific ideal operating conditions, which users may not be able to follow. Moreover, rapid changes in hardware, operating systems and associated software mean that the same product may not perform in the same way in different laboratories, or following an upgrade.

Careful and technologically sophisticated users may well take steps to ensure the validity and reliability of their measuring instruments. However, the ubiquitous penetration of experiment generators into research and teaching laboratories raises serious concerns over the possibilities of flawed data collection and inappropriate training in research methodology for students.

With national research council backing, we have established the Experimental Timing Standards Laboratory and formulated recognised benchmarks for testing the timing characteristics of most tools used by behavioral scientists for chronometric studies. This has required the development of appropriate testing criteria, benchmark standards and testing methodologies, the collection of timing data from commonly-used tools, the dissemination of information concerning good practice, and the establishment of a continuing testing service. In this paper, we outline these benchmarks and discuss our findings in relation to some commonly used packages. See also http://www.psychology.ltsn.ac.uk/ETSL


Paper presented at Measuring Behavior 2002 , 4th International Conference on Methods and Techniques in Behavioral Research, 27-30 August 2002, Amsterdam, The Netherlands

© 2002 Noldus Information Technology bv