Reliability analysis on continuously measured behavioral data

R.G. Jansen, R.G.M. Elbers, E.S. Meyer and L.F. Wiertz

Noldus Information Technology bv, Wageningen, The Netherlands

 

Background
Reliability analysis theory has been developed for assessing, on a nominal scale, a series of cases by two independent raters. For each case, the assessment results are tallied in a confusion matrix: a contingency table in which agreements and disagreements in the classifications of both raters are presented (Figure 1). Reliability statistics, such as percentage of agreement, Cohen's kappa (agreement corrected for chance agreement) and Pearson's correlation coefficient rho, can be computed from the confusion table.

 
Rater 2
Rater 1
Nominal value
A
B
C
A
2
3
0
B
0
4
1
C
1
1
3

Figure 1. Example of a contingency table in which agreements and disagreements have been tallied.

Reliability analysis and continuously recorded behavioral data
In behavioral research, reliability analysis must deal with two sets of time-structured data ('observations'). This is not a problem if the two observations compared involve behaviors sampled at fixed intervals, since each sample then corresponds with a case. But if they involve behaviors that have been recorded continuously, such that the start and end of each case are measured subjectively, the following problems arise in finding a basis for comparing the data sets:

Observation 1
Observation 2
Case
Behavior
time
Behavior
time
1
Walk
0
Walk
0
2
Jog
5
-
-
3
-
-
Run
7
4
Hold
12
Hold
11

 

Observation 1
Observation 2
Case
Behavior
time
Behavior
time
1
Walk
0
Walk
0
2
Jog
5
Run
7
3
Hold
12
Hold
11

Figure 2. Example showing the difficulty of objectively determining the number and type of cases.

Reliability analysis using The Observer software
The Observer 4.1 offers a new set of reliability functions capable of taking the time-structured nature of the data into account. Reliability analyses are based on four different methods of comparing continuously recorded data sets, which differ in the way the three problems mentioned earlier are dealt with:

Furthermore, The Observer presents outcomes at different levels:

Figure 3. Example of a case-by-case comparison in The Observer 4.1.

Figure 4. Example of a confusion matrix in The Observer 4.1.

Figure 5. Example of reliability measures in The Observer 4.1.


Paper presented at Measuring Behavior 2002 , 4th International Conference on Methods and Techniques in Behavioral Research, 27-30 August 2002, Amsterdam, The Netherlands

© 2002 Noldus Information Technology bv