50 independent observers using a one-second window. The program also creates a confusion matrix that indicates on which codes the coders agree and disagree. The kappa estimates are affected by the number of other categories in the system and the number of behaviors included in the confusion matrix. In order not to inflate the kappa estimates by including additional, non-related categories, the present study divided the categories into four classes of behaviors: verbalizations, vocalizations, responses to commands and questions, and physical behaviors. These classes contain the categories that are likely to be confused with one another. For instance, Direct Commands could be confused with Indirect Commands, questions, and descriptions, but not with Laugh or Physical Positive. Each class of behaviors was analyzed separately to reduce any artificial inflationary effect. Kappa estimates for parent and child behaviors were also computed in separate confusion matrices to reduce the likelihood of an overestimation of the reliability. Fleiss (1981) indicated that kappa values greater than .75 can be considered as representing excellent agreement beyond chance. Kappa values ranging from .60 to .75 indicate good agreement beyond chance, values between .40 to .60 indicate fair agreement, and values below .40 are indicative of poor agreement. These kappa values were used to evaluate the kappas found in this study. Intraclass correlations were included in the study to provide an alternative method of evaluating reliability. Intraclass correlations are based on examining the amount of variance attributed to between subjects differences and the variance attributed to within subject differences, or in this case, coder error. The correlation coefficient (_) is interpretable as "...the proportion of variance of an observation due to subject-to-subject