51 variability in error-free scores (Fleiss, 1986, p.3)." To achieve a high intraclass correlation, the variability in the frequency of occurrence between the participants must be greater than the variability between two observers' values for a given behavioral category on a particular subject. Intraclass correlations were computed by a formula derived by Fleiss (1981) which is based on analysis of variance procedures, and evaluated as standard correlation coefficients (Suen & Ary, 1989). Tables 3 and 4 summarize the reliability estimates for the parent and child categories. The DPICS I categories were summed across the three five-minute coding intervals (i.e., CDI, PDI, CU) and across the groups (clinic-referred and comparison). The DPICS II categories are ranked within the tables with the highest kappa estimates appearing first. In addition, borrowing the convention used to rate kappa estimates created by Fleiss (1981), the estimates are divided into groups considered to have "excellent," "good," and "fair" reliability estimates. To address the problem of poor reliability estimates in a prior study of DPICS II (Bessmer, 1996) and also to simplify the coding system, several DPICS II categories were coded differently in the comparison sample. First, the codes of Unlabeled, Labeled and Contingent Praise were combined into one category and coded as Praise (total). Second, the Criticism and Smart Talk codes were collapsed into one category of Negative Talk. Third, the parent vocalization codes of Whine and Yell were not coded. Finally, the physical behavior codes of Physical Negative and Destructive were not coded for either fathers or children. Because the father-child dyads in the clinic-referred sample had been