Document 3A DESIGN OF ON-FARM EXPERIMENTS 1. Experiments at Different Stages of OF Research Like any other form of research project an OF research project passes through different stages from the exploratory through the detailed examination of components to the verification. These stages are rarely completely separable. For example, at verification there will sometimes be additional treatments of possible promise. It is, of course, crucial that the objectives at each stage of the experimentation are clearly identified and relevant to that stage. The choice of treatments is discussed in general in section 3. Here it is worth noting that there will be a general tendency for the number of experimental treatments to decline through the programme. At the earliest stages there should be many treatment factors, each with few levels. In subsequent development the number of factors in a single experiment will tend to decline with the number of levels tending to increase. At no stage should there be many factors each with many levels since such an experiment would appear to be asking whether interactions were important and simultaneously assuming that they were while trying to identify best levels of each factor. It is not just the impossibility of managing such a trial but the contradiction in objectives that makes it inappropriate. Many of the concepts discussed later in this paper are relevant to most stages but there will be differences at least of emphasis. There will be differences in the number and distribution of farms with larger numbers, certainly, for the verification stage. I assume that the questions of choice of sample farms and the definition of recommendation domains are not covered in this paper, being discussed in CIMMYT Training Working Documents Nos. 4 and 5. 2. General Statistical Principles of Precision, Replication and Resource 2.1 Precision of Results It is extremely important to assess, before the experiment, the precision of the information to be obtained from the experimental results. This involves thinking about three quantities. First the likely background variability, as measured by the Coefficient of Variation. CV, or the plot Standard Deviation, s; second, the difference in yield (or other performance variable) which is important, d, or A if the difference is expressed as a percentage of the experimental mean yield; the third component is the number of replications n for each treatment. Assume that we are interested in the comparison between two treatments each having a total of n observations across the whole experiment. The crucial statistical result is that the standard error of the difference between two mean is: SE (XI X2) = q(2s2 /n) = Sq(2/n) To have a realistic chance of identifying whether the true difference between the treatments (P l-42) is as large as our critical difference d we must make the SE a good deal smaller then d. How much smaller depends on the significance level we propose to use and the risk we are prepared to accept of missing a true difference as big as d. A useful rule of thumb is to try to make the SE no bigger than d/3. This means that d is 3 standard errors which allows 2 standard errors for achieving a 5% significance level and an extra standard error for bad luck in getting (XI X2) smaller than d (the risk of missing a true difference ofd is one-sixth).