One Way Anova: Concepts and Application in Agricultural System Hussaini Abubakar1, Haruna Danyaya Abubakar2, and Aminu Salisu3 1 Department of Mathematics and Statistics, Hussaini Adamu Federal Polytechnic kazaure, Jigawa, Nigeria 2 Department of Science Laboratory Technology, Hussaini Adamu Federal Polytechnic kazaure, Jigawa, Nigeria 3 Department of Science Laboratory Technology, Hussaini Adamu Federal Polytechnic kazaure, Jigawa, Nigeria yankwanee85y@gmail.com Abstract - Agriculturalists seek general explanations for presently the most commonly advanced research the variations in agricultural yields in response to a method in business, economic, medical and social treatment. An increasingly popular solution is the science disciplines. powerful statistical technique one way analysis of variance Like many other parametric statistical techniques, (ANOVA). This technique is intended to analyze the variability in data in order to infer the inequality among ANOVA is based on the following statistical population means. After exploring the concept of the assumptions: technique, the response of the chlorophyll content on the a) Homoscedasticity (homogeneity) of variance. leaves of 160 maize seedlings to the treatment of nitrogen b) Normality of data. potassium phosphorous (NPK) at 0g, 5g, 10g and 20g as c) Independence of observations. control treatment, treatment1, treatment2 and treatment3 2. Basic concepts of one way ANOVA test. respectively, it was revealed that, there was a significant A one-way analysis of variance is used when the data effect of the amount of NPK on the chlorophyll content of are divided into groups according to only one factor. maize seedlings at P < 0.05, [F (3, 141) = 51.190, P = 0.000]. Assume that the data 𝑦11 , 𝑦12 , 𝑦13 , . . ., 𝑦1𝑛1 are sample Post hoc comparison using tukey HSD test indicated that, the mean score for treatment1 (M = 18.89, SD = 11.58) was from population 1, 𝑦21 , 𝑦22 , 𝑦23 , . . ., 𝑦2𝑛2 are sample significantly different than treatment3 (M =1.61, SD = from population 2, , π‘¦π‘˜1 , π‘¦π‘˜2 , π‘¦π‘˜3 , . . ., π‘¦π‘˜π‘›π‘˜ are 7.01) and the control treatment (M = 4.59, SD = 5.49), also sample from population k. Let 𝑦𝑖𝑗 denote the data from the mean score for treatment2 (M =21.57, SD = 9.80) was the ith group (level) and jth observation. significantly different than treatment3 and the control treatment respectively. However, the result indicated a We have values of independent normal random non-significant difference between the treatments 1 and 2 variables π‘Œπ‘–π‘— = 1, 2, 3, … , π‘˜ and J = 1, 2, 3, …, 𝑛𝑖 with and treatments 3 and the control treatment respectively. mean πœ‡π‘– and constant standard deviation 𝜎, π‘Œπ‘–π‘— ~ N Altogether the result revealed that the amount of NPK (πœ‡π‘– , 𝜎) Alternatively, each π‘Œπ‘–π‘— = πœ‡π‘– + πœ€π‘–π‘— where πœ€π‘–π‘— are really do have effect on the chlorophyll content of maize normally distributed independent random errors, πœ€π‘–π‘— ~ N seedlings. The data were analyzed using computer program SPSS. (0, 𝜎). Let N = 𝑛1 + 𝑛2 + 𝑛3 + . . . +π‘›π‘˜ is the total number of observations (the total sample size across all Keywords - one-way ANOVA test, multiple comparison groups), where 𝑛𝑖 is sample size for the ith group. tests, NPK, chlorophyll, SPSS. The parameters of this model are the population means πœ‡1 , πœ‡2 , πœ‡π‘˜ and the common standard 1. Introduction deviation 𝜎. The concept of analysis of variance (ANOVA) was established by the British geneticist and statistician sir Using many separate two-sample t-tests to compare R. A. Fisher in 1918 and formally published in his book many pairs of means is a bad idea because we don’t get β€œstatistical methods for workers” in 1925. The technique a p-value or a confidence level for the complete set of was developed to provide statistical procedures for test comparisons together. of significance for several group means. ANOVA can We will be interested in testing the null hypothesis be conceptually viewed as an extension of the two 𝐻0 : πœ‡1 = πœ‡2 = πœ‡π‘˜ (1) independent samples t-test to multiple samples t-test, but against the alternative hypothesis results in less type 1 error and therefore suited a wide : βˆƒ1 ≀ 𝑖, 𝑙 ≀ π‘˜: πœ‡π‘– β‰  πœ‡π‘™ (2) range of practical problems. Formerly, this idea was (there is at least one pair with unequal means). generally used for agricultural experiments, but is Let 𝑦𝑖 represent the mean sample i (i = 1, 2, 3, …, k): 1 𝑛 Copyright held by the author(s). 𝑦𝑖 = 𝑖 βˆ‘π‘—=1 𝑦𝑖𝑗 , (3) 𝑛𝑖 49 𝑦 represent the grand mean, the mean of all the data This p-value says the probability of rejecting the null points: hypothesis in case the null hypothesis holds. In case P < 1 𝑛𝑖 𝑦 = βˆ‘π‘˜π‘–=1 βˆ‘π‘—=1 𝑦𝑖𝑗 , (4) 𝛼, where Ξ± is chosen significance level, the null 𝑁 hypothesis is rejected with probability greater than (1- 𝛼 ) 100 probability. 𝑆𝑖2 represent the sample variance: 1 𝑛𝑖 2 𝑆𝑖2 = βˆ‘π‘—=1 (𝑦𝑖𝑗 βˆ’ 𝑦𝑖 ) , (5) 3. Post hoc comparison procedures. 𝑛𝑖 βˆ’1 2 and 𝑆 = 𝑀𝑆𝐸 is an estimate of the One possible approach to the multiple comparison variance 𝜎 2 common to all k populations, problems is to make each comparison independently 1 βˆ‘π‘˜ (𝑛 βˆ’ 1). 𝑆𝑖2 . using a suitable statistical procedure. For example, a 𝑆2 = (6) π‘βˆ’π‘˜ 𝑖=1 𝑖 statistical hypothesis test could be used to compare each ANOVA is centered around the idea to compare the pair of means, πœ‡πΌ and πœ‡π½ , I, J = 1, 2, …, k; I β‰  𝐽, where variation between groups (levels) and the variation the null and alternative hypotheses are of the form within samples by analyzing their variances. 𝐻0 : πœ‡πΌ = πœ‡π½ , πœ‡πΌ β‰  πœ‡π½ (17) An alternative way to test for a difference Define the total sum of squares SST, sum of squares for error (or within groups) SSE, and the sum of squares between πœ‡πΌ and πœ‡π½ is to calculate a confidence interval for treatments (or between groups) SSC: for πœ‡πΌ βˆ’ πœ‡π½ . A confidence interval is formed using a 𝑛𝑖 2 point estimate a margin of error, and the formula SST = βˆ‘π‘˜π‘–=1 βˆ‘π‘—=1 (𝑦𝑖𝑗 βˆ’ 𝑦) , (7) 2 (Point estimate) Β± (Margin of error). (18) 𝑛 SSE = βˆ‘π‘˜π‘–=1 βˆ‘π‘—=1 𝑖 (𝑦𝑖𝑗 βˆ’ 𝑦𝑖 ) = βˆ‘π‘˜π‘–=1(𝑛𝑖 βˆ’ 1). 𝑆𝑖2 , (8) The point estimate is the best guess for the value 𝑛𝑖 2 2 of πœ‡πΌ βˆ’ πœ‡π½ based on the sample data. The margin of error SSC = βˆ‘π‘˜π‘–=1 βˆ‘π‘—=1(𝑦𝑖 βˆ’ 𝑦) = βˆ‘π‘˜π‘–=1 𝑛𝑖 (𝑦𝑖 βˆ’ 𝑦) , (9) reflects the accuracy of the guess based on variability in Consider the deviation from an observation to the grand the data. It also depends on a confidence coefficient, mean written in the following way: which is often denoted by 1-𝛼. The interval is calculated 𝑦𝑖𝑗 βˆ’ 𝑦 = (𝑦𝑖𝑗 βˆ’ 𝑦𝑖 ) + (𝑦𝑖 βˆ’ 𝑦). (10) by subtracting the margin of error from the point Notice that the left side is at the heart of SST, and the estimate to get the lower limit and adding the margin of right side has the analogous pieces of SSE and SSC. It error to the point estimate to get the upper limit. actually works out that: If the confidence interval for πœ‡πΌ βˆ’ πœ‡π½ does not contain SST = SSE + SSC. (11) zero (there by ruling out that ( πœ‡πΌ β‰  πœ‡π½ ), then the null The total mean sum of squares MST, the mean sums of hypothesis is rejected and πœ‡πΌ and πœ‡π½ are declared squares for error MSE, and the mean sums of squares for treatment MSC are: different at level of significance Ξ±. 𝑆𝑆𝑇 𝑆𝑆𝑇 The multiple comparison tests for population means, as MST = = , (12) well as the F-test, have the same assumptions. 𝑑𝑓(𝑆𝑆𝑇) π‘βˆ’1 𝑆𝑆𝐸 𝑆𝑆𝐸 There are many different multiple comparison MSE = = , (13) 𝑑𝑓(𝑆𝑆𝐸) π‘βˆ’π‘˜ 𝑆𝑆𝐢 𝑆𝑆𝐢 procedures that deal with these problems. Some of these MSC = = , (14) procedures are as follows: Fisher’s method, Tukey’s 𝑑𝑓(𝑆𝑆𝐢) π‘˜βˆ’1 The one-way ANOVA, assuming the test conditions are method, Scheffé’s method, Bonferroni’s adjustment satisfied, uses the following test statistic: method, DunnΕ idΓ‘k method. Some require equal sample F= 𝑀𝑆𝐢 . (15) sizes, while some do not. The choice of a multiple 𝑀𝑆𝐸 comparison procedure used with an ANOVA will Under H0 this statistic has Fisher’s distribution F (k – 1, depend on the type of experimental design used and the N – k). In case it holds for the test criteria comparisons of interest to the analyst. F > 𝐹1βˆ’π›Ό,π‘˜βˆ’1,π‘βˆ’π‘˜ , (16) The Fisher (LSD) method essentially does not correct where 𝐹1βˆ’π›Ό,π‘˜βˆ’1,π‘βˆ’π‘˜ is (1 – 𝛼)quantile of F distribution for the type 1 error rate for multiple comparisons and is with k - 1 and N - k degrees of freedom, then hypothesis generally not recommended relative to other options. H0 is rejected on significance level Ξ± The results of the computations that lead to the F The Tukey (HSD) method controls type 1 error very statistic are presented in an ANOVA table, the form of well and is generally considered an acceptable which is shown in the table1. technique. There is also a modification of the test for situation where the number of subjects is unequal across Table1. Basic one way ANOVA table. cells called the Tukey-Kramer test. Degrees Tail The ScheffΓ© test can be used for the family of all Source Sum Of of area pairwise comparisons but will always give longer of Squares freedom Mean F- above confidence intervals than the other tests. Scheffé’s Variation SS df square Statistic F procedure is perhaps the most popular of the post hoc Between P– group SSC k-1 MSC MSC/MSE value procedures, the most flexible, and the most conservative. Within SSE N-k MSE There are several different ways to control the Total SST N-1 experiment wise error rate. One of the easiest ways to 50 control experiment wise error rate is use the 𝐻1 : βˆƒ1 ≀ 𝑖, 𝑙 ≀ π‘˜: πœŽπ‘–2 β‰  πœŽπ‘™2 (25) Bonferroni correction. If we plan on There are many tests assumptions of homogeneity of making m comparisons or conducting m significance variances. Commonly used tests are the Bartlett (1937), tests the Bonferroni correction is to simply use π›Όβ„π‘š as Hartley (1940, 1950), Cochran (1941), Levene (1960), our significance level rather than Ξ±. This simple and Brown and Forsythe (1974) tests. The Bartlett, correction guarantees that our experiment wise error rate Hartley and Cochran are technically test of will be no larger than Ξ±. Notice that these results are homogeneity. The Levene and Brown and Forsythe more conservative than with no adjustment. The methods actually transform the data and then tests for Bonferroni is probably the most commonly used post equality of means. hoc test, because it is highly flexible, very simple to compute, and can be used with any type of statistical test Note that Cochran's and Hartley's test assumes that there (e.g., correlations), not just post hoc tests with ANOVA. are equal numbers of participants in each group. The Ε idΓ‘k method has a bit more power than the Bonferroni method. So from a purely conceptual point The tests of Bartlett, Cochran, Hartley and Levene of view, the Ε idΓ‘k method is always preferred. may be applied for number of samples k > 2. In such situation, the power of these tests turns out to be The confidence interval for πœ‡πΌ βˆ’ πœ‡π½ is calculated different. When the assumption of the normal using the formula: distribution holds for k > 2 these tests may be ranked by power decrease as follows: Cochran Bartlett Hartley 1 1 Levene. This preference order also holds in case when π‘ŒπΌ βˆ’ π‘Œπ½ Β± 𝑑1βˆ’π›Ό/2 , 𝑁 βˆ’ π‘˜ . βˆšπ‘† 2 ( + ) (19) the normality assumption is disturbed. An exception 𝑛𝐼 𝑛𝐽 concerns the situations when samples belong to some where 𝑑1βˆ’π›Ό/2 , 𝑁 βˆ’ π‘˜ is the quantile of the Student’s t distributions which have more heavy tails then the probability distribution, by Fisher method (LSD βˆ’ Least normal law. For example, in case of belonging samples Significant Difference); to the Laplace distribution the Levene test turns out to 𝑆2 1 1 be slightly more powerful than three others. π‘ŒπΌ βˆ’ π‘Œπ½ Β± π‘žπ›Ό,π‘˜ , 𝑁 βˆ’ π‘˜ . √ ( + 2 𝑛𝐼 𝑛𝐽 ), (20) Bartlett’s test has the following test statistic: where π‘žπ›Ό,π‘˜ , 𝑁 βˆ’ π‘˜ represents the quantile for the B = 𝐢 βˆ’1 [(𝑁 βˆ’ π‘˜). 𝐼𝑛𝑆 2 βˆ’ βˆ‘π‘˜π‘–=1(𝑛𝑖 βˆ’ 1). 𝐼𝑛𝑆𝑖2 ] , (26) 1 1 1 Studentized range probability distribution, by Tukey Where constant C = 1 + . (βˆ‘π‘˜π‘–=1 βˆ’ ) and 3(π‘˜βˆ’) 𝑛𝑖 βˆ’1 π‘βˆ’π‘˜ Kramer method (HSD βˆ’ Honestly Significant meaning of all other symbols is evident (see section 2). Difference); The hypothesis H0 is rejected on significance level Ξ±, 1 1 when π‘ŒπΌ βˆ’ π‘Œπ½ Β± √(π‘˜ βˆ’ 1)𝑆 2 ( + ) . 𝐹1βˆ’π›Ό,π‘˜βˆ’1,π‘βˆ’π‘˜ (21) 𝑛𝐼 𝑛𝐽 B > 𝑋 21βˆ’π›Ό,π‘˜βˆ’1 (27) By ScheffΓ© method; 2 where 𝑋 1βˆ’π›Ό,π‘˜βˆ’1 is the critical value of the chi- 1 1 square distribution with k - 1 degrees of freedom. π‘ŒπΌ βˆ’ π‘Œπ½ Β± 𝑑 π›Όβˆ— , 𝑁 βˆ’ π‘˜ βˆšπ‘† 2 ( + ) (22) 𝑛𝐼 𝑛𝐽 1βˆ’ 2 Cochran’s test is one of the best methods for detecting cases where the variance of one of the groups is much larger than that of the other groups. This test uses the 𝛼 where 𝛼 βˆ— = , C = (𝐾2) is the number of pairwise following test statistic: 2 comparisons in the family, by Bonferonni method; π‘šπ‘Žπ‘₯𝑠𝑖2 C= (28) βˆ‘π‘˜ 2 𝑖=1 𝑆𝑖 π‘ŒπΌ βˆ’ π‘Œπ½ Β± 𝑑 π›Όβˆ— , 𝑁 βˆ’ 1βˆ’ 2 The hypothesis H0 is rejected on significance level Ξ±, 1 1 when π‘˜ βˆšπ‘† 2 ( + ) (23) C > 𝐢𝛼,π‘˜,π‘›βˆ’1 (29) 𝑛𝐼 𝑛𝐽 where critical value 𝐢𝛼,π‘˜,π‘›βˆ’1 is in special statistical where 𝛼 βˆ— = 1 βˆ’ (1 βˆ’ 𝛼)1/𝐢 and C = (𝐾2), by tables. DunnΕ idΓ‘k metho Hartley’s test uses the following test statistic: Test for homogeneity of variance π‘šπ‘Žπ‘₯𝑆𝑖2 H= . (30) π‘šπ‘–π‘›π‘†π‘–2 Many statistical procedures, including analysis of variance, assume that the different populations have the The hypothesis H0 is rejected on significance level Ξ±, same variance. The test for equality of variances is used when to determine if the assumption of equal variances is H > 𝐻𝛼,π‘˜,π‘›βˆ’1 , (31) valid. where critical value 𝐻𝛼,π‘˜,π‘›βˆ’1 is in special statistical tables We will be interested in testing the null hypothesis Originally Levene’s test was defined as the one-way 𝐻0 : 𝜎12 = 𝜎22 = β‹― = πœŽπ‘˜2 (24) analysis of variance on 𝑍𝑖𝑗 = |𝑦𝑖𝑗 βˆ’ 𝑦𝑖 |, the absolute residuals 𝑦𝑖𝑗 βˆ’ 𝑦𝑖 , I = 1, 2, 3, …, k and j = 1, 2, 3, …, 𝑛𝑖 where against the alternative hypothesis 51 k is the number of groups and ni the sample size of 11.159 3 141 0.000 the ith group. The test statistic has Fisher’s distribution F(k – 1, N – k ) and is given by: 2 4.2. Test of significance for the treatment effect. (π‘βˆ’π‘˜) βˆ‘π‘˜ 𝑖=1 𝑛𝑖 .(𝑍𝑖 βˆ’π‘) F= 𝑛𝑖 2 . (32) After the tests for the assumption of normality and (π‘˜βˆ’1) βˆ‘π‘˜ 𝑖=1 βˆ‘π‘—=1(𝑍𝑖𝑗 βˆ’π‘π‘– ) equality of variance (Homoscedesticity), the next thing is to determine the significant effect of the independent 1 𝑛 Where N = βˆ‘π‘˜π‘–=1 𝑛𝑖 , 𝑍𝑖 = 𝑖 βˆ‘π‘—=1 𝑍𝑖𝑗 , 𝑍= variable, in this case amount of nitrogen. The 𝑛𝑖 1 𝑛 significance of the treatment is based on F distribution, βˆ‘π‘˜π‘–=1 βˆ‘π‘—=1 𝑖 𝑍𝑖𝑗 the test revealed that the probability of the Fisher 𝑁 distribution F (3, 141) was 0.000, less than the level of To apply the ANOVA test, several assumptions must significance of 0.05 (i.e, P < 0.05). The null hypothesis be verified, including normal populations, that there was no significant difference between the homoscedasticity, and independent observations. The mean chlorophyll was rejected. As presented in table 2. absolute residuals do not meet any of these assumptions, so Levene’s test is an approximate test of Table 4. One way ANOVA table for the experiment data. homoscedasticity. Brown and Forsythe subsequently proposed the absolute Source of Sum Of Squares Mean deviations from the median 𝑦̃𝑖 of the ith group, so is df F P Variation SS square 𝑍𝑖𝑗 = |𝑦𝑖𝑗 βˆ’ 𝑦̃𝑖 |. Between group 11373.06 3 3791.02 51. 19 4. Methodology Within group 10442.14 141 74.058 The study was undertaken in Kazaure north jigawa Total 21815.2 144 Nigeria. The population for this study was one hundred and sixty (160) maize seedlings grown and studied for 4.3. Post hoc comparison. three weeks period. Information was collected from the When the null hypothesis is rejected using the F-test in target population (maize seedlings) with the aid of ANOVA, we want to know where the difference among chlorophyll meter (SPAD 502 plus) to measure the the means is. To determine which pairs of means are chlorophyll content of the leaves of each seedling. Data significantly different, and which are not, we can use the analysis was with the aid of inferential statistics (one multiple comparison tests, in this case, tukey HSD. The way ANOVA). Independent variable for the study was result was presented in table 5. the amount of NKP measured in gram. The significance test for the between treatment effect was the researcher’s Table 5. Post hoc comparison. statistical evidence of the effect of the treatment on the chlorophyll content of the leaves of the maize seedlings. Mean Pairs I, J Difference Lower Bound Upper Bound 4.1. Test for normality and homogeneity of the data. -14.30648 -20.0027 -8.6108 To begin ANOVA test, one must verify the validity of C, 1* -16.98366 -22.0381 -11.9292 the normality and homogeneity assumptions of the data C, 2* under study. These tests were based on Kolmongorov – C, 3 2.97842 -2.1928 8.1497 Siminov and levene’s statistic respectively. These -2.67717 -8.1711 2.8167 1, 2 normality and homogeneity tests were conducted and 17.2849 11.6834 22.8864 found tenable P > 0.05, at 0.05 level for all the four 1, 3* treatment levels and P < 0.05, at 0.05 level respectively. 19.96208 15.0146 24.9096 2, 3* The results were presented in tables 2 and 3 below. *The mean difference is significant at the 0.05 level. Table 2. Kolmongorov–Siminov test of normality. 7. Conclusion In many statistical applications in agriculture, business TREATMENT Statistic df Sig. administration, psychology, social science, and the natural sciences we need to compare more than two TREATMENT1 0.117 27 0.200* groups. For hypothesis testing more than two population TREATMENT2 0.103 43 0.200* means, scientists have developed ANOVA method. The TREATMENT3 0.539 39 0.120* ANOVA test procedure compares the variation in observations between samples (sum of squares for CONTROL 0.298 36 0.130* groups, SSC) to the variation within samples (sum of *. This is a lower bound of the true significance. squares for error, SSE). The ANOVA F test rejects the null hypothesis that the mean responses are not equal in Table 3. Levene’s test for homogeneity of variance. all groups if SSC is large relative to SSE. The analysis of variance assumes that the observations are normally and Levene Statistic df1 df2 Sig independently distributed with the same variance for each treatment or factor level. 52 However, the ANOVA F test revealed a significant 4. Ostertagova, E., Applied Statistics (in Slovac), Elfa, effect of the amount of NPK on the chlorophyll content Kosice, 2011. of maize seedlings at P < 0.05, [F (3, 141) = 51.190, P = 5. Parra-Frutos, I., β€œ The bahaviour of the modified 0.000], and also the tukey HSD test result indicated a levene’s test when data are not normally non-significant difference between the treatments 1 and distributed,” comput Stat, Springer, 671-693 2 and treatments 3 and the control treatment (2009) respectively. Altogether the results revealed that the 6. Rafter, J.A., Abell, M.L., Braselton. J.P., β€œMultiple amount of NPK really do have effect on the chlorophyll Comparison Methods for Means,” SIAM content of maize seedlings. Review, 44(2). 259-278 (2002) 7. Rykov, V.V., Balakrisnan, N., Nikulin, M.S., References Mathematical and Statistical Models and 1. Aczel, A.D., Comple Business Statistics, (Irwing, methods in Relability, Springer, (2010). 1989) 8. Stephens, L.J., Advanced Statistics demystified, 2. Brown, M., Forsythe, A., β€œRobust tests for the McGraw-Hill (2004) equality of variances,” journal of the American 9. Aylor, S., Business Statistics.www.palgrave.com. Statistical Association, 365-367 (1974) 3. Montgomery, D.C., Runger, G.C., Applied Statistics and Probability for Engineers, (John wiley & Sons, 2003) 53