The terms ‘setting’ and ‘streaming’ are used to describe a variety of approaches by which pupils with similar levels of current attainment are consistently grouped together for lessons.
Setting’ usually involves grouping pupils in a given year group into classes for specific subjects, such as mathematics and English, but not across the whole curriculum.
‘Streaming’ (also known as ‘tracking’ in some countries) usually involves grouping pupils into classes for all or most of their lessons, so that a pupil is in the same group regardless of the subject being taught.
Pupils in different sets or streams sometimes follow a different curriculum, particularly when different national tests, different examination levels or different types of academic and vocational qualifications are available.
In some studies, such as Slavin 1987, ability grouping in classes is described as ‘re-grouping’.
Setting and streaming are combined in this Toolkit entry because these practices are usually combined in the evidence reviews on attainment grouping. Both involve regular and consistent grouping of pupils into classes based on attainment.
There are other forms of grouping, not included in this Toolkit entry, which also use current academic performance to organise pupils for teaching.
Within-class attainment grouping: Pupils with similar levels of current attainment are grouped together, for example, on specific tables, but all pupils are taught by their usual teacher and support staff, and they usually all follow the same curriculum.
Cross-age grouping: pupils from different year groups are formed into classes of similar current attainment for specific lessons (usually reading and mathematics), but then return to their same-age classes for other lessons.
Gifted and talented provision: high attaining pupils are taught in separate groups or classes.
Acceleration: pupils considered to be of exceptional ability receive separate lessons with a different curriculum (or the same curriculum at a faster pace) or join older learners for more advanced study.
The Toolkit has a separate entry on Within-class attainment grouping. Cross-age grouping, gifted and talented provision, and acceleration are not currently covered in the Toolkit.
Although these practices are sometimes described as ‘ability grouping’, we refer here to ‘attainment’ rather than ‘ability’, as schools generally use measures of current performance, rather than measures of ability, to group pupils.
Search Terms: ability grouping, setting, streaming, tracking, homogeneous/ heterogeneous grouping, regrouping, (gifted and talented, within class ability grouping)
Note: a combined search was run for the ‘setting or streaming’ and ‘within-class attainment grouping’ strands. Different inclusion criteria were then applied to identify the information relevant to each strand as some reviews contained data for Setting or streaming, or for Within class attainment grouping, or for both. Brackets indicate search terms which are less relevant to this strand. See the Toolkit manual for more detail on search and inclusion procedures.
Puntuación de la evidencia
There are six meta-analyses of setting or streaming studies included in the analysis for this strand, once duplication has been taken into account. These suggest that setting or streaming has a small negative effect overall. Although there is some variation depending on methods and research design, conclusions on the impact of ability grouping are relatively consistent.
The majority of the meta-analyses do not report risk of bias, or tests of heterogeneity, or assess the impact of methodological features such as research design effects or sample size. Only one meta-analysis has been conducted in the last ten years. Many of the designs of the included studies have limited causal inference.
The majority of the experimental evidence comes from the USA, and there are few rigorous experimental studies from other countries.
Overall the evidence is rated as limited.
Información coste adicional
Setting and streaming are organisational strategies that have few associated financial costs. Additional expenditure may be needed if setting or streaming results in greater numbers of classes or requires additional resources for different groups. Overall the costs are estimated as very low.
Additional impact information
There are a number of challenges involved in evaluating the impact of attainment grouping interventions which make evaluating and synthesising this literature challenging.
One is who should be included in the analysis. For interventions in which all pupils are grouped, it is possible to consider the overall impact, and the impact for each group. However, for interventions such as acceleration, which involve only some pupils from a year group, it is necessary to consider whether the impact should be measured just for those who receive the intervention (for whom the evidence suggests we will observe a positive impact), or for the whole cohort. The latter is often necessary, because there is evidence that the attainment of the ‘non-accelerated’ pupils may be detrimentally affected if higher achieving pupils are removed from some classes, both in the straightforward sense that the overall average for the class will come down if the results for higher attainers are not included, and in terms of the subsequent progress made once any positive peer effect of having the higher attainers in the class is removed.
Another is how to take account of the curriculum that pupils follow. If a grouping intervention involves different pupils studying different curriculum content, finding a test which can be used to assess the progress of all the pupils can be difficult. This is particularly the case for subjects like mathematics where questions on a test may relate to different mathematical content. The additional progress made by pupils who have been taught this additional content is compared to that made by pupils who have not been taught the material. This comparison may overstate the impact. This issue is often referred to as bias resulting from ‘treatment inherent’ measures (Slavin & Madden, 2011).
Differential impacts of setting and streaming on low, middle and high attaining pupils
Five of the six meta-analyses included in the analysis for this Toolkit entry also provided separate estimates of the impact of setting or streaming on low, middle and high attaining pupils. These results are presented below. These results should be treated with caution: the evidence for the overall impact estimate is rated as limited, and these estimates are likely to be more limited, as they are based on sub-group estimates.
Two other important points to consider, in the context of impact, are misallocation to attainment groups, and impacts on non-attainment outcomes. In line with every other Toolkit entry, the meta-analysis we present here considers impact on attainment outcomes only. However, these two issues arise repeatedly in the literature around setting and streaming and so warrant brief further discussion.
Misallocation: There is some evidence from the UK that misallocation to ‘ability groups’ is a particular problem for pupils from disadvantaged backgrounds. These pupils appear to be at greater risk of misallocation to lower attaining groups, and the impact of setting or streaming on pupils in lower attaining groups is negative on average. See, for example Wiliam and Bartholomew (2004), Tereshchenko et al. (2017) and Archer et al. (2018).
Impact on other outcomes: It is possible that setting or streaming also has an impact on wider outcomes such as confidence. A number of studies conclude that grouping pupils on the basis of attainment may have longer term negative effects on the attitudes and engagement of low attaining pupils, and discourage the belief that their attainment can be improved through effort. See, for example Hallam and Ireson (2007) and Tereshchenko et al. (2017).
Notes about 2018 update
We have more narrowly defined ‘setting or streaming’ for this update. Previously, the definition covered forms of attainment grouping that would not usually be described by teachers as setting or streaming. Because of the narrower definition, some of the studies which were previously included in the setting and streaming strand are no longer included. The list below provides a brief explanation of why studies have been excluded:
Gutierrez, R., & Slavin, R. E. (1992) Achievement Effects of the Nongraded Elementary School: A Retrospective Review. This study looks specifically at cross-age grouping.
Kulik C-L.C & Kulik J.A. (1982) Effects of Ability Grouping on Secondary School Students: A Meta-Analysis of Evaluation Findings, American Educational Research Journal, 19 (3), 415-428. The 2018 update revealed that this study was superseded by Kulik and Kulik 1992.
Kulik C-L.C & Kulik J.A. (1984) Effects of Ability Grouping on Elementary School Pupils: A Meta-Analysis. Annual Meeting of the American Psychological Association. The 2018 update revealed that this study was superseded by Kulik and Kulik 1992.
Lou, Y., Abrami, P. C., Spence, J. C., Poulsen, C., Chambers, B., & d’Apollonia, S. (1996) Within-class grouping: A meta-analysis. Review of Educational Research, 66(4), 423-458. (1996). This study considers only Within-class grouping, and so is not relevant to setting and streaming as defined for the 2018 update.
Puzio, K., & Colby, G.(2010) The Effects of within Class Grouping on Reading Achievement: A Meta-Analytic Synthesis. Society for Research on Educational Effectiveness. This study only considers within-class grouping, and so is not relevant to setting and streaming as defined for the 2018 update.
Steenbergen-Hu and colleagues (2016) undertook a meta-analytic review of the evidence over the past 100 years on grouping by attainment and acceleration. This was a tertiary review where they identified studies from existing meta-analyses and then recalculated overall effects. As described in the Toolkit manual, our analysis includes the relevant meta-analyses included in this tertiary review, but not the meta-meta-analysis itself. In their review they estimated standard errors from some studies in order to weight studies more appropriately. We have used their review to check the effect sizes from the meta-analyses included in the Toolkit and we have adopted their weighting approach, as this is consistent with the Toolkit methodology. We differ from Steenbergen-Hu et al. (2016) in that we recalculated the figures from Slavin (1987) Table 1 (see Steenbergen-Hu et al. (2016) page 859 and their Table 2), but were unable to replicate their average of -0.56: our estimates were a mean of -0.02 and median of 0.00 as Slavin himself reports.
1 - Archer, L., Francis, B., Miller, S., Taylor, B., Tereshchenko, A., Mazenod, A., Pepper, D., & Travers, M. C.
British Educational Research Journal, 44(1), 119-140.
2 - Boaler, J.
British Educational Research Journal 34.2 pp 167-194.
3 - Collins, C. A., & Gan, L.
(No. w18848). Cambridge, MA: National Bureau of Economic Research.
4 - Duflo, E., Dupas, P., Kremer, M.
American Economic Review 101 (5): pp 1739-1774.
5 - Dunne, M., Humphreys, S., Dyson, A., Sebba, J., Gallannaugh, F., & Muijs, D.
Curriculum Journal, 22(4), 485-513.
6 - Goldring, E. B.
Journal of Educational Research, 83, 313– 326
7 - Hallam, S., & Ireson, J.
British Educational Research Journal, 33(1), 27-45.
8 - Hanushek, E. A. & Woessmann, L.
CESifo working papers, No. 1415.
9 - Henderson, N. D. (Abstract)
Doctoral dissertation, Mississippi State University, Mississippi: Department of Curriculum and Instruction
10 - Ireson, J., Hallam, S. & Plewis, I.
11 - Ireson, J., Hallam, S., Mortimore, P., Hack, S., Clark, H. & Plewis, I.
Paper presented at the British Educational Research Association Annual Conference, University of Sussex at Brighton, September 2-5 1999.
12 - Kulik C-L.C & Kulik J.A.
Annual Meeting of the American Psychological Association.
13 - Kulik C-L.C & Kulik J.A.
American Educational Research Journal, 19 (3), 415-428.
14 - Kulik, J. A., & Kulik, C. C. (Abstract)
Gifted Child Quarterly, 36, 73–77.
15 - Kulik, J.A.
The National Research Center On The Gifted And Talented.
16 - Kulik, J.A., & Kulik, C.L.C.
Equity and Excellence in Education, 23(1-2), 22-30.
17 - Marks, R.
FORUM, 55(1), 31-44.
18 - Mosteller, F., Light, R. J., & Sachs, J. A.
Harvard Educational Review, 66, 797– 842.
19 - Rui, N. (Abstract)
Journal of Evidence Based Medicine Aug;2(3):164-83.
20 - Slavin, R. E. (Abstract)
Review of Educational Research, 57, 293–336.
21 - Slavin, R. E. (Abstract)
Elementary School Journal, 93, 535–552.
22 - Slavin, R. E. (Abstract)
Review of Educational Research, 60, 471–499.
23 - Slavin, R., & Madden, N. A.
Journal of Research on Educational Effectiveness, 4.4, 370-380.
24 - Steenbergen-Hu, S., Makel, M. C., & Olszewski-Kubilius, P.
Review of Educational Research, 86(4), 849-899.
25 - Tereshchenko, A., Francis, B., Archer, L., Hodgen, J., Mazenod, A., Taylor, B., Pepper, D., & Travers, M. C.
Research Papers in Education, 1-20
26 - Wiliam, D., & Bartholomew, H.
British Educational Research Journal, 30(2), 279-293.
Resumen de efectividad
9 - Henderson, N. D. (1989)
A meta-analysis of research studies found in ERIC was used to determine if there are significant differences in achievement and attitudes in homogeneously versus heterogeneously grouped pupils in the elementary grades. An effect size and f test formulas were used to compare grouped and ungrouped pupils in six areas: overall achievement, high ability student achievement, combined average and low ability student achievement, overall attitudes, high ability student attitudes and combined average and low ability student attitudes. Six hypotheses were tested. A comparison of overall student achievement was investigated for homogeneously and heterogeneously grouped pupils. The effect of heterogeneous grouping was to raise student achievement scores more than homogeneous grouping. A comparison of high ability student achievement was investigated for homogeneously and heterogeneously grouped elementary pupils. The effect of homogeneous grouping was to raise student achievement scores slightly more than by heterogeneous grouping. A comparison of combined average and low ability student achievement was investigated for homogeneously and heterogeneously grouped elementary pupils. The effect of grouping slightly favoured heterogeneous grouping. There was no significant difference in overall, high ability, or combined average and low ability student achievement at the .05 level. A comparison of overall student attitude was investigated for homogeneously and heterogeneously grouped elementary pupils. The size of the effect was large and favored homogeneous grouping. A comparison of high ability student attitudes was investigated for homogeneously and heterogeneously grouped elementary pupils. The effect size strongly favoured homogeneous grouping. A comparison of combined average and low ability student attitude was investigated for homogeneously and heterogeneously grouped elementary pupils. The effect size strongly favored homogeneous grouping. Overall and high ability student attitudes were significantly higher for homogeneous groups than for heterogeneous groups at the .05 level. Combined average and low ability student attitudes were not significant at the .05 level.
14 - Kulik, J. A., & Kulik, C. C. (1992)
Meta-analytic reviews have focused on five distinct instructional programs that separate students by ability: multilevel classes, cross-grade programs, within-class grouping, enriched classes for the gifted and talented, and accelerated classes. The reviews show that effects are a function of program type. Multilevel classes, which entail only minor adjustment of course content for ability groups, usually have little or no effect on student achievement. Programs that entail more substantial adjustment of curriculum to ability, such as cross-grade and within-class programs, produce clear positive effects. Programs of enrichment and acceleration, which usually involve the greatest amount of curricular adjustment, have the largest effects on student learning. These results do not support recent claims that no one benefits from grouping or that students in the lower groups are harmed academically and emotionally by grouping.
19 - Rui, N. (2009)
Objective: To review and synthesize evidence about academic and non-academic effects of detracking reform. Methods: Fifteen studies conducted from 1972 to 2006 were located and reviewed, including 4 experimental studies, 2 quasi-experimental studies, 7 observational studies, and 2 qualitative studies. Meta-analyses using fixed effects and random effects models were conducted for all and subsets of selected studies (by the academic ability of students and research design), followed by extensive discussion of individual studies. Results: Generally speaking, students in detracked groups performed slightly better academically than their equivalent-ability peers in tracked groups (d = 0.087, k = 22, N = 15,577, p < 0.0001), using a fixed effects model. A random effects model also indicated the overall positive effects of detracking (d = 0.202, k = 22, N = 15,577, p < 0.01). However, the effect sizes of individual studies are generally heterogeneous with I2(21)=94.033. Using a random effects model, the study shows that average or high ability students in detracked groups performed no differently than their equivalent-ability peers in tracked groups with a 95% confidence interval of (−0.047, 0.388). For low-achieving students, both the fixed effects model d = 0.113, k = 8, p < 0.0001, 95% CI (0.056, 0.169)] and random effects model [d = 0.283, k = 8, p < 0.005, 95% CI (0.087, 0.479)] revealed positive effects of detracking on student achievement for the 8 low-ability subgroups in 6 studies. The evidence with respect to the non-academic impact of detracking is mixed. Conclusion: The findings suggest that the detracking reform had appreciable effects on low-ability student achievement and no effects on average and high-ability student achievement. Therefore, detracking should be encouraged, especially in schools where the lower-track classes have been traditionally assigned fewer resources.
20 - Slavin, R. E. (1987)
This article reviews research on the effects of between- and within-class ability grouping on the achievement of elementary school students. The review technique- best-evidence synthesis-combines features of meta-analytic and narrative reviews. Overall, evidence does not support assignment of students to self-contained classes according to ability (median effect size [ES] = .00), but grouping plans involving cross-grade assignment for selected subjects can increase student achievement. Research particularly supports the Joplin Plan, cross-grade ability grouping for reading only (median ES = +.45). Within-class ability grouping in mathematics is also found to be instructionally effective (median ES = +.34). Analysis of effects of alternative grouping methods suggests that ability grouping is maximally effective when done for only one or two subjects, with students remaining in heterogeneous classes most of the day; when it greatly reduces student heterogeneity in a specific skill; when group assignments are frequently reassessed; and when teachers vary the level and pace of instruction according to students' needs.
21 - Slavin, R. E. (1993)
This article reviews research on the effects of ability grouping on the achievement of middle school students and discusses alternatives to traditional grouping practices. 6 randomized experiments, 7 matched experiments, and 14 correlational studies compared ability grouping to heterogeneous plans over periods of from 1 semester to 5 years. Overall achievement effects were found to be essentially 0 in middle and junior high school grades (6-9). Results were close to 0 for students of all levels of prior performance-high, average, and low. Alternatives to between-class ability grouping, including co- operative learning and within-class grouping, are also discussed. Finally, fruitful areas of future research are outlined.
22 - Slavin, R. E. (1990)
This article reviews research on the effect of ability grouping on the achievement of secondary students. Six randomized experiments, 9 matched experiments and 14 correlational studies compared ability grouping to heterogeneous plans over periods of from one semester to 5 years. Overall, achievement effects were found to be essentially zero at all grade levels although there is much more evidence regarding Grades 7-9 and 10-12. Results were similar for all subjects except social studies, for which there was a trend favouring heterogeneous placement. Results were close to zero for students of all levels of prior performance. This finding contrasts with those of studies comparing the achievement of students in different tracks, which generally find positive effects of ability grouping for high achievers and negative effects for low achievers, and these contrasting findings are reconciled.