Background
The methodology for estimating earnings returns in FE was established and refined by a series of research projects led by the University of Westminster during 2012 through 2015. The estimates used in previous editions of the Skills Index were taken from:
These earnings returns have now been updated internally, applying the same modelling methodology defined by Bibby et. al. (2014) to a more recent set of learners. The latest data spans those who had completed courses between academic years 2008/09 and 2013/14 and their earnings up to the financial year 2016/17.
Overview of returns estimation methodology
The methodology established by Bibby et. al. (2014) to estimate earnings returns uses a multiple regression model to isolate the effect on earnings for those who achieve a qualification, compared with those who start but do not achieve. The regression model controls for other observable characteristics and is run to estimate effects on (logged) earnings in each of years 3, 4 and 5 after completion, which are then averaged to provide a final value.
The controls used are: sex; age; region; ethnicity; Index of Multiple Deprivation (IMD); prior attainment, duration of study; number of previous FE learning spells; sector subject area; the number of days an individual was on active benefits in the year before learning; whether an individual has an inactive benefit spell in the year before learning; number of days in sustained (6 months) employment an individual has just before learning. OLASS learning and academic qualifications are also excluded from the dataset.
The estimates are presented as percentage increase in earnings that occur as a result of achieving a qualification. This is an average effect across those achieving a qualification at that level for the first time, and those who may have retrained or extended their knowledge at the same level.
This returns estimation approach has been validated through both the detailed work in the original research programme and in subsequent research. Chapter 6 of Bibby et. al. (2014) provides an assessment of the robustness of the achiever/non-achiever comparison to derive the returns. In “Settling the counterfactual debate (opens in a new tab)”, the Centre for Vocational Education Research (CVER) has also compared this counterfactual against a method looking at learners in possession of qualifications at the level below, including the impact on earnings differentials.
Updated administrative dataset
The multiple regression modelling uses a dataset constructed by joining a longitudinal view of the Individualised Leaner Record (ILR) to relevant data from the Longitudinal Educational Outcomes (LEO) study.
- Learning activity recorded on the ILR is combined into “spells”, so that a learner’s full activity can be connected (including across summer holidays) to allow identification of the highest and latest learning aim for use in the regression.
- Data on earnings, employment and benefits is then taken from LEO data to derive a range of outcome measures at yearly intervals after the completion of that highest and latest learning aim. Within this, earnings are converted to constant prices (2017 in the current dataset).
The original research by Bibby et. al. (2014) pre-dated the creation of the full LEO methodology and a significant part of the research was focussed on the construction of a prototype database. This linked ILR with the same benefit information (from Department for Work and Pensions data) and PAYE employment and earnings histories (from HMRC data) that are now used in LEO. This dataset spanned learners completing during 2004/05 to 2008/09, with earnings up to the financial year 2011/12.
To produce updated estimates we have refreshed the dataset construction process, to make use of the LEO datasets. We have also used more recent data spanning learners who had completed courses between academic years 2008/09 and 2013/14 and their earnings up to the financial year 2016/17.
Other changes implemented as part of the refreshed dataset construction approach include: extending the age coverage of the apprenticeships cohort to include 16-18 year olds; including qualifications at Level 4 and 5 for the first time; refining the method for defining learning spells; and (via the now established LEO process) improved linkage between education, and benefits and tax records, with improved data cleansing.
Changes in the earnings returns
The Index uses returns estimates associated with different levels of learning and, where available, different Sector Subject Areas (SSA).
In order to produce statistically significant returns estimates, the regression methodology requires a sufficiently large cohort of non-achievers and these are not available at detailed SSA level for all types and levels of courses. As a result, we only produce more detailed SSA Tier 1 estimates for Full Level 2 and Full Level 3 learning (for both Classroom-based and Apprenticeship courses). Estimates for Other Level 2 and 3 Classroom-based learning have not yet been updated, as further work is required on refining the data structures to properly isolate these qualifications within learning spells. For these types of courses the Skills Index still uses the 2014 estimates.
The below table shows how the updated estimates compare with the 2014 estimates, although not all values in the table are used to calculate the Skills Index. 2014 estimates for apprenticeships were based on adults aged 19+ whereas the updated estimates cover the whole 16+ apprenticeship cohort. No estimates are available where cells are marked [x].
Provision type | Level | 2014 estimate | Updated estimate | Difference |
Classroom-based (19+) | Below level 2 | 2% | 5% | +3ppt |
Other Level 2 | 1% | [x] | [x] |
Full Level 2 | 11% | 9% | - 2ppt |
Other Level 3 | 3% | [x] | [x] |
Full Level 3 | 9% | 16% | +7ppt |
Level 4/5 | [x] | 11% | [x] |
Apprenticeships (16+) | Intermediate | 11% | 14% | +3ppt |
Advanced | 16% | 17% | +1ppt |
Higher | [x] | 23% | [x] |
The subject-level estimates from Cerqua et. al. (2015) were based on a subject classification different to Sector Subject Area, so the below table only shows estimates from the updated process. No estimates are available where cells are marked [x].
Sector Subject Area (Tier 1) | Intermediate Apprenticeship | Advanced Apprenticeship | Classroom-based Full Level 2 | Classroom-based Full Level 3 |
Health, Public Services and Care | 18% | 10% | 14% | 13% |
Science and Mathematics | [x] | [x] | [x] | 11% |
Agriculture, Horticulture and Animal Care | 13% | 15% | 14% | 18% |
Engineering and Manufacturing Technologies | 22% | 37% | 6% | 12% |
Construction, Planning and the Built Environment | 20% | 19% | 16% | 19% |
Information and Communication Technology (ICT) | 27% | 28% | 4% | 14% |
Retail and Commercial Enterprise | 12% | 9% | 13% | 15% |
Leisure, Travel and Tourism | 5% | 7% | 5% | 8% |
Arts, Media and Publishing | [x] | [x] | 11% | 12% |
History, Philosophy and Theology | [x] | [x] | [x] | 12% |
Education and Training | [x] | [x] | 7% | 13% |
Preparation for Life and Work | [x] | [x] | 10% | [x] |
Business, Administration, Finance and Law | 11% | 11% | 8% | 13% |