Employment and earnings data
The employment data covers those with P45 and P14 records submitted through the Pay As You Earn (PAYE) system. These figures have been derived from administrative IT systems that, as with any large-scale recording system, are subject to possible errors with data entry and processing. While some data cleaning was necessary, the resulting data looks to provide a good reflection of an individual’s employment and earnings for the year.
For the purposes of collecting taxes only the tax year of employment is needed, accurate start and end dates within the tax year are not required. For this reason, issues encountered with the employment data included records with duplicate dates and records with dates which were invalid for our intended use (for example, where an employment start date occurred after the end date).
Additionally, a number of returns were found to have missing start dates due to, for example, the employer not forwarding a timely P45. The default dates recorded in the dataset are either 6 April (the first day of the tax year) or, where only an end date is known, the day before that end date. Similarly, for records where the employment is known to have come to an end within a tax year but the end date is not known, the record is given a default 5 April end date, the last day of the tax year.
Individuals can also have overlapping spells of employment. Before carrying out analysis, the P45 and P14 records for each individual were cleaned and then merged into a single record to give a longitudinal picture of their employment and a total sum of their earnings in each tax year. Where uncertain dates appeared, other employments or benefits records for that individual were used to create a merged employment spell with a known start and end date.
Example 1: Two employment spells
Spell A Start |---------| End
Spell B Unknown start |-----------------|--------------| End
Merged result Start |----------------------| End
In example 1, the start date of spell B is uncertain with its possible range shown in bold. In this instance we can merge the two records resulting in an employment spell with the start date of spell A and an end date from spell B.
Any remaining uncertain dates were imputed through random sampling of gap lengths from a frequency distribution that was constructed from gaps with a known length.
DWP/HMRC coverage
Beginning in April 2013, the P45 reporting system was phased out in favour of the Real Time Information (RTI) system, which requires employers to submit information to HMRC each time an employee is paid. This system has now reached full deployment. RTI offers substantial improvements to the P45 system in terms of data coverage, since employers must now provide information on all their employees if even one employee of the company is paid above the Lower Earnings Limit. The move to RTI will mean that data coverage is high for the 2014/15 to 2019/20 tax years used in this publication.
We cannot currently distinguish between part-time and full-time work in the LEO data. This is further discussed in “Methodology - Annualised earnings”.
As well as employment data for those who pay tax through PAYE, the employment data additionally includes those who pay tax through self-assessment.
Self-assessment forms are completed by a range of people who for example are self-employed, have received income from investments, savings or shares and by people who have complicated tax affairs. A list of people who are required to complete a self-assessment return can be found at www.gov.uk/self-assessment-tax-returns/who-must-send-a-tax-return (opens in a new tab). We have recently obtained a new self-assessment earnings dataset from HMRC, which contains variables on:
- Earnings received through employment (PAYE)
- Income from partnership enterprises
- Income from sole-trader enterprises
- Total earnings for the tax year from the self-assessment form.
We have used the income from partnership enterprises and income from sole-trader enterprises to ascertain graduates who are self-employed and their earnings from self-employment enterprises. We have taken a sum of these two variables, and where the sum of these is greater than £0, graduates are classified as self-employed. Where self-employment earnings are used, the earnings amount is the sum of these two variables.
In the data received from DWP, an overseas flag is received to identify individuals who are known to be living overseas. Details on when an individual informs HMRC can be found at Tax if you leave the UK to live abroad - GOV.UK (www.gov.uk) (opens in a new tab). For this analysis, individuals who are known to be overseas are excluded as their earnings and outcomes data is likely to be incomplete.
Graduate coverage
In this publication, we include first degree graduates and postgraduates attending Higher Education Providers (Higher Education Institutes (HEIs), Further Education Colleges (FECs) and Alternative Providers (APs)) in Great Britain.
For postgraduates, we only include those from HEIs Higher Education Institutes (HEIs) due to small numbers studying postgraduate degrees in Alternative Providers and Further Education Colleges. It should be noted that not many postgraduate courses are undertaken in these providers with the most common courses at level 7 (master’s degree) in PGCE / Education and Teaching, Business or Law.
In the 2014/15 academic year, some other specialist providers in England were mandated to submit data to the Higher Education Statistics Agency (HESA). In the 2015/16 academic year, the coverage was extended to include all other specialist providers in England with undergraduate designated courses. For this reason, this publication only includes information for graduates from other specialist providers, one and three years after graduation. (Note that in line with HESA statistics, the University of Buckingham, a specialist provider, is reported with universities).
Young graduates (under 21 at the start of their course)
Some of the breakdowns in this release only cover young graduates (under 21 at the start of their course). This is due to low data coverage in graduates who were mature students (21 or over at the start of the course) or where including mature students would provide an unreliable comparison against trends within the young graduates group. For example, the free school meals (FSM) breakdown has been calculated using school records data, and for many of the mature graduates, this data is not readily available due to them having left school before this information was collected. Another example, ‘Home region’ has been calculated on young graduates alone using information about where they lived prior to study. For mature graduates this information is not as likely to be their home region, because they are more likely to have geographically relocated between leaving school and starting their course. The breakdowns that only cover young graduates are POLAR quintile, prior attainment, FSM, home region and residence.
Degree level (postgraduates)
The level of qualification is grouped in to three categories (level 7 taught, level 7 research and level 8). These are defined using the HESA "QUAL" variable (Student 2019/20 - Qualification awarded | HESA (opens in a new tab)).
Graduates were broadly grouped into Level 7 and Level 8, more commonly known as Master’s degrees and doctoral degrees respectively. Enhanced undergraduate courses (e.g. MMath, MEng) that give you a postgraduate-level qualification are not included in our Level 7 population. These degree courses are included in our first degree population as you do not need to have completed a Level 6 qualification to apply for these courses.
Level 7 data was also broken down into Level 7 (taught) for taught master’s degrees and Level 7 (research) for research masters degrees. In addition, Postgraduate Certificate in Education (PGCE) and Masters in Business Administration (MBA) were also split from the overall Level 7 (taught) numbers. For subject level breakdowns, these two courses were also split from the other 35 subject categories.
Comparison between first degree and postgraduates
Some factors to consider when making comparisons between first degree graduates and postgraduates:
- Firstly, students who go on to postgraduate study are typically expected to have achieved a higher level of attainment in their first degree. In the May 2018 publication (SFR_Template_NatStats (publishing.service.gov.uk) (opens in a new tab)), analysis was published to show the progression to further study and income distribution by degree classification.
- Secondly, the distribution of graduates by subject studied is likely to be different for postgraduates compared to first degree graduates (more information on population distributions by subject studied can be found at What do HE students study? | HESA (opens in a new tab)). It could be that postgraduate degrees tend to be in higher earnings subjects and this leads to the overall postgraduate average being higher.
The IFS published a report in September 2020 to look into these questions in more detail as well as other factors - Earnings returns to postgraduate degrees in the UK (ifs.org.uk) (opens in a new tab).
Higher Education provider coverage
In this publication, we include graduates from Higher Education Institutes (HEIs), Alternative Providers (APs) and Further Education Colleges (FECs). HEIs are mainly universities and former APs are HE providers who did not receive recurrent funding from the Office for Students (OfS) or other public bodies and who are not further education colleges. Eligible students can access loans and grants from the Student Loans Company (SLC) on specific courses, referred to as designated courses.
HESA coverage (first degree graduates and postgraduates)
In this publication, we include graduates from all higher education providers. Most tables of HESA data count students at providers in England registered with the OfS in the Approved (fee cap) or Approved categories plus publicly funded higher education providers in Wales, Scotland and Northern Ireland.
Specialist providers who are not universities did not need to return data on postgraduate students before the 16/17 academic year and from the 17/18 academic year they were required to return data on all postgraduate students (see Figure 15a - Alternative provider qualifications obtained by level of qualification obtained 2015/16 to 2018/19 | HESA (opens in a new tab)).
Postgraduates are identified using the “XPQUAL01” (see XPQUAL01_2.20.1 | HESA (opens in a new tab)) and “XQLEV501” (see XQLEV501_1.14.1 | HESA (opens in a new tab)) HESA variables. For this publication, postgraduates are identified as records where XPQUAL01 is ‘1’ and XQLEV501 is ‘1’ (Postgraduate research) or ‘2’ (Postgraduate taught). It should be noted that integrated Masters degrees are not included in this.
First degree graduates are identified using the same two variables from HESA. Here, we again filter for XPQUAL01 to be ‘1’, and XQLEV501 to be ‘3’ (first degree). Integrated Masters degrees are included here.
More details on the qualifications that are in each XQLEV501 grouping can be found from HESA's “QUAL” variable from which XQLEV501 is derived (Student 2020/21 - Qualification awarded | HESA (opens in a new tab)
NPD data coverage
For both free school meals status (FSM) and prior attainment, LEO data is linked to the National Pupil Database (NPD). More information on the NPD can be found at Find and explore data in the National Pupil Database - GOV.UK (education.gov.uk) (opens in a new tab).
For FSM, the school census data is linked to LEO data. The school census covers a variety of schools which are listed at Which schools and pupils to include - Complete the school census - Guidance - GOV.UK (www.gov.uk) (opens in a new tab). Not every individual in the LEO data can be matched to a school census record, these are represented as “Not known” in the publication. This could be because a pupil does not attend a school that completes the school census (e.g. they are at a registered independent school), they went to school outside of England, we are unable to match their LEO record to an NPD record from the variables given, and other less frequent reasons.
For Prior Attainment, we link to the KS5 attainment datasets. This covers a wider range of pupils and schools which is why we see significantly smaller numbers in the “Not known” section. However, we still have some “Not knowns” in this group from individuals who are Scottish and Welsh domiciled but attended an English HE provider or who we could not match to an NPD record.
Industry data coverage
The industry groups provided use the ONS Standard Industrial Classification (SIC) codes (opens in a new tab) agreed in 2007 (SIC2007). SIC codes provide information on the type of economic activity the graduates’ employer is engaged in, not the occupation of the graduate. This has been linked to LEO using the employer enterprise reference from the IDBR.
The IDBR covers over 2.6 million businesses in all sectors of the UK economy; however it does not include very small businesses. To be on the IDBR businesses must be registered either for VAT or PAYE. The Business Population Estimates publication provides figures for the number of UK businesses, 5.9 million, including the small businesses excluded from the IDBR. The IDBR covers approximately 44% of the total UK business population.
The IDBR data used in this dashboard is from datasets owned by the Office of National Statistics (ONS). The ONS does not accept responsibility for any inferences or conclusions derived from the IDBR data by third parties.
Graduates who do not have a PAYE record (e.g. self-employed) cannot be linked and will therefore be classified as ‘unknown’. A graduate’s SIC code is the industry in which they earnt the most in the tax year, and in the case where there were two industries in which the graduate earnt an equal amount, we have classified these as ‘unknown’ since one cannot be chosen. The majority of the analysis uses the 21 industry sections, however in the dashboard, the industry by subject table is expandable to the 3-digit-code level (see this ONS interactive SIC hierarchy (opens in a new tab)).
Graduates whose only income comes from self-employment cannot be linked to a SIC code, and therefore will be classified as ‘Unknown’.
Please note there are a small number of cases (0.03%) where there are conflicting section names and group names. These records are present in the data for transparency since we cannot say with certainty which is correct. The industries affected are:
- Agriculture, forestry and fishing
- Mining and quarrying
- Activities of extraterritorial organisations and bodies
- Administrative and support service activities
- Manufacturing
Domicile Categories (UK, EU and overseas)
Domicile categories have been based upon graduates’ domicile prior to the start of their course, as recorded in the HESA student record for graduates from HEIs and APs and as recorded in the ILR for graduates from FECs. Graduates have been categorised into three top-level categories – UK, EU and Non-EU. Due to data quality issues with the domicile variable on the ILR in the 2003/04 and 2004/05 academic years, we have not included non-UK domiciled graduates from FECs in the tables for these years.
UK domiciled refers to graduates domiciled in England, Scotland, Wales or Northern Ireland prior to the start of their course.
EU domiciled refers to graduates domiciled in the EU (with their country’s EU membership being determined at the start of their graduation year) other than in England, Scotland, Wales or Northern Ireland. As such, graduates domiciled in Gibraltar have been classed as EU domiciled. To expand further on this, over the period covered by this publication the membership of the EU has changed and hence different graduating cohorts consist of different sets of countries. Graduates have been classed as EU domiciled if their recorded country of domicile was a member of the EU at the start of their year of graduation. Table below details for which cohort(s) each country has been designated as part of the EU domiciled category. Countries listed include all of their European Union territories; for instance, Finland includes the territory of the Åland islands.
Countries and territories included in the European Union category by graduating cohort
Country/Territory | Graduating cohorts in which domicile is counted as EU domiciled |
---|
Austria Belgium Denmark Finland France Germany Gibraltar Greece Ireland Italy Luxembourg Netherlands Portugal Spain Sweden | All graduating cohorts (2003/04, 2004/05, 2008/09, 2009/10, 2010/11, 2011/12, 2012/13, 2013/14, 2014/15, 2015/16, 2016/17. 2017/18, 2018/19) |
Cyprus Czech Republic Estonia Hungary Latvia Lithuania Malta Poland Slovakia Slovenia | 2004/05, 2008/09, 2009/10, 2010/11, 2011/12, 2012/13, 2013/14, 2014/15, 2015/16, 2016/17, 2017/18, 2018/19 |
Bulgaria Romania | 2008/09, 2009/10, 2010/11, 2011/12, 2012/13, 2013/14, 2014/15, 2015/16, 2016/17, 2017/18, 2018/19 |
Croatia | 2013/14, 2014/15, 2015/16, 2016/17, 2017/18, 2018/19 |
Overseas domiciled refers to graduates domiciled in countries/territories not belonging to the European Union. The Crown Dependencies of Jersey, Guernsey and the Isle of Man are not part of the UK or of the European Union and thus they have been included in this category.
Table 19 in the accompanying tables gives employment and earnings outcomes for the 20 largest countries of domicile within our data. We have followed methodology used by HESA in defining country of domicile, for instance aggregating together the various territories of France in the France total but keeping China and Hong Kong separate.
Note that country of domicile is not the same as nationality (as recorded on the HESA student record). For instance, in 2012/13, 91% of UK domiciled graduates were UK nationals, while 7% of EU domiciled graduates and about 4% of overseas domiciled graduates were UK nationals.
Coronavirus employment support schemes
Two employment support schemes were set up to support employments affected by lockdown closures. The Coronavirus Job Retention Scheme (CJRS) was for employees who were on furlough, and the Self-Employment Income Support Scheme (SEISS) was for self-employed workers. All CJRS payments were paid via PAYE, so CJRS earnings are part of the RAPID data. RAPID (Registration and Population Interactions Database) is a DWP dataset. In theory, people who received a SEISS grant should complete self-assessment tax returns so the SEISS amounts will also be reflected in RAPID.
The CJRS and SEISS datasets were matched to the HE LEO publication data. A flag was created for graduates who had been in receipt of either of these HMRC Covid-19 employment support schemes for at least one week in 2020/21 tax year.
The proportion of graduate cohorts who were in receipt of an employment support grant, either CRJS or SEISS, should provide some context to earnings figures. This only includes graduates who were on furlough and paid with the aid of the CJRS, or in receipt of a SEISS grant. Employers were able to furlough employees and pay them without the aid of the CJRS scheme, and not all self-employed people applied or were eligible for the SEISS grant. These graduates are not counted in the proportions in receipt of an employment support grant.