Methodology

Graduate outcomes (LEO)

Published
Last updated
See all notes (2)
  1. Updated Graduate industry sections.

  2. New information on Graduate industry included in data quality and definitions sections.

Introduction

Background to the Longitudinal Educational Outcomes (LEO) dataset 

The Small Business, Employment and Enterprise Act 2015 enabled government, for the first time, to link higher education and tax data together to chart the transition of graduates from higher education into the workplace (for more information on the legal powers governing the dataset please see section 78 of the Small Business, Enterprise and Employment Act 2015 and sections 87-91 of the Education and Skills Act 2008). 

One of the advantages of linking data from existing administrative sources is that it provides a unique insight into the destinations of graduates without imposing any additional data collection burdens on universities, employers or members of the public. Compared to existing sources of graduate outcomes data, it is also based on a considerably larger sample, does not rely on survey methodology, and can track outcomes across time to a greater extent than was previously possible. 

The LEO dataset links information about students, including;

  • personal characteristics such as sex, ethnic group and age
  • education, including schools, colleges and higher education institution attended, courses taken, and qualifications achieved
  • employment and income
  • benefits claimed

 

It is created by combining data from the following sources:

  • the National Pupil Database (NPD), held by the Department for Education (DfE)
  • Higher Education Statistics Agency (HESA) data on students at UK publicly funded higher education institutions and some alternative providers, held by DfE
  • Individualised Learner Record data (ILR) on students at further education institutions, held by DfE
  • employment data from the Real Time Information System (RTI). RTI contains information formerly collected on the P45 and P14 forms, held by Her Majesty’s Revenue and Customs (HMRC)
  • data from the Self-Assessment tax return, held by HMRC
  • the National Benefit Database, Labour Market System and Juvos data, held by the Department for Work and Pensions (DWP)

By combining these sources, we can look at the progress of higher education leavers into the labour market.  

The privacy notice explaining how personal data in this project is shared and used can be found at Longitudinal education outcomes study: how we use and share data - GOV.UK (www.gov.uk)

Data quality and coverage

Employment and earnings data 

The employment data covers those with P45 and P14 records submitted through the Pay As You Earn (PAYE) system. These figures have been derived from administrative IT systems that, as with any large-scale recording system, are subject to possible errors with data entry and processing. While some data cleaning was necessary, the resulting data looks to provide a good reflection of an individual’s employment and earnings for the year. 

For the purposes of collecting taxes only the tax year of employment is needed, accurate start and end dates within the tax year are not required. For this reason, issues encountered with the employment data included records with duplicate dates and records with dates which were invalid for our intended use (for example, where an employment start date occurred after the end date). 

Additionally, a number of returns are found to have missing start dates due to the employer not forwarding a timely P45. The default dates recorded in the dataset are either 6 April (the first day of the tax year) or, where only an end date is known, the day before that end date. Similarly, for records where the employment is known to have come to an end within a tax year but the end date is not known, the record is given a default 5 April end date, the last day of the tax year. 

Individuals can also have overlapping spells of employment. Before carrying out analysis, the P45 and P14 records for each individual were cleaned and then merged into a single record to give a longitudinal picture of their employment and a total sum of their earnings in each tax year. Where uncertain dates appeared, other employments or benefits records for that individual were used to create a merged employment spell with a known start and end date.

Example 1: Two employment spells 

Spell A                               |---------| 

Spell B                  |-----------------|--------------|  

Merged result               |----------------------| 

In example 1, the start date of spell B is uncertain with its possible range shown in bold. In this instance we can merge the two records resulting in an employment spell with the start date of spell A and an end date from spell B. 

Any remaining uncertain dates were imputed through random sampling of gap lengths from a frequency distribution that was constructed from gaps with a known length. 

DWP/HMRC coverage 

Beginning in April 2013, the P45 reporting system was phased out in favour of the Real Time Information (RTI) system, which requires employers to submit information to HMRC each time an employee is paid. This system has now reached full deployment. RTI offers substantial improvements to the P45 system in terms of data coverage, since employers must now provide information on all their employees if even one employee of the company is paid above the Lower Earnings Limit. The move to RTI will mean that data coverage is high for the 2014/15 to 2018/19 tax years used in this publication. 

We can not currently distinguish between part-time and full-time work in the LEO data. This is further discussed in “Methodology - Annualised earnings”.  

As well as employment data for those who pay tax through PAYE, the employment data additionally includes those who pay tax through self-assessment. 

Self-assessment forms are completed by a range of people who for example are self-employed, have received income from investments, savings or shares and by people who have complicated tax affairs. A list of people who are required to complete a self-assessment return can be found at www.gov.uk/self-assessment-tax-returns/who-must-send-a-tax-return. We have recently obtained a new self-assessment earnings dataset from HMRC, which contains variables on: 

  • Earnings received through employment (PAYE)
  • Income from partnership enterprises
  • Income from sole-trader enterprises
  • Total earnings for the tax year from the self-assessment form.

We have used the income from partnership enterprises and income from sole-trader enterprises to ascertain graduates who are self-employed and their earnings from self-employment enterprises. We have taken a sum of these two variables, and where the sum of these is greater than £0, graduates are classified as self-employed. Where self-employment earnings are used, the earnings amount is the sum of these two variables. 

In the data received from DWP, an overseas flag is received to identify individuals who are known to be living overseas. Details on when an individual informs HMRC can be found at Tax if you leave the UK to live abroad - GOV.UK (www.gov.uk). For this analysis, individuals who are known to be overseas are excluded as their earnings and outcomes data is likely to be incomplete.  

Graduate coverage 

In the 2014/15 academic year, some APs in England were mandated to submit data to the Higher Education Statistics Agency (HESA). In the 2015/16 academic year, the coverage was extended to include all APs in England with undergraduate designated courses. For this reason, this publication only includes information for AP graduates, one and three years after graduation. (Note that in line with HESA statistics, the University of Buckingham, an Alternative Provider, is reported with HEIs).  

Some of the breakdowns in this release only cover young graduates (under 21 at the start of their course). This is due to low data coverage in graduates who were mature students (21 or over at the start of the course) or where including mature students would provide an unreliable comparison against trends within the young graduates group. For example, the free school meals (FSM) breakdown has been calculated using school records data, and for many of the mature graduates, this data is not readily available due to them having left school before this information was collected. Another example, ‘Home region’ has been calculated on young graduates alone using information about where they lived prior to study. For mature graduates this information is not as likely to be their home region, because they are more likely to have geographically relocated between leaving school and starting their course. The breakdowns that only cover young graduates are POLAR quintile, prior attainment, FSM, home region and residence. 

In this publication, we include graduates from Higher Education Institutes (HEIs), Alternative Providers (APs) and Further Education Colleges (FECs). Alternative providers (APs) are higher education (HE) providers who do not receive recurrent funding from the Office for Students (OfS) or other public bodies and who are not further education colleges (FECs). Eligible students can access loans and grants from the Student Loans Company (SLC) on specific courses, referred to as designated courses. In previous publications (e.g. Graduate outcomes (LEO): Employment and earnings outcomes of higher education graduates by subject studied and graduate characteristics in 2017 to 2018 (publishing.service.gov.uk)) we provided a breakdown of earnings/outcomes by provider type. However, we no longer compare these as they are providers for different characteristics of students. 

The table below shows the employment outcomes by the three provider types and the number of graduates from each type of provision for the 2014/15 graduating cohort (three years after graduation in 2018/19 tax year). 

Provider typeNumber of graduatesNumber of matched graduates Percentage of matched graduates in sustained employment or further study (%)
HEI271,905266,55087.8
AP3,1653,03086.3
FEC7,5607,44082.4

For more data split by provider type, see table 7 in the ‘UK domiciled excel tables’ document - this can be found in the ‘Download associated files’ section at the beginning of this release.

NPD data coverage

For both free school meals status (FSM) and prior attainment, LEO data is linked to the National Pupil Database (NPD).  More information on the NPD can be found at Find and explore data in the National Pupil Database - GOV.UK (education.gov.uk).

For FSM, the school census data is linked to LEO data. The school census covers a variety of schools which are listed at Which schools and pupils to include - Complete the school census - Guidance - GOV.UK (www.gov.uk). Not every individual in the LEO data can be matched to a school census record, these are represented as “Not known” in the publication. This could be because a pupil does not attend a school that completes the school census (e.g. they are at a registered independent school), they went to school outside of England, we are unable to match their LEO record to an NPD record from the variables given, and other less frequent reasons.

For Prior Attainment, we link to the KS5 attainment datasets. This covers a wider range of pupils and schools which is why we see significantly smaller numbers in the “Not known” section. However, we still have some “Not knowns” in this group from individuals who are Scottish and Welsh domiciled but attended an English HE provider or who we could not match to an NPD record. 

Industry data

Graduates whose only income comes from self-employment cannot be linked to a SIC code, and therefore will be classified as ‘Unknown’. 

Please note there are a small number of cases (0.03%) where there are conflicting section names and group names. These records are present in the data for transparency since we cannot say with certainty which is correct. The industries affected are: 

  • Agriculture, forestry and fishing
  • Mining and quarrying
  • Activities of extraterritorial organisations and bodies
  • Administrative and support service activities
  • Manufacturing

Methodology

Time period 

The earliest time period for which employment and earnings data is reported is one year after graduation. This refers to the first full tax year after graduation (YAG). Hence, for the 2016/17 graduation cohort, the figures one year after graduation refer to employment and earnings outcomes in the 2018/19 tax year. This time period was picked as using the tax year that overlaps with the graduation date would mean that graduates are unlikely to have been engaged in economic activity for the whole tax year. 

Academic year 2016/17   |------------------|    

Tax year 2017/18                                    |------------------|

Tax year 2018/19                                                            |------------------|  

In this publication, we look at one, three, five and ten years after graduation, focussing on the 2018/19 tax years with some comparative analysis with 2014/15 to 2017/18 tax years. Thus, we look at employment and earnings outcomes in the 2018/19 tax year for graduates from the 2007/08, 2012/13, 2014/15 and 2016/17 academic years. For 2014/15 tax year graduates from the 2003/04, 2008/09, 2010/2011 and 2012/2013 academic years and the other tax years are calculated using this method. 

The table below shows this for all tax years and academic years. The cells represent years after graduation (YAG). Bold indicates it is a cohort available in this publication:

  Tax Year
  2014/152015/162016/172017/182018/19
Academic year of graduation2003/0410 YAG11 YAG12 YAG13 YAG14 YAG
2004/059 YAG10 YAG11 YAG12 YAG13 YAG
2005/068 YAG9 YAG10 YAG11 YAG12 YAG
2006/077 YAG8 YAG9 YAG10 YAG11 YAG
2007/086 YAG7 YAG8 YAG9 YAG10 YAG
2008/095 YAG6 YAG7 YAG8 YAG9 YAG
2009/104 YAG5 YAG6 YAG7 YAG8 YAG
2010/113 YAG4 YAG5 YAG6 YAG7 YAG
2011/122 YAG3 YAG4 YAG5 YAG6 YAG
2012/131 YAG2 YAG3 YAG4 YAG5 YAG
2013/14 1 YAG2 YAG3 YAG4 YAG
2014/15  1 YAG2 YAG3 YAG
2015/16   1 YAG2 YAG
2016/17    1 YAG

Employment outcomes 

We refer to a graduate as matched if they have been successfully matched to the Department for Work and Pensions’ Customer Information System (CIS) or if they have been matched to a further study instance on the HESA Student Record. Graduates who have not been matched to CIS or a further study record are referred to as unmatched. These graduates were not found on DWP’s Customer Information System (CIS), either because they had never been issued with a National Insurance number or because the personal details provided from the HESA data did not fulfil the matching criteria. These graduates are excluded from calculations performed for UK domiciled populations. This is as well as records that were matched and are known to be overseas. They are not included in outcomes categories in Tables 1 to 14 and 20 to 32. 

UK domiciled graduates who have been matched and are not known to be overseas are then placed in one of five outcomes categories. These are: 

  • Activity not captured
  • No sustained destination
  • Sustained employment only
  • Sustained employment with or without further study
  • Sustained employment, further study or both.

Unmatched graduates are included in the denominator when calculating employment outcomes for non-UK domiciled graduates (Tables16, 17 and 33) and are placed in a separate ‘unmatched’ outcome category. For these populations the match rates are much lower and non-UK graduates are much more likely to leave the UK after graduation. Including these graduates in the calculations means we get a better indication of the proportion of graduates who have stayed in the UK to work or study after graduation, making it easier to compare countries with vastly different match rates.  

For non-UK domiciled graduates, the employment outcome categories should not be used as an indication of success in finding employment after graduation, it is likely that the majority of these graduates who are ‘unmatched’ or in ‘activity not captured’ are employed outside of the UK.  

More information on match rates is given in section: Data matching and match rates. If a graduate is unmatched on the CIS but has a further study record for the tax year in question, then they are counted as being in further study, and hence are not in the unmatched category. 

Activity not captured 

Graduates in this category have been successfully matched to CIS but do not have any employment, out-of-work benefits or further study records in the tax year of interest. Reasons for appearing in this category include: moving out of the UK after graduation for either work or study, voluntarily leaving the labour force or death. 

No sustained destination 

Graduates who have an employment or out-of-work benefits record in the tax year in question but were not classified as being in ‘sustained employment’ and do not have a further study record. 

Sustained employment defined by P45 data 

The ‘sustained employment’ measure aims to count the proportion of graduates in sustained employment in the UK following the completion of their course. The definition of sustained employment is consistent with the definition used for 16-19 accountability and the outcome-based success measures published for adult further education (see Further education: outcome-based success measures, Academic Year 2017/18 – Explore education statistics – GOV.UK (explore-education-statistics.service.gov.uk)) . This definition looks at employment activity in the six-month October to March period of each tax year. A graduate needs to be in paid employment for at least one day in five out of six months between October and March of a given tax year to be classified as being in ‘sustained employment’ in the given tax year. If they are not employed in March, they must additionally have at least one day in employment in the April of the same calendar year to be counted as being in sustained employment.  

For instance, a graduate employed from 1st October 2017 to 5th January 2018 and then again from 30th March 2018 onwards would be classed as being in sustained employment in 2017/18 as although they are not employed in February 2018, they are employed in the other five months in the period from October 2017 to March 2018.  

However, a graduate employed from 1st October 2017 to 28th February 2018 but not employed in March 2018, would not be considered as being in sustained employment unless they had a day in employment April 2018. 

Sustained employment defined by self-assessment data 

This publication incorporates self-assessment data into measures of sustained employment. Self-assessment data captures the activity of individuals with income that is not taxed through PAYE, such as income from self-employment, savings and investments, property rental, and shares. Currently, only data from the 2013/14 tax year is available for inclusion in LEO. For this reason, we have only published employment and earnings outcomes for these tax years in this publication. 

For the purposes of this publication, individuals are classed as being in sustained employment in the tax year if they meet our definition of sustained employment based on PAYE or have returned a self-assessment form stating that they have received income from self-employment and their earnings from a Partnership or Sole-Trader enterprise are more than £0 (profit from self-employment). These individuals may or may not have an additional PAYE record. Individuals who have received income through self-assessed means other than self-employment, such as through rental of property, and do not have a PAYE record, are not classed as being in employment (either sustained or unsustained). Those who have made a loss from self-employment are currently excluded from sustained employment as we are unable to distinguish between those who made a loss and those who submitted self-assessment returns for other reasons at this moment in time. 

Further study 

A graduate is defined as being in further study if they have a valid higher education study record at any UK HEI on the HESA Student Record or designated English Alternative Provider (AP) on the AP HESA Student Record that overlaps the relevant tax year. Further study undertaken at further education colleges is not currently reflected in these figures but we will review this in future publications. The further study does not have to be at postgraduate level to be counted. The purpose of this category is to identify how students spent their time in the relevant tax year and as such cannot be used to calculate the proportion of graduates who go on to postgraduate study. We have not counted instances lasting 14 days or less, a change from previous publications. Additionally, students enrolled on further education courses, on some initial teacher training enhancement, booster and extension courses, whose study status is dormant or who were on sabbatical are excluded from this indicator in line with our previous methodology. 

As a tax year overlaps with two academic years, some students would be coming to the end of their further study in the tax year in question and some would be starting their further study. For example, those who graduated in the 2015/16 academic year and went straight on to a one-year masters course would not be counted as being in further study in the 2017/18 tax year (one year after graduation) as their course would finish in July 2017. If a graduate from 2015/16 waited a year before starting their one-year masters course then they would typically be counted as being in further study in the 2017/18 tax year (one year after graduation) if their course started in September 2017 for instance. 

Sustained employment only 

Graduates are considered to be in sustianed employment only if they have a record of sustained employment (as defined either via the P45 or self assessment data) but no record of further study (as defined above).  

Sustained employment with or without further study 

Sustained employment with or without further study includes all graduates with a record of sustained employment (defined either via the P45 or self assessment data), regardless of whether they also have a record of further study (as defined above). 

Sustained employment, further study or both 

Sustained employment, further study or both includes all graduates with a record of sustained employment or further study. This category includes all graduates in the ‘sustained employment with or without further study’ category as well as those with a further study record only

It is important to note that our definition of sustained employment does not distinguish between the different types of work that graduates are engaged in and so cannot provide an indication of the proportion of graduates who are employed in graduate occupations. Furthermore, we cannot distinguish between full-time and part-time employment. 

The below table summarises the type of activity people may have to be unmatched or to fall into one of the five outcomes categories. 

Table: Classification of graduate outcomes (Y indicates that the column is true for that outcome) 

LEO category Further study Sustained employment Any employment Out-of-work Benefits 
Unmatched -Unmatched to CIS Unmatched to CIS Unmatched to CIS 
Activity not captured  -
No sustained destination --Y
--
-YY
Sustained employment only Y-
Y
Sustained employment, with or without further studyYY-
YY
Y-
Sustained employment, further study or both Unmatched to CIS Unmatched to CIS Unmatched to CIS 
Y
Y
Y
Y
Y
 YY
Further study, with or without sustained employmentYUnmatched to CIS Unmatched to CIS Unmatched to CIS 
Y---
Y-Y-
Y--Y
Y-YY
YY--
YYY-
YY-Y
YYYY

 Annualised earnings 

Earnings figures are only reported for those classified as being in sustained employment via PAYE and where we have a valid earnings record from the P14 or where they are self-employed and have reported income of over £0 for that tax year. Those in further study are excluded, as their earnings would be more likely to relate to part-time jobs. Note that our publications prior to December 2017 did not include earnings from self-assessment. Under the new methodology, some graduates will have increased earnings if they have PAYE earnings as well as self-employment earnings. However, there are also more graduates included in the earnings calculations – those who have self-employment earnings but do not have qualifying PAYE earnings. This group typically has lower earnings than graduates with PAYE earnings. Thus, the reported median earnings under the new methodology is not necessarily higher under the new methodology compared to the old methodology. See our December 2017 publication for more details on the effect of this methodology change. 

Under our new methodology, PAYE and earnings from self-employment are treated differently. 

For each graduate who has been paid through the PAYE system, the earnings reported for them for a given tax year are divided by the number of days recorded in the employment spell in that same tax year. This provides an average daily wage, which is then multiplied by the number of days in the tax year to create their annualised earnings. 

This calculation has been used to maintain consistency with figures reported for further education learners after study. It provides students with an indication of the earnings they might receive once in stable and sustained employment. 

For earnings from self-employment, raw earnings are used. Due to the nature of the Self-Assessment tax return, dates of self-employment are not required and therefore are not available to annualise the self-employment earnings in the same way that PAYE earnings are annualised. We are therefore assuming that the Self-Assessment tax return relates to activity that took place over the full tax year. 

Where a graduate has income from both sustained employment paid through PAYE and though self-employment, the earnings used for this graduate is the sum of their annualised PAYE earnings and their raw earnings from self-employment. It should be noted that a graduate with a PAYE records (that does not reach the ‘sustained’ criteria) and a self-employment earnings record will be counted as being in ‘sustained employment’ but we do not include their earnings in the earnings calculation. This is to avoid the risk of annualising PAYE data that could be based on a very short earnings spell. 

The annualised earnings calculated are slightly higher than the raw earnings reported in the tax year. This is because the earnings of those who did not work for the entire tax year will be higher when annualised. The difference between the annualised and raw figures decreases as time elapses after graduation. Overall median annualised earnings one year after graduation are around £650 higher than the overall median raw earnings reported in the data. Five years after graduation, the overall median annualised earnings are less than £300 higher than the overall median raw earnings. The trend follows for both graduates who are in PAYE employment only and graduates who earnt income from both PAYE employment and self-employment.  

Information provided on the Self-Assessment tax return includes a field on earnings through PAYE employment, which we have used only where P14 earnings is not present. 

All earnings presented are nominal. They represent the cash amount an individual was paid and are not adjusted for inflation (the general increase in the price of goods and services). The exception to this is the figure and table showing the nominal earnings compared to real-term earnings using Consumer Prices Index Including Owner Occupiers’ Housing Costs  (CPIH) to account for inflation. 

It should be noted that LEO does not currently data on the average number of hours worked per week. Therefore, we can not distinguish between part-time and full-time employment/earnings. We appreciate that this is likely to impact some demographics more than others and are working towards having this data in future iterations of LEO so that it can be accounted for. 

Calculating earnings difference between sexes

Previously, the percentage used to compare male and female earnings was calculated as the difference between the medians divided by the female median earnings. This year, we have altered the calculation and use male median earnings as the denominator. This is inline with the calculation used by the ONS in their gender pay gap publication - Gender pay gap in the UK - Office for National Statistics (ons.gov.uk).

Rounding and suppression rules 

We apply rounding and suppression rules to help minimise the risk of someone being identifiable from our data (also known as Statistical Disclosure Control).  All calculations done in this publication are used on the rounded figures. 

The following rounding rules have been applied to this publication:

  • All monetary values have been rounded to the nearest £100
  • All population counts have been rounded to the nearest 5.
  • All percentages have been rounded to 1 decimal place.

The following suppression rules have been applied to this publication:

  • Employment outcomes based on less than 2 full person equivalent (FPE) have been suppressed.
  • Earnings outcomes based on less than 11 FPE have been suppressed.

Definitions

Sex

For graduates from HEIs and APs, this field is collected by HESA and more detail can be found on Student 2020/21 - Sex identifier | HESA. We filter our data to only include individuals who are recorded as ‘Male’ or ‘ Female’ to avoid the risk of disclosure for individuals who are recorded as ‘Other’. 

For graduates from FECs, the fields is collected in the ILR and more detail can be found on ILR Specification: Field: Sex (fasst.org.uk). For these individuals, ‘Male’ and ‘ Female’ are the only possible entries in the field. 

Ethnicity

The ethnicity breakdowns provided use groupings inline with HESA and ONS published data. Detailed ethnicity breakdowns are provided in the publication and breakdowns of broader ethnic groups can be viewed in the downloadable main tables document.

Age

Age breakdowns use the age at the start of the course. This is calculated as their age on the 30th September of the academic year e.g. for individuals starting in the 2012/13 academic year, their age on the 30th September 2012.

Some of the breakdowns in this release only cover young graduates (under 21 at the start of their course). Details on the reason for this can be found in ‘Data quality and coverage - HESA coverage’.

Subject areas 

The Higher Education Statistics Agency (HESA) are changing the way they report subjects from the 2019/20 academic year; the current Joint Academic Coding System (JACS) is being replaced by the Higher Education Classification of Subjects (HECoS). HESA have produced the Common Aggregation Hierarchy (CAH) which bridges between the two systems, and to maintain consistency across years were are using level 2 of the CAH to report breakdowns by subject area.  

The number of subject categories increases to 35, compared with 23 using the previous JACS groupings. In many cases the CAH categories map exactly to a JACS category (e.g. Medicine and dentistry, Mathematical sciences, Creative arts and design) ; in the remainder of cases, the CAH categories just provide a more detailed split compared with JACS groups (e.g. the JACS group ‘Engineering & Technology’ is now split into ‘Engineering’ and ‘Materials and technology’ separately; similarly for ‘Historical and Philosophical Studies’ split into ‘History and archaeology’ and ‘Philosophy and religious studies’). More information on HECoS and CAH can be found here: https://www.hesa.ac.uk/innovation/hecos 

CAH Code Subject 
CAH01-01Medicine and dentistry 
CAH02-02Pharmacology, toxicology and pharmacy 
CAH02-04Nursing and midwifery 
CAH02-05Medical sciences
CAH02-06Allied health
CAH03-01Biosciences
CAH03-02Sport and exercise sciences 
CAH04-01Psychology
CAH05-01Veterinary sciences 
CAH06-01Agriculture, food and related studies
CAH07-01Physics and astronomy 
CAH07-02Chemistry
CAH07-04General, applied and forensic sciences
CAH09-01Mathematical sciences 
CAH10-01Engineering 
CAH10-03Materials and technology
CAH11-01Computing
CAH13-01Architecture, building and planning 
CAH15-01Sociology, social policy and anthropology
CAH15-02Economics 
CAH15-03Politics 
CAH15-04Health and social care 
CAH16-01Law
CAH17-01Business and management 
CAH19-01English studies 
CAH19-02Celtic studies 
CAH19-04Languages and area studies 
CAH20-01 History and archaeology 
CAH20-02Philosophy and religious studies 
CAH22-01Education and teaching
CAH23-01Combined and general studies
CAH24-01 Media, journalism and communications 
CAH25-01 Creative arts and design
CAH25-02 Performing arts   
CAH26-01Geography, earth and environmental studies

It is important to note that, even with these additional splits, each CAH subject area can still include a diverse range of subjects, some of which will lead to significantly different employment and earnings outcomes. For example, ‘subjects allied to medicine not otherwise specified’ contains courses ranging from nutrition and dietetics to biomedical sciences. 

In this version of the publication, we have provided an additional table that shows the breakdowns by JACS 4-digit subjects (see JACS 3.0: Detailed (four digit) subject codes | HESA). This was originally requested by BEIS Office of Manpower Economics and includes more granular breakdowns for all subjects. The suppression rules mean that not all subject will have employment and earnings outcomes available. 

Mode of study

The mode of study breakdown is derived from two HESA fields (XMODE01_3.3.1 | HESA and XQMODE01_3.4.1 | HESA). ‘Full-time’ graduates have XQMODE01 recorded as full-time and XMODE01 recorded as not sandwich, ‘Sandwich’ have XMODE01 recorded as sandwich and ‘Part-time’ have XQMODE01 recorded as part-time. 

For graduates from FECs, the mode variable from the ILR is used ILR Specification: Field: Mode of Study (fasst.org.uk).

There are other modes of study which are not included in this analysis as they do not apply or would likely lead to small counts which would be suppressed.

Free School Meals (FSM)

For FSM, we use the FSM6 variable as used in the Pupil Premium calculations. This looks at school census records for the individuals to see if they have ever been eligible for FSM in the last six years from the date of the census.  

We use data from the Spring census when the individual was in Year 11. The Spring census is used in finalising Pupil Premium funding meaning it is more likely to be accurate. We use the Year 11 census as it has a better coverage than Sixth Form and FE colleges do not have to return the school census.  

FSM6 is used as it ensures we pick up all individuals who have seen some disadvantage during secondary school (e.g., someone could be eligible for FSM for Year 6 to Year 10 but not Year 11, using FSM6 picks this up whereas FSM would not).  

In this publication, an individual is counted in the “FSM” category depending on their most recent FSM6 record census value in KS4. 

POLAR (Participation Of Local Area) 

The participation of local areas (POLAR) classification groups areas across the UK based on the proportion of young people who participate in higher education. It looks at how likely young people are to participate in higher education across the UK and shows how this varies by area. POLAR classifies local areas into five groups (or quintiles) based on the proportion of young people who enter higher education aged 18 or 19 years old. 

In this publication, we use POLAR3. POLAR3 uses Census Area Statistics ward as the geographical area. More details can be found at Get the area-based measures data - Office for Students 

Prior Attainment

Prior attainment is the attainment of students prior to commencing their higher education course. We have calculated prior attainment based on key stage 5 qualifications recorded in the National Pupil Database (NPD), which contains data about pupils in schools and colleges in England. Due to the coverage of the NPD, we are unable to provide prior attainment breakdowns for our earliest cohorts (graduates in 2003/04 and 2004/05) or for mature students. Note also that coverage for graduates domiciled in Scotland, Wales and Northern Ireland is significantly lower than for those domiciled in England, since only those who took their KS5 qualifications in England are included. 

The majority of categories are based on point scores in A-levels. For these categories we have included Applied A levels and Vocational A levels alongside traditional A levels. Prior to the 2009/10 academic year, the available grades from an A level were A, B, C, D, E, N and U, with A being the highest, E being the lowest passing grade and N and U being considered fails. From 2009/10 onwards, an A* grade, higher than an A, was also available. Among our graduate cohorts, only graduates from 2012/13 onwards would typically have had A* available to them. To keep our categories comparable across years, our categorisation does not distinguish between A and A* grades.  

We use the following categories, listed in order of preference (i.e., if an individual satisfies the criteria of two or more categories, they are included only in the first of those categories): 

  • 4 As or more
  • 360 points
  • 300-359 points
  • 240-299 points
  • 180-239 points
  • Below 180 points
  • 1 or 2 A level passes (and no other qualifications other than AS levels)
  • BTEC (regardless of grade)
  • Other (this includes mixtures of A levels and other qualifications)

For category ’4 As or more’, it is important to note that 3 A levels is usually enough for entry to most universities. Hence many students who might be capable of attaining four A grades would only take 3 A levels. Indeed, some schools only offer 3 A levels to their students. 

For categories ’360 points’ to ‘Below 180 points’, we use the conversion between A level grades and points listed in Table E and only consider graduates with at least 3 A level passes. Hence ’360 points’ requires three grades of A or A*, while the threshold for ’300-359 points’ is equivalent to three Bs and the threshold for ’240-299 points’ is equivalent to three Cs.  

Note also that graduates with one or two A level grades as well as a BTEC National Diploma would be included in the ‘BTEC’ category. This was chosen as a BTEC is considered a “full” set of qualifications while one or two A levels is usually not enough to be considered for entry to most first degree courses. 

Conversion between A level grades and point scores 

A level grade Point Score 
A or A* 120 
100 
80 
60 
40 
N/UNot counted as one of top 3 A levels 

Current Region 

The current region geographical location data is based on the latest address that DWP has recorded for each individual on their Customer Information System (CIS). The LEO dataset does not contain the actual address or postcode for each individual, we currently have data on the Government Office Region (GOR), Local Authority District and Lower Layer Super Output Area (LSOA) where the individual lives at the end of each tax year.  

The CIS is primarily updated when an individual notifies DWP or HMRC of a change of address or through the individual interacting with a tax or benefit system. Individuals who have not been matched to the CIS will not have geographical information. This does not have an adverse effect on the data analysis as ‘unmatched’ graduates are excluded from employment and earnings outcomes.  

For those matched to CIS, address data is available in nearly all cases (over 99.8%), however for those who are not in receipt of benefits or contributing to the tax system then this information could be out of date. Even when contributing to the tax system, employee address is not a mandatory field in the data submitted to HMRC via employers HR systems. It is also possible that in the years soon after leaving university graduates may still use their parents address if they are moving frequently between rented accommodation. More work is needed to try and understand how big an impact this has on the address data held on CIS. 

Home Region

A graduate's home region is found by using their permanent or home postcode prior to starting the course as recorded by HESA - Student 2016/17 - Postcode | HESA

Residence (Living at Home or Elsewhere)

Residence information is based on term time accommodation recorded in the HESA student record/the ILR. Note that a student’s residence status may potentially change during their studies; we use their status during their graduating year, this may be different to their residence status in their earlier years. Collection of this variable is mandatory for full-time students and those on sandwich courses; coverage is lower for part-time and other modes of study – see Table below.  

Coverage of residence data by Mode of Study 

Coverage: Young (under 21 at start of course) UK domiciled male and female first degree graduates from English HEIs and FECs 

Cohorts: 2003/04, 2004/05, 2008/09, 2009/10, 2010/11, 2011/12, 2012/13, 2013/14, 2014/15, 2015/16, 2016/17

Mode of Study Coverage of Residence data (%) 
Full-time 96.7
Sandwich Degree >99.9
Part-time 81.0
Other 55.2

We have presented residence information in three categories: living at parental/guardian home, living elsewhere and not known. The ‘living elsewhere’ category consists of a variety of different living arrangements – see Table below for the breakdown of residence into term time accommodation in 2012/13, and the proportion in each of these finer categories. In this table, the ‘not in residence at institution’ category includes students on an industrial placement or language year abroad, ‘own residence’ includes a student’s permanent residence, which may be either owned or rented by them, and ‘other rented accommodation’ refers to a more temporary arrangement, including renting in a flat share on a yearly basis. Note that the proportion in each category vary slightly each year, as does the categorisation used by HESA. 

Breakdown of term time accommodation in 2012/13

Coverage: Young (under 21 at start of course) UK domiciled male and female first degree graduates from English HEIs and FECs - 2012/13 graduating cohort (5YAG)

Residence Term time accommodation Proportion (%) 
Living at parental/guardian home Living at parental/guardian home 22.2
Living elsewhere Institution maintained property 6.6
Private-sector halls 2.9
Not in attendance at the institution 0.3
Own residence 17.8
Other rented accommodation 35.9
Other 3.5
Not known Not known 8.6

Full Cycle Movement 

Full cycle graduate movement uses three variables (home region – study/provider region – current region) to indicate the migration trend for a student (e.g., “studied in their home region, but currently living elsewhere” or “left their home region to study and currently living in their study region”). 

Due to the way ‘provider region’ is defined it is possible that although studying in a different region to their ‘home region’ some of these graduates were still living in their home region and then commuting to a different region to attend university (e.g. living at home in Sheffield but commuting to a provider in the East Midlands). 

The provider region and home region geographical location variables are both from the HESA student record, and the current region geographical location data is from DWP as is explained in more detail in the above ‘current region’ section.  

If a graduate has an unknown home or current region, they were filtered out of this analysis, meaning that the cohort numbers are smaller than in other breakdowns. An individual may have an unknown home region if their home postcode is not provided by their HE provider, however this only affects a very small proportion of graduates. Reasons for an individual having an unknown current region are explained in the previous ‘current region’ section. We also filter out mature graduates from this analysis because the home region data is unreliable for mature students. This is because the region they lived in prior to starting their course is less likely to be their ‘true’ home region, as they are more likely to have geographically relocated in the years between school and higher education.  

We have presented the full cycle movement breakdowns using five categories in the following format;  

Left home region to study Stayed in home region to study 
Currently in home region Currently in study region  Currently elsewhere Currently in home/study region Currently elsewhere 

‘Left home region to study’ means that the graduate attended a HE provider in a region that was not their home region. ‘Stayed in home region’ means that the graduates HE provider was in their home region. These do not define where the graduate lived during their study period (which instead can be seen in the ‘term time residence’ section), because a graduate could move cities for HE but still be within the same region as their home region, or could commute to a different region for HE while still living at home.  

The second row of variables represent the movement between their study region and their current region. If they left their home region to study, the options are that they currently live in the pre-study home region, that they currently live in the region in which they attended HE, or they are living in a different region which is neither their home or study region. If the graduate studied in their home region, they can either currently be in the home/study region (meaning that their home region, study region and current region are all the same), or they could have moved elsewhere and currently be in a different region.  

Domicile Categories 

Domicile categories have been based upon graduates’ domicile prior to the start of their course, as recorded in the HESA student record for graduates from HEIs and as recorded in the ILR for graduates from FECs. Graduates have been categorised into three top-level categories – UK, EU and Non-EU. Due to data quality issues with the domicile variable on the ILR in the 2003/04 and 2004/05 academic years, we have not included non-UK domiciled graduates from FECs in the tables for these years. 

UK domiciled refers to graduates domiciled in England, Scotland, Wales or Northern Ireland prior to the start of their course. Tables 1 to 14 and 20 to 32 refer only to UK domiciled graduates.  

EU domiciled refers to graduates domiciled in the EU other than in England, Scotland, Wales or Northern Ireland. As such, graduates domiciled in Gibraltar have been classed as EU domiciled. Over the period covered by this publication, the membership of the EU has expanded and hence different graduating cohorts consist of different sets of countries. Graduates have been classed as EU domiciled if their recorded country of domicile was a member of the EU at the start of their year of graduation. Table below details for which cohort(s) each country has been designated as part of the EU domiciled category. Countries listed include all of their European Union territories; for instance, Finland includes the territory of the Åland islands. 

Countries and territories included in the European Union category by graduating cohort 

Country/Territory Graduating cohorts in which domicile is counted as EU domiciled 

Austria 

Belgium 

Denmark 

Finland 

France 

Germany 

Gibraltar 

Greece 

Ireland  

Italy 

Luxembourg 

Netherlands 

Portugal 

Spain 

Sweden 

All graduating cohorts 

(2003/04, 2004/05, 2008/09, 2009/10, 2010/11, 2011/12, 2012/13, 2013/14, 2014/15, 2015/16, 2016/17) 

Cyprus 

Czech Republic 

Estonia 

Hungary 

Latvia 

Lithuania 

Malta 

Poland 

Slovakia 

Slovenia 

2004/05, 2008/09, 2009/10, 2010/11, 2011/12, 2012/13, 2013/14, 2014/15, 2015/16, 2016/17 

 

Bulgaria 

Romania 

2008/09, 2009/10, 2010/11, 2011/12, 2012/13, 2013/14, 2014/15, 2015/16, 2016/17 
Croatia 2013/14, 2014/15, 2015/16, 2016/17 

Overseas domiciled refers to graduates domiciled in countries/territories not belonging to the European Union. The Crown Dependencies of Jersey, Guernsey and the Isle of Man are not part of the UK or of the European Union and thus they have been included this category.  

Table 19 in the accompanying tables gives employment and earnings outcomes for the 20 largest countries of domicile within our data. We have followed methodology used by HESA in defining country of domicile, for instance aggregating together the various territories of France in the France total but keeping China and Hong Kong separate. 

Note that country of domicile is not the same as nationality (as recorded on the HESA student record). For instance, in 2012/13, 91% of UK domiciled graduates were UK nationals, while 7% of EU domiciled graduates and about 4% of overseas domiciled graduates were UK nationals. 

Industry data

The industry groups provided use the ONS Standard Industrial Classification (SIC) codes agreed in 2007 (SIC2007). SIC codes provide information on the type of economic activity the graduates’ employer is engaged in, not the occupation of the graduate. This has been linked to LEO using the employer enterprise reference from the IDBR.  

The IDBR covers over 2.6 million businesses in all sectors of the UK economy; however it does not include very small businesses. To be on the IDBR businesses must be registered either for VAT or PAYE. The Business Population Estimates publication provides figures for the number of UK businesses, 5.9 million, including the small businesses excluded from the IDBR. The IDBR covers approximately 44% of the total UK business population. The IDBR data used in this dashboard is from datasets owned by the Office of National Statistics (ONS). The ONS does not accept responsibility for any inferences or conclusions derived from the IDBR data by third parties.  

Graduates who do not have a PAYE record (e.g. self-employed) cannot be linked and will therefore be classified as ‘unknown’. A graduate’s SIC code is the industry in which they earnt the most in the tax year, and in the case where there were two industries in which the graduate earnt an equal amount, we have classified these as ‘unknown’ since one cannot be chosen. The majority of the analysis uses the 21 industry sections, however in the dashboard, the industry by subject table is expandable to the 3-digit-code level (see this ONS interactive SIC hierarchy). 

Data matching and match rates

The HESA student records are matched to DWP’s Customer Information System (CIS - Customer Information System – an explanation of the information held about you - GOV.UK (www.gov.uk)) using an established matching algorithm based on the following personal characteristics: National Insurance Number (NINO), forename, surname, date of birth, postcode, and sex. Some of these characteristics are simplified to make the matching process less time-intensive and allow more matches. For instance if a surname is misspelt in one of the datasets, only the first initial of the forename is used, the surname is encoded using an English sound-based algorithm called SOUNDEX (function that turns a surname into a code representing what it sounds like, which allows some flexibility for different spellings. For example Wilson=Willson), and for most matches only the sector of the postcode is used. 

The NINO is not present on the HESA student record itself and has been matched on where possible by fuzzy matching with personal data from the Student Loans Company. This process increases the likelihood of finding a match with CIS. Accordingly, groups less likely to take a student loan, for example international students who are not eligible for one, are likely to have lower match rates. 

All records accessed for analysis are anonymous so that individuals cannot be identified. The personal identifying records used in the actual matching process are accessed under strict security controls. 

There are five match processes carried out, ranging from the highest quality and most likely to be accurate (Green) to the lowest quality and most likely to be a false match (Red-Amber). The table below shows the criteria for each match type.  

Once the HESA records have been matched to the CIS the corresponding tax and benefits records for that individual can then be linked to their HESA record. 

All match rate analysis in this chapter is restricted to the HESA population covered in this publication, that is, UK domiciled, first degree graduates from UK Higher Education Providers. 

Table: Criteria for each match strength (Y indicates a match, - indicates no match) 

Match 
quality 
NINO (National 
Insurance number) 
Forename
(initial) 
Surname (soundex) Date of birth Sex Postcode (sector) 
1. Green YAt least four of forename, surname, DOB, sex and postcode.
2. Amber Any three of forename, surname, DOB, sex and postcode.
3. Green-Amber YYY
4. Amber-RedYYOne of sex or postcode.
5. Red-AmberYY (full postcode) 

 

Overall match rates 

In this section we consider match rates to the CIS spine. This differs slightly from the match rates displayed in the main tables of this publication, which also include those without a CIS match but with a record of further study in the given year. 

The table below shows the overall CIS match rates for graduates who studied full-time as well as the proportion with a tax or benefit record. Potential reasons for not being able to find a P14 record, despite having a match to the CIS spine, include: earning below the Lower Earnings Limit (LEL), self-employment, moving abroad and death. 

Table: Match rates for UK domiciled first degree graduates at English HEIs, by year of graduation 
Coverage: UK domiciled male and female first-degree graduates from English HEIs. 
Cohorts: 2003/04, 2004/05, 2005/06, 2006/07, 2007/08, 2008/09, 2009/10, 2010/11, 2011/12, 2012/13, 2013/14, 2014/15, 2015/16, 2016/17 

Academic year Matched to CIS spine (%) Matched to tax/benefit record (%) 
2003/04 95 94 
2004/05 95 95 
2005/06 96 95 
2006/07 96 96 
2007/08 97 97 
2008/09 97 97 
2009/10 97 97 
2010/11 97 97 
2011/12 97 97 
2012/13 98 97 
2013/14 97 97 
2014/15 97 96 
2015/16 95 95 
2016/17 95 94 

The table above shows that the match rate was high for the most recent cohorts: 95% of full-time graduates in 2016/17 were matched using the CIS,  98% were matched in 2012/13 (our 5YAG breakdown for the 18/19 Tax Year) and almost all of these had at least one tax record or out-of-work benefit record. This compares to a match rate of 95% of graduates in 2003/04. The higher match rates for more recent cohorts is at least partly explained because the CIS holds the most recent names and addresses for individuals, and so if the details change after someone graduates there is less chance that they will be matched. There is a dip for the most recent cohorts which is explained below. 

These rates are slightly lower than previous publications, particularly in recent academic years dropping by 3 or 4 percentage points. Older academic years have been considerably less impacted by this change in match rates. The individuals that are impacted were matched to multiple NINo’s where there was an equal likelihood of the NINo being correct. As a result, neither match was used to avoid potential incorrect matches. The cause of the issue has now been understood and will be corrected in the next LEO dataset.  

Match rate by graduate characteristic 

The table below shows match rates by sex. The match rate for females is slightly lower in the earlier years than for males, but this difference is negligible or non-existent in recent cohorts. As the CIS holds the latest information about an individual, anyone that has changed their name since graduation will have a different name on the CIS compared to their HESA record. This particularly affects females, due to a higher likelihood than males of changing their name upon marriage. 

Table: CIS match rate by sex 
Coverage: UK domiciled male and female first degree graduates from English HEIs. 

Academic year Female (%) Male (%) 
2003/04 93 98 
2004/05 93 98 
2005/06 94 98 
2006/07 95 98 
2007/08 96 98 
2008/09 96 98 
2009/10 97 98 
2010/11 97 98 
2011/12 97 98 
2012/13 98 98 
2013/14 97 97 
2014/15 97 97 
2015/16 95 95 
2016/17 95 95 

The match rates were also compared for different ethnic groups out of the UK-domiciled students. There was little consistent difference between the groups, the only exception being graduates whose self-declared ethnicity was Chinese, where the match rate was 92% in 2016/17. Further investigation showed that this was most likely due to the ethnically Chinese forenames and surnames being switched on one of the databases. This is more common for Chinese names because the family name traditionally comes before the individual name. This hypothesis is further corroborated by the fact that ethnically Chinese students with common English names have match rates that are similar to graduates from other ethnic groups.  

The number of forenames or surnames an individual has can affect the match rate, because with multiple names it is more likely that they will not all be recorded, or there may be forenames recorded as surnames or vice versa. Analysis of the match rates showed that those with at least two surnames had a slightly lower match rate than those with only one. 

Match rates are noticeably lower for non-UK domiciled graduates. The main reason for this is that LEO relies on graduates having been issued with a National Insurance number to match them to an employment record. However, international students who have no intention of working or claiming benefits in this country are less likely to apply for a National Insurance number and so would not appear in the LEO data. 

It may be that international graduates remain in the UK but not in work or receiving any type of benefit, and so do not require a National Insurance number. However, our expectation is that international graduates are likely to have moved abroad, with the majority returning to their home country. Recent Home Office reports confirm that the vast majority of non-EU international students who were granted a visa to study in the UK left in time (97.4% - Fourth report on statistics being collected under the exit checks programme - GOV.UK (www.gov.uk)).  

Some international students may have been issued with a National Insurance number but will not appear in the UK tax or benefit system for the tax years included in this release. These graduates are recorded as ‘activity not captured’, even if they are in employment in another country.  

This publication uses the current region of residence data supplied by the Department for Work and Pensions (DWP) to identify graduates who were not living in the UK for the majority of the tax year. These graduates are removed from the denominator to help improve the accuracy of the employment outcomes calculations.  

Other reasons for lower match rates among non-UK domiciled graduates include higher likelihoods of misspelling of names and lower take up of/eligibility for student loans, meaning we would not be able to attach NINO to the HESA data to aid the matching process. 

Get in touch

Media enquiries 

Press Office News Desk, Department for Education, Sanctuary Buildings, Great Smith Street, London SW1P 3BT. 

Tel: 020 7783 8300 

Other enquiries/feedback 

Email: HE.LEO@education.gov.uk 

 

Help and support

Contact us

If you have a specific enquiry about Graduate outcomes (LEO) statistics and data:

LEO

Email: HE.LEO@education.gov.uk
Contact name: Simon Childs
Telephone: 07920594501

Press office

If you have a media enquiry:

Telephone: 020 7783 8300

Public enquiries

If you have a general enquiry about the Department for Education (DfE) or education:

Telephone: 037 0000 2288

Opening times:
Monday to Friday from 9.30am to 5pm (excluding bank holidays)