Methodology

Graduate outcomes (LEO): postgraduate outcomes

Published

Introduction

Background to the Longitudinal Educational Outcomes (LEO) dataset 

The Small Business, Employment and Enterprise Act 2015 enabled government, for the first time, to link higher education and tax data together to chart the transition of graduates from higher education into the workplace (for more information on the legal powers governing the dataset please see section 78 of the Small Business, Enterprise and Employment Act 2015 and sections 87-91 of the Education and Skills Act 2008). 

One of the advantages of linking data from existing administrative sources is that it provides a unique insight into the destinations of graduates without imposing any additional data collection burdens on universities, employers or members of the public. Compared to existing sources of graduate outcomes data, it is also based on a considerably larger sample, does not rely on survey methodology, and can track outcomes across time to a greater extent than was previously possible. 

The LEO dataset links information about students, including;

  • personal characteristics such as sex, ethnic group and age
  • education, including schools, colleges and higher education institution attended, courses taken, and qualifications achieved
  • employment and income
  • benefits claimed

 

It is created by combining data from the following sources:

  • the National Pupil Database (NPD), held by the Department for Education (DfE)
  • Higher Education Statistics Agency (HESA) data on students at UK publicly funded higher education institutions and some alternative providers, held by DfE
  • Individualised Learner Record data (ILR) on students at further education institutions, held by DfE
  • employment data from the Real Time Information System (RTI). RTI contains information formerly collected on the P45 and P14 forms, held by Her Majesty’s Revenue and Customs (HMRC)
  • data from the Self-Assessment tax return, held by HMRC
  • the National Benefit Database, Labour Market System and Juvos data, held by the Department for Work and Pensions (DWP)

By combining these sources, we can look at the progress of higher education leavers into the labour market.  

The privacy notice explaining how personal data in this project is shared and used can be found at Longitudinal education outcomes study: how we use and share data - GOV.UK (www.gov.uk) (opens in a new tab)

Data quality and coverage

Employment and earnings data 

The employment data covers those with P45 and P14 records submitted through the Pay As You Earn (PAYE) system. These figures have been derived from administrative IT systems that, as with any large-scale recording system, are subject to possible errors with data entry and processing. While some data cleaning was necessary, the resulting data looks to provide a good reflection of an individual’s employment and earnings for the year. 

For the purposes of collecting taxes only the tax year of employment is needed, accurate start and end dates within the tax year are not required. For this reason, issues encountered with the employment data included records with duplicate dates and records with dates which were invalid for our intended use (for example, where an employment start date occurred after the end date). 

Additionally, a number of returns are found to have missing start dates due to the employer not forwarding a timely P45. The default dates recorded in the dataset are either 6 April (the first day of the tax year) or, where only an end date is known, the day before that end date. Similarly, for records where the employment is known to have come to an end within a tax year but the end date is not known, the record is given a default 5 April end date, the last day of the tax year. 

Individuals can also have overlapping spells of employment. Before carrying out analysis, the P45 and P14 records for each individual were cleaned and then merged into a single record to give a longitudinal picture of their employment and a total sum of their earnings in each tax year. Where uncertain dates appeared, other employments or benefits records for that individual were used to create a merged employment spell with a known start and end date.

Example 1: Two employment spells 

Spell A                               |---------| 

Spell B                  |-----------------|--------------|  

Merged result               |----------------------| 

In example 1, the start date of spell B is uncertain with its possible range shown in bold. In this instance we can merge the two records resulting in an employment spell with the start date of spell A and an end date from spell B. 

Any remaining uncertain dates were imputed through random sampling of gap lengths from a frequency distribution that was constructed from gaps with a known length. 

DWP/HMRC Coverage

Beginning in April 2013, the P45 reporting system was phased out in favour of the Real Time Information (RTI) system, which requires employers to submit information to HMRC each time an employee is paid. This system has now reached full deployment. RTI offers substantial improvements to the P45 system in terms of data coverage, since employers must now provide information on all their employees if even one employee of the company is paid above the Lower Earnings Limit. The move to RTI will mean that data coverage is high for the 2014/15 to 2018/19 tax years used in this publication. 

We can not currently distinguish between part-time and full-time work in the LEO data. This is further discussed in “Methodology - Annualised earnings”.  

As well as employment data for those who pay tax through PAYE, the employment data additionally includes those who pay tax through self-assessment. 

Self-assessment forms are completed by a range of people who for example are self-employed, have received income from investments, savings or shares and by people who have complicated tax affairs. A list of people who are required to complete a self-assessment return can be found at www.gov.uk/self-assessment-tax-returns/who-must-send-a-tax-return (opens in a new tab). We have recently obtained a new self-assessment earnings dataset from HMRC, which contains variables on: 

  • Earnings received through employment (PAYE)
  • Income from partnership enterprises
  • Income from sole-trader enterprises
  • Total earnings for the tax year from the self-assessment form.

We have used the income from partnership enterprises and income from sole-trader enterprises to ascertain graduates who are self-employed and their earnings from self-employment enterprises. We have taken a sum of these two variables, and where the sum of these is greater than £0, graduates are classified as self-employed. Where self-employment earnings are used, the earnings amount is the sum of these two variables. 

In the data received from DWP, an overseas flag is received to identify individuals who are known to be living overseas. Details on when an individual informs HMRC can be found at Tax if you leave the UK to live abroad - GOV.UK (www.gov.uk) (opens in a new tab). For this analysis, individuals who are known to be overseas are excluded as their earnings and outcomes data is likely to be incomplete.  

HESA Coverage ("Postgraduate" and “first degree” graduates)

In this publication, we include graduates from Higher Education Institutes (HEIs) only due to small numbers studying postgraduate degrees in Alternative Providers and Further Education Colleges.  Alternative providers did not need to return data on postgraduate students before the 16/17 academic year and from the 17/18 academic year they were required to return data on all postgraduate students (see Figure 15a - Alternative provider qualifications obtained by level of qualification obtained 2015/16 to 2018/19 | HESA (opens in a new tab)). However, the 17/18 academic year is not covered in this publication so they are excluded here.

Postgraduates are identified using the “XPQUAL01”  (see XPQUAL01_2.20.1 | HESA (opens in a new tab)) and “XQLEV501” (see XQLEV501_1.14.1 | HESA (opens in a new tab)) HESA variables.  For this publication, postgraduates are identified as records where XPQUAL01 is ‘1’ and XQLEV501 is ‘1’ (Postgraduate research) or ‘2’ (Postgraduate taught). It should be noted that integrated Masters degrees are not included in this. 

The text also draws comparisons to “first degree” graduates. Identifying “first degree" graduates use the same two variables from HESA. Here, we again filter for XPQUAL01 to be ‘1’, and XQLEV501 to be ‘3’ (first degree). 

More details on the qualifications that are in each XQLEV501 grouping can be found from HESA's “QUAL” variable  from which XQLEV501 is derived (Student 2019/20 - Qualification awarded | HESA (opens in a new tab)).

Methodology

Time period

The earliest time period for which employment and earnings data is reported is one year after graduation. This refers to the first full tax year after graduation (YAG). Hence, for the 2016/17 graduation cohort, the figures one year after graduation refer to employment and earnings outcomes in the 2018/19 tax year. This time period was picked as using the tax year that overlaps with the graduation date would mean that graduates are unlikely to have been engaged in economic activity for the whole tax year. 

Academic year 2016/17   |------------------|    

Tax year 2017/18                                    |------------------|

Tax year 2018/19                                                            |------------------|  

In this publication, we look at one, three, five and ten years after graduation, focussing on the 2018/19 tax year with some comparative analysis to the 2014/15 to 2017/18 tax years. Thus, we look at employment and earnings outcomes in the 2018/19 tax year for graduates from the 2007/08, 2012/13, 2014/15 and 2016/17 academic years. For 2014/15 tax year graduates from the 2003/04, 2008/09, 2010/2011 and 2012/2013 academic years and the other tax years are calculated using this method. 

The table below shows this for all tax years and academic years. The cells represent years after graduation (YAG). Bold indicates it is a cohort available in this publication:

Tax Year
2014/152015/162016/172017/182018/19
Academic year of graduation2003/0410 YAG11 YAG12 YAG13 YAG14 YAG
2004/059 YAG10 YAG11 YAG12 YAG13 YAG
2005/068 YAG9 YAG10 YAG11 YAG12 YAG
2006/077 YAG8 YAG9 YAG10 YAG11 YAG
2007/086 YAG7 YAG8 YAG9 YAG10 YAG
2008/095 YAG6 YAG7 YAG8 YAG9 YAG
2009/104 YAG5 YAG6 YAG7 YAG8 YAG
2010/113 YAG4 YAG5 YAG6 YAG7 YAG
2011/122 YAG3 YAG4 YAG5 YAG6 YAG
2012/131 YAG2 YAG3 YAG4 YAG5 YAG
2013/141 YAG2 YAG3 YAG4 YAG
2014/151 YAG2 YAG3 YAG
2015/161 YAG2 YAG
2016/171 YAG

Employment outcomes

We refer to a graduate as matched if they have been successfully matched to the Department for Work and Pensions’ Customer Information System (CIS) or if they have been matched to a further study instance on the HESA Student Record. Graduates who have not been matched to CIS or a further study record are referred to as unmatched. These graduates were not found on DWP’s Customer Information System (CIS), either because they had never been issued with a National Insurance number or because the personal details provided from the HESA data did not fulfil the matching criteria. These graduates are excluded from calculations performed for UK domiciled populations. This is as well as records that were matched and are known to be overseas. They are not included in outcomes categories in Tables 1 to 14 and 20 to 32. 

UK domiciled graduates who have been matched and are not known to be overseas are then placed in one of five outcomes categories. These are: 

  • Activity not captured
  • No sustained destination
  • Sustained employment only
  • Sustained employment with or without further study
  • Sustained employment, further study or both.

Unmatched graduates are included in the denominator when calculating employment outcomes for non-UK domiciled graduates (Tables16, 17 and 33) and are placed in a separate ‘unmatched’ outcome category. For these populations the match rates are much lower and non-UK graduates are much more likely to leave the UK after graduation. Including these graduates in the calculations means we get a better indication of the proportion of graduates who have stayed in the UK to work or study after graduation, making it easier to compare countries with vastly different match rates.  

For non-UK domiciled graduates, the employment outcome categories should not be used as an indication of success in finding employment after graduation, it is likely that the majority of these graduates who are ‘unmatched’ or in ‘activity not captured’ are employed outside of the UK.  

More information on match rates is given in section: Data matching and match rates. If a graduate is unmatched on the CIS but has a further study record for the tax year in question, then they are counted as being in further study, and hence are not in the unmatched category. 

Activity not captured 

Graduates in this category have been successfully matched to CIS but do not have any employment, out-of-work benefits or further study records in the tax year of interest. Reasons for appearing in this category include: moving out of the UK after graduation for either work or study, voluntarily leaving the labour force or death. 

No sustained destination 

Graduates who have an employment or out-of-work benefits record in the tax year in question but were not classified as being in ‘sustained employment’ and do not have a further study record. 

Sustained employment defined by P45 data 

The ‘sustained employment’ measure aims to count the proportion of graduates in sustained employment in the UK following the completion of their course. The definition of sustained employment is consistent with the definition used for 16-19 accountability and the outcome-based success measures published for adult further education (see Further education: outcome-based success measures, Academic Year 2017/18 – Explore education statistics – GOV.UK (explore-education-statistics.service.gov.uk)) (opens in a new tab) . This definition looks at employment activity in the six-month October to March period of each tax year. A graduate needs to be in paid employment for at least one day in five out of six months between October and March of a given tax year to be classified as being in ‘sustained employment’ in the given tax year. If they are not employed in March, they must additionally have at least one day in employment in the April of the same calendar year to be counted as being in sustained employment.  

For instance, a graduate employed from 1st October 2017 to 5th January 2018 and then again from 30th March 2018 onwards would be classed as being in sustained employment in 2017/18 as although they are not employed in February 2018, they are employed in the other five months in the period from October 2017 to March 2018.  

However, a graduate employed from 1st October 2017 to 28th February 2018 but not employed in March 2018, would not be considered as being in sustained employment unless they had a day in employment April 2018. 

Sustained employment defined by self-assessment data 

This publication incorporates self-assessment data into measures of sustained employment. Self-assessment data captures the activity of individuals with income that is not taxed through PAYE, such as income from self-employment, savings and investments, property rental, and shares. Currently, only data from the 2013/14 tax year is available for inclusion in LEO. For this reason, we have only published employment and earnings outcomes for these tax years in this publication. 

For the purposes of this publication, individuals are classed as being in sustained employment in the tax year if they meet our definition of sustained employment based on PAYE or have returned a self-assessment form stating that they have received income from self-employment and their earnings from a Partnership or Sole-Trader enterprise are more than £0 (profit from self-employment). These individuals may or may not have an additional PAYE record. Individuals who have received income through self-assessed means other than self-employment, such as through rental of property, and do not have a PAYE record, are not classed as being in employment (either sustained or unsustained). Those who have made a loss from self-employment are currently excluded from sustained employment as we are unable to distinguish between those who made a loss and those who submitted self-assessment returns for other reasons at this moment in time. 

Further study 

A graduate is defined as being in further study if they have a valid higher education study record at any UK HEI on the HESA Student Record or designated English Alternative Provider (AP) on the AP HESA Student Record that overlaps the relevant tax year. Further study undertaken at further education colleges is not currently reflected in these figures but we will review this in future publications. The further study does not have to be at postgraduate level to be counted. The purpose of this category is to identify how students spent their time in the relevant tax year and as such cannot be used to calculate the proportion of graduates who go on to postgraduate study. We have not counted instances lasting 14 days or less, a change from previous publications. Additionally, students enrolled on further education courses, on some initial teacher training enhancement, booster and extension courses, whose study status is dormant or who were on sabbatical are excluded from this indicator in line with our previous methodology. 

As a tax year overlaps with two academic years, some students would be coming to the end of their further study in the tax year in question and some would be starting their further study. For example, those who graduated in the 2015/16 academic year and went straight on to a one-year masters course would not be counted as being in further study in the 2017/18 tax year (one year after graduation) as their course would finish in July 2017. If a graduate from 2015/16 waited a year before starting their one-year masters course then they would typically be counted as being in further study in the 2017/18 tax year (one year after graduation) if their course started in September 2017 for instance. 

Sustained employment only 

Graduates are considered to be in sustianed employment only if they have a record of sustained employment (as defined either via the P45 or self assessment data) but no record of further study (as defined above).  

Sustained employment with or without further study 

Sustained employment with or without further study includes all graduates with a record of sustained employment (defined either via the P45 or self assessment data), regardless of whether they also have a record of further study (as defined above). 

Sustained employment, further study or both 

Sustained employment, further study or both includes all graduates with a record of sustained employment or further study. This category includes all graduates in the ‘sustained employment with or without further study’ category as well as those with a further study record only

It is important to note that our definition of sustained employment does not distinguish between the different types of work that graduates are engaged in and so cannot provide an indication of the proportion of graduates who are employed in graduate occupations. Furthermore, we cannot distinguish between full-time and part-time employment. 

The below table summarises the type of activity people may have to be unmatched or to fall into one of the five outcomes categories. 

Table: Classification of graduate outcomes (Y indicates that the column is true for that outcome) 

LEO categoryFurther studySustained employmentAny employmentOut-of-work Benefits
Unmatched -Unmatched to CIS Unmatched to CIS Unmatched to CIS 
Activity not captured  -
No sustained destination --Y
--
-YY
Sustained employment only Y-
Y
Sustained employment, with or without further studyYY-
YY
Y-
Sustained employment, further study or both Unmatched to CIS Unmatched to CIS Unmatched to CIS 
Y
Y
Y
Y
Y
YY
Further study, with or without sustained employmentYUnmatched to CIS Unmatched to CIS Unmatched to CIS 
Y---
Y-Y-
Y--Y
Y-YY
YY--
YYY-
YY-Y
YYYY

 

Annualised earnings

Earnings figures are only reported for those classified as being in sustained employment via PAYE and where we have a valid earnings record from the P14 or where they are self-employed and have reported income of over £0 for that tax year. Those in further study are excluded, as their earnings would be more likely to relate to part-time jobs. Note that our publications prior to December 2017 did not include earnings from self-assessment. Under the new methodology, some graduates will have increased earnings if they have PAYE earnings as well as self-employment earnings. However, there are also more graduates included in the earnings calculations – those who have self-employment earnings but do not have qualifying PAYE earnings. This group typically has lower earnings than graduates with PAYE earnings. Thus, the reported median earnings under the new methodology is not necessarily higher under the new methodology compared to the old methodology. See our December 2017 publication for more details on the effect of this methodology change. 

Under our new methodology, PAYE and earnings from self-employment are treated differently. 

For each graduate who has been paid through the PAYE system, the earnings reported for them for a given tax year are divided by the number of days recorded in the employment spell in that same tax year. This provides an average daily wage, which is then multiplied by the number of days in the tax year to create their annualised earnings. 

This calculation has been used to maintain consistency with figures reported for further education learners after study. It provides students with an indication of the earnings they might receive once in stable and sustained employment. 

For earnings from self-employment, raw earnings are used. Due to the nature of the Self-Assessment tax return, dates of self-employment are not required and therefore are not available to annualise the self-employment earnings in the same way that PAYE earnings are annualised. We are therefore assuming that the Self-Assessment tax return relates to activity that took place over the full tax year. 

Where a graduate has income from both sustained employment paid through PAYE and though self-employment, the earnings used for this graduate is the sum of their annualised PAYE earnings and their raw earnings from self-employment. It should be noted that a graduate with a PAYE records (that does not reach the ‘sustained’ criteria) and a self-employment earnings record will be counted as being in ‘sustained employment’ but we do not include their earnings in the earnings calculation. This is to avoid the risk of annualising PAYE data that could be based on a very short earnings spell. 

The annualised earnings calculated are slightly higher than the raw earnings reported in the tax year. This is because the earnings of those who did not work for the entire tax year will be higher when annualised. The difference between the annualised and raw figures decreases as time elapses after graduation. Overall median annualised earnings one year after graduation are around £650 higher than the overall median raw earnings reported in the data. Five years after graduation, the overall median annualised earnings are less than £300 higher than the overall median raw earnings. The trend follows for both graduates who are in PAYE employment only and graduates who earnt income from both PAYE employment and self-employment.  

Information provided on the Self-Assessment tax return includes a field on earnings through PAYE employment, which we have used only where P14 earnings is not present. 

All earnings presented are nominal. They represent the cash amount an individual was paid and are not adjusted for inflation (the general increase in the price of goods and services). The exception to this is the figure and table showing the nominal earnings compared to real-term earnings using Consumer Prices Index Including Owner Occupiers’ Housing Costs  (CPIH) to account for inflation. 

It should be noted that LEO does not currently data on the average number of hours worked per week. Therefore, we can not distinguish between part-time and full-time employment/earnings. We appreciate that this is likely to impact some demographics more than others and are working towards having this data in future iterations of LEO so that it can be accounted for. 

Calculating earnings difference between sexes

Previously, the percentage used to compare male and female earnings was calculated as the difference between the medians divided by the female median earnings. This year, we have altered the calculation and use male median earnings as the denominator. This is inline with the calculation used by the ONS in their gender pay gap publication - Gender pay gap in the UK - Office for National Statistics (ons.gov.uk) (opens in a new tab).

Rounding and suppression rules 

We apply rounding and suppression rules to help minimise the risk of someone being identifiable from our data (also known as Statistical Disclosure Control).  All calculations done in this publication are used on the rounded figures. 

The following rounding rules have been applied to this publication:

  • All monetary values have been rounded to the nearest £100
  • All population counts have been rounded to the nearest 5.
  • All percentages have been rounded to 1 decimal place.

The following suppression rules have been applied to this publication:

  • Employment outcomes based on less than 2 full person equivalent (FPE) have been suppressed.
  • Earnings outcomes based on less than 11 FPE have been suppressed.

Definitions

Degree level (postgraduates)

The level of qualification is grouped in to three categories (level 7 taught, level 7 research and level 8). These are defined using the HESA "QUAL" variable (Student 2019/20 - Qualification awarded | HESA (opens in a new tab)). 

Graduates were broadly grouped into Level 7 and Level 8, more commonly known as Master’s degrees and doctoral degrees respectively. Enhanced undergraduate courses (e.g. MMath, MEng) that give you a postgraduate-level qualification are not included in our Level 7 population. These degree courses are included in our first degree population as you do not need to have completed a Level 6 qualification to apply for these courses. 

Level 7 data was also broken down into Level 7 (taught) for taught master’s degrees and Level 7 (research) for research masters degrees. In addition, Postgraduate Certificate in Education (PGCE) and Masters in Business Administration (MBA) were also split from the overall Level 7 (taught) numbers. For subject level breakdowns, these two courses were also split from the other 35 subject categories.

Sex

This field is collected by HESA and more detail can be found on Student 2020/21 - Sex identifier | HESA (opens in a new tab). We filter our data to only include individuals who are recorded as ‘Male’ or ‘ Female’ to avoid the risk of disclosure for individuals who are recorded as ‘Other’. 

Subject areas

The Higher Education Statistics Agency (HESA) are changing the way they report subjects from the 2019/20 academic year; the current Joint Academic Coding System (JACS) is being replaced by the Higher Education Classification of Subjects (HECoS). HESA have produced the Common Aggregation Hierarchy (CAH) which bridges between the two systems, and to maintain consistency across years were are using level 2 of the CAH to report breakdowns by subject area.  In this publication we use version 1.3 of CAH, this is in line with HESA recommendations. Version 1.3 is also consistent with our previous publications, aiding comparisons between them.

The number of subject categories increases to 35, compared with 23 using the previous JACS groupings. In many cases the CAH categories map exactly to a JACS category (e.g. Medicine and dentistry, Mathematical sciences, Creative arts and design) ; in the remainder of cases, the CAH categories just provide a more detailed split compared with JACS groups (e.g. the JACS group ‘Engineering & Technology’ is now split into ‘Engineering’ and ‘Materials and technology’ separately; similarly for ‘Historical and Philosophical Studies’ split into ‘History and archaeology’ and ‘Philosophy and religious studies’). More information on HECoS and CAH can be found here: https://www.hesa.ac.uk/innovation/hecos (opens in a new tab) 

CAH Code Subject 
CAH01-01Medicine and dentistry 
CAH02-02Pharmacology, toxicology and pharmacy 
CAH02-04Nursing and midwifery 
CAH02-05Medical sciences
CAH02-06Allied health
CAH03-01Biosciences
CAH03-02Sport and exercise sciences 
CAH04-01Psychology
CAH05-01Veterinary sciences 
CAH06-01Agriculture, food and related studies
CAH07-01Physics and astronomy 
CAH07-02Chemistry
CAH07-04General, applied and forensic sciences
CAH09-01Mathematical sciences 
CAH10-01Engineering 
CAH10-03Materials and technology
CAH11-01Computing
CAH13-01Architecture, building and planning 
CAH15-01Sociology, social policy and anthropology
CAH15-02Economics 
CAH15-03Politics 
CAH15-04Health and social care 
CAH16-01Law
CAH17-01Business and management 
CAH19-01English studies 
CAH19-02Celtic studies 
CAH19-04Languages and area studies 
CAH20-01 History and archaeology 
CAH20-02Philosophy and religious studies 
CAH22-01Education and teaching
CAH23-01Combined and general studies
CAH24-01 Media, journalism and communications 
CAH25-01 Creative arts and design
CAH25-02 Performing arts   
CAH26-01Geography, earth and environmental studies

It is important to note that, even with these additional splits, each CAH subject area can still include a diverse range of subjects, some of which will lead to significantly different employment and earnings outcomes. For example, ‘subjects allied to medicine not otherwise specified’ contains courses ranging from nutrition and dietetics to biomedical sciences. 

In this version of the publication, we have provided an additional table that shows the breakdowns by JACS 4-digit subjects (see JACS 3.0: Detailed (four digit) subject codes | HESA) (opens in a new tab). The suppression rules mean that not all subject will have employment and earnings outcomes available. 

Current Region

The current region geographical location data is based on the latest address that DWP has recorded for each individual on their Customer Information System (CIS). The LEO dataset does not contain the actual address or postcode for each individual, we currently have data on the Government Office Region (GOR), Local Authority District and Lower Layer Super Output Area (LSOA) where the individual lives at the end of each tax year.  

The CIS is primarily updated when an individual notifies DWP or HMRC of a change of address or through the individual interacting with a tax or benefit system. Individuals who have not been matched to the CIS will not have geographical information. This does not have an adverse effect on the data analysis as ‘unmatched’ graduates are excluded from employment and earnings outcomes.  

For those matched to CIS, address data is available in nearly all cases (over 99.8%), however for those who are not in receipt of benefits or contributing to the tax system then this information could be out of date. Even when contributing to the tax system, employee address is not a mandatory field in the data submitted to HMRC via employers HR systems. It is also possible that in the years soon after leaving university graduates may still use their parents address if they are moving frequently between rented accommodation. More work is needed to try and understand how big an impact this has on the address data held on CIS. 

Comparison with first degree graduates

The definition of “first degree” graduates is explained above in the “Data quality and coverage” section. Here we highlight some other factors to consider when making these comparisons. 

Firstly, students who go on to postgraduate study are typically expected to have achieved a higher level of attainment in their first degree. In the May 2018 publication (SFR_Template_NatStats (publishing.service.gov.uk) (opens in a new tab)), analysis was published to show the progression to further study and income distribution by degree classification.    

Secondly, the distribution of graduates by subject studied is likely to be different for postgraduates compared to first degree graduates (more information on population distributions by subject studied can be found at What do HE students study? | HESA (opens in a new tab)). It could be that postgraduate degrees tend to be in higher earnings subjects and this leads to the overall postgraduate average being higher.  

The IFS published a report in September 2020 to look into these questions in more detail as well as other factors - Earnings returns to postgraduate degrees in the UK (ifs.org.uk) (opens in a new tab).

Domicile categories

Domicile categories have been based upon graduates’ domicile prior to the start of their course, as recorded in the HESA student record for graduates from HEIs. Graduates have been categorised into three top-level categories – UK, EU and Non-EU. 

UK domiciled refers to graduates domiciled in England, Scotland, Wales or Northern Ireland prior to the start of their course.

EU domiciled refers to graduates domiciled in the EU other than in England, Scotland, Wales or Northern Ireland. As such, graduates domiciled in Gibraltar have been classed as EU domiciled. Over the period covered by this publication, the membership of the EU has expanded and hence different graduating cohorts consist of different sets of countries. Graduates have been classed as EU domiciled if their recorded country of domicile was a member of the EU at the start of their year of graduation. Table below details for which cohort(s) each country has been designated as part of the EU domiciled category. Countries listed include all of their European Union territories; for instance, Finland includes the territory of the Åland islands. 

Countries and territories included in the European Union category by graduating cohort 

Country/Territory Graduating cohorts in which domicile is counted as EU domiciled 

Austria 

Belgium 

Denmark 

Finland 

France 

Germany 

Gibraltar 

Greece 

Ireland  

Italy 

Luxembourg 

Netherlands 

Portugal 

Spain 

Sweden 

All graduating cohorts 

(2003/04, 2004/05, 2008/09, 2009/10, 2010/11, 2011/12, 2012/13, 2013/14, 2014/15, 2015/16, 2016/17) 

Cyprus 

Czech Republic 

Estonia 

Hungary 

Latvia 

Lithuania 

Malta 

Poland 

Slovakia 

Slovenia 

2004/05, 2008/09, 2009/10, 2010/11, 2011/12, 2012/13, 2013/14, 2014/15, 2015/16, 2016/17 

Bulgaria 

Romania 

2008/09, 2009/10, 2010/11, 2011/12, 2012/13, 2013/14, 2014/15, 2015/16, 2016/17 
Croatia 2013/14, 2014/15, 2015/16, 2016/17 

Overseas domiciled refers to graduates domiciled in countries/territories not belonging to the European Union. The Crown Dependencies of Jersey, Guernsey and the Isle of Man are not part of the UK or of the European Union and thus they have been included this category.  

The accompanying tables gives employment and earnings outcomes for the 20 largest countries of domicile within our data. We have followed methodology used by HESA in defining country of domicile, for instance aggregating together the various territories of France in the France total but keeping China and Hong Kong separate. 

Note that country of domicile is not the same as nationality (as recorded on the HESA student record). For instance, in 2012/13, 91% of UK domiciled graduates were UK nationals, while 7% of EU domiciled graduates and about 4% of overseas domiciled graduates were UK nationals. 

Data matching and match rates

The HESA student records are matched to DWP’s Customer Information System (CIS - Customer Information System – an explanation of the information held about you - GOV.UK (www.gov.uk) (opens in a new tab)) using an established matching algorithm based on the following personal characteristics: National Insurance Number (NINO), forename, surname, date of birth, postcode, and sex. Some of these characteristics are simplified to make the matching process less time-intensive and allow more matches. For instance if a surname is misspelt in one of the datasets, only the first initial of the forename is used, the surname is encoded using an English sound-based algorithm called SOUNDEX (function that turns a surname into a code representing what it sounds like, which allows some flexibility for different spellings. For example Wilson=Willson), and for most matches only the sector of the postcode is used. 

The NINO is not present on the HESA student record itself and has been matched on where possible by fuzzy matching with personal data from the Student Loans Company. This process increases the likelihood of finding a match with CIS. Accordingly, groups less likely to take a student loan, for example international students who are not eligible for one, are likely to have lower match rates. 

All records accessed for analysis are anonymous so that individuals cannot be identified. The personal identifying records used in the actual matching process are accessed under strict security controls. 

There are five match processes carried out, ranging from the highest quality and most likely to be accurate (Green) to the lowest quality and most likely to be a false match (Red-Amber). The table below shows the criteria for each match type.  

Once the HESA records have been matched to the CIS the corresponding tax and benefits records for that individual can then be linked to their HESA record. 

All match rate analysis in this chapter is restricted to the HESA population covered in this publication, that is, UK domiciled, first degree graduates from UK Higher Education Providers. 

Table: Criteria for each match strength (Y indicates a match, - indicates no match) 

Match qualityNINO (National Insurance number)Forename(initial)Surname(soundex)Date of birthSexPostcode(sector)
1. Green YAt least four of forename, surname, DOB, sex and postcode.
2. Amber Any three of forename, surname, DOB, sex and postcode.
3. Green-Amber YYY
4. Amber-RedYYOne of sex or postcode.
5. Red-AmberYY (full postcode) 

Overall match rates

In this section we consider match rates to the CIS spine. This differs slightly from the match rates displayed in the main tables of this publication, which also include those without a CIS match but with a record of further study in the given year. 

The table below shows the overall CIS match rates for graduates who studied full-time as well as the proportion with a tax or benefit record. Potential reasons for not being able to find a P14 record, despite having a match to the CIS spine, include: earning below the Lower Earnings Limit (LEL), self-employment, moving abroad and death. 

Table: Match rates for UK domiciled postgraduates at English HEIs, by year of graduation Coverage: UK domiciled male and female postgraduates from English HEIs. Cohorts: 2003/04, 2004/05, 2005/06, 2006/07, 2007/08, 2008/09, 2009/10, 2010/11, 2011/12, 2012/13, 2013/14, 2014/15, 2015/16, 2016/17 

Academic yearMatched to tax/ benefit record (%)Matched to CIS spine (%)
2003/048788
2004/058989
2005/069091
2006/079192
2007/089293
2008/099293
2009/109393
2010/119393
2011/129393
2012/139596
2013/149596
2014/159596
2015/169596
2016/179696

The table above shows that the match rate was high for the most recent cohorts: 96% of postgraduates in 2016/17 were matched using the CIS,  96% were matched in 2012/13 (our 5YAG breakdown for the 18/19 Tax Year) and almost all of these had at least one tax record or out-of-work benefit record. This compares to a match rate of 88% of graduates in 2003/04. The higher match rates for more recent cohorts is at least partly explained because the CIS holds the most recent names and addresses for individuals, and so if the details change after someone graduates there is less chance that they will be matched. 

These rates are slightly lower than previous publications, particularly in recent academic years dropping by 3 or 4 percentage points. Older academic years have been considerably less impacted by this change in match rates. The individuals that are impacted were matched to multiple NINo’s where there was an equal likelihood of the NINo being correct. As a result, neither match was used to avoid potential incorrect matches. The cause of the issue has now been understood and will be corrected in the next LEO dataset.

Match rate by graduate characteristic

The table below shows match rates by sex. The match rate for females is slightly lower in the earlier years than for males, but this difference is negligible or non-existent in recent cohorts. As the CIS holds the latest information about an individual, anyone that has changed their name since graduation will have a different name on the CIS compared to their HESA record. This particularly affects females, due to a higher likelihood than males of changing their name upon marriage. 

Table: CIS match rate by sex Coverage: UK domiciled male and female postgraduates from English HEIs. 

Academic yearFemale (%)Male (%)
2003/048494
2004/058694
2005/068895
2006/079095
2007/089195
2008/099295
2009/109295
2010/119395
2011/129394
2012/139596
2013/149696
2014/159696
2015/169696

The match rates were also compared for different ethnic groups out of the UK-domiciled students. There was little consistent difference between the groups, the only exception being graduates whose self-declared ethnicity was Chinese, where the match rate was 81% in 2016/17. Further investigation showed that this was most likely due to the ethnically Chinese forenames and surnames being switched on one of the databases. This is more common for Chinese names because the family name traditionally comes before the individual name. This hypothesis is further corroborated by the fact that ethnically Chinese students with common English names have match rates that are similar to graduates from other ethnic groups.  

The number of forenames or surnames an individual has can affect the match rate, because with multiple names it is more likely that they will not all be recorded, or there may be forenames recorded as surnames or vice versa. Analysis of the match rates showed that those with at least two surnames had a slightly lower match rate than those with only one. 

Match rates are noticeably lower for non-UK domiciled graduates. The main reason for this is that LEO relies on graduates having been issued with a National Insurance number to match them to an employment record. However, international students who have no intention of working or claiming benefits in this country are less likely to apply for a National Insurance number and so would not appear in the LEO data. 

It may be that international graduates remain in the UK but not in work or receiving any type of benefit, and so do not require a National Insurance number. However, our expectation is that international graduates are likely to have moved abroad, with the majority returning to their home country. Recent Home Office reports confirm that the vast majority of non-EU international students who were granted a visa to study in the UK left in time (97.4% - Fourth report on statistics being collected under the exit checks programme - GOV.UK (www.gov.uk) (opens in a new tab)).  

Some international students may have been issued with a National Insurance number but will not appear in the UK tax or benefit system for the tax years included in this release. These graduates are recorded as ‘activity not captured’, even if they are in employment in another country.  

This publication uses the current region of residence data supplied by the Department for Work and Pensions (DWP) to identify graduates who were not living in the UK for the majority of the tax year. These graduates are removed from the denominator to help improve the accuracy of the employment outcomes calculations.  

Other reasons for lower match rates among non-UK domiciled graduates include higher likelihoods of misspelling of names and lower take up of/eligibility for student loans, meaning we would not be able to attach NINO to the HESA data to aid the matching process. 

Get in touch

Media enquiries

Press Office News Desk, Department for Education, Sanctuary Buildings, Great Smith Street, London SW1P 3BT.

Tel: 020 7783 8300

Other enquiries/feedback

Email: HE.LEO@education.gov.uk

Help and support

Contact us

If you have a specific enquiry about Graduate outcomes (LEO): postgraduate outcomes statistics and data:

Higher education statistics team (LEO)

Email: he.leo@education.gov.uk
Contact name: Simon Childs
Telephone: 07920 594501

Press office

If you have a media enquiry:

Telephone: 020 7783 8300

Public enquiries

If you have a general enquiry about the Department for Education (DfE) or education:

Telephone: 037 0000 2288

Opening times:
Monday to Friday from 9.30am to 5pm (excluding bank holidays)