LSC

WBL Success Rates Methodology - Master File Production

1 This definition covers the process of combining the Fuzzy Match Files. After some final processing and de-duplification of cases the output from this methodology is the new measures WBL Success Rates Master File.

Purpose

2 Learner outcome data used by the LSC as part of Performance Review and the data used by ALI during inspections focuses on success rates extracted from the individualised student record (ISR) and individualised learner record (ILR).

3 This methodology shows how to use the outputs of the Fuzzy Data Matching methodology to create the WBL Success Rates Master File. The output of this methodology is the basis for all WBL Success Rates calculations based around the new measures.

Relevant Collections

4 The method is run from data collected in the most recent freeze for each year, allowing for success rates to be calculated up until the most recent month.

  • ILR (WBL) 2005/06
  • ILR (WBL) 2004/05
  • ILR (WBL) 2003/04
  • ILR (WBL) 2002/03
  • Interim ILR (WBL) 2001/02

Source Data

5 The method uses various other files to aid in producing the new measures master file.

Derived Variables and Output Datasets

6 The methodology produces the following derived variable(s)

Field Name

Label

Dataset

trans

Flag – learner has transferred to a different course at the same provider

WBL Success Rates Master File

trans_provider

Flag – learner has transferred to a course at another provider

WBL Success Rates Master File

leavers

Flag – learner has an actual end date on this aim

WBL Success Rates Master File

starters

Flag – case counts as a start in the success rates, as it is not a transfer to another provider.

WBL Success Rates Master File

continuing

Flag – learner is continuing in study at the most recent date we have records for.

WBL Success Rates Master File

frm_ach

Flag – learner has achieved the framework

WBL Success Rates Master File

frm_ach_timely

Flag – learner has achieved the framework either before the planned end date or within 31 days of that date

WBL Success Rates Master File

nvq_ach

Flag – learner has achieved the NVQ

WBL Success Rates Master File

nvq_ach_timely

Flag – learner has achieved the NVQ either before the planned end date or within 31 days of that date

WBL Success Rates Master File

miss

Flag – the user went missing between years before an achievement or end date had been recorded.

WBL Success Rates Master File

sfl_0102 to sfl_0506

Flag – which year’s ILR did the case come from

WBL Success Rates Master File

plan_break

Flag - learner is on a planned break from learning

WBL Success Rates Master File

livel01_0102 to livel01_0506

Flag – provider was active in this year.

WBL Success Rates Master File

startyr

The academic year in which learning began

WBL Success Rates Master File

expendyr

The academic year in which learning was expected to end

WBL Success Rates Master File

actendyr

The academic year in which learning actually ended

WBL Success Rates Master File

hybridendyr

The academic year which is the latter of expendyr and actendyr

WBL Success Rates Master File

7 The method produces the following datasets

Detailed definition

8 This methodology follows on from preceding methodologies, which are carried out in the following order:

9 The methodology is followed by these methodologies:

10 As an output of the fuzzy matching process we have a set of files that can be combined and will produce a consistently matched dataset across all the years involved. This is because L03s have been made consistent across years.

11 The first step is to combine all the files using the provider number (L01), learner number (L03), programme type (A15) and the intermediate matching variable (matchv). Matchv either denotes the area of learning, in the case of NVQ only programmes or the sector framework code, in the case of apprenticeships.

12 The derived variables relating to the years of learning are then calculated. This is done by iteratively checking each year from 1993 through to 2025 to determine whether the planned end, actual end and start dates occur within that year. The extreme date range is used in order to catch all the data, as there are some very unlikely dates recorded in the ILR.

13 Learners are then checked to see which cases went missing between years. Each case has a set of variables called sfl_0102 through to sfl_0506; these flag which years the case has drawn data from. Therefore, if a case is in one year but not the next and has no recorded end date then that case is flagged as having gone missing in the derived variable ‘miss’.

14 Each case is then assigned an age band based upon the age of the learner at the start of the aim. All cases are assigned into 2 age bands: either “16-18” or “19+”.

15 Next, any remaining cases with the same matching variables used at the start of this process are removed such that only the one with the earlier start date remains, provided that this does not mean removing any achievements from the file. This ensures that the same information is not counted twice.

16 Cases that have the same UPIN, L03 and are leavers who have not achieved are then examined. Any cases that meet these criteria and have an earlier end date than the other cases that met the criteria will be marked as transfers and will therefore not feature in success rate calculations. Transfers are flagged in the derived variable ‘trans’.

17 The derived variable ‘starters’ is calculated as being the opposite of trans, so that the case counts as a start if and only if it is not a transfer.

18 The WBL Success Rates Live UPIN Files are then matched into the remaining data. These files flag which providers returned data in a particular year and allow for success rates to be produced only in years for which the provider was returning data to the LSC.

19 Any cases that are flagged with variable ‘miss’ are then assigned an actual end year of the last year they were seen in.

20 Any user who is not flagged as a transfer in variable ‘trans’ but who has gone missing is marked as a leaver so that they count as a non-achievement in the success rates.

21 Any case without an actual end date that has not gone missing between years is flagged as ‘continuing’.

22 Variables involved in the matching process that are not needed in success rate calculations are removed from the file. The remaining data is then saved as the WBL Success Rates Master File.

Sample Code

23 The following sample code is available

Creator

Analysis and MI Team

Date issued

31 March 2006

Date created

14 February 2006

Document ref.

\\records.lsc.local\NAT\23 LrngSkillsPolicyInfrastr\23-07 DataCollectAlysis\23-07-03 LrnrDataAlysisDiss\nat-wblsuccessratesmethodology-masterfileproduction-report-31mar2006.doc

LSC office

Learning and Skills Council
Cheylesmore House Quinton Road Coventry CV1 2WT
T 0845 019 4170 F 024 7682 3675 www.lsc.gov.uk/

Last Modified: 31 Mar 06