A CSO Frontier Series Output - What is this?
Before using personal administrative data for statistical purposes, the CSO removes all identifying personal information. This includes the Personal Public Service Number (PPSN), a unique number used by people in Ireland to access social welfare benefits, personal taxation and other public services. A pseudonymised Protected Identifier Key (PIK) is created by the CSO when the PPSN is removed. This PIK is unique and non-identifiable and is only used by the CSO.
Using the PIK enables the CSO to link and analyse data for statistical purposes, while protecting the security and confidentiality of the individual data. All records in the matched datasets are pseudonymised and the results are in the form of statistical aggregates which do not identify any individuals.
The CSO is committed to broadening the range of high-quality information it provides on societal and economic change. The large increase in the volume and nature of secondary data in recent years poses a variety of challenges and opportunities for institutes of national statistics. Joining secondary data sources in a safe manner across public service bodies, while adhering to statistical and data protection legislation, can provide new analysis and outputs to support decision-making and accountability in a way that is not possible using discrete datasets. Furthermore, a coordinated approach to data integration can lead to cost savings, greater efficiency and a reduction in duplication.
The CSO has a formal role in coordinating the integration of statistical and administrative data across public service bodies that together make up the Irish Statistical System (ISS). Underpinning this integration is the development of a National Data Infrastructure (NDI) – a platform for linking data across the administrative system using unique identifiers for individuals, businesses and locations. The data linking for statistical purposes is carried out by the CSO on pseudonymised datasets using only those variables which are relevant to the research being undertaken. A strong focus on data integration, which requires the collection and storage of identifiers such as PPSN and Eircodes, is a priority of the ISS in its goal of improving the analytical capacity of the system.
Data protection is a core principle of the CSO and is central to the development of the NDI. As well as the strict legal protections set out in the Statistics Act, 1993, and other existing regulations, we are committed to ensuring compliance with all data protection requirements. These include the Data Sharing and Governance Act (2019) and the General Data Protection Regulation (GDPR, EU 2016/679).
Enumerators for the 2016 Census of Population were told that a staff member of an institution, (e.g. a hospital or prison), who worked a night shift on Census night and who returned to their own home the following morning should be enumerated in their own home and not in the institution. Therefore, it can be assumed that all those enumerated in a prison on Census night were offenders.
There were 3,791 offenders enumerated in prisons on Census night 2016. A valid PIK was matched for 2,850 (75.2%) of the offenders while no match was possible for 941 (24.8%). The following section compares the matched and unmatched records by sex, age group, and country of birth, to illustrate how representative the 2,850 matched records are of the total offender population.
Table 4.1 - Offender population on Census Night classified by sex and data matching rate1 | ||||
Offenders matched to PIK | Offenders not matched to PIK | % of offenders matched to PIK | % of offenders unmatched to PIK | |
Male | 2,750 | 890 | 94.7 | 96.5 |
Female | 100 | 50 | 5.3 | 3.5 |
Total | 2,850 | 940 | 100.0 | 100.0 |
1Numbers are rounded to the nearest ten |
The proportion of males in the matched population was 94.7%, very similar to 96.5% in the unmatched population, see Table 4.1. This suggests that the proportion of offenders by sex were relatively close across the matched and unmatched populations.
Table 4.2 Offender population on Census Night classified by age and data matching rate1 | ||||
Age category | Offenders matched to PIK | Offenders not matched to PIK | All offenders | % Matched to PIK |
Under 25 years | 540 | 280 | 820 | 65.9 |
25 - 29 years | 550 | 170 | 720 | 76.4 |
30 - 34 years | 530 | 160 | 690 | 76.8 |
35 - 39 years | 430 | 110 | 540 | 79.6 |
40 - 44 years | 300 | 70 | 370 | 81.1 |
45 - 49 years | 190 | 50 | 240 | 79.2 |
50 - 54 years | 120 | 40 | 160 | 75.0 |
55 - 59 years | 80 | 20 | 100 | 80.0 |
60 - 64 years | 50 | 20 | 70 | 71.4 |
65 years and over | 70 | 30 | 100 | 70.0 |
Total | 2,850 | 950 | 3,790 | 75.2 |
1Numbers are rounded to the nearest ten |
Three in four (75.2%) offenders were matched to a PIK but younger offenders were less likely to get a match. Only 65.9% of offenders under 25 years were matched compared to 81.1% of those aged 40-44 years. These differences in matching rates should be considered when making any conclusions about the total offender population based on the matched data only.
Table 4.3 Male offenders and all males aged 15 and over on Census Night by age group | ||||
number | % | |||
Age group | All male offenders | All males aged 15+ | All male offenders | All males aged 15+ |
Under 25 years | 820 | 287,080 | 21.1 | 15.9 |
25 - 29 years | 720 | 141,760 | 19.0 | 7.8 |
30 - 34 years | 690 | 170,140 | 18.2 | 9.4 |
35 - 39 years | 540 | 188,560 | 14.2 | 10.4 |
40 - 44 years | 370 | 175,290 | 9.8 | 9.7 |
45 - 49 years | 240 | 160,520 | 6.3 | 8.9 |
50 - 54 years | 160 | 145,660 | 4.2 | 8.1 |
55 - 59 years | 100 | 131,330 | 2.6 | 7.3 |
60 - 64 years | 70 | 116,280 | 1.8 | 6.4 |
65 years and over | 100 | 291,370 | 2.6 | 16.1 |
Total | 3,790 | 1,807,990 | 100.0 | 100.0 |
Male offenders were younger than the overall male population. About four in ten (40.1%) male offenders were aged under 30 while just over two in ten (23.7) males in Ireland aged 15 and over were in the age group 15-29. Just 39.1% of male offenders were aged 35-64 compared with 50.8% of all males aged 15 and over.
Table 4.4 Offender population on Census Night classified by place of birth and data matching rate1 | ||||
Offenders matched to PIK | Offenders not matched to PIK | All offenders | % of offenders matched to PIK | |
Ireland | 2,460 | 720 | 3,180 | 77.3 |
Elsewhere | 390 | 220 | 610 | 64.1 |
Total | 2,850 | 940 | 3,790 | 75.2 |
1Numbers are rounded to the nearest ten. |
Offenders born in Ireland were more likely to be matched. Of the 3,180 offenders born in Ireland, 77.4% were matched to a PIK compared with 63.9% of the 610 offenders born elsewhere.
Table 4.5 Male offenders and all males aged 15 and over on Census Night by place of birth1 | |||||||
number | % | ||||||
Offenders matched to PIK | All males aged 15 and over | All male offenders | All males aged 15 and over | ||||
Ireland | 3,060 | 1,448,280 | 84.1 | 80.1 | |||
Elsewhere | 580 | 359,720 | 15.9 | 19.9 | |||
Total | 3,640 | 1,808,000 | 100.0 | 100.0 | |||
1All numbers rounded to nearest ten. |
The proportion of male offenders born in Ireland was 84.1%, very close to the proportion of 80.1% for all males aged 15 and over.
Figures 4.1 - 4.4 provide a summary of the unmatched records by age and place of birth in absolute terms and as a percentage of total offenders, respectively. The following analysis does not include a breakdown by sex as only 50 (or 1.8%) of the 2,850 matched offenders were female.
Unmatched offenders | |
< 25 | 280 |
25-29 | 170 |
30-34 | 160 |
35-39 | 110 |
40-44 | 70 |
45-49 | 50 |
50-54 | 40 |
55-59 | 20 |
60-64 | 20 |
65-69 | 30 |
Unmatched offenders | |
20-24 | 33.9003645200486 |
25-29 | 26.6469282013323 |
30-34 | 23.3784746970777 |
35-39 | 21.9055374592834 |
40-44 | 19.2222222222222 |
45-49 | 19.5723684210526 |
50-54 | 23.8213399503722 |
55-59 | 24.3137254901961 |
60-64 | 22.2222222222222 |
65-69 | 23.3644859813084 |
As a proportion of each age category, the age group <25 had the highest unmatched rate at 33.9% while the lowest rate was in the age group 40 – 44 at 17.9%, although unmatched rates in most age groups were below or close to the 24.8% overall unmatched rate.
unmatched offenders | |
Ireland | 720 |
Elsewhere | 220 |
unmatched offenders | |
Ireland | 22.7172889865077 |
Elsewhere | 35.9271523178808 |
Offenders born elsewhere were more likely to be unmatched. Of the 610 born elsewhere, 35.9% were not matched to a PIK compared with 22.7% of the 3,180 offenders born in Ireland.
The rates of matching to a PIK vary by age and place of birth. This should be remembered when making conclusions for the whole population of offenders.
The reasons for being unable to allocate a PIK could include:
There are five classifications which are used to describe economic status throughout the report:
‘Education & training only' refers to offenders that are enrolled in an education and/or training programme but are not classified as being in substantial employment:
'Education & training and substantial employment' corresponds to offenders who meet the criteria for substantial employment or self-employment and are enrolled in education at some point within the same calendar year.
'Substantial employment only' An individual is regarded as being 'in substantial employment' within a given calendar year if they fulfil either of the criteria A or B below.
A. Substantial P35 Employment - They fulfil the following two requirements:
B. Substantial Self-Employment - Their total turnover across all self-employment activities is at least €1,000 within the calendar year.
‘Neither employment nor education’ is comprised of offenders who are neither enrolled in education nor are involved in substantial employment within the year but appear somewhere in the administrative data for that year. These offenders may have some record of (non-substantial) employment, may have claimed a benefit from the Department of Employment Affairs & Social Protection (DEASP) in the year, or engaged with a public body regarding property related payments such as rent or property tax, among other minor administrative activities.
‘Not identified’ are offenders who were not identified in any of the administrative data between 2012 and 2017. Some of this group may have emigrated, but this number is hard to quantify, as there is no definitive indicator of emigration available through administrative data sources.
Income through the PAYE system is included in the earnings analysis - income from self-employment activities registered through the self-assessment system is excluded. Median values for earnings are presented in each case. All earnings relate to gross pay.
Census of Population Analysis (COPA)
The COPA is a pseudonymised copy of the Census of Population 2016 dataset held internally within the CSO for analysis purposes. It contains Census attribute information for individuals and households, excluding persons records registered as guests. Approximately 5% of Census records could not be assigned a PIK and were excluded from the analysis.
Person Income Register (PIR)
The PIR is a pseudonymised income register held internally within the CSO. It contains information on income received by individuals relating to employment, self-employment and social transfers. It is derived from administrative holdings held by the Revenue Commissioners and Department of Employment Affairs and Social Protection. Therefore, the PIR provides a near complete picture on income for individuals, for a calendar year. All linkage is carried out by using a PIK assigned on each contributing data source. The PIK is then used to link the pseudonymised data sources together to create the PIR. The PIK protects a person’s identity but also enables linking across data sources and over time. The PIK enables high quality deterministic matching thus significantly reducing/eliminating linkage error.
Primary Care Reimbursement Service (PCRS)
PCRS is responsible for making payments to healthcare professionals – doctors, dentists, pharmacists and optometrists/ophthalmologists – for the free or reduced costs services they provide to the public across a range of community health schemes. The schemes form the infrastructure through which the HSE delivers a significant proportion of Primary Care to the public. PCRS also manages the National Medical Card Unit (NMCU) which was established in 2011 to process all Medical Card and GP Visit Card applications at a national level.
Residential Tenancies Board (RTB) Registrar
The Residential Tenancies Board (RTB) register contains information on all tenancies registered by landlords, both private and Approved Housing Bodies (AHB).
Central Records System
The Central Records System (CRS) is a legacy system within the Department of Employment Affairs and Social Protection (DEASP) which holds data on their customers. This includes data on identity, address, relationships, claims, PPS contributions, earnings, employments and employers. It is a central repository of basic personal data on individuals held on different systems within DSP, together with income and social insurance contributions data (P35 data) which are supplied by the Revenue Commissioners.
Driver Details from the National Vehicle and Driver File (NVDF)
This file contains a listing of driver licence holders in the country.
DSP Payments
The Central Record System (CRS) includes data on the clients of the DEASP, including employment, unemployment, illness payments, pensions and so on. Each quarter the CSO receives a copy of the full database.
The CRS contains a high-level record for each engagement, so a four-week period of unemployment would be one record (see Table 1, item 1). It does not contain data about how much the payment was, merely the type of scheme, and the duration. The operational information is contained in the Integrated Short-Term Payments System (ISTS).
ISTS contains one record for each payment due, so four records in this example, one in each weekly file (Table 1, item 2). It includes a breakdown of the payment and any extra amounts for dependants, as well as all relevant dates, a sample of which are included in the table. The ISTS records the full entitlement amount, but this may not be the amount actually paid out, which may be reduced to repay a previous overpayment, or because of means from a spouse or part time job.
This system contains data on all short-term schemes, such as unemployment, illness, rent, maternity, etc. It does not cover long term pension schemes.
The actual amounts paid out are found in yet another system, the Business Object Model (BOMI). This is the DEASP’s newest database and is gradually replacing many of their older operational data systems, such as ISTS and PenLive (Pensions data).
FAS/SOLAS
FAS organised courses, further education and training for many years.
In October 2013 SOLAS and the new Education & Training Boards replaced FÁS and FAS courses. SOLAS – An tSeirbhís Oideachais Leanúnaigh agus Scileanna – is now Ireland’s Further Education and Training Authority. They build a clear, integrated pathway to work for learners seeking to enrol in Further Education and Training.
FETAC/Quality and Qualifications Ireland
The Further Education and Training Awards Council (FETAC) is the former statutory awarding body for further education in Ireland. It was established on 11 June 2001 under the Qualifications (Education and Training) Act 1999.
FETAC was dissolved, and its functions were passed to Quality and Qualifications Ireland (QQI) on November 6, 2012. QQI is an amalgamation of the previously operational Further Education and Training Awards Council (FETAC); the Higher Education and Training Awards Council (HETAC); the Irish Universities Quality Board (IUQB) and the National Qualifications Authority of Ireland (NQAI).
Housing Assistance Payments (HAP)
The HAP data delivery relates to details of all tenancies, tenants, properties and landlords that have been involved in the HAP scheme since its introduction in 2014. More information on the HAP scheme can be found on the HAP website.
Higher Education Authority
The Higher Education Authority data provides details on annual enrolments and graduations from the publicly funded universities and institutes of technology in Ireland.
Springboard
HEA Springboard and ICT provides information on students who have undertaken HEA springboard or ICT courses. This data includes course details and basic demographic information for enrolled students, with identifying data removed or pseudonymised.
Housing Agency Social Housing Waiting Lists
The data from the Housing Agency includes the Summary of Social Housing Assessments (SSHA) annual reports supplied by the Local Government Management Agency (LGMA) for years 2016, 2017, 2018 and 2019.
Help to Buy Scheme
The Help to Buy (HTB) incentive is a scheme for first-time property buyers. It will help with the deposit you needed buy or build a new house or apartment. The data flow is supplemented by CSO-created value-added tables.
Integrated Short Term Payments System (ISTS)
The weekly ISTS data is supplied by the DEASP and represents current and closed claims on their system.
ITForm11
Annual Income Tax returns of the self-employed.
Local Property Tax (LPT)
The LPT file contains one record - the most recent LPT return - for each of the 1.9 million properties in the State.
PAYE Modernisation (PMOD)
The PMOD dataset combines data from Revenue with CRS data on sex, month of birth and nationality.
Post-Primary Pupils Database
The Post-Primary Pupil Database is currently the only national archive of student enrolment at post-primary schools. Individual and personal data on each student enrolled in each recognised post-primary school are collected by the Department of Education and Skills.
SPP35
P35L (annual employee- level return by employers) dataset, as received from Revenue Commissioners combined with CRS (Client Record System) for Gender and Month of Birth and with CBR (Central Business Register) for CBR, NACE (Statistical classification of economic activities in the European Community), and Legal form.
Learn about our data and confidentiality safeguards, and the steps we take to produce statistics that can be trusted by all.