Back to Top

 Skip navigation

Background Notes

CSO Frontier Series Research Paper

CSO research publication, , 11am
Frontier Series Output

CSO Frontier Series outputs may use new methods which are under development and/or data sources which may be incomplete, for example new administrative data sources. Particular care must be taken when interpreting the statistics in this release.
Learn more about CSO Frontier Series outputs.

About TII Data

Transport Infrastructure Ireland (TII) have over 300 active Traffic Monitoring Units (TMU’s) around the country that record the volume of traffic by hour of day and vehicle class. Vehicles are counted when they pass over loops embedded in the road surface. Currently the CSO publishes data from a small number of TMU’s to provide indicators of traffic patterns in Dublin and Regional areas for its monthly Transport Bulletin.

The availability of hourly aggregated TII traffic count data allows for a more comprehensive overview of nationwide traffic patterns and the possibility of producing real-time statistics. It also opens the possibility of using traffic data to inform broader economic indicators.

Statistical organisations around the world have started to publish experimental statistics using larger scale traffic counts to provide various indicators. A proof of concept was carried out by the UK Office for National Statistics (ONS). In their paper, “Faster indicators of UK economic activity: road traffic data for England“ (Rowland et al., 2019), the ONS carried out detailed studies on traffic flow around ports to identify the uses of traffic counts in determining economic activity. Additionally, the work of Statistics Finland on Nowcasting, “Nowcasting Finnish economic activity: a machine learning approach” (Fornaro and Luomaranta, 2019), highlights the application of traffic data, again as an economic indicator. The Northern Ireland Statistics and Research Agency (NISRA) have also completed studies in this area by analysing “Traffic Counts of Vehicles at the Fifteen Main Northern Ireland-Ireland Border Crossing Locations”.

Big Data

Given the velocity and volume of the traffic count data, it falls under the classification of Big Data. The use of Big Data in official statistics is still very new and involves some challenges. Internationally agreed standards are still being developed and high-frequency statistics often come with a trade-off between timeliness and reliability.

While the TII traffic count data can help to improve the periodicity and timeliness of traffic count statistics it is not a simple matter of aggregating weekly or daily aggregates. Among the challenges faced are the quality and completeness of the data sets. For example, gaps can occur during the recording of traffic counts due to inactive TMU’s as a result of road works or technical difficulties. A major component of this work was the preparation of the data and the generation of methodologies to deal with missing data .

Footnotes

Rowland, R., Eidukas, A., Campbell, S., Nolan, L., Elliot, D., Del-Chowdhury, S. (2019). Faster indicators of UK economic activity: road traffic data for England. Data Science Campus - Office for National Statistics.

Available on the Data Science Campus UK website - (Accessed: 28-09-22)

Fornaro, P. & Luomaranta, H. (2019). Nowcasting Finnish economic activity: a machine learning approach. ETLA Statistics Finland.

Available on the Eurostat website - (Accessed: 15-02-23)

Northern Ireland Statistics and Research Agency (2023). Traffic Counts of Vehicles at the Fifteen Main Northern Ireland-Ireland Border Crossing Locations.

Available on the NISRA website - (Accessed: 28-09-22)

The TII Source Dataset

The hourly aggregated counts were provided daily to the CSO by Transport Infrastructure Ireland via an API where they were uploaded and stored in the CSO’s data hubThe daily data files consisted of seven variables. In addition to the hour, day, month and year, the data includes the hourly vehicle count, the vehicle classification, and a unique identifier information for each TMU. The unique identifier can be used to identify the TMU location and description.

Traffic Monitoring Units

TII data is enhanced by its collection process. The data from the TII is recorded via Traffic Monitoring Units (TMU’s). Each TMU has associated geospatial data allowing for the clear identification of each TMU location.

National level statistics are less reliant on the geospatial aspect of each TMU but rather the counts recorded by them.

For refined analyses, such as tourist or port locations, the geospatial features of each TMU allows for a clear and unbiased selection of TMU data.

Data Quality Overview

There are complications or issues when dealing with Big Data. The TMU’s recording data for the TII unfortunately lack a storage capacity which is typically found in Motorway Incident Detection and Automatic Signaling (MIDAS) sensors. Given a lack of storage capacity, when a TMU is temporarily out of operation it results in a loss of information or traffic count. The leading causes of temporary inactivity are typically software updates, roadworks or where a sensor has failed validation checks. In the case of TMU’s when any of the following faults or outages are present there will be gaps present in the data until the issue is resolved.

A major aspect when dealing with this TII traffic count data is identifying if any rises or falls in recorded traffic is valid and not because of a TMU failure. The importance of this aspect strongly correlates with spatial resolution. Inversely, as the spatial resolution decreases due to an increase in the geographical area investigated, aggregated values over several counters allows for an increased margin of error.

Although the TII TMU’s are quite comprehensive in their coverage it should be noted that there may be location bias in the areas in which they are operational. TMU’s tend to be clustered in more densely populated areas where greater traffic volumes are expected.

Tourist Locations

TMUs within a 10km radius of selected tourist sites were identified for the bus traffic analysis. This aimed to reduce bias or subjectivity from the TMU selection process. A similar method was initially employed by the UK’s Office of National Statistics (ONS) where it proved to be successful, hence the reasoning behind the replication of their methodology.

The tourist locations were selected based on visitor counts published by Fáilte Ireland as part of their Annual Visitor Attractions Survey which included 2021 attendances at a large variety of tourist attractions.

Three sample touist locations were selected:

  • Cliffs of Moher
  • Rock of Cashel
  • Newgrange

 

  

 

Methodological Notes

This research paper contains four methodologies developed for the generation of differing traffic count analytical outputs. The first method is based on the generation of timelier national traffic counts. It compliments existing traffic count analyses currently in production. This method expands upon this existing work by analysing total weekly traffic volumes at an increased level of TMU coverage.

The second method involves the generation of daily traffic patterns based on average hourly traffic volumes per TMU across Ireland. It allows for the assessment of peak traffic volumes per time of day and includes a breakdown of weekend, weekdays, and full week analyses.

The third method involves vehicle specific analyses around tourist locations. It is a simple method where the analysis used in method one is filtered to buses for selections of TMU’s recorded using the designated buffer zones (see “Tourist Locations”).

The fourth and final method identifies traffic volumes based on fifteen selected border locations to provide indicators of cross-border trip trends. These fifteen border locations were selected to coincide with existing work carried out by the Northern Ireland Statistics and Research Agency (NISRA).

All methodologies aimed to develop robust code that can be implanted and replicated by others. As part of the validation, the TII traffic count analysis involved the replication of the current transport bulletin as a benchmark of accuracy and reliability (see “Testing and Validation”).

Method 1: National Statistics

The National Statistics aspect of this publication aimed to expand upon the current work carried out within the Transport Bulletin. To date the Transport Bulletin provides weekly average traffic volumes for Cars and Heavy Good Vehicles (HGV’s) for a small selection of Dublin and Regional sites. Using the TII Traffic Count data the outputs were expanded to compute National Figures using all TMU’s where possible.

Hourly aggregated data was merged into daily data sets where they were compiled into monthly data frames in the CSO’s internal data hub. It was then available for analysis using a dedicated R server. The data was later combined to form a yearly data frame.

Some data exhibited misclassification errors. Missing data was present at sufficient volumes to require re-classification. Any unclassified vehicle classes were re-classified based on the associated hourly vehicle count. The average hourly count for each known vehicle type was calculated and the re-classification of the unknown vehicles were based on these average values.

Date and week number variables were appended to the cleaned yearly data frames and the traffic counts were aggregated to show totals per day by vehicle class.

Missing days were then appended to the data frame and the missing daily value was imputed based on the weekly average of the existing data.

Once cleaned, and with all necessary imputation completed, the data was further aggregated to show weekly sum values for all known vehicle classes.

Method 2: Average Hourly Traffic Volumes per TMU

This method explores the peak traffic times on a national level. The methodology used follows directly from above (see Method 1: National Statistics).

For this method, the aggregation is done hourly rather than by week and merges all vehicle classes together. The mean hourly counts for all vehicles are generated. Appended to the data frame is a ‘Day of Week’ variable which allows for a more refined analysis of the variation in peak traffic times for the weekend, weekdays, and the full week.

Method 3: Cross-Border Traffic Trends

This analysis involves the restriction of the TII data to information generated from fifteen selected TMU’s on Northern Ireland border crossing roads.

The cleaned and prepared data (see Method 1: National Statistics) is limited to these sites only where monthly totals per vehicle class is calculated. This calculation sums the monthly total from each site into a singular monthly value.

For this analysis each monthly sum was indexed using the following formula:

 Formula

IndexMonth = (TotalMonth x 100) / Base

Base = January 2019 Monthly Sum

Therefore, January 2019 had an Index = 100

Method 4: Geographical Localisation - Tourist Sites

To analyse the traffic flow patterns at a more granular level, for example around selected tourist attractions, it was important to first define a methodology which could be applied throughout the project. This methodology aimed to remove subjectivity and reduce any bias in the TMU selection process.

The geographical localisation process involved the following steps:

Geometry Creation

The first step is to define tourist regions of interest. For information on the tourist locations see “Tourist Locations” above.

Data Reduction

The next step is to restrict the analysis to only TII data generated from TMU’s within a buffer zone region. Once the selected sites for analysis were identified they were marked on the map. A geographical area of radius 10km is set up around this location and any TMU’s within the mapped zone will be included.

Calculation of Buses

Now the data is limited to the smaller selection of TMU’s within the created mapped zone. The analysis around tourist site activity is carried out by assessing the volumes of buses in this mapped zone. The volume of bus traffic directly attributable to tourist activity is difficult to isolate because of factors such as the placement of the TMU relative to the tourist site and the impact of scheduled bus services but the month on month and year on year trends are indicative. The bus volumes can be indicative of increased or decreased tourist activity. The total volume of bus traffic around each tourist location is calculated monthly.

Testing and Validation

New statistics based on Big Data must be subject to extensive checks before being deployed by National Statistical Institutes. The CSO tested the TII data to validate the above-described methods. This is to assist the CSO’s objective of producing high quality, unbiased estimates of traffic counts from TII data.

A – The TII data is tested on its completeness to assess and identify the type and volume of missing data.

B – The datasets are also tested to check for any unwanted vehicle classifications.

C – The accuracy of the TII data is examined by comparing the results with existing official traffic count statistics produced by the CSO.

Why you can Trust the CSO

Learn about our data and confidentiality safeguards, and the steps we take to produce statistics that can be trusted by all.