Data Collection
Introduction
Welcome to the latest Modelling Group blog post.
For this post, we are going to focus on an important project requirement needing assessing and defining before any traffic model is developed - the data collection exercise needed to inform microsimulation models.
We’ll go into detail on the typical data elements specified in order to ensure that the models developed are realistic and can be empirically proven to be representative of on-site conditions.
Data for Traffic Flows
Automatic Traffic Counts (ATCs)
For collecting screen-line counts or link flows, ATCs are a common technique. These have historically been simple pneumatic tubes (although some traffic survey companies are now replacing them with radar) that are placed across the road in pairs, using the number of axles as vehicles pass over them to determine the vehicle type and speed.
ATCs are useful for counting for full 24hr periods and over several days, weeks or even months, to monitor longer-term trends on roads.
Classified/Junction Turning Counts (CTCs/JTCs)
For collecting junction flows where more detail on movements is required, then CTCs are a common option. The majority of these are now collected using video surveys, which are then analysed to establish the movements between each arm of a junction. The data is classified into different vehicle types, which can then be used to develop static routes or origin-destination (O-D matrices) in our models.
Automatic Number Plate Recognition (ANPR) Surveys
A more comprehensive (and expensive) option for collecting data over a wider network area is an ANPR survey. From the cordon that you specify, an O-D matrix can be created by vehicles being matched by the cameras entering and exiting the cordon during the surveyed period. These surveys can be useful in creating more accurate O-D matrices for your models, although effort is still required during the analysis to accurately develop the matrices when accounting for ‘unmatched’ vehicles from the survey.
Data for Journey Times
ANPR Surveys
An added benefit of ANPR surveys is that you can also obtain journey time information from vehicles travelling between the various O-D pairs. This can also be split into light and heavy vehicles, which can be useful if there is a need to distinguish between the two. As with all journey time data, for the data to be of use, there needs to be a good sample size between each O-D pair to allow a suitable journey time average to be obtained – we would always advise a minimum of 10, but a preference of more if possible.
TomTom Portal
One of the ‘big data’ options is the TomTom Move portal. Although access is limited to license-holders, you can contact these companies and request certain routes for your chosen time periods (day/week/month) for a fee and the data can be extracted. As it’s a big data source, the journey time information is often based on very large samples of data (often in the hundreds, or even thousands), so the average journey times provided are representative. It is also a useful option if you need to retrieve historic journey times to tie in with old survey data (for example), which can be beneficial if the data collection budget does not stretch to a full, comprehensive new set of data being able to be collected.
Floating Car Method
Not an option that we often consider or would ever recommend, but if you have a very small network, then the ‘floating car’ method for collecting journey times can be used. This essentially involves people driving around the site and recording journey times between various routes. If the routes you are modelling are small and you can guarantee at least 10 readings for each route during the peak periods, then this is a viable option. However, for bigger networks or routes where you are not likely to achieve at least 10 runs in the peak period, then other options should be used.
Data for Queue Lengths
Video Surveys
Queue length surveys are generally specified as being collected at the same time as junction turning counts as they can often make use of some/all of the same video footage from the installed cameras. These can be specified to be collated as average and maximum queue lengths per specified time period (we usually specify 5 minutes) and can also be specified with different criteria to account for the operational distinction between priority or signal controlled arms.
It should be noted that collecting and collating queue length data can be very subjective - determining exactly what is a ‘queue’ can be difficult to define as well as the meaning often differing between different people/survey companies. As such, queue lengths used in modelling should only be used as a calibration aid and not part of the validation criteria of the model due to their subjective nature. Although not advised, if queues are to be used for validating a model, then queue surveys collected over several days would be recommended so that daily variance can be taken into account allowing a more robust validation.
Data for Saturation Flows
Video Surveys
If undertaking modelling work within London (and occasionally for individual projects elsewhere, as specified), saturation flows are a requirement for Transport for London (TfL) through their Model Audit Process (MAP). Saturation flow surveys generally make use of video cameras, with the footage being assessed to provide the saturation flows per arm/lane and for the specified timeframe. The data can also be analysed to include degree of saturation, effective green and cycle time data, which can feed into a range of modelling software.
Data for Car Parking
Video Surveys
It may be necessary to take account of on-street parking, layby parking or more formal car parking facilities in your models. If this is required, then setting up static cameras and collecting video footage that is then analysed is a common method. When specifying what to analyse, it’s good to understand the vehicle arrival times, parking durations and occupancy of all parking spaces. This allows your models to accurately reflect the parking conditions and identify any busy periods that could affect the network performance (assisting with calibration and validation).
Data for Pedestrian Flows
Video Surveys
For understanding more complex pedestrian flows (e.g. routing through a public transport interchange), then it is best to contact survey companies directly to understand their capabilities and gauge what they consider the best approach for the data that you require.
However, if you are looking to understand pedestrian flows at a pedestrian crossing, then in a similar way to parking surveys, setting up static cameras and analysing the footage is a common technique. This data can help to ensure that any crossings in your models are called at the same frequency as on-site. It is also useful to ask the survey company to split the pedestrian counts into people that wait for the pedestrian crossing and those that don’t. This helps to stop the crossing being called too many times in your model, which could affect the calibration and validation of the model.
Other Data Collection Elements & Sources
As well as the elements listed here, there are a number of other data collection elements and sources which can be considered as part of the data collection exercise. These include:
· Survey Video Footage – it is highly recommended to ask the survey company for a copy of any video footage collected. This can be viewed, analysed and watched multiple times for different time periods and can assist in model calibration for inputs such as driver behaviour, lane usage, bus stop dwell times etc.
· Drone Surveys – drone surveys are becoming more popular, and it may be that a survey that allows you to analyse the junction or network performance from a higher vantage point would be beneficial. However, there will be a need to ensure that suitable permissions are in place for flying the drones and that they can be situated in a location that is beneficial. The cost of the survey will also need to be factored into the data collection budget, which is likely to vary depending on location and timescales required.
· Public Transport Data – for projects in London, TfL will often provide iBus data to assist with the modelling of bus routes and stops. Outside of London, there are websites where public transport data is available (https://bustimes.org/ is one example) and these should be used to ensure that the bus routes, stops and frequencies in your models are accurate and representative.
· National Speed Distributions – datasets on the Department for Transport (DfT) website (https://www.gov.uk/government/statistics/vehicle-speed-compliance-statistics-for-great-britain-july-to-september-2021/vehicle-speed-compliance-statistics-for-great-britain-july-to-september-2021, for example) can be used to collect information on typical speeds in different speed limits. These can be used to create various speed distribution profiles for your models. It is important to consider factors which may have impacted speeds before deciding on which dataset to use – for instance, the recent Covid-19 pandemic.
· Google Maps – Traffic Layer – to gain a high-level overview of the traffic conditions, the traffic layer can be a useful tool to check that congestion and queuing levels within your model are proportional to site conditions.
Summary
We hope this post has helped to provide a bit of insight into the different data types and what to consider when specifying data to be collected for your models. Although not necessarily an exhaustive list, this post covers the data collection types most commonly specified for most projects.
Thanks for reading!