Skip to main content

Train Delays amongst Britain’s trains cost the rail industry over £100 million a year. However, the additional cost to the economy in lost time is over £1 billion [1]. On-time performance is directly impacted by a multitude of events including train reliability, poor track quality, and operation under degraded conditions. To better understand what constitutes the major contributions to delays, the rail industry has embarked on a comprehensive data collection and analysis program [2].  Although it is recognized that there are many potential causes for train delays and data need to be acquired from a wide range of sources, the current approach to analysis is to simply use each individual data source essentially in isolation. This has arisen from primarily a desire to solve a single targeted problem. However, if all the data sources were to be examined in unison, using the power of modern big data analytics, it is highly probable that the industry would not only identify the major contributions to delays more effectively and efficiently, but would also likely make important discoveries of problem areas that would otherwise remain hidden. Through “knowledge discovery from data (KDD)”, potentially saving the rail industry £millions.

Shedding Light on Train Delays

Over the last few years there has been significant anticipation associated with the use of big data techniques for the analysis of rail-related data. However, the major expectations for the approach have yet to be fully realized.

We are currently looking to apply the techniques that have been used in the prediction of solar flares to ‘multi-variable’ analysis of rail problems. We are working closely with a team from Georgia State University (GSU), one of the leaders in the emerging field of data science, who have recently made significant advances in big data analytics related to prediction of solar flares that we believe can be directly applied to complex rail problems. The GSU techniques are based on the combination of decision trees and deep neural networks feeding off multiple data streams.


Shedding Light on Train Delays

This multi-data stream approach to prediction fits well with our ELBowTie risk analysis methodology. Over the coming months we will be setting up analysis of real time train data to prove the technique and demonstrate ELBowTie live working.


[1] “Reducing passenger rail delays by better management of incidents”, National Audit Office, 2008, management-of-incidents/

[2] “A Digital Railway for a Modern Britain”, Network Rail, 2017,


Institution websites: