Do we know why the number of traffic fatalities is declining? If not, can we find out?

The number of traffic fatalities has declined by 80-90 % from the all-time peak number in many highly motorized countries. It would be of great interest to identify factors that explain this decline. Unfortunately, this is difficult, and an ideal method does not exist. This paper discusses some less-than-ideal methods. Multivariate statistical analyses are unlikely to be informative because they are likely to be affected by both multicollinearity and omitted variable bias. This implies that they will always include both too many and too few predictor variables, a problem that is impossible to solve. Historical reconstruction is another possible method. It explains fatality reductions over time by known factors that are related to fatalities and for which sufficient information about their presence or uptake is available. Studies based on historical reconstructions show consistently that safer road user behaviour is a major contributor to reducing fatalities, followed by infrastructure and vehicle related safety measures. However, there is no way of establishing causality. The relative importance of different types of factors is highly dependent on the amount of information available. In a strict sense, there is therefore no prospect of providing a scientific explanation of the decline in traffic fatalities. In a less strict sense, historical reconstructions still may give an impression of relative contributions of some known factors. For example, the most recent Norwegian study identified factors that may explain more than half of the reduction of road traffic fatalities after 2000.


Introduction
The number of traffic fatalities reached an all-time peak level in many highly motorized countries around 1970. Since then, the number of fatalities has been greatly reduced in many countries. In Norway, the peak number was 560 in 1970. In 2021, there were 87 fatalities, a decline of 84.5%. Sweden recorded the highest number of fatalities in 1965 and 1966. In 2021 the number had been reduced by 85.4%. Denmark has recorded an 88.9% reduction of the number of traffic fatalities from the all-time high. By contrast, the number of traffic fatalities has stayed high in the United States. The lowest number in recent years (2012) was just 40.5% lower than the peak number (1972).
Why has the number of traffic fatalities been reduced so much? Was it brought about by successful road safety policies, or did other societal changes exert a greater influence? Why has the number of fatalities declined more in some countries than in others? Finding good answers to these questions can teach important lessons about what influences the number of traffic fatalities and what a country should do to reduce them. Unfortunately, as this paper will argue, it is very difficult to find good scientific explanations of the decline in the number of traffic fatalities. It is in the nature of historical data that they make well-supported causal inferences difficult.

Causal explanations: the basic requirements
A cause can be defined as any action, event or process that produces a change that would not otherwise have occurred (Elvik, 2011). A cause is a necessary condition for the change: without it, the change would not have occurred. The relationship between a cause and the changes it produces may, however, be probabilistic. Thus, a road safety measure may bring about a change in the probability of accident occurrence.
The key element in finding a causal relationship is to establish the counterfactual, i.e. what would have happened if the cause had been absent. It is widely agreed that the best way of establishing the counterfactual is to conduct a randomized controlled trial. The control group then serves as the counterfactual condition. If implemented successfully, a randomized controlled trial will reveal causality.
When studying historical changes, like the long-term decline in the number of traffic fatalities in a country, randomized controlled trials are ruled out. Even quasi-experimental designs are ruled out: history does not produce a comparable comparison group or multiple data series that differ in controlled ways with respect to the factors influencing them.
Sometimes, a large event occurs whose impacts can be reasonably estimated. The effect of the blood alcohol limit introduced in Great Britain in 1967 is clearly discernible in the data (Elvik, 2021). Usually, however, changes are more gradual, and it is difficult to relate them to specific events. Is it then possible to develop a causal explanation of a long-term trend, i.e. identify the factors that brought about the trend and without which it would not have occurred?
Given the fact that the number of traffic fatalities is influenced by very many factors, one possible approach would be to collect data on as many factors as possible and estimate their effects by means of a multivariate statistical analysis. However, a causal interpretation of coefficients estimated in statistical analyses is, in general, not justified (Hauer, 2010). A statistical analysis does not establish the counterfactual.
Another approach might be termed historical reconstruction. When trying to establish causality, historians often ask what might have happened if a certain event had not occurred (Førland, 2013;Macintyre, 1976). This kind of thinking involves reconstructing a hypothetical (i.e. nonobserved) counterfactual. Can it ever become more than a hypothetical counterfactual? Hempel (1965) argues that historians often rely on scientific laws, or universal hypotheses, when proposing explanations. He also argues that explanations and predictions have the same logical structure. This provides a clue about how both statistical analyses and historical reconstruction can be made empirically testable, although a counterfactual in a strict sense may not be established.
As far as statistical analysis is concerned, splitting the data randomly in two equal parts creates a possibility to test the analysis. Regression coefficients are estimated using half the data. They are then applied to predict the other half of the data. If the predictions are successful, i.e. predicted values are close to observed values, one may provisionally conclude that the statistical relationships are reasonably stable and reflect real associations. This brings the analysis one step closer to establishing causality, although one would like to see replications of such splithalf analyses in several data sets, with close to identical results, before discussing causality.
With respect to historical reconstruction, it would not make sense to use a split-half technique. However, prediction may sometimes be possible. If, for example, there was a period of increasing market penetration of a certain safety system, like electronic stability control, one can predict how the number of fatalities is likely to change in the future as market penetration continues towards one hundred percent. The accuracy of such predictions can then be assessed at some future point in time. If reasonably accurate, it has been established that the statistical association is stable over time, which is a first step towards establishing causality. Nevertheless, there are so many factors influencing fatalities that evaluating the accuracy of predictions would be very difficult. Moreover, one might explain all results (finding and not finding the expected relationships in the future) with effects of other factors that are not controlled for.
The following sections of the paper discusses what has been learnt, both from statistical studies and historical reconstruction.

Some previous statistical analyses
There have been many statistical analyses of factors influencing the number of traffic fatalities and changes over time in the number of traffic fatalities. This paper does not aim to review all these studies but will discuss a few of the most quoted studies. Fridstrøm & Ingebrigtsen (1991) present a panel data study for Norway based on data for [1974][1975][1976][1977][1978][1979][1980][1981][1982][1983][1984][1985][1986]. The study included more explanatory variables than any of the studies discussed below. Some of the explanatory variables are road safety measures, like seat belt wearing, vehicle inspections and law enforcement. A replication of the study using more recent data would have been useful, but there is too much missing data to replicate an analysis using monthly data for counties as the unit of observation. There was no consistent decline in the number of traffic fatalities during the period covered by the study -a decline in the first half was followed by an increase in second half of the period. Partyka (1991;1984) developed a very simple model that fitted fatality counts in the United States from 1960 to 1982 almost perfectly. The model included only the number of unemployed workers, the number of employed workers, the non-labour force, and dummies for the year 1974 and the years after 1974. The good fit of this model shows that one does not have to include a large number of variables to model changes from year-to-year in the number of fatalities, despite the fact that it is widely agreed that the number of fatalities is influenced by very many factors. However, when the model was used to predict fatalities for the years 1983-1989, it failed and predicted a far too high number of fatalities. Thus, models that accurately describe past trends may not accurately predict future trends. Viewed as an example of the 'split-half' method mentioned above, Partyka's model failed to demonstrate a causal relationship, at least if such a relationship is assumed to be stable over time. Page (2001) developed a model designed to identify the contribution of road safety policies to the decline in the number of traffic fatalities in OECD countries from 1980 to 1994. He modelled the number of fatalities by country and year (21 countries and 15 years = 315 observations) as a function of exogenous variables, i.e. variables that are not influenced by road safety policies, like the size of the population and its distribution by age. The idea was that if the exogenous variables fail to explain all of the decline from 1980 to 1994 in the number of fatalities, the unexplained part can be attributed to road safety measures. This is an intriguing idea, but there are three limitations to it, if one seeks to apply it a single country, rather than the 21 countries Page included: (1) During a period of, say, 20 years, all the exogenous variables Page included change gradually at almost constant annual rates and are, for that reason, likely to be highly correlated; (2) Page included seven variables. This is fine in a sample of 315, but could be a problem in a sample of 20 (20 years in one country); (3) The road safety measures that may have contributed to reducing the number of fatalities are not identified, but make up a nameless residual term. Policy makers often seek more precise knowledge about which road safety measures made a small or large contribution to reducing fatalities. Lassarre (2001) presented a study with a similar objective to the study of Page. In some of the ten European countries included in his study, he was able to estimate the change in the number of fatalities associated with a certain road safety measure. However, this included only major changes in legislation, like laws making seat belt wearing mandatory. The effects of road safety measures that are introduced more gradually, like road improvements, could not identified by the model but ended up in the trend term.
Finally, Antoniou et al. (2016) showed that the number of fatalities is positively associated with changes in Gross Domestic Product. In other words, the number of fatalities tends to grow in periods of strong economic growth. This shows the importance of controlling for economic growth in any model attempting to estimate the contribution of road safety measures to the changes in the number of fatalities. Good economic times may stimulate growth of traffic and fatalities, but also imply that countries can afford to do more to improve road safety than in harder economic times.

A statistical model for Norway 1997-2018
None of the studies discussed above were able to identify the contribution made by specific road safety measures to the decline in the number of traffic fatalities. The study by Page (2001) came closest, but the road safety measures were only identified as a residual term, not as explanatory variables in the model. To further assess the prospects of explaining the decline in traffic fatalities by means of a multivariate statistical analysis, a model for Norway based on data for 1997-2018 was developed. During this period, the number of traffic fatalities declined from an average of 327 in the first two years of the period, to an average of 107 in the last two years of the period.
A negative binomial regression model was developed, using the annual number of fatalities as dependent variable. Negative binomial regression is by far the most common type of count regression model used in accident research (Mannering & Bhat, 2014). Independent variables for which data were available for each year in the period were included. The independent variables included: 1.
Year count (counting 1997 as 1, 2018 as 22) 2. Million vehicle kilometers of travel 3. Length of motorways in kilometers 4. Length of 2+1 roads with median barriers in kilometers 5. Number of fixed speed cameras in use 6. Seat belt wearing among car drivers (percent) 7. Share of vehicle kilometers performed by heavy goods vehicles (percent) 8. Share of vehicle kilometers performed by motorcycles (percent) 9. Share of vehicle kilometers performed by cars with electronic stability control (percent) 10. Share of vehicle kilometers performed by cars with a 5-star rating in the Euro NCAP program (percent) 11. Number of traffic tickets issued per million vehicle kilometers of travel Most of these variables were expected to contribute to fewer fatalities. The only variables not expected to do so were vehicle kilometers of travel and share of vehicle kilometers performed by motorcycles. Increases in these variables were expected to increase the number of fatalities, all else being equal.
It can be argued that it makes no sense to include 11 independent variables in data set consisting of 22 observations. This objection is certainly correct. Yet, if anything, not all relevant variables that may have contributed to reducing the number of fatalities were included in the model. It is known, for example, that the mean speed of traffic has declined after 2006, but data for earlier years are incomplete (Elvik & Høye, 2021). It is known that bicycle helmet wearing has increased, but again data are not available for the whole period. As far as drinking-and-driving is concerned, there are only data for 2006and 2017(Furuhaugen et al., 2018Gjerde et al., 2013;Gjerde et al., 2008). Data on driver distraction, like the use of mobile phones while driving, are completely absent.
In view of what we have reasons for thinking influences the number of traffic fatalities, the model includes too few variables, not too many. But from a statistical point of view, one may reasonably argue that there are too many variables in the model. Nevertheless, coefficients were estimated for all variables, and most of them had the expected sign. The number of fatalities predicted by the model closely tracks the recorded number of fatalities, see Figure 1. The first impression is therefore that the model not only describes the decline in fatalities well; it also captures year-to-year fluctuations remarkably well. Alas, on closer inspection, first impressions turn out be deceptive and the model falls completely apart. First, the model fits the data too closely. It is over-fitted, in the sense that it not only explains all of the systematic variation in the number of fatalities, but also part of the random variation. This can be determined by comparing residual variance to the mean number of fatalities. Residual variance is 134.19. The mean number of fatalities is 223.09. If one accepts the commonly made assumption that random variation in the count of fatalities follows a Poisson distribution, variance equals the mean. Hence, the smallest residual variance that would be consistent with not "explaining" part of the random variation is 223.09.
Second, although the model appears to track the recorded number of fatalities closely, there is autocorrelation of the residual terms. The autocorrelation at lag one is -0.532. This means that year-to-year changes are not actually described as well as Figure 1 may lead one to believe. More specifically, there are strings of positive and negative residuals, rather than random shifts between positive and negative residuals.
Third, some coefficients have implausible values or the wrong sign. The coefficient for 2+1 roads with a median barrier implies that the growth of these roads from 1997 to 2018 has reduced the number of fatalities by 90%. This cannot be correct. New 2+1 roads with a median barrier influences only the accidents that occur on these roads, which is less than 10% of all fatalities. It cannot be correct that these roads have reduced the total number of fatalities by 90%. The coefficient for the share of vehicle kilometers performed by motorcycles is negative, suggesting that a higher share of motorcycles reduces the number of fatalities. This is very implausible, considering the fact that motorcycles have a higher fatality rate per vehicle kilometer than other motor vehicles. Moreover, the share of fatalities represented by motorcyclists grew from around 12.5% early in the period to around 17.5% late in the period.
The share of vehicle kilometers driven by cars with electronic stability control has a positive coefficient, suggesting that it increases the number of fatalities. The coefficient is very precisely estimated and highly statistically significant. Yet inserting it into the predictive equation gives absurd results. It suggests that the increasing use of electronic stability control from 1997 to 2018 has increased the number of fatalities by a factor of 296499.
The root of these meaningless results is found in the very high correlations between the independent variables in the data set. Of 66 pairwise correlations, including correlations with the number of fatalities as dependent variable, 36 exceed ±0.9 and a further 10 exceed ±0.8. These extremely high correlations are generated by the fact that all independent variables grow over time. But the correlations are artefactual. For example, the length of motorways correlates 0.992 with the share of vehicle kilometers performed by cars with electronic stability control. This does not mean that these two variables measure the same thing, and that by dropping one of them from the model, the other would capture the effects of both. On the contrary, one may assume that new motorways and an increasing share of electronic stability control have made independent and additive contributions to reducing the number of fatalities. They just happen to have occurred at the same time and roughly the same rate.
A remedy sometimes suggested for co-linear data is to omit variables from models. In this case, this is not a solution (Fridstrøm, 2015). It will merely introduce another source of error: omitted variable bias. Thus, in the model including all 11 independent variables, the coefficient for seat belt wearing was -0.020. When included as the only independent variable, it had a coefficient of -0.118, implying that increased seat belt wearing by itself accounted for almost all of the decline in the number of traffic fatalities from 1997 to 2018.
The truth is that even with 11 variables included, there is likely to be omitted variable bias. The problem is unavoidable for the simple reason that data about many variables is missing. This, of course, does not mean that the variables do not influence the number of fatalities nor that they are uncorrelated with the variables included in a model. Missing data does not mean no effect.

Historical reconstruction: some examples
More than 30 years ago Leonard Evans tried to identify the main determinants of traffic safety (Evans, 1990). He identified six main factors. These were: (1) Social norms influencing human behavior; (2) Driver behavioral adaptation to perceived changes in risk; (3) Legislative interventions; (4) Roadway infrastructure; (5) Traffic control; (6) Vehicle safety features. He regarded changes in social norms as the most important factor explaining the long-term decline in fatality rate per vehicle mile driven in the United States. He did not quantify the factors, remarking that (page 114): "I believe the problem is too multidimensional and complex to be susceptible to any tidy analytical solution using such techniques as multivariate analysis." Evans was clairvoyant in foreseeing the difficulties of trying to explain long-term changes by means of statistical analysis. However, a different research tradition does exist, which may be termed historical reconstruction. It proceeds by first identifying factors that may have contributed to the decline in the number of fatalities. Next, the changes over time in each factor are reconstructed. For example, estimates are developed of the share of vehicle kilometers each year driven by cars that have electronic stability control. The effects on fatalities of an increase in a certain factor, for example electronic stability control, are then estimated by relying on evaluation studies in the research literature. An estimate is then made of what the number of fatalities in a given year would have been if a certain factor, like electronic stability control, had not been present that year. This is referred to as a counterfactual number of fatalities. It will be higher than the actual number of fatalities, because if a factor increasing safety had not been present, the number of fatalities would have been higher than it actually was.
By estimating a counterfactual number of fatalities factor-by-factor, and year-by-year, the aim is to explain the decline by reconstructing a hypothetical situation in which the factors believed to produce the decline did not exist. Figure 2 gives an example of the output from this kind of study (Elvik & Høye, 2021). (C) smaller variation in risk between drivers belonging to different age groups; and (D) a potential decline in the reporting of serious injuries in official accident statistics. It is seen that these factors explain the greater part of the decline in the number of killed or seriously injured road users. Disregarding potential changes in reporting, 59 % of the decline could be explained. The three most important factors are decreased speed, infrastructure measures (such as median guardrails), and vehicle measures (such as increased uptake of electronic stability control and improved crashworthiness). Together, these factors explain about half of the explained reduction.
There have been a number of studies employing this approach. Elvik et al. (1984) identified factors contributing to the decline in traffic fatalities in Norway from 1977to 1981. Broughton et al. (2000 estimated effects of three factors contributing to reducing the number of killed or seriously injured road users in Great Britain from 1983 to 1998. Elvik et al. (2009) Table 1 summarises key findings of these studies.  The factors contributing to a decline in fatalities or serious injuries have been sorted into six main categories: 1. Improved infrastructure and traffic control 2. Vehicle safety features 3. Legislation and enforcement 4. Safer road user behavior 5. High-risk group attenuation 6. Reduced traffic exposure

High-risk group attenuation
The first group includes, for example, new motorways and changes in speed limits. The second group includes safety systems on motor vehicles. An example of a measure in the third group is increased use of speed cameras. Safer road user behavior includes, for example, increased seat belt wearing and lower mean speed of traffic. Attenuation of high risk means that variation in risk, for example between different age groups of car drivers, has become smaller over time.
Reduced traffic exposure is a reduction in the amount of travel performed by a group of road users.
None of the studies includes factors in all these categories. Furthermore, the contribution to explaining the decline in killed or seriously injured road users from the categories included varies considerably between studies. The only category making a major contribution in all studies is safer road user behavior. When looking only at those factors that are represented in all studies (infrastructure, vehicles, and road user behavior), road user behavior is still among the two most important factors (of three). Which of infrastructure and vehicle related measures is more important, varies between the studies. In addition to the inconsistency of the results, the value of such comparisons is limited because the relative contributions of each factor will be highly dependent on which factors are included. Road user behavior may be more important than other factors, but it may also be the factor for which most information is available. In addition, it may be influenced by the other factors, especially infrastructure related measures and enforcement.
The sum of first-order impacts, listed at the bottom of Table 1, is the sum of the counterfactual numbers estimated for each factor. There are two main problems with these studies: 1. They cannot establish causality: the estimated contributions only represent a hypothetical counterfactual 2. They cannot be tested empirically: there is no way of running history a second time over with one or more of the factors absent to see what then happened. Prediction of future effects of factors that have not "run their course" is, however, in principle possible.
In other words: these estimates cannot be treated as more than educated guesses, at best. Like the multivariate statistical models, the studies are certainly incomplete and do not include all factors that may have contributed to the decline in fatalities or serious injuries. The factors that were included were just those for which sufficient data on changes over time were available and their likely impact on fatalities could be estimated on the basis of evaluation studies. Potentially important factors like changes in the collective experience of the population of drivers, changes in driver distraction or in mean headway between vehicles could not be included in any of the studies. Had these factors been included, it is highly likely that the percentage contributions assigned to the currently included factors would have changed. None of the studies listed in Table 1 was able to explain more than about 70% of the decline it aimed to explain.

Discussion
An ideal method for identifying the contributions of factors to the decline in road traffic fatalities, does not exist. Instead, we have described several suboptimal methods, applied two of them to Norwegian accident data, and discussed how results from such studies may be interpreted.
In a multivariate analysis, 11 variables were included in a data set consisting of 22 observations. Still, the list of variables was incomplete, and the estimated coefficients are undoubtedly influenced by omitted variable bias. Some of them were meaningless as a result of co-linearity among the variables. Splitting the data set in half in order to test the predictive performance of coefficients was not possible. A split-half analysis would require at least 50 observations, i.e. 50 years of data. The further back in time an analysis goes, the less data are available on explanatory factors. It is likely that in most highly motorized countries, good data on variables influencing the number of traffic fatalities will not be available when going as far back as fifty years.
Historical reconstructions have been made in several studies to explain fatality decreases in the Netherlands, Norway, Sweden and the UK. All studies identify factors that are likely to have contributed to the decreases. The choice of factors depends on the availability of information about their presence and their effect on fatalities from other sources. The studies have in common that they identify changed road user behavior as one of the most important factors, followed by infrastructure and vehicle measures. A critical drawback with such studies is that the relative importance of the different types of factors is highly dependent on the available information. Factors for which much information is available, will seem more important than those for which information is lacking.
Moreover, historical reconstruction faces the same problem as multivariate analyses when trying to extend the period backwards in time.
What about testing a historical reconstruction by means of prediction? In principle, this is possible for factors which have not run their course, i.e. for which a further increase or reductions is still possible. For example, close to 100 % of cars in Norway use daytime running lights. Any effects of a further increase in the use of daytime running lights are ruled out. Likewise, some vehicle safety systems, like airbags, are very close to 100 % market penetration. Other systems are still far from full market penetration and may continue to contribute to reducing the number of fatalities for some future years. This contribution can be predicted.
Nevertheless, any such prediction is conditional only. It essentially assumes that nothing else changes much. If it does, it becomes difficult to identify the contributions of various factors. The tendency for the mean speed of traffic to go down in Norway appears to have stopped in 2020 and 2021. Yet, the number of fatalities was reduced in both years. In 2020, a reduction of vehicle kilometers due to lockdowns during the Covid-19 pandemic is a plausible candidate for explaining the decline in the number of fatalities that year. In 2021, the decline in traffic volume did not continue, nor was the mean speed of traffic reduced. The decline from 2020 to 2021 was well within the bounds of random variation; hence any explanation may be entirely spurious. Yet, the declines both in 2020 and 2021 are consistent with a long-term trend.
It should by now be clear that the main reasons why it is so difficult to explain the decline in the number of traffic fatalities are: 1. The number of traffic fatalities is influenced by a very large number of factors. 2. High-quality data is only available for very few of these factors. 3. Many of the factors influencing traffic fatalities are highly correlated. 4. As a result of these characteristics, any multivariate statistical analysis will be influenced by omitted variable bias and problems of co-linearity. 5. Historical reconstruction faces the same problems related to lack of data about important variables and the possibility of empirical testing of the effects assigned to various factors that are believed to influence the number of fatalities.

Conclusions
It is difficult to develop good scientific explanations of the decline in the number of traffic fatalities in many highly motorized countries. Improvements over time in roads and vehicles have no doubt contributed, but so has safer road user behavior. Quantifying the contributions of the various factors is difficult, as reliable data are often missing. Any quantification runs a high risk of being misleading due to omitted factors whose contribution cannot be quantified.

Declaration of conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.