enEnglish
CROS

Background: from Morgenstern to Manski

General aspects

The existence of errors, and consequently of uncertainty related to the measurements of given phenomena, is a key point of inferential statistics since the 18th century when Gauss developed his probability distribution of measurement errors. Nevertheless, the importance of measurement errors and the related uncertainty is quite different across scientific domains.

In experimental sciences, where phenomena are fully observed, and experiments can be replicated under the very same conditions, the uncertainty can be relatively easy controlled and reduced (e.g. by increasing the number of replications). At the same time, the true value of the phenomena can be almost perfectly calculated due to the fact that all its determinants are fully observed. In non-experimental sciences, like economic and social ones, the situation is rather different. In this context, phenomena are only partially observed, and there are no possibilities of replications. In such situation, controlling the uncertainty about the estimates of socio-economic phenomena is much more complex. Furthermore, thinking that we can obtain the true value associated to a given phenomenon (e.g. the true value of GDP) is like a contradiction in terms.

Despite the fact that official statistics are dealing with non-experimental phenomena, they have traditionally concentrated their effort in calculating and publishing single values for each observation. Consequently, the estimates regularly published by the statistical offices are essentially “point estimates” (i.e., central value in case of the assumption of a symmetric probability distribution). In such a way, they implicitly ignore the uncertainty unavoidably associated to the estimates they are producing. Obviously, in the recent years many statistical offices have done efforts to associate to their estimates some reliability and precision measures, such as revision analysis and standard errors, which only provide partial and incomplete information about the uncertainty associated to statistical estimates.

The approach adopted by statistical offices to focus their attention in publishing point estimates could be attributed to the need of privileging simple computations and to facilitating the communication and dissemination of information by minimizing at the same time the risk of creating confusion among users.

Historical considerations

The idea that the uncertainty associated to statistical estimates should not be disregarded is present since a long time among economists and statisticians. Probably the first one to raise this problem has been Oskar Morgenstern in his book “On the Accuracy of Economic Observations” published in 1950 and further revised and extended in its 2nd edition published in 1965.

The book of Morgenstern generated quite an interesting debate to which took part, among others, two of the most eminent economists and statisticians of the period Simon Kuznets and Raymond T. Bowman. Their contributions to the topic of S. Kuznets are available in his research papers published in 1950, namely: “Conditions of Statistical Research”, Journal of the American Statistical Association 45 (249): 1–14; “Review on the Accuracy of Economic Observations”, Journal of the American Statistical Association 45:576–79. While the contributions of R. T. Bowman appeared in 1964 in his paper  “The American Statistical Association and Federal Statistics Presidential Address”, Journal of the American Statistical Association 59 (305): 1–17, and comments on “Qui Numerare Incipit Errare Incipit” by Oskar Morgenstern, American Statistician 18 (3): 10–20. Kuznets, Bowman and Morgenstern agreed on the considerations related to the importance of uncertainty in economic statistics, but they slightly disagreed on the perspective of measuring it being Morgenstern’s position more negative than the ones expressed by Kuznets and Bowman.

After the beginning of the 60s, with few exceptions, the interest on the topic was quite low, both from researchers and statisticians. More recently, thanks to the attention placed on accuracy by the European statistics Code of Practice (CoP), and even more to the paper of Manski (2016), the attention to the measurement and communication of uncertainty gained a new significant impulse. In his paper, Manski has identified, starting from the well-known classification of errors in sampling and non-sampling ones, three main sources of uncertainty for economic statistics, namely: transitory, permanent and definitional.

Transitory causes are those related to the incomplete information set, or preliminary data, on which first estimates are based. This source of uncertainty progressively disappears and tends to 0 (zero) when the information set is stabilized. Nevertheless, this is very important since its measurement allows to a better interpretation of preliminary estimates, which is crucial for policy-making exercises.

Permanent causes are related to some bias in the results, which are generated by a certain rate of non-response, or to a sampling scheme ignoring some group of individual or businesses. For example, ignoring enterprises of less than 10 employees when calculating the industrial production index, can cause a bias on the results.

The conceptual causes of uncertainty can be considered from different points of view. For example, changing the definition of an aggregate implies obtaining different values. This is the case when considering, for example, the ILO unemployment instead of the register one. Another way to see the conceptual causes of uncertainty is associated to some unobserved components of the series such as the seasonality of the calendar effects. Making different hypothesis and their generation process and, consequently, adopting different methods for filtering such components out can produce quite significantly different results.

State of the art

Nowadays, several statistical offices are investing in identifying the best way to measure and communicate uncertainty. Among them, it is important to note the projects undertaken by the Bank of England with the use of fan charts to communicate uncertainty, the project more recently undertaken by the Office for National Statistics (ONS) in the UK, or the one by Statistics Netherlands (CBS) on the inventarisation of uncertainty sources for statistics.

Measuring uncertainty is a complex and challenging task, which can involve the use of sophisticated statistical and econometric techniques (either classical or Bayesians), resulting in a quantitative evaluation of the data. Surely, not less challenging is to identify the optimal way to transform the results obtained into easy to read, attractive and not-misleading dissemination products.

For these reasons, an in depth methodological and empirical evaluation of a number of approaches has to be carried out before making concrete proposals. At the same time, a detailed investigation of pros and cons of communicating uncertainty needs to be carried out with the active participation of users.

The risk of misleading users is high, and that is why particular care has to be taken when choosing the appropriate tool for measuring and disseminating uncertainty. On the other hand providing accurate and clear uncertainty measures will enhance considerably the relevance and credibility of official statistics. Policy makers will benefit from this additional information especially in difficult periods, when the uncertainty of figures tends to increase.