This is a write-up of the session “Statistics in Regulation and Policy“, given at the RPI Annual Competition and Regulation Conference 2024.
This may seem an abstruse question of algebra. But it also was a theme lurking throughout the recent RPI conference in Oxford.
The overt focus of the conference was the regulatory and policy challenges facing UK decision makers under a new Government. How to stimulate more building of the right kind of houses? How to ensure an effective transition to green energy? How to reform the overarching governance of our policymaking institutions (this last of course highly pertinent in the context of a Government with an ambitious new focus on five missions for change)?
But underneath this overt focus lay a question of statistics and data: really, none of the key questions can be understood or addressed without a proper grasp of the numbers that describe them – where those numbers are from, who created them, what they mean, and what they can and can’t be used for.
Fortunately for attendees, this substratum question of statistics and data did not lie in the shadows, but was addressed head-on in a brilliant opening session led by Alex Plant CEO of Scottish Water, Georgina Sturge of the House of Commons Library, and Elise Rohan of the Office for Statistics Regulation.
Alex began by setting out the central role of data for a regulated business like Scottish Water. Data serve an external purpose – for scrutiny and by extension trust – and an internal purpose – to support sound decisions on asset management and operations. But beneath these generalities lies a world of challenge. It is in the nature of a water company’s business that its main assets – pipes in the ground – are very long lived, indeed often performing well beyond their expected useful life. But how are water companies to interpret data on useful lives where a) the assets can often outlive these estimates by a number of years but b) failure can have a significant, perhaps catastrophic, impact on communities and the environment? The answer lies in finer-grained analysis of the drivers of performance and failure – and a recognition that investment in data is as significant as investment in physical infrastructure.
And Alex added a further point. Public discourse around water issues can be problematic for a water company. He highlighted an incident in which the media reported an incident of sewage being seen on a beach in Scotland – yet closer study revealed it to be merely seaweed. Not every claim of pollution is sound; not every x is a y.
Georgina Sturge widened the discussion. She drew on her experience as a House of Commons researcher, as well as her book Bad Data and her forthcoming book Sum of Us, which will offer a history of the UK through data. She highlighted why good data matters, not just for water regulation but for a wide range of areas of policy.
Politicians want to use data – to make good decisions using evidence, to understand impacts and value for money, and because statistics can be persuasive. But the quality of data matters: just because data exists it doesn’t mean it is reliable for policy uses. The long history of Government policies that run into implementation
difficulties illustrates this – Georgina gave one example of the roll-out of a payment system for farmers that was stymied by limitations in the data.
Why, then, is it so hard to avoid bad data? For one thing, anything involving people is likely to be messy. There can be multiple definitions – of poverty, well-being, migration and so on. And there can be multiple data sources, and they can conflict – for example, crimes reported by the police might be falling while crimes reported by the public in surveys of victims might be rising. As society adapts, data need to adapt too – otherwise the picture they present will be misleading.
It turns out, then, that not every piece of data is completely reliable. Not every x is a y. And it may not even really be an x. What may look like a pattern or established fact can easily fall apart under scrutiny.
To guard against this, everyone needs to be aware of the sources and biases in data – whether it is the politician as user; the analyst or statistician as producer; or the citizen as beneficiary.
This call for scepticism was echoed by Elise Rohan. Elise led the work of the Office for Statistics Regulation during the recent UK General Election. She outlined the importance of the OSR’s role in clarifying what can and can’t be inferred from claims made by data. She noted that the key OSR test is not whether a claim is right or sound, but whether it is intelligently transparent – that is, whether it is possible for a reasonable person to access the evidence under-pinning the claim and make sense of it. She showed how OSR has advocated these principles – and intervened in some high profile cases during the Election when the principles were not followed (not every claim was consistent with intelligent transparency – again, not every x was a y.)
The session did not, I’m afraid, resolve the big policy questions set out for the remainder of the conference. For example it did not tell us how to decarbonise home heating; nor how to ensure the UK’s planning regime could be reformed to enable more homes to be built. The other sessions took on these tricky issues with admirable verve, wit and insight.
But this session did set the scene very effectively. It reminded us to bring scepticism to all claims involving data – to always ask who produced the data, what do the data mean and when can they be used. The session reminded us, in short, that not every x is a y.