Home / posts / Blog / Is Policy Evaluation Fit For Purpose?

Is Policy Evaluation Fit For Purpose?

Sep 26, 2016 | Blog

By Roger Highfield, Director of External Affairs of the Science Museum, member of the Royal Society’s Science Policy Advisory Group.

To tackle climate change, ecosystem destruction and the many daunting issues facing humanity we need not only to draw on science and engineering but also develop policies that can change the behaviour of 7.5 billion people.

That means we need ways to evaluate which policies work and which don’t, and figure out how to hone them. However, there is still some way to go to make our existing institutional machinery fit for purpose, according to a far-ranging discussion on Policy Evaluation for a Complex World I chaired this month, at St Martin’s in the Fields, Trafalgar Square, London, for the Centre for the Evaluation of Complexity Across the Nexus (CECAN).

Prof. Nigel Gilbert of the University of Surrey, CECAN Director, opened the meeting with Jane Elliott, Chief Executive of the Economic and Social Research Council. Before them sat an audience of about 120 government policy analysts, academics, social researchers and business people, Sir Mark Walport, Chief Scientific Adviser to HM Government, along with the members of a distinguished panel: David Halpern, Chief Executive of the Behavioural Insights Team; Dame Margaret Hodge MP, former Chair of the Public Accounts Committee; Dr Ulrike Hotopp, Chief Economist at Simetrica; and Michael Kell, the National Audit Office’s Chief Economist.

I was keen to find how much had changed since a decade ago, when I covered a fascinating report by the Commons Science and Technology committee for The Daily Telegraph, entitled Scientific Advice, Risk and Evidence Based Policy Making.

In preparing that report, MPs discovered how some research commissioned by ministries was quietly buried if it did not back Government policy. One minister had underlined her commitment to her policy by adding that she had even commissioned research to back her case. Another called for a ban when there was no scientific definition of what was being banned, no cost benefit analysis and no public engagement, let alone any evidence that it would even work

Opening the discussion about the current state of evaluation was Sir Mark, who has to wrestle with issues ranging from badgers and bovine TB to the Haldane Principle as Head of the Government Office for Science. He described three ways that science slots into policymaking:

1. Sifting evidence, most importantly providing ‘good evidence reviews’ (Sir Mark challenged the audience, ‘Why don’t we write a single review and bring it up to date, so that you have version 1.0, 2.0?’ rather than simply study the latest evidence);

2. Good and efficient communication, which he said was relatively neglected (‘too often scientists live in their own arcane worlds’) and confused by badly-framed discussions in which values are conflated with the science;

3. Policymaking itself, when scientists have to give advice on the basis of incomplete evidence and unknowns. They do this in spite of the fractured nature of academia and the siloed nature of Government departments – the latter often focusing more on dealing with problems than preventing them (‘We are extremely good at paying for the treatment of disease, very bad at paying for the maintenance of health.’) He added: ‘On the whole, most effort goes into evaluation before policymaking.’

Sir Mark emphasized the role of social scientists, for example in understanding the disruptive effects of the internet. ‘You really do need all of the sciences,’ he stressed. In dealing with the recent Ebola epidemic, he added, it was important to consult anthropologists who understood local burial rituals in which the higher the status of an individual who had died, the more people touched the corpse.

I asked him if the policy evaluation machinery in this country is fit for purpose. ‘There is no simple policy evaluation machinery,’ he replied. ‘We are better at evaluation before we make policy and we could do a lot more to evaluate the policies that we have made. But we are better than most countries.’

But, on what was one of the hottest days of the year, Michael Kell gave a much frostier answer to my question, based on a National Audit Office report Evaluation in Government, published at the end of 2013: ‘We don’t have a system that is fit for purpose.’

More sophisticated methodologies often gave more equivocal findings, reported the NAO evaluation of evaluation, and there were ‘question marks about the credibility of Government commissioned evaluation’. It was, he said, as if the Government was marking its own homework.

Delegation of evaluation to an arms-length body means that conclusions are more trustworthy, and he told the meeting that he welcomed the establishment of CECAN as a way to help produce more high quality, robust evidence. In this way, CECAN can improve the dialogue between academia and Government.

Like Sir Mark, Michael Kell wanted to know what the totality of the evidence base means for a given policy, not just the latest evidence. ‘I’d like an overview, and I’d like it quickly please.’

In the second response to Sir Mark, David Halpern said it was important for Government to ‘say out loud what it doesn’t know’ and cited Mark Twain to emphasise that its evidence has to be sound: ‘It ain’t what you don’t know that gets you into trouble. It’s what you know for sure that just ain’t so.’

He added that he was heartened by how ministers now talk about randomized controlled trials, RCTs. Even complex ‘wicked’ problems can be tackled by many smaller simpler interventions (nudges) that can be tested in RCTs. And he extolled the benefits of evaluating variations of policies in ‘multi-armed trials’, rather than waiting years for much larger studies to reach their conclusions.

But when it comes to making evaluation part and parcel of policymaking in Government, he thought there was more to be done, from ensuring that all the policy profession was familiar with methods to build our understanding of ‘What Works’, such as RCTs and ‘step-wedge’ designs, and for the Treasury to routinely insist that such methods be built into programs, and ultimately spending reviews.

Dame Margaret Hodge, who has recently published the book Called To Account, told the meeting that she felt it is ‘particularly interesting’ that in the UK, academia has less influence on policymakers than other institutions such as think tanks. She speculated that one reason for their lack of impact was a love of jargon (when it came to tax, for example, experts ‘hid behind complexity to avoid there being a public debate about fairness’) and another was the long gestation of academic evidence. ‘Timeliness matters.’

Myopia was another issue highlighted by Dame Margaret. Policymakers do not take a sufficiently long-term view, being ‘in and out of jobs too quickly’, and they only paid lip service to the importance of evidence. They were too wary of innovation, being fearful of failure. Like others, she highlighted the lack of coordination across Government departments. ‘I think it is far far worse in Government than in academia.’

During the Q and A we discussed policy innovation and evaluation in the context of the ‘B word’ (Brexit, not badgers), such as reforming farming to provide more benefits for ecosystem services. Sir Mark said there were opportunities for deep policy reassessment but Dame Margaret added that, while she agreed that Brexit provides an impetus to think afresh, it will mean there will be less money available for research.

When it comes to the rise of ‘post-truth’ politics and the rejection of experts witnessed in the Brexit vote, Michael Kell said that the answer was to maintain a focus on those who really need evaluation information, ideas and tools.

In her response to Sir Mark, Dr Ulrike Hotopp emphasized once again the importance of evaluation and how it also rested on having precise policy objectives, which are not always clear at the outset. She added that many studies, for instance in the field of economics, are not formally thought of as evaluation but are useful when it comes to assessing the impact of policies.

She told the packed meeting that she was also delighted that CECAN has got off the ground, describing it as a vital resource for policy evaluators.

This chimed with Jane Elliott’s introduction, in which she said that it was important to draw on the breadth and depth of insights from the social sciences, such as the use of behavioural psychology to influence people, adding that they were still under-utilised by policymakers.

And it also complemented the introduction by Prof. Nigel Gilbert, who stressed that, if evaluation is to really make a difference, it needs to recognise the messiness of policy development and the non-linearity of real-world phenomena, where his own research has shown agent-based modelling is a useful tool to explore future what-if scenarios.

Evaluation needs to be continuous and not just grafted on to a supposedly linear policymaking process, he said, pledging that ‘CECAN will develop methods for policy evaluation in complex areas.’

CECAN Webinar – The benefits and challenges of conducting research with impact ‘built in’: reflections and findings from an evaluation of Electronic Monitoring with the Ministry of Justice, with Ian Brunton-Smith. 23 Jun, 1 - 2pm BST. Includes live Q&A! Register free: www.cecan.ac.uk/events/cecan...

[image or embed]
— CECAN (@cecan.bsky.social) April 9, 2025 at 12:22 PM

*New Resource* - 'Guidance on using large language models to extract cause-and-effect pairs from texts for systems mapping', written by Jordan White and Pete Barbrook-Johnson. See: www.cecan.ac.uk/resources/to...

[image or embed]
— CECAN (@cecan.bsky.social) April 3, 2025 at 2:53 PM

Is Policy Evaluation Fit For Purpose?

Related