A new model for public policy?
Examining Nicholas Gruen’s proposal for an Evaluator General to ensure policy is genuinely evidence-based
Glyn Davis
Good day, I'm Glyn Davis, and welcome to The Policy Shop, a place where we think about policy choices.
Male 1
It's not my job to care. That's what politicians are for. My job is to carry out government policy.
Male 2
Even if you think it's wrong?
Male 1
Well, almost all government policy is wrong. But frightfully well carried out.
Glyn Davis
It's an eternal struggle for governments to ensure the tax payer gets value for money and that services are delivered effectively and fairly. In this episode we're going to dive into the detail about how we might improve public policy in Australia. In particular we're going to have a look at a proposal for a new model of evidence-based policy.
My guests today are Nicholas Gruen, the CEO of Lateral Economics, Nicholas, welcome to the podcast.
Glyn Davis
Hi, Glyn.
Glyn Davis
Joining us on the phone from New Zealand is Patricia Rogers, Professor of Public Sector Evaluation at the Australia and New Zealand School of Government. Patricia, welcome to the podcast.
Patricia Rogers
Hello, Glyn.
Glyn Davis
I'm keen to think about how we evaluate public policy. Nicholas, before we go to your proposal, can you tell us a bit about the current state of play? What goes on in government to evaluate policy?
Nicholas Gruen
What goes on is a great many things and they range from serious attempts to evaluate policy, very often after the event to what I would argue is more common, which is evaluation which partakes of the form of evaluation but in many ways is done with an eye to the institutional imperatives of the organisation itself.
Glyn Davis
But you're saying that evaluation is not really part of the ordinary operating culture?
Nicholas Gruen
Yes, and it's extremely difficult to mandate that it be so, that's the point. It's very easy to say, as the Prime Minister says, as the Secretary of the Department of Prime Minister and Cabinet say, we have to be evidence-based in our policy, everyone says that. There are two problems with that, one is watch the hands, not the gestures if you like. One is that that's easily said.
The other problem is that being evidence-based is really, really hard. What tends to displace being evidence-based, being really evidence-based is either pretty much obviously pretend evidence-based or people who with great seriousness will say we did a randomised control trial, as if that makes something evidence-based.
What it does is it puts an evidence-based protocol in the middle of something but there's so much more to being evidence-based. If you think how does a teacher know how to teach well, that's a big story behind that. Some randomised control trial in Kenya or Alphington or anywhere else is going to be a small part of the evidence that will be needed to be evidence-based in your teaching or anything else.
Glyn Davis
So Patricia, staying with the current way that we do evaluation, can you say something about your experience in both New Zealand and Australia, given that evidence-based policy has been a mantra for so very long?
Patricia Rogers
In some departments and agencies you have a range of very good evidence that's really embedded in the way that they make decisions, and they do connect the pieces. But far too often evaluation is something that you do at the end, it's something that's primarily about ticking the box and compliance, it's about accountability in a very narrow sense and it's often done too little too late and not really linked into the decision-making. Or conversely it's seen as if there's going to be one answer that will stand for all time to inform the evidence.
Over the last 10, 20, 30 years what you see is these waves of governments, particularly incoming governments saying we want to be more evidence-based, we want to find out what works and what doesn't, we want to stop doing what doesn't work, we want to do more of what works and learn more. But somehow either the appetite for really confronting when things don't work fades, or people realise that it's much harder than you might think just to find out really what works.
Because fundamentally that's the wrong question, you really need to find out what's working for whom in what ways and what's not working for whom in what ways and what situations.
That's a much harder question and not one that's going to be answered in one study.
Glyn Davis
Let's explore Nicholas's big idea in this area. Nicholas, you've proposed the establishment of an evaluation general to ensure that Australian public policy is effective. Can you tell us a bit about this idea?
Nicholas Gruen
Sure, so the first big idea in that idea is the idea that evaluation like the Australian Bureau of Statistics or Meteorology is something which is trying to elicit objective information and therefore it should be independent. It shouldn't be at the whim of organisations that are being evaluated. I mean that's a pretty obvious kind of point, but it's really very important for me to stress that although I've called this the Evaluator General because you have to use the resources of the English language, and I am certainly appealing to the same set of ideas is behind the Auditor General, I'm talking about far more than what the Auditor General does.
Glyn Davis
Tell us the difference.
Nicholas Gruen
The Auditor General sits above a system and it ensures that things are done in a way that is compliant with what it's on about, which is integrity and decision-making, and it comes from a financial control perspective. That's no longer entirely the case but comes from, I mean historically that's pretty much where it comes from.
With evaluation, one of the critical things is that the voice of people at the bottom of these systems is never really heard. In fact what happens is that senior people, they might be politicians, they might be senior bureaucrats trying to keep the politicians reasonably happy and reasonably safe, will say things which we require them to say.
So we go through a public theatre of accountability and the politician might say we'll get more police on the beat, we'll get more teachers into schools, we'll improve, we'll close the gap, we'll do this, we'll do that. Then pressures get placed on people further down the system, but if we are to do these kinds of things, if we're to close the gap with Aboriginal people in Australia for instance, we need to have a system which is learning on the ground in all the places where policy is going on. And a top-down system of accountability won't do that, it will disrupt it, it will disguise, it will be able to produce all kinds of spreadsheets and tables that look like accountability.
But we'll be back where we are right now which is doing the same thing or something slightly different, over and over again.
Glyn Davis
So is this the cat that disappears when we try and look at it? Will the effective constant evaluation be that nothing happens?
Nicholas Gruen
Because being evidence-based is a hard thing to do, it's certainly possible to implement the letter of this law, the letter of what I'm talking about, and achieve very little. But it's designed to try to support people at the bottom, people in the field, people all the way through an organisation trying to learn.
Let me give you an example, I'm not sure exactly the dates but let's say in 2003 or 2004, a particular state government department administering child protection decided that it would try to reunify children with parents that they'd previously removed because of abuse and neglect.
Well that's easy to talk about in a university paper. You can show that if you could make this happen it'd be a very good thing. Well the success rate of that policy was 30 per cent, 30 per cent of the families reunified were successfully reunified. In fact while the policy was current, one particular office got that success rate to 85 per cent. Nobody know in at central office, and when the policy was swept away for the next fad from the top, nobody knew about this and that group of people who had made the policy work, done the learning, provided the support for the families, identified the right families, that learning was scattered to the winds as various people were promoted and moved.
So think of that as what's wrong and my Evaluator General is trying to make that right.
Glyn Davis
So it's a ground-up program.
Nicholas Gruen
Very much so.
Glyn Davis
With independence.
Nicholas Gruen
Yeah.
Glyn Davis
Where a team works with people who've actually been in service delivery, to try and evaluate against…
Nicholas Gruen
Correct, so in my model, the Evaluator General ultimately is responsible for the monitoring evaluation system of an agency. If the agency says we're protecting children, then the Evaluator General will provide some resources to help build a monitoring and evaluation system to help those people deliver that service and to be accountable to themselves to start with.
We have to have self-accountability, which is the best sort of accountability, to drive all the other kinds of accountability. So the idea is that they collaborate closely, that they provide expertise, which is very thin on the ground in this sort of area, and if things work well that close collaboration turns into a seamless chain of accountability all the way to the top.
When there is a disagreement between the agency and the Evaluator General about evaluation, what the Evaluator General says goes.
Glyn Davis
Patricia, you've had a lifetime of expertise in evaluation, does this sound to you like the sort of change that could make a difference?
Patricia Rogers
It has some appeal and some cautions. I think part of the issue is about the label and what's mentioned with that. Because the notion of Evaluator General sounds like an Auditor General who comes in completely separately from outside. That has both some appeal and some worries because certainly there's a problem in evaluation at the moment being totally under the control of the program management. Very often within government there are a lot of things that never get evaluated, it's the little welfare projects, often a lot of Indigenous programs get evaluated immensely and the mainstream where the money is spent, particularly where the tax system foregoes income, isn't evaluated at all.
So I think there's something very important there about holding the government as a whole to account. But there is a risk that that would drive out the actual internal systems that need to be in place. So Nicholas is trying to frame that as a way of building those internal systems but would include this learning from experience. His example of the family reunification comes out time and time again, there are classic stories in evaluation from the '60s and '70s of similar issues where across a whole program on average it didn't work but there were pockets of excellence that did, and no effort made to learn from them. This terrible waste of actually being able to improve things by learning from it.
Nicholas Gruen
We can surmise that that has been going on for the last 30 odd years in Aboriginal policy. I know some of these things, and are we learning from them, are we generalising from them, are we expanding what works? No, not really. When I talk about this publicly I often talk about the Royal Society in 1660 and its famous motto, do you remember what it is, Glyn? In Latin Nullius in verba, a fantastic idea, which is accept nothing from anyone's word, put nature to the test.
Forget about Aristotle, if Aristotle said that small stones fall more slowly than big stones, check it out.
Glyn Davis
As he did.
Nicholas Gruen
Check it out.
Glyn Davis
He also said that men had more teeth than women.
Nicholas Gruen
So maybe he was speaking metaphorically, who knows? My idea is to say about incumbent systems, the way we do things with their unions and senior managers and all the rest of it, they're all essentially vested interests in this system. To say to them we will put you to the test alongside these cute little innovative programs, and we'll know publicly which ones are working better and why.
Patricia Rogers
What I would like to connect to that is again about having a framework to look at the gambit of the important things that government agencies are doing or ought to be doing. An alternative suggestion that might be part of this evaluator general or might be an alternative way of achieving the same ends, would be to have very explicit learning agendas and for each department should be able to identify what are some of the enduring issues that they have to address. So for schools it could be about attendance or justice, it could be about differential incarceration rates between groups for family programs. It could be the issue about how do you achieve and support family unification where that's possible, how do you get to that?
So it never comes down to a single evaluation, it never comes down to a single metric but you actually have a process where you can put different evidence and these different vested interests and can present evidence. It can be checked and discussed and so you not only have a system for bringing together different types of evidence, but you have a process of dialogue around it so that you actually get to interrogate it and draw some interim conclusions about what the state of the situation is, what's causing it, how things are going, what might be done differently and how those efforts are going. It becomes an ongoing discussion around a limited number of key priority areas.
That's quite a different view from this very scattered efforts that we have, where we have evaluations going on and performance indicators and customer satisfaction and all sorts of international literature not actually being brought together and the different groups involved not being brought together.
Nicholas Gruen
As you described that I thought to myself well if that happens within a department, the department's senior managers are thinking all the time what if the minister gets a question in parliament on this, what am I going to – they cannot any longer – and this is not a moralistic comment, this is simply a comment about the way we are now. They cannot avoid looking at all of these things with a news management lens, and that is just catastrophic for knowing anything.
I remember I took real note when Anna Bligh replaced Peter Beattie, a very consensual handover as I recall. I went up to Queensland and I was doing some work in innovation and they said we don't use the expression smart state anymore, because Anna Bligh's minders tell her that that's not her brand. Well that's just catastrophic, these are things – calling it the smart state was never really, I wasn't a big fan of it but that's fine, that's a marketing thing. But these are the PR imperatives of the way government is done at the moment. Now we can have a long and serious talk about how to address that, but that's the reality that we are having to cope with.
I'm trying to build an institutional house, an institutional place. I'm thinking of it a little bit like monasteries in the dark ages where knowledge can be protected and expanded.
Glyn Davis
Patricia, you mentioned Indigenous programs being heavily evaluated, why has that evaluation had so little impact on policy formulation?
Patricia Rogers
I'm thinking about an evaluation I was involved in a number of years ago of the Stronger Families and Community Strategy that had over 600 projects across Australia. A large number of those were Indigenous projects and one of the things that we did as part of that was exactly the sort of thing Nicholas was talking about, we actually identified which were the projects that seemed to be working particularly well and what could we learn from them. We went and did a number of case studies of those and was really trying to see what were some – were there common features or were they all different in different ways. Actually did produce a report looking at these factors and that was used by what was then FaHCSIA to look at designing some of the follow-on projects.
Nicholas Gruen
Patricia, I would call a success not just producing a report and having some of those recommendations accepted but starting a whole culture of having people who have some expertise in evaluation, helping programs become evidence-based. This is kind of how Toyota revolutionised production on the line, they gave their workers literally, this isn't a figurative statement, 10 times as much training. They trained them in statistical control and they instead of imposing KPIs on them and running the line faster and faster and trying to get them to keep up, they basically managed to harness the intrinsic motivation of these teams of workers on the line to do as good a job as they could. As good an evidence-based job as they could.
Within about 10 or 15 years they had doubled labour productivity, not by spending more capital, well actually more human capital if you like. So that's an evidence-based culture and the difference is massive. Before I chaired the Australian Centre for Social Innovation I was a bit like Tony Abbott, well what I mean by that…
Glyn Davis
In so many ways.
Nicholas Gruen
In so many ways, I was like Tony Abbott in that I was a sort of a tragic Liberal. Now Tony Abbott isn't even a Liberal, but what I mean by that is Tony Abbott said the poor will always be with us. I think that's true but that we can do much better in these programs.
I thought basically we couldn't, that these were intractable social problems and working at the Centre for Social Innovation showed me most people, 99 point something per cent of people want to get a better life and there's something stopping them.
We can unpick some of this stuff but we have to be really empirical and we have to get them to help us and we have to empower people who work with them, not the powerful people at the top of these systems, that's a major part of the problem.
Patricia Rogers
So Patricia, you're in New Zealand at the moment doing some work in education there, but interestingly earlier this year the New Zealand Government turned away from the traditional spend versus cut approach to policy and argued that it would be taking an investment approach. Which uses evidence to quantify a public policy and then fund the interventions most likely to improve outcomes and therefore reduce in a sense future cost to government. It's quite a different way of thinking about it, it does put evaluation into the policy process and asks that it's part of the test. Are you seeing any early evidence of how this is operating in New Zealand?
Patricia Rogers
I think it's, from what I can see so far, early days. I think it's also something we need to be very careful about, it's a great idea. I think this notion that you actually invest in the primary care, the prevention services, to avoid a whole list of costs in all sorts of factors later on makes a lot of sense and I'm really for it.
One of the challenges will be obviously the quality of the evidence about the costs of those and the costs that are being avoided. While there's a lot we can do again around the supply of that, I think we also have to be very thoughtful about the demand end, about how we're going to use it. Because you're not going to get a single figure to be able to say this is definitely worth spending money on. Because it won't be that it will work in every case.
Glyn Davis
So there's an argument for a much more nuanced view of what a valuation can achieve and doing it at a very programmatic level or even lower so that you get a sense of difference?
Patricia Rogers
Yes, and I have this real visceral reaction when people talk about nuances, because it sounds a bit like it's a trivial thing. Maybe that's not what's intended but some of the examples where things work differently for different people are really huge.
There's a classic case around at one of the early intervention programs called Early Head Start, that on average worked to actually support toddlers and children in disadvantaged families. Some risks of evidence-based programs you see it listed as this is something that works and people are encouraged to put money into it.
But the evaluation found that for the most disadvantaged families it was actually harmful, it actually reduced all sorts of indicators of wellbeing and interactions between parents and children. So that's a case where certainly it's going beyond the average but it's something really fundamental that you need to pay attention to.
So it's knowing that each time you slice and dice that you might see a slightly different pattern. So yes you should take an investment approach and you should base it on the best evidence you've got and you should do a really good risk management approach. But you should always commit to ongoing learning to find the cases where it perhaps doesn't work as well or it might even be harmful or much less cost effective.
Glyn Davis
Nicholas, in a recent article in The Mandarin you were quite strident in dismissing the present ways…
Nicholas Gruen
Strident? That doesn't sound like me.
Glyn Dsvis
Not at all but… [Laughs]
Nicholas Gruen
Go on, go on.
Glyn Davis
You express strong views critical of the current ways we do evaluation and the current ways we think about policy more generally. You dismissed calls for big data as the way to solve policy problems which you described memorably as Ted talk. What are the ways you're looking to see an Evaluator General and the findings of this process then feed back into public policy?
Nicholas Gruen
Now Frederick Hayek had a word for this and he called it scientism. Scientism comes from the idea that we want to be scientific the way Isaac Newton was scientific and Albert Einstein was scientific. So we take some things that look scientific, they might be randomised control trials, in economics it's cost benefit analysis, and big data is another example of that. Now not only is there nothing wrong with those things, those things have a very important place in evidence-based policy. But because you can show somebody your flash spreadsheet and say minister here's the cost benefit analysis on this, when you've glossed over all the ways in which we don't know what's working and how it's working, that's the problem. That's why we've spent 20 or 30 years spending I don't know what it is but when I last looked it was $50,000 or $60,000 per person in Aboriginal policy, getting absolutely nowhere.
But of course all of those programs would have been able to present tables and there would have been KPIs and they would have gone through all the evidence-based schtick but they actually miss, somehow they miss out because they're role playing. They're taking hold of trinkets and so on rather than carefully using evidence in the way that we need to to understand what we're doing.
Glyn Davis
Nicholas, how could we be sure of achieving transparency in a program like this?
Nicholas Gruen
Now transparency is a critical part of this because if we're generating information that one small innovative program works better than the incumbent system, we want there to be pressure on the incumbent system to start absorbing the lessons and improving itself, or to be displaced by expanding that incumbent system. Remember the incumbent system has got unions on its side, it's got senior managers on its side, it's got credentials on its side. So the idea is to try and level the playing field in the public arena.
Right now a little innovative program can be as good as it likes and no one really learns much about it. The opposition doesn't know much about it, they don't know or they might know that it's kind of quite good but they don't know whether it's better than the existing system. So transparency is a critical part of this, independence transparency and collaboration so that when we have learning, when we have a system which the Evaluator General can put their hand on their heart and say we're pretty sure this works better than this other system, pressure comes on the system to do something with that information.
At the moment, as Patricia has been telling us, that hasn't been happening for 20 or 30 years, so that's a critical part.
Patricia Rogers
I think that it's very easy to have transparency when things are working, it's easier. But the harder thing is when you have transparency about when you have evidence that something's not working, that there has been mistakes.
Nicholas Gruen
That's right.
Glyn Davis
Then that's politics. So you're hoping for a different policy paradigm arising from the discipline that an Evaluator General would introduce?
Nicholas Gruen
Yes, I'm hoping for a policy paradigm which is based on strengthening and resourcing professional craft. If you think about what we did say in the 19th century we had professionals in – well in early 20th century, we had professionals in healthcare and we had professionals in education. We sort of gave them money and we trusted them. Now that's far from perfect as we know but we want those people to be more accountable than they are. We want them to be as evidence-based as possible. But what we've got at the moment with governments making public statements about how they're holding everybody to account in ways that they can't possibly hold to account, we have a chain of legitimation, a chain of governance if you like which is dodgy.
So what we're doing is we're sending messages out to people who know very little about this, because we've lost faith in saying just trust the professionals. Now I don't have a lot of faith in doing that but I'm trying to get to some happy medium where if we want to put effort in, we can be confident that the professionals are pursuing evidence-based solutions but you have to put a fair bit of effort in to know that. So I want to strengthen their ability to do that, resource their ability to do that and get them to solve the problems as best they can because almost all of them want to do that.
At the moment we've got this sort of schizophrenic system where we have people who know nothing about the subject, holding other people to account in a kind of a Tower of Babel, which is getting us absolutely nowhere if experience in child protection, Aboriginal welfare and a number of other areas is to be taken as evidence.
Patricia Rogers
I guess one of the big challenges that I see is expanding who should be involved in using evidence. There are two others that I think we need to also have on the table. One is that they're both about the public but in different ways. One is the public as consumers, as customers of particular services and actually how they engage with their individual doctor or the individual service provider…
Nicholas Gruen
Yeah, agreed.
Patricia Rogers
Being engaged in decisions about what to do. Something that might be on average okay isn't going to work for you or whatever. The other one is the public as citizens, and I think that's really difficult in a whole populist environment with fake news and silos of information and a lack of democratic discourse and dialogue where people are not accustomed to listening to each other and asking questions to understand. But I think that's where we really need to get to…
Nicholas Gruen
Totally agree.
Patricia Rogers
…so that we can have people involved in it. That's a much harder task and I think we have to have that as part of what we're trying to achieve as well.
Glyn Davis
So we've heard about a realist school for testing policy outcomes and we've got before us a new proposal to create an Evaluator General and a culture of evaluation. It's been a great pleasure to speak today with Patricia Rogers, Professor of Public Sector Evaluation at the Australia and New Zealand School of Government. Thank you, Patricia, for joining us on the podcast.
Patricia Rogers
You're welcome.
Glyn Davis
Nicholas Gruen, the CEO of Lateral Economics, Nicholas, thank you for taking time to join us on The Policy Shop.
Nicholas Gruen
Thanks very much, Glyn.
Glyn Davis
Thank you for listening.
Voiceover
The Policy Shop is produced by Eoin Hahessy with audio engineering by Gavin Nebauer and is recorded at the Horwood Studio at the University of Melbourne.
Thanks to the BBC for the use of their clip from Yes Minister.
The Policy Shop is produced under Creative Commons copyright, the University of Melbourne 2018.
It is the eternal struggle of governments to ensure the taxpayer gets value for money and that services are delivered effectively and fairly.

Does government get the right advice?
In this episode, The Policy Shop considers how we can improve public policy in Australia, examining a proposal for a new model for evidence-based policy.
Patricia Rogers, Professor of Public Sector Evaluation at the Australia and New Zealand School of Government (ANZSOG) and Nicholas Gruen, CEO of Lateral Economics join host, Professor Glyn Davis, Vice-Chancellor of the University of Melbourne.
Episode recorded: 5 June 2018
Series producer: Eoin Hahessy
Audio engineer: Gavin Nebauer
Banner image: Pixabay