Remarks by Shelley Metzenbaum, Office of Management and Budget

U.S. Department of State Third Annual Conference on Program Evaluation
Washington, DC
June 8, 2010

MS. ABENDROTH: Good afternoon, everyone. Welcome back. I hope you're all having an enjoyable and informative day. My name is Claudia Magdalena Abendroth, and I'm the director for strategic and performance planning at the State Department.

It's my pleasure to introduce our good friend, Dr. Shelley Metzenbaum, as the next speaker. Shelley works in the Office of Management and Budget as the associate director for performance and personnel management. She works directly with Jeffrey Zients, our nation's first chief performance officer, to advance accountable and transparency across the federal government.

Her previous positions include founding director for the Collins Center for Public Management at the University of Massachusetts; associate administrator for regional operations and state and local relations at the Environmental Protection Agency; undersecretary of Massachusetts Executive Office of Environmental Affairs; and executive director for public sector performance management at Harvard University's Kennedy School of Government.

Please give Dr. Shelley Metzenbaum a warm welcome.


DR. METZENBAUM: Thank you, Claudia. Did you notice how she memorized that whole thing? Really.

So I'm very pleased to be here with you today to talk about how the Obama Administration is thinking about performance measurement, performance management, and evaluation. And when I talk about performance management, I'm talking about the use of goals and measures, and then your analysis of those measures, to improve outcomes but also to communicate to the public. So let me sort of chat with you a little bit about that, and I'll also happily answer any questions you have.

Let me start by talking about what we're trying to accomplish. And the President, I'm happy to say, says it best. What we're trying to figure out, and what we do with our measurement and what we do with our evaluation, is not whether or not the government is too big or too small, but rather whether it works.

And what do we mean by whether or not it works? We really mean how it affects people, places, communities, whether it helps families find jobs at a decent wage, or the care that they can afford, or a retirement that's dignified. We're really talking about trying to make a difference in the world.

And that's a hard thing to do in organizations as large as the ones in which we work. And yet if we don't maintain that line of sight from what we do every day to what we're trying to accomplish, we can get caught in the trap of just doing without ever knowing if that action is having an impact.

And the President goes on to say success should be judged by results, and data is a powerful tool to determine results. So it's not the only tool, but it's a powerful tool that we can use and should use on a regular basis to help guide our actions. We can't ignore the facts. We can't ignore the data.

It' not often you get a President who talks about data that way. But this one actually talks about it, thinks about it, and actually uses it, and takes in the different discussions and interpretations, and I think many of you know, to consider the issues, both from the point of view, but also let's look at the facts.

And so what we're trying to do with goals and measurement in government, with evaluation, is do work that is useful. So you see my useful button up there. Right? Useful, useful, useful. If you're setting goals that you're not using, if you're measuring and not using the data, then you ought to ask yourself why.

Is it because I don't have the analytic capacity to actually dig into this? Is it because I don't think this is actually useful data? The whole objective here is to get useful data, not data that shows as we've done in the past, not data that in the past we've said, okay, you have the Government Performance and Results Act. We are going to set goals, and we're going to put it in reports. That goal�'setting was really helpful and important if people used it.

But it's not very important if the goals just go into a piece of paper that then gets sent up to Congress that nobody reads. So we really need to think about how we can use performance information, to lead.

So what do I mean by lead? Well, in government we often have people who expect us to do far more than we can actually try and accomplish. We just don't have the resources to do it all.

So goal�'setting is a really powerful way of communicating what our priorities are and what they aren't. And the ambitiousness of those goals, that's also very powerful.

But then we want to learn. We want to learn what's working so we can do more of it and what's not working so we can do less of it. And we want to not only use that information to lead and learn, but once we've figured out what's working and do more of it, we want to improve outcomes. And once we figure out what's not working, we want to fix that. Or if we can't fix it, we want to stop doing it.

There may be things that we want to figure out how to fix, but we can't do it so we really take a completely different approach, experimenting. So we want to use that performance information to improve outcomes.

We also want to communicate it, clearly and concisely and candidly. And I throw in that word "candidly" because I think I've gotten some questions about, okay, do you really want to see performance data that shows that things aren't going as you'd hoped? Do you really want to have evaluations that actually don't show a very pretty picture about the impact of the program?

And the answer is yes. We're committed to an open government and an honest government that communicates candidly, but that doesn't make it too complicated, either, that really thinks about the audience for that information, so clearly and concisely.

And this third bullet, in terms of what our performance management strategies are, use performance information �'�' it's strength in problem�'solving networks. Well, what do we mean by that?

Well, here's one thing. There's a lot of people across the federal government who share objectives and who work on the same kinds of challenges, one of which is: How do we stop bad things from happening or prevent bad things from happening that are hard to measure?

Turns out that there are an awful lot of federal agencies that try to do that. How can we get smarter about managing in that area, whether looking at near�'misses or precursor indicators or whatever �'�' how do we get smarter about that? So we need to bring people together who are already thinking about so we can learn together and not be inventing it in all the different parts of government.

So the approach we're taking in this administration to performance management is not having everybody just write reports because those reports are useful; they can be really valuable not just writing plans �'�' although those plans can be useful; they can be really valuable �'�' but to actually bring this alive, create a dynamic within the organization that is focused on those goals and constantly measuring to see, how are we doing in reaching those goals?

So we're here now. We want to be here. What does it look like to get from here to there, and how are we doing along the way, and when do we want to make a mid�'course correction?

And having those data�'driven reviews to discuss it, to really be diagnostic about it, to brainstorm, to get good ideas going, so a system that is alive, that is not just a compliance practice but actually helps you come back and go, okay, wait a minute. Let's try this because that didn't work.

And not just once every year in the budget process, but on an ongoing basis, and that makes that a transparent process to the public. So that's how we're approaching performance management. And we've done that in part by asking agency leaders to set high performance goals for the near term. And let me get back to that in a second.

Because what I want to also �'�' since this is an evaluation conference, I thought it would be helpful if I talked briefly about how we see performance information, the performance management work, the performance measurement work, the goal�'setting. How does that link to evaluation?

And a lot of people sort of put evaluation over here, and they put performance measurement over here, and they don't see that the two of them are inextricably linked, and that the managers in an organization need to be thinking about them in that way.

And we're trying to get a dynamic going, working with you that gets that dynamic going, so we set some clear goals, some near�'term high priority performance goals, as State has set �'�' Afghanistan and Pakistan, obviously; global health. You've got near�'term high priority performance goals. What are you going to try and accomplish within two years?

Now, why, with such big issues, did we ask you to think about what do you want to accomplish within two years when, in fact, some of those are long�'term goals, too?

And the answer is, we tend to get so far out there that we don't always force ourselves to say, what are we going to try and get accomplished within current budget, with current legislation, to really focus our senior managers, the Secretary of State, to have �'�' she articulated these high priority goals, these near�'term high priority goals to get focused on, what are we going to try and get accomplished in the near term, and how are we going to know if we're on or off track along the way?

And let's measure that performance, but also let's do the same thing in the longer term. Let's know whether or not we're on or off track. And when we find things that seem to be working, that seem to be building better relations, whether it's a Fulbright program �'�' and I should say I'm the mother of a Fulbright Scholar �'�' whether it's a global health program, whether it is a food security program, if we are dealing with those kinds of programs and we're beginning to see, wow, we are actually seeing changes in farmer behavior locally that we hoped we would see happen along the way, well, then all of a sudden we may say, wow, that's a promising practice.

What did they do? What happened in that country that actually moved the needle in terms of changing food security practices, where you have a theory of change about here are the five practices we need to change, and here's how we're going to know if we're getting more food security. What's working and what's not working?

And when you start to see some of those promising practices, you can say, hmm. We've got it happening in this community; now let's try it over here. And if it started here and you got the changes you expected, and you move it over here, you do a replication demonstration, you begin to build some confidence.

And then you try it in another country and say, wow, it even works across cultures. Or maybe it doesn't work across cultures, but we can spread it within the country.

So we're starting to use the performance measurement to help us see changes that we think are promising practices. And when we can replicate them, we get more and more confidence, and they begin to feel like proven practices.

So the measured performance, we searched for promising practices. If we find them, we try and validate them and demonstrate their replicability. If we can't find any promising practices, if we're in there trying to deal with food security and we're not seeing any changes either in the short�'term sort of behavioral changes we expected to see or in perhaps changes in sort of economic indicators we hope to see �'�' if we're not seeing those, then we need to go back to the drawing table and experiment.

And that's what those dynamic meetings are hopefully helping us have, both here in Washington, but also out in the field. But at the same time, we may find promising practices, and we need to do evaluations.

So basically, after we found promising practices, we may feel very confident about them. And then we need to figure out how to promote them, promote their adoption. And there we need to get more and more successful and more effective, too. And we may find evaluations done in other places that help us say, hey, that's a proven practice. I want to promote that.

So this is not a separate set of activities that are going on, but performance measurement to actually help us manage on a daily basis, and occasional evaluations that help us learn more about specific aspects of problems, that help us compare different kinds of interventions, and that help us isolate the impact of our work in government or the work of others.

So I'll give you my favorite example here, and I know you'll think it's a little bit more �'�' not in your world, but it's such a robust example that we're all intimately familiar with, which is how many of you know about the "Click It or Ticket" campaign? Any of you heard of that?

(Show of hands.)

DR. METZENBAUM: Now, that's interesting, because a lot of you travel overseas a lot. But over half the hands in the room went up.

So the "Click It or Ticket" campaign is something the National Highway Traffic Safety Administration does as an effort to get everybody to buckle up their safety belt. Why? Because we have incredibly compelling evidence that buckling your safety belt makes a huge difference in saving lives. It doesn't prevent the accidents, but �'�' and I'll quote a Coast Guard person here, a Coast Guard guy, who says, "My job is to prevent bad things from happening, and when I can't prevent them, to reduce their costs."

So some accidents we try and prevent, and we focus on the equipment. But in other cases �'�' and sometimes we focus on the operator and try and reduce the number of operators who are drunk, or we try and increase the number of operators who are young and who have training and aren't driving with distractions so we don't �'�' a number of states are saying, no more kids and the other kids in the car when you're 17, 16 and 17. You've got to be a little bit older.

So the National Highway Traffic Safety Administration is scanning to look for successful practices. And then when they find them, they promote their adoption. They use the tools at their disposal �'�' sometimes they have grants, sometimes they have penalties, and sometimes they just have persuasion �'�' to try and promote those practices.

And they track whether or not they're being promoted and whether or not the outcomes have changed. But at the same time, they do occasional evaluations to figure out if, in fact, it was their actions that made a difference.

So that's the dynamic that we're trying to have here. It's not about reporting �'�' reporting's useful �'�' but it's really about performance improvement. It's not about complying with requirements, but it's about performance improvement.

So I throw this slide up here because in your business �'�' and, in fact, I think in all of our businesses �'�' we use measurement a lot. But we use it for a lot of different purposes. And I just throw this up here because sometimes we get some of those purposes confused.

So we do description, we do prediction, and we do prescription. And a lot of times I've heard people talk about models that predict what's going to happen, and use those as evaluation.

Some of you who live here may remember a few years back there was a Washington Post story about the Chesapeake Bay, where they supposedly were measuring performance of success on the Chesapeake Bay and whether or not the water was getting cleaner. And it turned out it was all modeled data.

Modeled data is incredibly important for informing our actions, but we need our models to be updated with descriptive data, with evaluations that help us know what has happened in the past as well as other intelligence that help us really think about, okay, what's likely to change in the future? So that help us forecast.

And then, once we've got that, you can do the scenario planning �'�' and I'm guessing in your world you do that a lot �'�' to sort of say, what's going on? But to the extent this can be informed by analysis of past experience, both through performance measurement and through evaluation, you're much more likely to be accurate.

Which of course means we need to build the capacity to learn from our experience, to bring in that information. So the President, when you had the �'�' was it the Christmas �'�' the underwear bomber? Is that what we call him? And he had a great line about, we had the intelligence; we didn't apply it.

And I think that's a big challenge in your world. I think you're smack in the center of how do we take the intelligence we have and figure out how to filter it? And you're part of with not yourselves but with lots of other agencies.

How do we think about, if we're trying to build relationships with different groups? The story in the Washington Post the other day about the military putting women out to talk with other women in Muslim countries, is that working better? What can we learn from that, and how do we apply that, then, in other things we're trying to do? So really trying to figure out how do we get smarter over time? How do we build learning organizations?

So I throw this slide up here just because one of the other things I've heard a lot in the past questions/complaints is, okay, you want outcome measurement? It was all one size fits all. But it's not one size fits all. That's why we need these problem�'solving networks.

We have different kinds of outcome measurements. Some have long lag times. Some have unpredictable lag times. It's harder to measure the impact of R&D because it's both unpredictable and it's a long lag time in most cases.

Does that mean we shouldn't try to actually assess the impact of R&D? The fact is, if we're going to run our programs smarter, we have to say, okay �'�' and in your world, this is very true �'�' how are we going to learn if the actions we've been taking are working?

Because if we don't try and figure that out in a fairly objective way and build our capacity to learn more about that over time �'�' and I'm not suggesting this is easy �'�' but if we don't do that, then we're just going to keep doing it over and over again whether it worked or not. And we can get on autopilot an awful lot.

So the measurement, what do we need the measurement for? Why are we actually trying to get this information? What is the purpose? Yes, it's to improve outcomes. I would also argue to you it's to inform decisions. So we want to have that data.

We want to do the analysis, both the evaluations and the analysis of the performance data and other data we collect, to inform goal�'setting. We cannot focus on everything at once. We have to be able to set priorities.

And that's not easy, and politically that's incredibly hard. But that's okay. That's good political debate. The Department of Transportation has a new strategic plan on the street, and it is very different than the overhead strategic plan in terms of the goals it's laid out.

Is that a bad thing? No. That's a good, healthy democratic debate. That is leaders coming in and saying, we're not sure if the focus we've had in the past is exactly the same focus we want in the future. We think we ought to go in this direction instead.

And that's what we need to use our goals for, is to stimulate this healthy debate, especially when it's backed up by information about the characteristics and size of the problems we're trying to tackle, the characteristics and size of the opportunities. It's really all we're thinking. It's how do we get information from across large organizations and beyond the organization and apply it? And that's what we're trying to do with the goal�'setting.

And then, as I've been saying, we want to use those measurements to find the promising practices we want to prove, and then to take those proof and practices and promote them. And then we also want to find problems to fix, and a lot of times that means drilling down to find the causal factors.

But then it's about improvement, but also, as we all know, this is all very values driven. The choices of the priorities we set are informed partly by data, but also partly by values. And so we need to get information out there to inform policy debate about what we should be focusing on.

But we can also sometimes get information out to inform individual and organizational decisions. So all of us would kind of like beach data right now; it's always been a problem that the beach �'�' there are a number of beaches that are marked with flags about whether or not the water quality is safe.

And of course, we're �'�' it's fascinating. We have a controlled experiment going on right now with the BP thing, and we have a lot of really dynamic, quick feedback about, is this working? It's not working. Let's try something else.

But it's interesting. Alabama has closed its beaches and made it illegal, whereas Mississippi and Florida have not yet closed their beaches. So a little ways down the road, if somebody looks at the public health data, they're going to be able to see the impact of people swimming in oily water.

And we in government actually have to look for those natural experiments and take advantage of them. We need to think about how to run our programs to take advantage of them. But we also want to think about how we communicate information to inform individual or organizational choice.

So we try and get data out on the cleanliness of beaches �'�' certainly, a lot of states do that �'�' but, in fact, it turns out that data is dated. You usually can't get a good reading till two or three days later.

So where should government be investing? Well, one of the things is in getting more real-time data and figuring out how to get it to the point of use. So when I say useful, useful, useful, some of it is to inform our decisions in government, but a lot of it is to inform other people's decisions.

And the open government efforts of the Obama administration are also to get the data out so others can do some analysis and inform our decisions, so others can do some analysis and inform other people's decisions.

There's some really interesting stuff going on in terms of traffic that way, so we can get real time data that tells us where traffic is, which we already can do with Google. But now there's an effort that's coming up that I expect we'll see some announcements about it pretty soon, which is actually giving us like an hour in advance predictions of traffic data so that when you're ready to leave the office, you can go look at it.

So these are various uses of goals and measurement. And the challenge is for us to start to figure out how to collect this data and disseminate this data and analyze this data to make it useful.

I throw this up because this is just a sense of, as we talk about how we look at data, there's just lots of different ways to look at it. So this is �'�' SAS did this. Standard reports, ad hoc reports, query drill�'downs, alerts.

So now this is after you figured out what you're going to start to analyze and what your goal is. It's missing the whole continuum, which is how do we choose our goals? How do we pick our strategic priorities?

Then you get to statistical analysis, forecasting, predictive modeling, and optimization. This is just one company. And then they laid out, I'm guessing, kinds of questions that you might want to ask with your analysis.

So those are sort of the things �'�' I'll leave those questions up there �'�' because I think these are things we, as we do analysis and as we do evaluations, we need to ask ourselves, what are we trying to answer? What questions are we trying to answer?

Are we trying to answer the question of, does this program work or not? Think about that. Is that the question we're trying to answer with evaluations? And if we're trying to ask that question, why are we trying to ask that question?

So I think a lot of times �'�' and I'm going to put this in a different realm because, if you think about it in your own realm, you sort of feel a little threatened sometimes.

Do you want to know if the water quality programs that EPA runs work? Okay? Let me tell you, the answer to that question is yes. What are you going to do with that answer? Invest more in the water programs? Okay. Just give it all to all water programs?

Or do you need to have a much more granular answer to that question, a much more refined answer to the question: Where is water quality getting cleaner, and where is it getting dirtier?

In the places that it's getting cleaner, is it because there's been a significant change in residential composition, the residential patterns in the place? In the industrial patterns in the place? Or is it actually getting cleaner because of something that's not explained by those changes?

So evaluation can help you look at that. At the same time, EPA, running its programs, could actually be doing what the National Highway Traffic Safety Administration did, which was it saw a big decline in the fatality rate in California, and it said, why is that going on?

So you start to look for variations that you didn't expect. You look at the data to say, huh, this one is unexpected. It's an anomaly. Why did that happen? And you can then set up an evaluation, which can be very useful, and you can also do a drill�'down to say, why did that happen? What are the causes of that happening?

So in the case of the National Highway Traffic Safety Administration, with California, what kind of analysis do you think they did? So what they really did was very sophisticated. They picked up the phone and they called California, and they said, wow, your fatality rate is way down. Did you do something?

And California said, in fact, we did a primary enforcement law. We passed a primary enforcement law that lets you stop and check for safety belt use, that lets the police stop and check.

So it is looking at our data on a regular basis to see unexpected changes in speed, unexpected changes in direction. It's looking for relationships. It's looking for patterns where we may have opportunities to move on those patterns. And there may be scale economies.

It's really making data analysis, integrated that into the way we run our programs, and also looking for opportunities like this natural experiment that's going on right now in terms of Alabama restricting access to the water and Mississippi not, and having someone in an agency say, we really ought to be looking at that and at the public health data now to see what's making a difference and what's not, or at least get that out and hope others will look at it.

So there are questions we have to think about as we move forward in this world. One of them is the dimension for analysis and for action. Are we going to look at the government�'wide level, the program level? Are we going to look at a project level? Are we going to look at the local level? How do we do that?

Are we going to �'�' as you try and build relations or improve food security or improve health, public health �'�' in the public health world, this has been done for years; I mean, it really is �'�' much of this is applying epidemiology across all of government. Right?

It's taking epidemiology methods and the research methods, where you are integrating data and you're integrating evaluation, and applying it to the way we run our programs, and so saying, okay.

Are we going to look at this at the scale of six locations �'�' we're going to do not a full scale out randomized controlled trial, but in fact just try three things here and three things there, different? I mean, one thing in these three communities and one thing in these three communities, see if we get a difference.

Will that prove it to us? No. But it will get us smarter than before, and we can take the next step, and really running our programs in a much more nimble way.

I think another challenge is thinking about the audience for information. So as we do evaluations and analysis, as we collect data, who needs to use that information and are we getting it to them?

So if we're conducting evaluations and they're not getting to the people who might apply the lessons of them, then we're getting a whole lot less value out of both the evaluation and the program than we would have otherwise gotten.

So really thinking, not just, it's my job to do evaluations, but rather, what questions am I trying to answer and who needs to know the answers to that? If I've run a program in one place to work with the youth in that area and it doesn't seem to be working, then should I let others know that so they don't try the same thing, or at least get someone in a central office �'�' Louis Brandeis, Justice Louis Brandeis, once talked about the states as the laboratories of democracies. You guys might think about some of the countries that we're working with as we're trying different things.

Who's the scientist in the laboratory? And I think that's a big challenge we need to think through. Who's going to ask these questions? And once the scientists have found the answers, in our traditional sciences, our scientists then have to get this into peer�'reviewed journals. And how many practitioners do you know who read peer�'reviewed journals? Right? We have to think about disseminating that information, too.

And then we need to integrate, as I said, the experiments and the evaluations into our program operations so we can compare different kinds of practices and their impact. And we carry out this ongoing diagnostics.

So let me just end with two additional thoughts, and then if you have some questions, I'm happy to answer them.

One thing is, I've been talking a little bit about getting the data out, open government, getting the analysis out, getting the evaluations out. I think, as we think about open government, we need to think about why are we trying to be transparent? Because then we're likely to be more effective in our transparency.

So I throw this out to you to think about. I'm not saying this is right. What we're trying to do here is get all of us, all of us in this learning network and all of us in this performance�'improving network.

One aspect of getting information out, I think, is to improve democratic decision�'making. Another, as we put goals and measures online and want you to report progress on it because we're trying to do that, is to motivate so that if you actually know who the goal leader is for a particular goal, which is what we're doing with the high priority goals �'�' we've identified the goal leaders, and they're really on the line to deliver on this.

And so we want them to feel like, whoa. This is going online to the public. I'd better stay focused on that goal, and I'd better be constantly asking: Is there a way to improve?

And the other thing is to strengthen the public trust, is to get that information out there honestly, coherently, so it's understandable, in plain language. That's one of the things we're trying to do here.

But it's also, as I said before, to inform decisions, and think about delivery partners that we dependent on. How do we get them the information that helps them know? So the "Click It or Ticket" campaign I mentioned, that's actually a packaged campaign the National Highway Traffic Safety Administration does for local partners.

So they actually set it up with a script so the local policeman can do an ad on the local TV. They do PTA announcements. And the federal government actually �'�' you can go online and actually see this package.

So they're thinking about their delivery partners and how to help them, giving them the evidence about why this makes a difference; also giving them evidence so that if they need to persuade policy�'makers, they have that evidence.

And then the other thing that's really important, I think, is to get that data out to the folks who are supplying the data, returning it, though, with value added through our analysis so that they want to give you more so we all start to learn more.

I think another reason to get the information out is to stimulate ideas and innovation, and then, of course, to enlist assistance. People in this country, people around the world, are really ready to help if we figure out how to get them the information they need. So I think that's a big challenge for us.

And I want to end with a discussion about accountability expectations. And I put this out here because as we move forward to try and have a goal�'focused, data�'driven, evidence�'based government, there's oftentimes a lot of fear.

What if I don't meet the goals? What if performance improves? What if I've done an evaluation and it looks like the program didn't work? Am I going to look bad? Am I going to �'�' what is �'�' you know, there's a big problem here.

Well, I think what I want to say is what we're trying to do �'�' and I cannot promise you Congress will do the same, though we are happy to talk to them; I especially would love to go up there with you �'�' but what we think about in terms of what we want to hold you accountable for, as folks in the federal agencies?

One is setting clear, outcome�'focused, outcome�'aligned goals, a few of which are priorities, and for those priorities, with ambitious targets that stimulate innovation.

The second is measuring progress toward those goals.

Third is analyzing them, is really looking at the data. Don't just report it, analyze it. Look for patterns. Look for anomalies. Look for relationships. Try and find the causal factors so you can find the ones you want to prevent and the ones you want to promote.

And then take quick action, based on the evidence. Make the adjustments. And don't get caught in an "I've got to wait for next year" kind of thing.

And then finally, I would say to you I'll apply the Bill Bratton accountability principle. Any of you know who Bill Bratton is? New York City Police Commissioner, then Los Angeles Police Department. Drove the crime rate down, New York, 40 percent in the three years. That was the goal he set.

Needless to say, when he tried to set that goal, everybody in the police department said �'�' actually, Rudolph Giuliani, the mayor, his boss, said, don't make that public. Bratton pushed for making it public. He made it public. He didn't last with Giuliani that long because Giuliani and he were both strong personalities. But they systemically brought the crime rate down.

How did they do that? Well, for one thing, they would have precinct captains �'�' excuse me �'�' they made precinct captains clearly accountable for bringing the crime rate down in their precinct, which wasn't such a clear line of accountability before. You had borough commanders and precinct captains, and they were matrixed, and nobody knew who was really in charge.

Bratton said, precinct captains, you need to bring the crime rate down in your precinct. And then he said, you've got to start to report the data on crime regularly.

And here's what's really interesting: They did report the data on crime regularly, but they reported it every quarter, and the data wasn't due to the federal government until six months after the quarter had ended. So it was nine month old data.

So do you think that the New York City Police Department could use nine month old data? But that was the only crime data they were looking at. What were they looking at?

They were looking at data like, the customer �'�' the response time. You know, you called in to the police, and there was response time. There was one actually in New York City, Bratton says �'�' they were looking at things like how many meetings they'd had in the community because they were into community policing.

But nobody was asking, is the crime rate going down? And Bratton asked, is the crime rate going down? Now, this was really very complicated because it's hard to do that. No? Right? It's really not hard to do that. You're counting it in your office.

So he made them send that in to a central office. And here's the thing. We say today, oh, we can't do it. It's just too much �'�' the data costs are too complicated. We'll have to have a big system. They faxed it in initially, and then they drove in floppy notes on Lotus Notes �'�' Lotus 123. It was before Lotus Notes. Right?

And they took the data, and they analyzed it, and they looked for those patterns. They looked for those anomalies. And they would make the precinct captain stand up here with two screens behind his or her head and have to explain what was going on and what they were planning to do with it.

So what Bratton said is, nobody ever got in trouble if the crime rate went up. They got in trouble if they didn't know why it had gone up and they didn't have a plan to deal with it.

So it is our hope that in government, every agency will have goals with a line of sight to people on the ground, to places, to communities. Every person in every agency will have a clear line of sight to the goals they're trying to advance.

And everybody will be pushing to get smarter and smarter and act on that information, to use �'�' we want actionable information. We want people to act, and to feel pushed to act, and regardless of whether it's within their own unit or they have to reach out to another unit, to act to improve those outcomes; but that if you don't get there, you're not going to get punished for that if you had that clear line of sight, if you know why didn't make it, and if you have a plan to make progress in the future.

So that's our hope. And every one of you sitting here who has chosen to come to an evaluation conference, we hope you are front and center in that process, helping decision�'makers that you work with, including yourselves, make smarter decisions so we can have better outcomes.

So thank you very much.


MS. ABENDROTH: We'd like to open it up for some questions. We have two mikes on either side of the room, so if you have a question, please walk up and speak into the mike. And while you're gathering your thoughts, I'll ask the first question.

You laid out the administration's approach, with the focus on the use of information and data. Can you talk a little bit about how you and your office are engaging Congress, and what you see as their role?

DR. METZENBAUM: Yes. That's a great question, how we're engaging Congress. One of the ways we're engaging Congress right now is Congress has actually got a new bill going through to sort of take what we're doing and put it into law.

And we've been trying to be responsive to Congress �'�' this was not initiates by us; there was, in fact, a different bill that we felt wasn't going to work as well, based on experience.

So some of you know the program assessment rating tool. And actually, before I came here, I did what could best be described as an in�'depth survey of users of PART, both agencies and appropriations committees.

And it just �'�' it had all the right intents. I think it had a whole lot of good questions. But it ended up not telling you �'�' motivating the performance improvement in a lot of places. It did in some and it didn't in others.

And so there is a bill in Congress which was going to codify PART. And so we started to the congressional committees about can we actually get something based on what our experience says has worked well in other places.

And the approach we're taking now, I should say, is informed by the experience of New York City, as I informed by the experience of Maryland, in the United Kingdom, in Australia. We're looking at things that have worked, and it's not all over the place in Australia, but parts of their education and labor and employment and training world.

So we're having conversations with Congress about these things, sometimes around specific bills with specific members of Congress who are interested. We also want to have conversations with Congress with the agencies, so that we're not going up to Congress and saying, okay, here what we're hoping you'll do, Congress, but rather go up with agencies to Congress �'�' to your appropriations committees, to your authorizing committees �'�' and have those conversations as well.

And the other thing I guess I would say is we feel very strongly that agencies should be having a conversation with Congress about their goals and about their progress and their problems.

So we see Congress as a key customer of the performance website. For now, it's called That may be its long�'term name, but I'm not sure of that yet because we haven't gone live.

But the performance website, the performance portal, we actually see Congress as a key customer for that. And it's this website where we're taking the high priority goals �'�' there are about 130 of them �'�' and it does identify who the goal owner is.

And we've asked everybody to say, where are you now? Where do you want to go? What are your quarterly measures and milestones? Why did you choose the strategy you've chosen? Why did you choose the goal you've chosen?

Going to a website, and it's really interesting because I've already twice since �'�' it's up now; you guys are loading it right now �'�' I've gone there twice to see if there was enough data in it to answer a question that was coming to me from the press.

So I think that we're hoping Congress will start to use that on a regular basis. But we also plan to engage them. But we'd rather do that working with agencies than do this on our own.

Did I see another question over there?

QUESTION: Thanks very much. A whole range of things, but let me start with two, I guess.

Going back to the notion of accountability and thinking about the State Department's high priority performance goals, the reality is that all the HPPGs for the Department of State, none of them are realistically logical to think of in the short term. We've come up with ways to define them that way, but the reality is that they don't fit terribly well.

So if you're thinking about food security of democracy or Afghanistan, Pakistan, Iraq, all of those have long�'term processes. And I'm kind of curious, I guess, one, why you let us do that. Why did you let us go down that road as opposed to looking for �'�' what we end up with is milestones for 2011 but no real outcomes for 2011.

And I don't really yet see, in your description of accountability, any real sense of what's the real implication. And how do you even think about the long�'term goal and whether or not we're making progress towards it and whether or not, in administrations where leaderships change in 18 to 24 months, why would any leader care about accountability?

You look at the group in this room. There's only one person in this room at the deputy assistant secretary level, and I would be willing to bet there's probably only a handful, maybe even not, at the office director level.

So you're talking to the people who believe. You're not talking to the people who make the decisions. And I'm not sure how you get to those people.

And to finish that, I guess I'll go with: Do you really believe �'�' and I really am honest �'�' do you really believe that Secretary Clinton selected the high priority performance goals for the Department of State, as you suggested that you did?

DR. METZENBAUM: Okay. So let me go through all of those. I'm not sure I'll remember them all, so someone may need to give me a cheat sheet since I didn't write it down.

On the first one, which is, how did we let you do it? Well, in the last administration, we didn't let you do it. They didn't let you do it. So they set goals for a lot of agencies in their part. And they were OMB goals.

And we know that some agencies have taken it far more �'�' are far more �'�' what I'd really say is far more familiar with this way of managing than other agencies, and set outcome�'focused goals or goals with a clear line of sight.

And understand how you break a goal like global health down into, here's what we are actually going to try and get done in two years, which could be there are a few locations where we're going to see health improvements, significantly, or we're going to have immunization rates go way up.

So I guess what I would say to you is on this, we opted to make these your goals, even as we pushed to get you more and more outcome�'focused and more and more specific so that they would be actionable goals. And we realized that we had to sort of �'�' at some point you stop, and you go forward, and you hope that it will improve over time.

We also realized, in the case of the State Department, you have a number of areas where there are other groups that are working on very clear goals, where you're having these kinds of meetings on a regular basis, and that we didn't want to create a new structure. We wanted to compliment that. And so we're still trying to figure out how to make that work.

In terms of do I think Secretary Clinton was engaged in these particular goals, we actually focused on the deputy secretary, and we know that the deputy secretary was knowledgeable about the goals. We also know some secretaries scrubbed in intensely.

So we continue to try and make sure these are in fact aligned with secretaries' goals. And we hope and believe these are aligned with, but if they're not, we're going to hope to find that out and make adjustments over time.

And then finally, the institutional question, the politicos not in the room, et cetera. When this works really well �'�' and I've seen it �'�' so let's take New York City.

Bill Bratton got fired by Rudy Giuliani after three years because he had too much of an attitude for Giuliani to be able to stand. Right? CompStat is still going strong. They're down like 80 percent now in terms of their crime rate.

Not only that, police departments across the country have picked up this way of managing. And you have had an evaluation done of it; the book "Freakanomics" �'�' you may all be familiar with �'�' it concluded that it actually wasn't Bill Bratton's CompStat that did this; it was the abortion rate that had the real impact on crime reduction.

The problem is, then, Bratton went to Los Angeles and did it again. He did a replication demonstration, and we've seen crime rate go down. But there's another evaluation going on right now in the United Kingdom, which is saying, okay, let's look at these crime and disorder reduction units to see which ones the crime went down in the most and what the common factors in those are that weren't available in the others.

So I guess what I would say to you is when this works well �'�' National Highway Traffic Safety Administration, I do not care who the administrator is, it got started well with a data�'driven approach and it continues there; it doesn't matter.

So the political values driven choices are things like, should we follow up when we see violations? So the whole Toyota problem, and now it's Ford, do we follow up when we see significant rise in a pattern of problem? That is a political decision about how aggressive are we on enforcement.

But the fact is, the data are there to be able to make the decisions. And the career staff continually push this and tee it up so that you actually get very good data driven decision�'making in that organization. And you see it in other parts of the government.

I'll mention one other one, which is Healthy People. There's now going on to Healthy People 2020, and it's just a goal�'setting exercise driven by data. But guess what? Congress actually picked it up at one point and put it into a law.

And you see �'�' I actually first learned about it when I was working with County of San Diego. And what was striking to me was it was so cogent in the way it framed the goals and the way it sort of thought about the actions. It was so cogent that the County of San Diego used it almost as �'�' it nominated a set of goals for debate locally.

So I am passionate about this, and I do believe it can work. And I do believe that the people in this room have to help us make it work so that when you get new policy leaders in, they look and they make smarter choices than they make. And I think it can work.

Do I think it's going to happen in every place overnight? No. We're looking for pockets of success. We're looking for performance pathbreakers so that we can find those promising practices, see if they can be replicated, and then promote them. We're trying to manage this high priority goal effort the same way. And we will learn from our experience and adjust.

Another question?

QUESTION: I'm just wondering if you could comment briefly about the direction you see the U.S. government going in developing staff competencies.

I had a job interview about a year ago for a position that was entitled "Economist." I mentioned regression analysis during the interview, to which the interviewer said, "What's regression analysis?"

So my question to you is, first, what do you see is the role of OMB in promoting and recruiting people who are trained in sound methodologies for getting answers to the kinds of questions we're interested in getting answers to?

And secondly, what steps can OMB take to promote and encourage, for lack of a better term, methods geeks in getting into positions of political responsibility so that we don't all, everyone in this room, continue to be the lone voice in the room that says, "No, wait, wait, wait. Don't push that money out the door quite so fast before we have a couple of answers to these questions"? Thanks.

DR. METZENBAUM: Great questions. So one thing that's fun is OMB is now filled with methods geeks, but we all come from different methods backgrounds. And so there's a real nice complementarity there.

But that doesn't answer your question, the question of, how are we trying to build this capacity across the federal government? And how are we trying to build this capacity to support the federal government?

So I'd say a few things. One is the strengthened problem�'solving network as one of our three priority performance management goals is really about trying to address that.

And it's about two things. One is problem�'solving networks for outcome problems, like climate change, like weatherization, like science, technology, engineering, and math education, so really getting cross agency �'�' it's like the First Lady's obesity work. Right? Is really getting cross agency groups working and making sure that they're not just issuing reports, but actually implementing the recommendations.

So we're trying to strengthen those and figure out how to do that. I won't say we've got a perfect way of dealing with it, but we're learning and we're trying to do that.

And the second is trying to second our performance management and evaluation capacity. And on that, I guess what I would say is we're launching working groups and trying to create expert networks.

So I'd love to get your card, if you're interested in that. There's an evaluation working group. A steering committee is already up and running. There used to be one; it's getting resurrected, and it's very important.

The director of OMB is passionate about this, is getting an evaluation capacity that's strengthened. So are the deputy directors of OMB. So we're really trying to get the evaluation capacity moving forward.

There is another working group on �'�' it's the working group with the worst working title, which is, "Preventing Undesirable Incidents." Right? But it's basically �'�' whether you're doing epidemiology, or Total Quality Management, or you're doing risk management, there's like seven different disciplines that are all trying to prevent bad things from happening, and reduce their costs when they do, and figure out how to budget for that, especially those bad things you can't measure like drug running.

So we're trying to get that. There's a group that's come together to try and help each other figure out even what the methodologies are.

And I'll be meeting later this month with a group called "INFORMS" on operations research, and so really trying to build �'�' I don't know what the INF is, but OR is Operations Research and Management Sciences.

And they've got a public sector division. It's mostly private sector. But we're trying to help them beef up their capacity so that they can help us in government.

And then I think we're also trying to think about actual job descriptions, job categories, et cetera. But I also welcome your ideas because we are looking for resources we can find for other people, where you can tap each other but also strengthen each other, and really bring people together who've got that capacity so they can help each other grow and help others grow. But this is an area where we're just beginning to work.

And then the final thing I'd say is we're working with the Office of Personnel Management a lot, and they are setting up registries to speed the hiring process for people with certain expertise. And performance management and evaluation are two areas where we're really working on that.

So great question, and we know we need to work on it. We're trying to, but we welcome advice and suggestions, and will enlist your assistance if you're interested.

QUESTION: I appreciated your emphasis on problem�'solving networks as being key to innovation. And there's a lot of evidence from the nonprofit sector and businesses that that's key for pushing innovation.

So my question is: What best practices have you uncovered for how federal agencies can build communities of practice that cross different departments but also bridge the gap that sometimes exists between the headquarters and field agencies?

DR. METZENBAUM: So I don't have one on the tip of my tongue, except that I believe in the drug area, there was �'�' I just seem to remember reading a study about that.

But it is a great question. I have a guy here with me named Michael Messinger who's on my team now. I'm borrowing him from Voice of America. And thank you, Voice of America, those of you who are there.

But we are building this website which will have the goals and have the measures. But we're also trying to have tips, tools, and templates. And there is a group of �'�' they're called �'�' they're working with the Partnership for Public Services.

It's a group of federal employees who have a fellowship assignment. And they're working with us on communities of practice to help us find the best practices and communities of practice so we can actually set that up and work on it.

And again, we'll be glad to enlist your assistance if you want to help us on that one. So thank you. So Michael and I'll be right down here, or we'll be at the cocktail party or something. Okay? Please do find us.