Posts Tagged ‘performance measurement’

Cash for Clunkers: Measuring Performance

May 5, 2010

The GAO recently released a report reviewing the impact of Cash for Clunkers. Beyond good information about that program, this report impressively discusses general questions regarding stimulus implementation and performance measurement. Here’s the conclusion of the GAO’s report: (Note: NHTSA refers to the National Highway Traffic Safety Administration; the emphasis is ours.)

The implementation and results of the CARS program offer potential lessons learned for future vehicle retirement or similar incentive programs. First, the program produced economic and environmental benefits, achieving its broad objectives. However, the extent of the program’s effects is uncertain.

Second, before a program is underway, steps must be taken to determine what impacts are going to be measured and what data will be required to measure them. Moreover, steps must be taken to ensure that the data are reliable. NHTSA relied heavily on the consumer survey for data on the economic and environmental benefits of the CARS program. However, there is a potential risk to the reliability of estimates based on this survey data, because NHTSA did not follow some generally accepted survey design and implementation practices, largely because it had limited time to establish and administer the program.

Finally, given the number of stakeholders that are financially affected by the auto industry, it would be important to collect and consider information on how a future program would affect these stakeholders and take mitigating actions, as appropriate.

It’s hard not to get the sense that the fly-by-the-seat-of-your-pants nature of Cash for Clunkers made it, at best, difficult to ascertain the real benefits of the program. Given that some facets of the stimulus are designed specifically to spur data collection and use (as in the education and health care fields), we’re often surprised by the data shortcomings in other  stimulus spending as well as the lack of strong performance management thinking.

We appreciate that officials wanted to execute the program quickly, but if you can’t evaluate how beneficial a program is, how can you know whether or not it was wise to do it in the first place, much less do it quickly? How can you learn how to do it better next time?

We should also note that this second takeaway—“ before a program is underway, steps must be taken to determine what impacts are going to be measured and what data will be required to measure them”—sounds to us like some of the issues we’re hearing with respect to other stimulus projects. In several of the stimulus program measurement conversations we’ve had, jobs and spending seem to be the only real measures of success for certain programs.

Finally, anyone interested in designing and conducting valid surveys should definitely read Appendix II of this report, in which the GAO analyzes the NHTSA’s consumer survey of Cash for Clunkers, pointing out its weaknesses.


Long-term thinking in New Mexico

April 23, 2010

Part of our incentive for starting this blog was to shed more light on the actual performance benefits of stimulus dollars, as opposed to just “jobs, jobs, jobs,” in the words of the GAO’s Stan Czerwinski.

With that in mind, we were delighted to see a press release from New Mexico that quotes Arturo L. Jaramillo, Secretary of the General Services Department, there.  Jamarillo was talking about energy efficiency projects in state buildings: .”These projects must realize energy savings per dollar spent. Reducing utility costs through high efficiency upgrades such as these will result in millions of tax dollars saved in the long term.”

That’s just the ticket, as far as we’re concerned. The key, of course, is doing the follow-up to see whether those dollars are actually saved, after the fact. We’ll be checking with New Mexico and other states, on that front, as time goes on.

In any case, performance benefits are not inconsistent with the desire to help the state’s economy in the short term. New Mexico’s Governor Bill Richardson recently cited the Council of Economic Advisers report, “that says the Recovery Act is responsible for creating 16,000 jobs in New Mexico to date.”

Cash for Clunkers: Digging Deeper

April 15, 2010

The debate over the economic benefits of last summer’s “Cash for Clunkers” program continues. Recently, the White House blog and exchanged salvos in the argument.  But the economic benefits are only one part of the question. Examining performance metrics other than economic impact — such as the environmental benefits — sheds additional light on what exactly Cash for Clunkers accomplished.

Happily, fedgazette editor Ronald A. Wirtz  has a piece on the website of the Federal Reserve Bank of Minneapolis that digs  into data regarding some of the fuel efficiency savings that the country could gain from Cash for Clunkers.

Wirtz looked at the program in Minnesota, Montana, North Dakota, South Dakota, Wisconsin and the Upper Peninsula of Michigan. He found that gas efficiency gains were genuine, but the devil was in the details. Explained Wirtz:

“Average fuel efficiency between trade-ins and newly purchased vehicles rose about 50 percent, from roughly 15.5 miles per gallon to 24.

“But that covers up a lot of variation, part of which suggests a sop to owners of older, typically larger vehicles who used the opportunity to upgrade to something with only marginally better gas mileage.

“For example, about one quarter of the new vehicles purchased through the clunker program in the Upper Peninsula and the Dakotas had fuel efficiency gains of four miles per gallon or less. In many cases, older trucks and SUVs were simply traded in for newer but only marginally more fuel-efficient versions. Only 10 percent of all vehicles bought in the district under the program got 30 mpg or better.”  (see Chart 1).

The gains shouldn’t  be brushed aside, of course. As Wirtz is careful to note, even seemingly small fuel efficiency upgrades can save enormous amounts of fuel, which becomes more evident when using “Gallons per Thousand Miles” rather than “Miles per Gallon.”

Is Faster Better?

April 13, 2010

One of the most significant measures currently being applied to stimulus spending is the speed with which dollars are put to work. We get that. It doesn’t stimulate the economy very much today, to spend money next year. But there’s a delicate balance here. The Recovery Act also suggests that states should be looking toward the long-term benefits of the dollars spent. And that’s not easy to do, if the primary goal is to crank the dollars out as quickly a possible.

Former Governor Tim Kaine

Perhaps the best example of this tension is the October back and forth between Congressman James Oberstar, Chairman of the U.S. House Committee on Transportation and Infrastructure and former Governor of Virginia Tim Kaine. Oberstar complained that Virginia was the slowest of the states in getting its transportation funding out into the field. Kaine responded that his state was different from others, in that it was using its ARRA money to create new projects, not just to keep old ones going. What’s more, he pointed to the careful thoughtful process Virginia uses to make sure its infrastructure dollars are well spent.

We asked some transportation officials in Virginia about the exchange, and were struck by something that chief financial officer Reta Busher said regarding the challenges of transportation management, “The short term nature of what we are dealing with is troublesome in a business where a project can take anywhere from 18 months to 5 years.”

We think there’s an important point in having a range of different measures. If you concentrate too much on speed, then quality may be sacrificed, as we wrote in a column in Governing magazine. With a range of measures, no one element gets an inordinate or inappropriate amount of focus. It’s fine to measure speed, but that information also has to be weighted against how well the services are working — or what the dollars are accomplishing — in the long run.

A new report on measuring energy savings

April 7, 2010

With roughly $25 billion flowing out of the stimulus package to various energy efficiency programs, there’s a compelling need to measure and evaluate energy savings.

A new report from the Alliance to Save Energy asks:  “Are taxpayers, ratepayers, shareholders and property owners getting their money’s worth?” “Are energy savings and other benefits, such as air pollution and greenhouse gas reductions and enhanced reliability of electric grids, being delivered?”

Good questions. Hard answers. The conversation about how to measure energy efficiency has been going on for decades and there’s plenty of disagreement about methods of evaluation and the assumptions to be used. The report makes a good case that discussions on these topics need to accelerate and that a wide variety of stakeholders should be included.

As usual in topics involving government evaluation and measurement, the perfect is the enemy of the good. The acceptance of some uncertainty is unavoidable and the allure of more precise measurement and evaluation has to be balanced against cost. The problems of measurement — which are delineated in detail — should not stand in the way of tackling it. As the Urban Institute’s distinguished fellow Harry Hatry said to us several years ago.  “You just have to accept the fact that it’s better to be roughly right than precisely ignorant.”

The report is loaded with sources for more information — like the California Measurement Advisory Council database of evaluation studies from that state and the Consortium for Energy Efficiency database, which contains studies from other states.

Conflicting goals, weatherization and a little about soccer

April 6, 2010

One of our very favorite Governing columns that we’ve written over the years was about performance measurement and girls’ soccer. As we watched our daughter play, we noticed we were seeing some of the same performance issues come up as we’d seen in government. One of the chief problems was that of conflicting goals. Coaches said they wanted to develop players and win games, but doing both those things simultaneously was tricky. If you played your best kids all the time, you might win, but you wouldn’t do as much as you could to develop the skills of the bottom half of the team. If you played your weaker players – thus developing them – it was likelier that you’d lose.

Sandy Greene, daughter and performance measure

We can’t help notice that the Recovery Act is also struggling with a passel of conflicting goals. The area that has been most significantly paralyzed by this problem has been weatherization. The Recovery Act sought to spur economic growth, create jobs and lower energy bills by providing insulation, caulking, weather stripping, etc., to low income families. But the goal was also to make sure that the jobs paid the prevailing wage. Since weatherization work had not been covered by this requirement before, the arduous and detailed task of calculating and setting proper wage rates fell to the Department of Labor, and then the Department of Energy had to help states figure out how to certify that these payroll requirements were met.

Hence the delay. On March 5, California auditor Elaine Howle testified before the Committee on House Oversight and Government Reform that when the auditor’s office finished its fieldwork in December, no houses had yet been weatherized in California even though $93 million had been available since the end of July. By February, the Department of Community Services told the audit office that 210 homes had been weatherized. Putting aside the fact that Howle seemed a bit dubious of this number in her testimony, those 210 houses still fell far short of the state goal of weatherizing 1,433 houses per month.

There are lessons to be learned here, and we think one of the primary ones is for government decision-makers to assess early on where worthwhile, but conflicting goals — like getting work done speedily and setting up a complex new payroll structure — may end up causing problems.

We’re going to cover more about the technical side of this issue in a couple of upcoming posts.

Youth shall be served. . .

March 22, 2010

The Recovery Act included about $1.2 billion to fund jobs for disadvantaged young people, with a healthy portion going to the 2009 Summer Employment Program. The numbers are impressive. About 250,000 youths received job assistance through the pre-existing Workforce Investment Act in 2008. In 2009, with the stimulus dollars, that number grew to 355,000, of which about 88 percent participated during the summer.

A new evaluation of the summer job program by Mathematica Policy Research paints a generally positive picture of the program and includes a number of lessons learned. Under a contract with the Department of Labor, Mathematica interviewed young people, program staff and employers  and sought information about a variety of topics including the two performance measures required by the Recovery Act and the Department of Labor: an assessment of whether young people achieved a “work readiness skill goal”  at the end of their experience and data on whether they completed their summer employment.  Results for both measures are provided for the fifty states in two appendices at the end of the 148-page report.

The stimulus-dollar-funded Summer Youth Initiative operated by the Wise Workforce Center in Virginia

Some of the most interesting information can be found in the variation of completion rates among the states. While 82 percent of young people completed their summer employment nationally, 11 states had completion rates higher than 90 percent:  Alaska, Connecticut, Florida, Georgia, Kentucky, Maryland, Massachusetts, Oregon, Vermont, Washington and West Virginia. At the low end, seven states had completion rates of less than 70 percent: Delaware, Maine, Montana, New Jersey, New Mexico, North Dakota and Utah. As far as we can see, this data, alone, is fodder for some really interesting explorations as to why young people in some states seemed to stick with their jobs better than others.

The information on work readiness, unfortunately, must be regarded with some caution because a lot of freedom was given to entities in determining how to make this assessment.  In fact, one of the recommendations from Mathematica was that the Employment and Training Administration should provide more guidance on how to measure this in a way that ensures “the use of a valid measure across all local areas.” We’re grateful to Mathematica for pointing this out. It’s long been a frustration of ours to see how much national data is skewed by the fact that it is self-reported, without consistency or sufficient guidance.

Still, even though the comparative state information needs to be viewed with caution, the results bear consideration:

Nationally, 75 percent of the young people who participated in the program over the summer attained a work readiness skill goal, but the variation among states was also great here.  Six states reported that this goal was accomplished by more than 90 percent of participants: Arkansas, Georgia, Maryland, New Hampshire, Rhode Island and Wisconsin. Twelve states reported results under 70 percent:  Delaware, Kansas, Michigan, Montana, New jersey (with a startling 21 percent), North Carolina, North Dakota, Oklahoma, Oregon , Utah, Vermont and Wyoming.

For what it’s worth, we note that only two states made it to the very top tier of both lists: Georgia and Maryland.

INTERVIEW WITH: Beth Blauer, director of Maryland’s StateStat

March 16, 2010

Blauer: "The Governor lives and breathes StateStat"

Beth Blauer is director of the Maryland Executive Department’s StateStat office. In that role, she oversees Maryland’s much-praised efforts to manage the state through widely disseminated performance measures. This was modeled, to some extent, on ground-breaking efforts in Baltimore, when  Governor Martin O’Malley was mayor there.

As in Washington and a few other states, Maryland has connected its tracking of stimulus dollars to other performance reporting mechanisms.  In a move that makes good sense to us, it launched its stimulus website on the back of the work it had already done with StateStat – and Blauer was put in charge. She also became, in her words, the “de facto stimulus czar.”  Largely as a result of the work done in the past, the online stimulus material was ranked as best in the country in Good Jobs First’s evaluation of state stimulus websites.

We thought it might be interesting to hear what Blauer had to say about  this work, and she was kind enough to spend a chunk of time chatting. Following some excerpts from our conversation.

Q. Maryland’s stimulus tracking was top of the pack in the Good Jobs First evaluation.  What separates Maryland from other states, in your view?

BB: A lot of states are struggling to bridge the gap between accountability with the financial data and the need for accountability and transparency on a management level. We had this highly scrutinized accountability and performance measurement program in place already.  So, we were able to immediately gear up.

Q. In what ways have you linked the spending of stimulus dollars in Maryland to program results?

BB: As much as possible, we’ve connected the tracking of the stimulus dollars with the performance information we have through StateStat. I’m director of StateStat and the de facto stimulus czar. It’s useful to have the same person have both these responsibilities because they’re both concerned with performance and reporting. We want to use this information so we can make the right choices of where to spend the money and we want to use the data on a management level.

The Governor lives and breathes StateStat and he’s looking at the map that shows stimulus activity all the time. I think that’s the most important message – if you don’t have your leaders buying in to it and heavily relying on the tools, you’re not going to have a tool that’s fully used.

The Governor was very interested in the creation of the map. He was sitting next to me and involved down to the choices of the icons. He wanted the map to show more than just the distribution of dollars, he wanted it tied to performance measurement. That’s the principle. He gets it.

Q. States are required to track the jobs that are being funded with the stimulus dollars. What are some of the other ways you’re tracking the use of the stimulus dollars on your website?

BB: We’ve been measuring how quickly our contracts are going out to bid and we’re looking at the percentage of minority business enterprises that are getting our Recovery Act contracts. We’re not relenting on that. We have a goal of 25 percent.

On weatherization, we wanted to connect the weatherization program with the state’s energy assistance program. We’re making it a goal to prioritize the people who get weatherization money, by looking for those who are also receiving cash assistance from the state to help pay energy bills. That way the state’s costs will go down when individual utility costs drop. So, we’ll track who gets the service and we’ll track the impact that has on energy bills. I’m working out how to depict this on the map without having to go down to the individual house level.

We’re also tracking the number of people who are going through weatherization training, so they can work in those jobs.

Q.  A lot of states are now using the same map that you’re using. How did that happen?

BB: We worked with ESRI, a company that specializes in geographic information systems, to create the tools under the condition that they’d share it with other states. We worked very hard to develop it. Now there are over 20 other states with the exact same map

But that doesn’t mean they’ve necessarily linked to performance. Washington State has some performance linkages and Massachusetts is working on it. New York City is doing a great job.

Q. Are there areas in which you think Maryland has gone beyond other states?

BB: We’re the only state right now that’s extensively using needs data. For example, we had to make decisions about where transit money was going.  If you look at our map under transportation, you can see we’ve color coded the areas in the state where there are high percentages of individuals who don’t have motor vehicles. Those are the spots that may have more need of transit dollars. We knew that Baltimore would be one of those areas, but the map also shows us that there was a justification for Garrett County, in the western part of the state, getting some new buses.

Q. What do you have planned going forward?

BB: For every single funding area, I have a wish list of performance metrics. I’d like to continue to draw connections, as we did with weatherization and energy assistance. I’d like to think about how we can speed up some of the milestones and goals that we’ve had. For example, with an infusion of dollars going to water projects, can we create a faster-paced change in the health of the Chesapeake Bay?

Q. What’s the biggest challenge for you with tying Recovery Act dollars to performance?

BB: This is hard, because it’s a fast-paced program and the majority of the dollars that are coming to the state are for programs that are very well developed, like the increase in the Medicaid match.

The biggest struggle I have in mapping the recovery data is trying to isolate the direct impact that the federal investment has had.

Q. Do you have any disappointments with what you’ve done so far?

A. I’ve been surprised that there hasn’t been more public engagement. Every place on our map, we have a way that a user can directly communicate with us. I was looking forward to that public engagement as part of the transparency. But we’re not getting the response we were expecting. That’s an area in which we can really strengthen our program.

Starting off on the right foot. . .

March 15, 2010

In recent months, as we’ve talked with evaluation professionals in the states and local governments, we have worried a bit that they weren’t more involved in the early days after the Recovery Act went into action. An old friend, John Turcotte, who is director of the program evaluation division for the North Carolina General Assembly, wasn’t complaining – but he did point out that he hasn’t been asked, thus far, to evaluate any of the aspects of the Recovery Act. A little north, in Virginia, David Von Moll, who is the state comptroller, told us that “We haven’t specifically related ARRA activity to our performance management process. That’s a goal of what we might want to do at some point.”

It certainly seems like a worthwhile goal. Of course, there’s no shortage of data associated with the Recovery Act, and the reporting requirements are keeping local officials just about as busy as a florist on Mothers Day.  The funds are scrutinized, of course,  by leagues of individuals who are focused on accountability. But  attention to program results is significantly less pronounced. Perhaps the emphasis – and public dialogue – would already have shifted in that direction had there been more early involvement from the performance auditors, legislative evaluators, and performance budgeters  who spend their lives dealing with measuring the impact of programs.

“I felt that the goals of the stimulus office were the same as the performance improvement office, but they were clearly two distinct operations,” says Sharon Daboin, the former Deputy Secretary for Performance Improvement with the Budget Office in Pennsylvania. “I tried to share the tools that were being used to track objectives and accomplishments within each program office, but a key person in the stimulus office said ‘Your focus and mine are completely different.'”

There are some states, like Maryland, in which the stimulus reporting is directly connected with the performance reporting generally. But, that appears to be more the exception than the rule.

A comment from Gary VanLandingham, director of Florida’s Office of Program Policy Analysis and Government Accountability sums up our feelings nicely (although he wasn’t talking about the Recovery Act when he said this): “I think, ideally, if you have something that is going to have to be evaluated or managed, then it would be good if folks who were designing the programs would bring in some of the evaluation folks at the beginning to know what kind of data we’d need to collect. It’s difficult to retrofit management systems to go back and collect data which ideally you should have been collecting to start with.”

How deep do you want to dive?

March 8, 2010

After years of looking at performance measurement in a whole variety of fields, we’ve come up with two basic principles that we think apply in most cases:

1) You can’t have enough performance measurement and evaluation.

2) Too many performance measures are a bad thing.

“You’re going to have to do a sampling and you’re going to have to be pretty choosy as to what you go and look at, and then make some fundamental decisions about how deep you want the dive to be. ” – Jeffrey A. Simon, director of the Massachusetts Reinvestment and Recovery Office.

These may not seem entirely consistent, we know. But, at heart, they just represent the conflict between an ideal world and the real world. In the ideal world, there would be sufficient time and resources to measure the effectiveness and the efficiency of every government program, and to report those findings to as many people as possible. In the real world, on the other hand, executive managers, auditors, legislators and their staffs will probably never have enough time to look at a fraction of all the metrics potentially available. What’s more, just creating all that information can wind up being an impossible burden on the men and women who are trying to keep up.

Truth is, creating a measure that nobody looks at is a lot like building an energy-efficient  car that can only move on gold-paved roads. It might be a great car, but it’s never going to go anyplace.

What does this have to do with the stimulus? Simple. Even as states move forward in the direction of measuring the actual outcomes of these federally-funded activities, they’re not going to have the time and resources to measure outcomes across the board in a timely way. We started thinking about this after a terrific conversation with Jeffrey A. Simon, the director of the Massachusetts Reinvestment and Recovery Office.  Although his office has been largely concerned with the basics of the Stimulus Act thus far (making sure the money is properly spent and so on), he’s enthusiastic about understanding what’s actually been done with those dollars, beyond job preservation and creation.

Still, he’s convinced that the states are going to have to carefully pick and choose what things to measure or be overloaded by an impossible task. “We’ve just begun to spend some serious time in my office talking about what it means moving into the implementation phase,” he told us, “as opposed to project selection or spending so much time on federal reporting requirements. All of this is important . . .But then you need to step back and see, from an implementation point of view, what’s being accomplished.

Jeffrey Simon

“You’re going to  have to take a close look at the skill sets that you need to do that kind of thing, and the sheer volume of projects. You’re going to have to do a sampling and you’re going to have to be pretty choosy as to what you go and look at, and then make some fundamental decisions about how deep you want the dive to be. How far in the weeds to we want to get?

“I’d tend to take fewer projects and go farther down in the expenditure to the last dollar and understand what it did, and look at the product it produced, whether it’s a weatherized home or a child care voucher.

“One of the distinguishing characteristics of the recovery program is it goes all across state government. It’s not like I can have two or three different people and say, ‘today you’re going to look at housing vouchers and tomorrow you’ll look at road paving.’ Those are fundamentally different. So, I’d need an army of people to do a full examination. And that’s why we’re going to have to make choices.”