Posts Tagged ‘data quality’

Cash for Clunkers: Measuring Performance

May 5, 2010

The GAO recently released a report reviewing the impact of Cash for Clunkers. Beyond good information about that program, this report impressively discusses general questions regarding stimulus implementation and performance measurement. Here’s the conclusion of the GAO’s report: (Note: NHTSA refers to the National Highway Traffic Safety Administration; the emphasis is ours.)

The implementation and results of the CARS program offer potential lessons learned for future vehicle retirement or similar incentive programs. First, the program produced economic and environmental benefits, achieving its broad objectives. However, the extent of the program’s effects is uncertain.

Second, before a program is underway, steps must be taken to determine what impacts are going to be measured and what data will be required to measure them. Moreover, steps must be taken to ensure that the data are reliable. NHTSA relied heavily on the consumer survey for data on the economic and environmental benefits of the CARS program. However, there is a potential risk to the reliability of estimates based on this survey data, because NHTSA did not follow some generally accepted survey design and implementation practices, largely because it had limited time to establish and administer the program.

Finally, given the number of stakeholders that are financially affected by the auto industry, it would be important to collect and consider information on how a future program would affect these stakeholders and take mitigating actions, as appropriate.

It’s hard not to get the sense that the fly-by-the-seat-of-your-pants nature of Cash for Clunkers made it, at best, difficult to ascertain the real benefits of the program. Given that some facets of the stimulus are designed specifically to spur data collection and use (as in the education and health care fields), we’re often surprised by the data shortcomings in other  stimulus spending as well as the lack of strong performance management thinking.

We appreciate that officials wanted to execute the program quickly, but if you can’t evaluate how beneficial a program is, how can you know whether or not it was wise to do it in the first place, much less do it quickly? How can you learn how to do it better next time?

We should also note that this second takeaway—“ before a program is underway, steps must be taken to determine what impacts are going to be measured and what data will be required to measure them”—sounds to us like some of the issues we’re hearing with respect to other stimulus projects. In several of the stimulus program measurement conversations we’ve had, jobs and spending seem to be the only real measures of success for certain programs.

Finally, anyone interested in designing and conducting valid surveys should definitely read Appendix II of this report, in which the GAO analyzes the NHTSA’s consumer survey of Cash for Clunkers, pointing out its weaknesses.


“You can’t do the math that way”

April 12, 2010

One of the best sources for news reporting about the stimulus is ProPublica. Its staff is particularly savvy about how to dig into the data and seems happy to share its knowledge with journalists elsewhere.

Jennifer LaFleur

ProPublica, which draws its funding from the Sandler Foundation and other contributors, is an independent, non-profit newsroom, created with the purpose of covering “truly important stories, stories with moral force.” It has aggressively covered the stimulus from its passage in February 2009. We recently talked with Jennifer LaFleur, who is director of computer assisted reporting there.

How did ProPublica get started covering the Recovery Act?

In the beginning, there was this huge amount of money and very little coverage. Initiated by our stimulus reporter Michael Grabell, we started pulling together as much data as we possibly could. Before the data was on, we had to go to individual states to get it. In some states, if you’re not a resident, they won’t provide public records – even though this was federal money.

Early on, we spent a lot of time with the state websites and the state department of transportation sites. Initially, a lot of states looked like they were giving you a lot of information, but they really weren’t. Texas had a pretty page and lots of graphics and flashy things, but it didn’t provide a list of actual recipients. I remember that Washington was a state that did a really good job on its state DOT site.

With all the data on, is the stimulus a lot easier to cover now?

It is easier, though there is a problem that not all the data is in one place. There’s and and there’s duplication in those sites. The information in the latter covers the parts of the stimulus that weren’t intended to create jobs – Section 8 housing money, Small Business Administration loans.

We’ve collected all the data in one place, so we have double the records in our data set that you’d find in

What do you think of the website?

For the speed with which they had to set this up, is one of the better sites as far as providing information. But it was a very big promise to say we’d be able to follow every dollar. That’s not the case.

In the beginning, there were definitely data quality issues.  There were zip codes that didn’t exist and data entry errors on Congressional districts. They’ve done a pretty good job of fixing the data issues.

Are there improvements you’d make in

It’s been responsive and fixed a lot of the big problems that came up early on.

But there is information you’d like to see. They don’t include counties in the data and that’s kind of a standard thing. They have other location information, but people want to look at what money is going to their county. There are some identifier issues I’d fix.  You can connect a project to a prime recipient, who doles out the money to sub-recipients, but you can’t really hook up vendors to sub-recipients.

For example, a reporter called yesterday.  He saw that a sub-vendor in his area was a car dealership and  wanted to know who had hired that car dealer. But the data doesn’t show that.

There’s also no information on anything below sub-recipient. In weatherization, for example, the money goes from the Department of Energy to states and states disperse the money to community action agencies and then they hire contractors to do the work.  The data doesn’t show those relationships.

I know that at some point you have to draw the line on how many levels you go down.

What do you think of the quality of local reporting on the stimulus?

It’s also gotten a lot better. The questions we get from reporters are more sophisticated than they used to be, but understanding the data is still complicated.

I’ve noticed that a few organizations have made the mistake of taking the total project money, dividing by the number of jobs and then writing a story about the $500,000 job. You can’t do the math that way. A lot of the money goes to other things – like buying asphalt or trucks.

Also, when I do training sessions for reporters, I ask how many people have read the recovery bill and nobody raises their hands.

Do you have any other issues with what you read?

There’s a tendency to jump on a problem as if it were a big conspiracy. Early on, there was a perception that a bad zip code was a signal that there was a big conspiracy of money being dumped in bad places. But it was just bad data.

What aspects of the ProPublica coverage are you particularly proud of?

Michael Grabell has done a great deal of very good work. He dug in very quickly. In one example, he found that stimulus money was going to teeny tiny airports in the middle of nowhere and then the DOT  inspector general came out with an advisory that said the same thing.

We’ve put a lot of work on some of the tools on our website. Our Recovery Tracker lets people track down to the county level and Stimulus Speed Chart looks at how fast money goes out the door.

What are your plans for future coverage?

We’ll continue to improve our Recovery Tracker and we likely will develop new interactive tools for covering the stimulus. We’ll also continue to dive into the data to do investigations.

Note:  A late afternoon congratulations to ProPublica, which became the first online non-profit news organization to win a Pulitzer. The award, announced today, was in the investigative journalism category and was shared with The New York Times for an article published in its Sunday magazine last August. The article, by Sheri Fink, dealt with choices faced by doctors at a New Orleans hospital immediately following Hurricane Katrina.

A civilized followup

April 8, 2010

Veronique de Rugy

Last week, we noted the excellent back-and-forth between Veronique de Rugy and Nate Silver regarding some of de Rugy’s research into stimulus spending. The best part of the debate, at least for us, was that two smart people put aside differences to prod each other to better answers and sharper thinking.

Less than a week later, de Rugy has already posted a revised version of her study, using methodological improvements that came out of her discussion with Silver. In the spirit of transparency, she has posted her revision, the original, and all her data here. Her own brief summary of the revised results is posted on The Corner. And she also sums up the questions that many researchers, journalists and citizens (including us) are asking:

There is still much more to learn on the question “How are stimulus funds being spent and why?”

The more I dig into this, the more important the question seems. After all, we’re talking about hundreds of billions of dollars. This is a question I will continue to try to shed some light on. And I’ll continue to refine my model to do so. For instance, I would like to try to understand better what we could learn from the design of the stimulus bill itself — what can we say about the fact that so much money is in fact spent or going to state capitals? (It’s not just that it’s being allocated there; the data I use tells where the money is actually spent.) What explains the fact that so much more of the money is going to the Department of Education than to the Department of Transportation? And what are we getting for our money?

We hope that de Rugy (and Silver and many others) continue asking these questions, especially that last one, and keep prodding each other toward better answers–and do so as transparently and cooperatively as de Rugy and Silver.

Trying to meet “a gold standard”

April 5, 2010

We recently interviewed Chris Patton, Recovery Act director in Wisconsin, to find out more about the improvements to the Wisconsin stimulus website.

Stimulus-supported school construction bonds in Wisconsin

Q. Wisconsin did very well on the Good Jobs First evaluation of state Recovery Act websites. What motivated you to make more improvements?

CP: We wanted to have as robust a tool as we could to explain to the public where the Recovery Act money was going and the impact it had on our economy. It’s a monumental effort. We’re really emphasizing that this has to be presented in a way that is easily understood by the average citizen.

Q. Will this experience have an impact on how the state reports on other programs?

CP:  I think the emphasis on customer service, the emphasis on listening to the public will really pay dividends going forward.

Historically, this kind of information has been available, but not in a user friendly manner.

Q. What have you learned in doing this?

CP:  We really tried to respond to what the public wants and to respond quickly. People are asking for information in a real time manner and typically, things can move slowly in state government.

We needed to learn how to be much more responsive in real time with the information we’re making available.

Initially, the federal Recovery website and ours was going to be updated quarterly. We realized and the public realized that there’s a lot going on and that they want to know what’s happening up-to-the-minute. Our website is now refreshed on a much more frequent basis, sometimes daily.

The public also had a much greater desire for details as to where the money is going, not only who is receiving it, but who the vendors and contractors are. We retooled the website to have that level of detail. We go down to basically any vendor payment that goes out the door. The federal payment standard is above $25,000, but we track spending down to a $10 hammer at Home Depot.

Q. What would you point out on your website as the key improvements?

CP: We give people the ability to download data – so you can take the information we present and get at the raw information and sort and compare. We had requests that made us consider different ways to access the information – for example, through local government identifiers or commercial districts.

There’s also an expanded search. You can input your zip code or look at it graphically on the map and you can zoom right in and see where recipients are, right down to your own street.

There’s a public policy side to having the level of detail that we do. With maps that overlay the unemployment rates in various communities, you can start to see where you are having impact and where you need to redouble effort. You can look at the per capita distribution.

Q. Wisconsin’s website seems to move faster than others. Do you know why?

CP: We charged our staff with meeting a gold standard. We have a set of products and tools that give us cutting edge technology on the back end of our website.

Q. Very few states have included information on where the bond money is going. Why do you feel this is important?

CP: The public doesn’t always understand that there’s a whole part of the Recovery Act that includes the bond provisions and tax credits. These are very important to stimulating our economy. We took it upon ourselves to make that information available to the public. This is a portion of the Recovery Act that doesn’t get a lot of attention.

With the bond program, we noticed that certain communities had expended their bonds and desired additional bonding allocations, but others hadn’t. We worked with the legislature to pool the unused bonds and target the communities that had the greatest need with products that were ready to go.

The geographic element, combined with the different data elements from the Commerce Department, was very helpful.

Q. What would you like to improve in the future?

CP: So much attention has gone to tracking the money, but there are a lot of other performance measures that the programs are undertaking. For example, the number of meals on wheels served. We hope to have additional details – not just what jobs are being funded, but truly understanding how programs are performing and the other benefits of the Recovery Act funding.

Q. What are the biggest challenges?

CP: Bringing the data together centrally is the challenge.

As we ramp up our performance monitoring efforts and collect other data elements and performance measures, we want to show unique program details. Are we meeting program goals in training dislocated workers? Are they not only being trained, but becoming employed?

The biggest barrier is the overall complexity of all the programs and the volume of data.

A Civilized Debate

April 2, 2010

Anyone who’s interested in how — and how well — stimulus dollars are being spent should read the exchange between Veronique de Rugy of George Mason University’s Mercatus Center and Nate Silver of about some pivotal data and quality issues. Of course, we could summarize their points; but we think you’ll find it much more illuminating to read through the exchange itself, and let the authors’ speak for themselves.

We suggest that you read the four pieces in reverse order (starting with number four and moving back to number one) Trust us, it’ll make more sense that way. But here they are in the original order:

  1. de Rugy’s paper, which got the ball rolling
  2. Silver’s first response
  3. de Rugy’s reply
  4. Silver’s second response

The exchange is worthy of attention because it highlights two great points:

1) We really like seeing smart people getting past their initial suspicions of each other in order to begin hammering out some fundamental questions about the quality of data and potential issues in the stimulus package itself.

We’ll never get very far in understanding what works and how well it works if we can’t get to points of basic agreement across partisan or ideological aisles about the ground rules of analysis. This civilized debate underscores how partisan and toxic much of the discussion over the stimulus (and other recent government plans) has become.

It has also been suggested that this is a model for the quick, effective peer-review that the internet facilitates.

2) In the course of their discussion, de Rugy and Silver raise some very good and sobering questions about the quality of the data for analytical purposes provided by

Note these two comments, for example:

“I worked within the confines of $18 million website, a website that we were promised would allow us to track the money to the last cent. Obviously, that is not the case. The money trail ends at the level reported, and from the website one cannot tell where the money went next.”–de Rugy’s reply


“I share de Rugy’s disappointment with the quality of the data available at Frankly, I am not sure that testing her hypothesis to a peer-reviewable level of robustness is possible given the middling quality of data and the inherent ambiguity with how particular projects must be assigned to particular congressional districts.”–Silver’s 2nd response

We  hope that the officials working on are paying attention. We were excited by the recent appointment of data presentation guru Edward Tufte to the ARRA Recovery Independent Advisory Panel, and hope that that might be a good step in making more useful to citizens and researchers alike.

Youth shall be served. . .

March 22, 2010

The Recovery Act included about $1.2 billion to fund jobs for disadvantaged young people, with a healthy portion going to the 2009 Summer Employment Program. The numbers are impressive. About 250,000 youths received job assistance through the pre-existing Workforce Investment Act in 2008. In 2009, with the stimulus dollars, that number grew to 355,000, of which about 88 percent participated during the summer.

A new evaluation of the summer job program by Mathematica Policy Research paints a generally positive picture of the program and includes a number of lessons learned. Under a contract with the Department of Labor, Mathematica interviewed young people, program staff and employers  and sought information about a variety of topics including the two performance measures required by the Recovery Act and the Department of Labor: an assessment of whether young people achieved a “work readiness skill goal”  at the end of their experience and data on whether they completed their summer employment.  Results for both measures are provided for the fifty states in two appendices at the end of the 148-page report.

The stimulus-dollar-funded Summer Youth Initiative operated by the Wise Workforce Center in Virginia

Some of the most interesting information can be found in the variation of completion rates among the states. While 82 percent of young people completed their summer employment nationally, 11 states had completion rates higher than 90 percent:  Alaska, Connecticut, Florida, Georgia, Kentucky, Maryland, Massachusetts, Oregon, Vermont, Washington and West Virginia. At the low end, seven states had completion rates of less than 70 percent: Delaware, Maine, Montana, New Jersey, New Mexico, North Dakota and Utah. As far as we can see, this data, alone, is fodder for some really interesting explorations as to why young people in some states seemed to stick with their jobs better than others.

The information on work readiness, unfortunately, must be regarded with some caution because a lot of freedom was given to entities in determining how to make this assessment.  In fact, one of the recommendations from Mathematica was that the Employment and Training Administration should provide more guidance on how to measure this in a way that ensures “the use of a valid measure across all local areas.” We’re grateful to Mathematica for pointing this out. It’s long been a frustration of ours to see how much national data is skewed by the fact that it is self-reported, without consistency or sufficient guidance.

Still, even though the comparative state information needs to be viewed with caution, the results bear consideration:

Nationally, 75 percent of the young people who participated in the program over the summer attained a work readiness skill goal, but the variation among states was also great here.  Six states reported that this goal was accomplished by more than 90 percent of participants: Arkansas, Georgia, Maryland, New Hampshire, Rhode Island and Wisconsin. Twelve states reported results under 70 percent:  Delaware, Kansas, Michigan, Montana, New jersey (with a startling 21 percent), North Carolina, North Dakota, Oklahoma, Oregon , Utah, Vermont and Wyoming.

For what it’s worth, we note that only two states made it to the very top tier of both lists: Georgia and Maryland.

How much does it cost?

March 18, 2010

We don’t want to get into the politics of the debate about road signs that advertise the use of stimulus dollars, but it does point out a significant shortcoming in government generally. That’s the difficulty in coming up with good reliable figures about what things cost.

Last month,  PolitiFact looked at the controversy over road signs, following up on comments by former Gov. Sarah Palin that one state spent one million dollars to post signs that advertised where stimulus money was being spent. PolitiFact did some research and concluded that the state was Ohio, but that the one million dollar figure was suspect.

It was based on the probably erroneous idea that each sign was about $3000.  But the Ohio Department of Transportation told PolitiFact that there was no way to come up with the cost of the signs.

Why? Because the costs of all construction signs in a construction zone were rolled together. PolitiFact pointed to inquiries in other states by the Associated Press, which  uncovered figures that were substantially lower than the $3000 estimate — Michigan estimated a sign costs $400 to $500. Illinois put the amount at $1000 ($300 for the sign and another $700 for labor). Colorado officials put the labor plus materials cost at between $750 and $1,200.

Of course this is small change, but wouldn’t it be useful for a state like Ohio to be able to clearly answer questions about what things cost — even if they’re small things? That would not only lead to useful cost comparisons, but help ensure that debates on spending were based on accurate information.

Cleaning up the data

March 17, 2010

Some good news emerges from the Recovery Accountability and Transparency Board’s just-released quarterly report to the President and Congress. Many of the data quality problems that emerged in the first reporting period have been corrected, as illustrated in the report’s table one, on page five.

Late reports, for example, are down 30 percent. At the same time, technological improvements have reduced the number of incorrect congressional districts shown to zero. Mismatched zip codes — a source of consternation in the first reporting period — are similarly nonexistent now.

Good news from the latest Recovery Board report

This report underscores the stories of improvement we’ve been hearing from officials in the states.  All this makes us wonder whether the data hassles of the first quarter reports will end up teaching the federal government and the states how to share, compile and report data much more effectively, across the board and (maybe) lead to improvements in the way governments work together. As Chris Masingill, the stimulus czar in Arkansas said:  “this is a federal-state partnership that historians will be talking about and writing about.”

That said, we are also curious to see how much this data quality improvement is reported in the press. Yesterday’s coverage in the Washington Post’s “Federal Eye”  — with the headline “Government probing thousands of stimulus complaints” — concentrates on statistics about complaints and investigations of potential fraud. (The actual figures reported on complaints, as outlined in a table on page 8, were 1,777 received through January 31, 2010, with 147 active investigations, and 43 prosecutions initiated.)

Keeping up with the GAO

March 8, 2010
Keeping up with the Government Accountability Office’s reports about the Recovery Act is a pretty demanding undertaking. But we’re going to try to do so on your behalf, and offer you summaries of the most important reports as they come out (as well as links to others).
The most comprehensive new report was the GAO’s major 174-page look at the Recovery Act “One Year Later”.  It gives a very clear picture of the pace at which dollars are being distributed, the vast array of oversight activities and a variety of agency specific topics (like the wage issues that have stalled the delivery of weatherization funds).
A few of its important points:

* Data quality. For individuals who have been following the fracas in the press over the last six months about data issues — like jobs reported for Congressional Districts that don’t exist, double counted jobs, incorrect dollar figures, etc. — the GAO report is surprisingly comforting. Lots of improvements were put in place for the second reporting period, which ended on December 31, 2009. New simplified job data have helped increase accuracy, along with ways to flag potential errors as data is input. (For example, the system will now stop people from proceeding if they input a Congressional District that doesn’t match their zip code.)

* Maintenance of Effort. One of the trickier issues pertaining to the Recovery Act are the “maintenance of effort” requirements, particularly in transportation.  A continuing budget crisis, coupled with legislative cuts and changes, has made it difficult for many states to certify to DOT that the stimulus money will actually be adding to previously planned spending. The Federal Highway Administration is expecting lots of states to send in revised certifications this week, according to the GAO. Whether or not states meet maintenance of effort requirements is likely to be an ongoing question.

* Capacity. As states and local governments grapple with all their responsibilities for reporting Recovery Act information, capacity questions continue to loom large. This problem will become increasingly acute as beleaguered budgets necessitate layoffs and temporary furloughs. For example: The GAO reports that capacity has been a barrier in housing agencies, causing some to bypass applying for competitive grants. This is particularly a problem for smaller agencies. Capacity issues also create obstacles for small agencies in particular in dealing with the distribution of weatherization dollars and the administrative requirements

Here are the two other GAO reports that came out last week.

Recovery Act: California’s Use of Funds and Efforts to Ensure Accountability
Recovery Act: Factors Affecting the Department of Energy’s Program Implementation