There are many discussions going on about the OMB’s recently issued Social
Media, Web-Based Interactive Technologies, and the Paperwork Reduction
Act. Basically, this guidance makes it easier for Federal agencies
to use variety of social media and “web 2.0” tools for interacting with
the public without having to go through the expensive and time
consuming clearance process required by the Paperwork Reduction Act.
Setting aside for the moment whether this document goes too far or not far enough, an issue that is not addressed is a concern I have about the statistical validity of the data government agencies collect and interpret to help them guide their programs.
Years ago when I was in the statistical research and survey business I always dreaded the OMB survey clearance process. I knew it would add months and many dollars to the surveys being contracted for by client agencies such as the National Cancer Institute and the US Copyright Office. But if you had a contract to do a survey for the US government, that’s what you had to do.
That situation hasn’t changed. Federal agencies still do workarounds to avoid having to get OMB clearance for surveys and survey-like data collection efforts. The impacts of this on real world decisionmaking are hard to assess but might involve making decisions without factually based data on how target citizen groups feel about and use expensive government programs.
Enter social media and web 2.0 tools that rest, fundamentally, on engaging and communicating with and among target population groups. Now it’s possible to extract data on conversation frequency, topics being discussed, and even “sentiment” algorithmically classified from keyword, semantic, and grammatical analysis. Making these tools more widely available as a way to engage with — and track — public involvement is a Good Thing, right?
My answer is a definite “Yes” — but there’s a catch. The catch is that we may tend to attribute statistical validity to tools that don’t necessarily provide consistent or complete data. For example, how accurate is a tool that classifies the sex of blog posters using various text cues in surrounding data? What is the impact of constant changes to the coverage of a social media monitoring tool on any kind of trend analysis? How “accurate” is a classification of a blog post’s “sentiment” as “positive” or “negative?”
This is not meant as a criticism of such tools; they can be very useful in instances when the populations impacted by a government program are very large, diverse, and/or volatile. I’ve advised clients on their use. But a danger exists that, because the tools exist and we can subscribe to them easily and create easy-to-read “dashboards” to track a variety of online events and conversations, we may try to squeeze a bit more out of them than is warranted. A comparable problem exists with the use of online and in person focus groups. No matter how much we understand the limitations of such tools with respect to their statistical projectability, we still find situations where we are forced to make decisions about population responses based on focus groups because that’s the only data we have.
I sympathize with this. I know that, no matter what the situation, we rarely have perfect data. Having to “make do” with what we have is a management fact of life.
So my take on the new OMB guidance on social media and the Paperwork Reduction Act is that it’s a Good Thing, but it doesn’t go far enough. Ultimate resolution of its ambiguities will probably require legislative action. Meanwhile, I’m hoping that people responsible for planning, managing, and interpreting data on government program performance derived from social media and “web 2.0” tools make an attempt to align and map such data to accepted and statistically valid population data.
At minimum, program managers and policy makers need to understand the context in which social media derived data are being generated and collected. That may require building and maintaining data repositories that combine data from a variety of sources and support the ability to report to management in ways that objectively provide facts, context, and insight that social media based metrics can now provide.
Copyright (c) 2010 by Dennis D. McDonald. Contact Dennis in Alexandria Virginia at [email protected].
This article was originally published April 8, 200 on Dennis McDonald’s Web Site.
Reminds me of the early days of government websites when we used to measure “hits” and “communication contacts.” Back then it was very easy to misinterpret server logs to come up with whatever stats you wanted.
Also, we’ve seemed to concentrate a great deal on the quantitative side of data gathering and data analysis. What impact will this have on qualitative data analysis? For example, can we derive useful conclusions from content analysis of tweets without that being a survey under PRA?
Bill, regarding Tweets, there are a number of commercial services you can use to track different aspects of Tweets with respect to a particular program or issue that won’t require any survey-related clearance. They provide a variety of ways to profile the qualitative “content” and that can be extremely useful.
You’re still left wondering, though, how such qualitative reporting represents or refers back to the individual people or organizations you’re trying to serve. Focusing too exclusively on those who engage in online conversations can, if you’re not careful, provide a skewed perspective on your entire target population.