Data analytics is often described as a puzzle in which you try to find how different pieces fit together to form a picture. But it’s harder than that. It’s more like being given, say, 1,000 puzzle pieces and being told that most of them won’t be in the final picture — and you have no idea what that picture will look like.
Feasible? Technically, yes, but practically, no.
That is why automation is so critical to government data initiatives. Specialized software makes it possible to gain insights from large volumes of data more quickly and effectively than human analysts could. And relieved of the arduous task of sorting through all that data, those analysts focus on putting those insights to good use.
In her article “The Innovative State,” Beth Simone Noveck highlighted an initiative by Chicago’s Department of Public Health. When the initiative launched in 2015, the city had more than
15,000 food establishments but only three dozen health inspectors. They needed to figure out where to put their energies. They turned to data.
Working with a team at Carnegie Mellon University, they came up with a program
that identified which restaurants were likelier to violate health codes. The data included information on previous inspections and permit requests, sanitation complaints and requests made to the city’s 311 program for services. Noveck noted that the project increased inspectors’ effectiveness by 25%.
Automation by Learning
Various tools automate data processes, but one of the most promising is machine learning (ML). A type of artificial intelligence, ML is software that gets smarter as it goes along.
Here, in simplified terms, is how machine learning works (courtesy of Scientific American):
- A programmer writes an initial algorithm (that is, step-by-step instructions) that a system can use to look for patterns in data.
- The programmer feeds the system a massive trove of data to analyze — what’s known as training data.
- Based on the patterns the system finds, it creates a model for identifying patterns in future data sets.
- That model evolves over time, as the system learns more effective analytic methods.
So, given 1,000 puzzle pieces, not only would a system be able to sort them faster than a human ever could, but it would get better at it over time.
Learning Bias
The problem is that a system learns by taking an initial data set and generalizing rules that can be applied to future sets. If that training data is skewed, the algorithm will be skewed, too.
Noveck noted that many companies now use machine learning-based tools to interview and assess first-round job applicants, often by comparing their responses to those of current employees. “If current employees are mostly White and American-born, applicants who are Black or foreign-born will score poorly,” she wrote.
Overcoming Obstacles
In a paper looking at the potential role of AI and machine learning in reducing government fraud, the Brookings Institution, a nonprofit public policy organization, noted that agencies can find it difficult to incorporate the technology into their operations. With that in mind, Brookings offered several recommendations to agencies:
Get the technology out of the lab and into operational settings where employees can see the difference it makes. “AI needs to be normalized as part of agency operations and not be a technical gadget that is separated from crucial missions.”
Get people trained on the ins and outs of AI and machine learning. They don’t need to become software experts, but they need to understand how algorithms work, the risks involved and how to mitigate those risks.
Establish clear standards for compiling, coding, analyzing and interpreting data to ensure it is useful and unbiased.
Develop a process for assessing the performance of AI and machine learning programs. Are they meeting program objectives? “Having clear-cut data analysis and policy assessment will inform AI design and deployment, and lead to products that are safer, fairer, and more effective in achieving their objectives,” the paper states.
Although machine learning is complex and the pitfalls real, the potential payoff that comes with such advanced automation is compelling. There is a lot of work to be done, but it will be worth it, experts say.
“By making it possible to sort the extraneous chaff from the informational wheat, machine learning could enable agencies to deliver both new and better services to the public,” Noveck wrote.
This article appears in “Your Data Literacy Guide to Improve Everyday Collaboration.” For more insights on making data do more, download the guide: