Imagine you have a massive cache of digital family photos, and you’re looking for images of your child’s kindergarten graduation. Sure, it’s great having all those photos on the computer, but unless you tagged them in some way, there’s no quick way to find what you need.
This dilemma mirrors how valuable information can be so difficult to find when it comes in the form of “unstructured data.” Unstructured data includes images, video, audio and other types of information that cannot be stored in traditional databases or analyzed with traditional data tools.
Structured data appears in rows and columns that are clearly labeled, making it easy to sort and analyze. Unfortunately, that’s not the case with unstructured data.
“How do you capture that information so that you can start to make sense of it from a more analytical point of view,” said Patrick Johnson, Senior Solution Architect for AI/ML/Data Privacy/Data Governance at MFGS, Inc., “as opposed to having to sit there and stream through all of your content, one file after the next?”
Smarter Technology
Although tools for managing and analyzing unstructured data have been around for years, AI and ML have accelerated the ability of agencies to identify and extract information from a wide variety of content types.
Key capabilities include:
- Text analysis, e.g., sentiment analysis and natural language processing
- Image analytics, e.g., object detection, facial recognition and image classification
- Video analytics, e.g., scene analysis, person detection and object detection
- Audio analytics, e.g., speech-to-text conversion, speaker identification and language detection
MFGS, Inc. provides an AI/ML-driven platform for unstructured data analytics called IDOL, or Intelligent Data Operating Layer. IDOL works with more than 1,500 data formats and comes with built-in connectors that can access files from over 150 data repositories, such as SharePoint, Dropbox and Microsoft Exchange.
“We are giving the end user an extremely efficient way to get the right data at the right time,” Johnson said.
Key Use Cases
One of the most common use cases for unstructured data analysis is Freedom of Information Act requests. In this situation, an AI-driven platform can make a difference. For example, in addition to finding the necessary documents, “the platform can identify sensitive information that needs to be redacted,” Johnson said.
Another interesting use case pertains to law enforcement investigations. MFGS, Inc. provides a platform called Law Enforcement Media Analysis (LEMA), which enables investigators to extract and analyze data from sources such as closed-circuit TV, social media and other digital content. The platform can identify vehicles, license plates, objects and faces from video or images and match them against a watch list or database of trained entities.
Another common yet often ignored use case for an AI-driven platform is helping agencies manage and clean up the massive collection of redundant, obsolete and trivial information. The result is reduced storage costs, more efficient backups and more.
It’s all about the “speed to understanding,” Johnson said. “With human eyes, you will never be able to match the level of performance that a machine can.”
This article is an excerpt for GovLoop’s guide titled “Your Data Literacy Guide to Improve Everyday Collaboration,” available here.