Hadoop and Business Intelligence

Like my colleague Alex Olesker, I too attended Cloudera Day 2012. While there were many panels of interest, perhaps one of the most important was Amr Awadallah‘s talk about big data applications to business intelligence. Many CTOVision readers with backgrounds in the intelligence community may think of corporate espionage when the phrase “business intelligence” is uttered, but I assure you that this is definitely not the case. Business intelligence is different from competitive intelligence, which is primarily based on open-source analysis of competitors and markets. Rather, business intelligence is quantitative analysis of internal data using advanced analytics techniques.

As we’ve noted before on CTOLabs, business intelligence is a changing field that is increasingly awash in information. Analysts face three core problems:

  • More information – Critical information now comes from new sources, such as online customer reviews and Facebook updates.
  • More change – The pace of competition, changes in customer preferences and organizational changes are all accelerating.
  • More questions – More people are asking new questions, such as: Why did that happen?How do these trends relate?

Adding to the analytical difficulties is the fact that complex and unstructured data, as opposed to relational data, is exploding. The question increasingly becomes–how do you manage the data and categorize it?

Awadallah talked about the unfavorable use/price dimension involved in archiving data. Once you’ve archived it, you’ve more or less lost it because it becomes too expensive to retrieve on demand. The solution? A combined computer/storage layer with Hadoop applications. Awadallah discussed opportunities to employ HDFS and MapReduce in business analytics. Hadoop offers three “core values”: scalability to grow nodes, flexibility in data storage and analysis, and data longevity. No transformation–implied under the previous model–is necessary, as data can start flowing any time with Hadoop applications. Awadallah contrasted what he called the slow “Data Council” model with a new forward-thinking approach he dubbed “Data Scientist” built on Hadoop products. Hadoop can grow too without requiring developers to re-architect their applications and algorithms. Both structured and raw data can be logged from multiple applications, and Hadoop offers centralized logging across all execution platforms.

All of these approaches, Awadallah argued, offered a much better RoB (Return on Byte) for business intelligence analysts looking to use big data to optimize their enterprise.


Original post

Leave a Comment

Leave a comment

Leave a Reply