Unless you have been spending the last couple of year living under a rock you must have heard about the big data hype. Started way back in 2009, the hype arose as Google released its Flu Trends paper. Google’s Flu Trends, quoting their own words, uses aggregated Google search data to estimate current flu activity around the world in near real-time. This became a hype when they used 50 million search queries from their search engine database and 450 million different models to test each of the candidate queries to find the best combination of 45 search terms that helped the Center for Disease Control in the US predict flu epidemics three weeks ahead of doctor-reported stats (see more details here). Since then the 2009 Google’s Flu Trends has become a poster child for the power of big-data analysis.
Scientists have known for long that data could create knowledge, bring new value to the table. But only until recently the rest of the world (including the governments, businesses, management) has really understand the value of said data. Suddenly, it makes absolute sense to extract (or try to extract) the value from all the data. The next question is the how-to.
What is really a Big Data?
A quick trip to wikipedia provides us with a general definition of big data: “is an all-encompassing term for any collection of data sets so large or complex that it becomes difficult to process using traditional data processing applications.” While the ‘large’ is self-explanatory, the keywords in understanding big data lie in this three words: difficult to process. There is no definite amount or size of data to be considered big (data). So, our emerging difficulty to process our ever-expanding data become the alarm to notify us that we need to consider jumping on to the big data bandwagon.
While the statement above mainly talk about the volume of the data, there are three others V’s defining the characteristic of Big Data: Variety, Velocity, and Veracity. Variety springs when we have (and want to process) various forms of data. We are talking about pictures, videos, sensors outputs, kind of variety. Velocity arrives when we are to handle faster stream of data. Variety, the last one added to the list, talks about the varying level of noise and processing errors.
To be able to capture, store, process, analyze, and report those kind of data promises us with lucrative rewards. Google-predicting-trends-before-it-happen kind of rewards. This changes Big Data, or Big Data Analytics as we like to coined it, from a mere hype into a fashion. And from a fashion, one everybody try to keep up with, Big Data Analytics become what western civilization called a teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it.
What Next?
So now you realize that your data can offers precious information and knowledge, you also realize that your current infrastructure can only support the data growth for so long, and you really wanted to know how to manage it, process it, store it, analyze it, and report it. You have now demystified the Big Data Hype. Welcome to the club. Let’s work out on the how-to’s on the next session.
for more information about big data analytics, please attend our events: Big Data Week.
Hi, this is a comment.
To delete a comment, just log in and view the post's comments. There you will have the option to edit or delete them.