Apache Hadoop Archives - Fractional Futurist

The problem with Big Data is not the Data

Sep 25, 2012

—

by

There is a seemingly irrational obsession about how BIG your Big Data has to be before a magical unicorn appears and delivers the answers your business needs. Not a day goes by where I don’t see some swanky infographic reminding me that Facebook collects several Yottabytes of data every day. Ok, so I may have embellished that a…

It’s not how big your data is, it’s how you use it!

Aug 9, 2012

—

by

MC

in Big Data

Over the past couple of months I have met and talked to a lot of new and interesting people. Everywhere I go I encounter the same questions about Big Data, it’s like some sort of mass hysteria around what on the face of it is a simple concept “volumes of data”. Example questions; “How much…

Hadoop: Processing ZIP files in Map/Reduce

Jul 6, 2012

—

by

MC

in Hadoop

Due to popular request, I’ve updated my simple framework for processing ZIP files in Hadoop Map/Reduce jobs. Previously the only easy solution was to unzip files locally and then upload them to the Hadoop Distributed File System (HDFS) for processing. This adds a lot of unnecessary complexity when you are dealing with thousands of ZIP files; Java…

Tag: Apache Hadoop

The problem with Big Data is not the Data

It’s not how big your data is, it’s how you use it!

Hadoop: Processing ZIP files in Map/Reduce