Oct 17 2010

Hadoop World

Published by at 5:22 pm under Miscellaneous


I just came back from the Hadoop World conference in New York and I have to say that it was quite exciting. Processing huge amounts of data used to be a problem for just a few companies like Google, Yahoo, Facebook and a few others, but has now become a problem for many. The conference topics were interesting and the training held by Cloudera was really good. My personal recommendation is to get up to speed on Hadoop and related technologies e.g. HBase, Hive, Pig etc. quickly since I think that the ever growing data sizes will soon make these tools commonplace. It takes time to learn how think at scale and to use these tools properly. I’ve now seen how “big data” has grown to such sizes that not even big clustered databases like Oracle RAC provide the ability to quickly process and extract information for our needs. Hadoop is not a universal tool for big data problems, but for a certain set of problems it’s quite powerful and provides almost linear performance as you scale up your compute cluster. Cloudera has excellent videos for Hadoop here: http://www.cloudera.com/resources/?media=Video to get you started. Tom White’s “Hadoop: The Definitive Guide (2nd Edition)” is excellent and I can highly recommend it.

No responses yet

Leave a Reply