I am currently working to set up an OLAP data warehouse using Hive on top of Hadoop. We have a considerable amount of data that comes from the ad servers on which we need to perform various kinds of analysis.
Writing a map-reduce job is not difficult in principle – it's just time consuming and requires the skills of a trained java engineer, which wouldn't be needed were we using SQL. That's where hive comes in: it allows us to query an hadoop data store using a flavor of SQL.
With the constant increasing of the quantity of data that companies collect and need to process, Data Warehousing is a job sector that's expnding even in the recession. It it also living a second youth, thanks to a number of open source projects that have been slowly but surely gaining popularity in a manner similar to linux 10 years ago. One of this technologies is Hadoop, a distributed filesystem and data processing framework based on Google's Map/Reduce paper. Hadoop powers Yahoo! Search, Facebook and many other sites' data warehouses.
One of the first questions that a 'traditional' ETL engineer asks when learning hadoop is, "How do I do a join ?"
For instance, how can we do in hadoop something like querying for the names of all employees who are in a California city:
SELECT e.name, c.name from employees e INNER JOIN cities c
on e.city_id = c.id AND c.state ='CA'
Today's rich IDEs make a lot of tasks easier...usually. With Java and its IDEs you often end up spending more time than you anticipated to just set up a project, especially when dealing with the complexities of J2EE: there are multiple versions of the specifications 1.3,1.4,5.0), each one with multiple implementations by different vendors plus extensions (richfaces, struts, seam, spring..). You have also to choose the container (tomcat, glassfish, jboss...). Last but not least, you can also pick different building tools (abt, maven...).
I am switching my personal blog/site to Drupal. Open source CMS systems are now mature and powerful enough to build an performing, powerful site without having to code every single feature (aka reinventing the wheel...authentication, registration...).
I was using wordpress, but I never found it neither too much user friendly, neither too powerful. After doing some research on web forums, I decided to commit to Drupal, which is regarded as having the best architecture.