Browsing All posts tagged under »hadoop«

Joins in Hadoop using CompositeInputFormat

June 7, 2009 by


One of the first questions that a ‘traditional’ ETL engineer asks when learning hadoop is, “How do I do a join ?” For instance, how can we do in hadoop something like querying for the names of all employees who are in a California city: SELECT, from employees e INNER JOIN cities c […]