Browsing All posts tagged under »big data«

Streaming Data to HTTP using Akka Streams with Exponential Backoff on 429 Too Many Requests

March 12, 2019 by


HTTP/REST is probably the most used protocol to exchange data between different services, especially in today's microservice world...

Creating Nested data (Parquet) in Spark SQL/Hive from non-nested data

April 4, 2015 by


Sometimes you need to create denormalized data from normalized data, for instance if you have data that looks like CREATE TABLE flat ( propertyId string, propertyName String, roomname1 string, roomsize1 string, roomname2 string, roomsize2 int, .. ) but we want something like   CREATE TABLE nested ( propertyId string, propertyName string, rooms <array<struct<roomname:string,roomsize:int>> )   […]