Scala and Apache Spark in Tandem as a Next-Generation ETL Framework

Scala and Apache Spark might seem an unlikely medium for implementing an ETL process, but there are reasons for considering it as an alternative. After all, many Big Data solutions are ideally suited to the preparation of data for input into a relational database, and Scala is a well thought-out and expressive language. Krzysztof Stanaszek describes some of the advantages and disadvantages of a scala-based approach to implementing and testing an ETL solution. … Read more

How to Start Big Data with Apache Spark

It is worth getting familiar with Apache Spark because it a fast and general engine for large-scale data processing and you can use you existing SQL skills to get going with analysis of the type and volume of semi-structured data that would be awkward for a relational database. With an IDE such as Databricks you can very quickly get hands-on experience with an interesting technology.… Read more