Big Data Digest: How many Hadoops do we really need?
Say hello to Flink, the newest distributed data analysis engine on the scene.
This week, the Apache Software Foundation announced Apache Flink as its newest Top-Level Project (TLP). Apache also provides a home for Hadoop, Cassandra, Lucene and many widely used open source data processing tools, so Flink's entry into the group speaks well for its technical chops.
Don't worry if you hadn't heard of Flink before -- it came to a surprise to us as well. Like Spark, another emerging data processing platform, Flink can ingest both batch data and streaming data. Apache Flink got its start as a research project at the Technical University of Berlin in 2009. Why would someone choose Flink over Hadoop? Performance and ease of use, say the creators of the software.