Here is a table presenting the comparison of the popular Big Data frameworks: Hadoop, Spark, and Flink. Although Spark has some limitations, it remains one of the most widely used Big Data solutions.
| Features | Apache Hadoop | Apache Spark | Apache Flink |
|---|---|---|---|
| Data processing engine | Batch | Batch | Stream |
| Processing speed | Slower than Spark and Flink | 100x faster than Hadoop | Faster than Spark |
| Programming languages | Java, C++, Python | Java, Scala, Python, R | Java, Scala |
| Programming model | MapReduce | RDD | Cyclic dataflows |
| Data transfer | Batch | Batch | Pipelined and Batch |
| Latency | High | Lower than Hadoop | Lower than Spark |
| Streaming support | NA | Spark Streaming | Flink Streaming |
| SQL Support | Hive, Impala | Spark SQL | Table API and SQL |
| Graph support | NA | GraphX | Gelly |
| Machine Learning support | NA | Spark MLlib | FlinkML |
0 comments:
If you have any doubts,please let me know