Transforming Data With Apache Spark
CIO, June 3rd, 2019
June 3, 2019,
Volume 255, Issue 1

Spark is the ideal big data tool for data-driven enterprises because of its speed, ease of use and versatility. It will help you understand your data quickly and help you make informed decisions faster

"Apache Spark is a fast data processing framework dedicated to big data. It allows the processing of big data in a distributed manner (cluster computing). Very popular for a few years now, this framework is about to replace Hadoop. Its main advantages are its speed, ease of use, and versatility.

Apache Spark is an open source big data processing framework that enables large-scale analysis through clustered machines. Coded in Scala, Spark makes it possible to process data from data sources such as Hadoop Distributed File System, NoSQL databases, or relational data stores like Apache Hive. This framework also supports In-memory processing, which increases the performance of analytical applications of big data. It can also be used for conventional disk processing if the data sets are too large for system memory..."

Read More ...


Other articles in the IT - Big Data section of Volume 255, Issue 1:

See all archived articles in the IT - Big Data section.