Apache Spark has rapidly become one of the most exciting technologies for big data analytics and machine learning. Spark is a general data processing engine created for use in clustered computing environments. Its heart is the Resilient Distributed Dataset (RDD) which represents a distributed, fault tolerant, collection of data that can be operated on in parallel across the nodes of a cluster. Spark is implemented using a combination of Java and Scala and so comes as a library that can run on any JVM.
InterSystems Developer Community is a community of
25,216 amazing developers
We're a place where InterSystems IRIS programmers learn and share, stay up-to-date, grow together and have fun!


