BigData SQL : Spark SQL with Tachyon (Alluxio)

Tachyon, now called Alluxio, is an in-memory file system that enables reliable data sharing across data-processing frameworks such as Spark and MapReduce. Tachyon achieves high performance by using memory caching and internally using lineage information. It caches a working set of files in memory and avoids going to disk.
Tachyon with Spark SQL has been used successfully at Baidu to provide low-latency ad hoc SQL query to data warehouses for business analysts. The addition of Tachyon to Spark SQL provides 10–20 times the speed for analytic query processing.
The initial computes are done in Spark engine, but the results are then cached within the Tachyon file system. Within Spark, SQL calls can be made using the DataFrame to the data stored on Tachyon.
After the query parsing and optimization is done within the Spark SQL engine, the query executor checks whether the requested data is already cached within the Tachyon file system. If so, it reads from Tachyon; otherwise, a new Spark job is initiated to read from the data store the computations done within the Spark engine, which are then cached in the Tachyon file system.