Bigdata SQL: Apache Hive Vectorization of Queries

Vectorized query execution is a Hive feature that greatly reduces the CPU usage for queries such as table scans, filters, aggregates, and joins. Typically, a query processor will scan and process one row at a time. Processing rows one at a time results in poor CPU utilization and a larger number of CPU instructions. What vectorized query execution does is process a block of rows at a time. In other words, the idea is to process a batch of rows as an array of column vectors. Within each such block, a column of vectors of primitive types is stored, either in compressed or uncompressed form.
A one-row-at-a-time-based execution model is essentially very slow. There are a couple of primary reasons for this. Because Hive internally uses object inspectors, which enable a good level of abstraction, it has, however, a major cost implication. This cost implication gets worse because of lazy SerDe implementation within the internals of Hive. Each such loop for a row processing creates Java objects, which adds to the object creation overhead and time, and has a lot of if-then clauses and lots of CPU instructions. This also causes the CPU to stall, because it has to wait for the data to be fetched before it can start working.
However, operations on a block of data can be done quickly, by iterating through the block in a loop with very few branches, which is the core concept behind the speedup. These loops can then be compiled to very few streamlined operations, which can be completed in fewer CPU cycles, taking advantage of CPU cache and executing in fewer clock cycles. This makes an effective use of pipelining, which most modern CPU architectures are very effective at. Most of the primitive data types are supported as part of vectorized query execution.