Shard-Query 2.0 performance on the SSB with InnoDB on Tokutek’s MariaDB distribution
May 26, 2013
Posted by on
Scaling up a workload to many cores on a single host
Here are results for Shard-Query 2.0 Beta 1* on the Star Schema Benchmark at scale factor 10. In the comparison below the “single threaded” response times for InnoDB are the response times reported in my previous test which did not use Shard-Query.
Shard-Query has been configured to use a single host. The Shard-Query configuration repository is stored on the host. Gearman is also running on the host, as are the Gearman workers. In short, only one host is involved in the testing.
The Shard-Query response times are for 6 gearman workers. There are six physical cores in my test machine. In my testing I’ve found that Shard-Query works best when the number of workers is equal to the number of physical cores.
As in the previous test the lineorder table is partitioned. This allows Shard-Query to automatically take advantage of multiple cores without changing any of the queries.
How? Well Shard-Query transforms a single expensive query into smaller “tasks”. Each task is a query which examines a small amount of data. Shard-Query takes a “divide and conquer” approach, where the data is divided into small chunks, and the chunks are operated on in parallel. Shard-Query treats each partition as a chunk. Future versions of Shard-Query will support subpartitions and hash partitions in MySQL 5.6.
In general, Shard-Query is faster than MySQL both cold and hot. There are a few cases where the speed is about the same, or where Shard-Query is slower. I believe this is due to MySQL bug #68079: queries may not scale linearly on MySQL. Star schema optimization is not turned on for these queries, so each of the sub-tasks is joining many rows.
Shard-Query 2.0 Beta 1 (patched for I_S.partitions)
Tokutek MariaDB 5.5.30-7.0.1