Shard-Query blog

The only open source MPP database engine for MySQL

Category Archives: Benchmarks

Shard-Query 2.0 performance on the SSB with InnoDB on Tokutek’s MariaDB distribution

Scaling up a workload to many cores on a single host

Here are results for Shard-Query 2.0 Beta 1* on the Star Schema Benchmark at scale factor 10.  In the comparison below the “single threaded” response times for InnoDB are the response times reported in my previous test which did not use Shard-Query.

Shard-Query configuration

Shard-Query has been configured to use a single host.  The Shard-Query configuration repository is stored on the host.  Gearman is also running on the host, as are the Gearman workers.  In short, only one host is involved in the testing.

The Shard-Query response times are for 6 gearman workers.  There are six physical cores in my test machine.  In my testing I’ve found that Shard-Query works best when the number of  workers is equal to the number of physical cores.

Why partitions?

As in the previous test the lineorder table is partitioned.  This allows Shard-Query to automatically take advantage of multiple cores without changing any of the queries.

How?  Well Shard-Query transforms a single expensive query into smaller “tasks”.  Each task is a query which examines a small amount of data.  Shard-Query takes a “divide and conquer” approach, where the data is divided into small chunks, and the chunks are operated on in parallel.  Shard-Query treats each partition as a chunk.  Future versions of Shard-Query will support subpartitions and hash partitions in MySQL 5.6.

Results

In general, Shard-Query is faster than MySQL both cold and hot.  There are a few cases where the speed is about the same, or where Shard-Query is slower.  I believe this is due to MySQL bug #68079: queries may not scale linearly on MySQL.  Star schema optimization is not turned on for these queries, so each of the sub-tasks is joining many rows.

1_cold

1_hot

2_cold

2_hot

3_cold

3_hot

4_cold

4_hot

Software:
Shard-Query 2.0 Beta 1 (patched for I_S.partitions)
Tokutek MariaDB 5.5.30-7.0.1

TokuDB vs Percona XtraDB using Tokutek’s MariaDB distribution

Following are benchmark results comparing Tokutek TokuDB and Percona XtraDB at scale factor 10 on the Star Schema benchmark. I’m posting this on the Shard-Query blog because I am going to compare the performance of Shard-Query on the benchmark on these two engines. First, however, I think it is important to see how they perform in isolation without concurrency.

Because I am going to be testing Shard-Query, I have chosen to partition the “fact” table (lineorder) by month. I’ve attached the full DDL at the end of the post as well as the queries again for reference.

I want to note a few things about the results:
First and foremost, TokuDB was configured to use quicklz compression (the default) and InnoDB compression was not used. No tuning of TokuDB was performed, which means it will use up to 50% of memory by default. Various InnoDB tuning options were set (see the end of the post) but the most important is that the innodb_buffer_pool_size which was set to 20G (all data fits in buffer pool for hot test). Read more of this post