Shard-Query blog

The only open source MPP database engine for MySQL

Shard-Query 2.0 performance on the SSB with InnoDB on Tokutek’s MariaDB distribution

Scaling up a workload to many cores on a single host

Here are results for Shard-Query 2.0 Beta 1* on the Star Schema Benchmark at scale factor 10.  In the comparison below the “single threaded” response times for InnoDB are the response times reported in my previous test which did not use Shard-Query.

Shard-Query configuration

Shard-Query has been configured to use a single host.  The Shard-Query configuration repository is stored on the host.  Gearman is also running on the host, as are the Gearman workers.  In short, only one host is involved in the testing.

The Shard-Query response times are for 6 gearman workers.  There are six physical cores in my test machine.  In my testing I’ve found that Shard-Query works best when the number of  workers is equal to the number of physical cores.

Why partitions?

As in the previous test the lineorder table is partitioned.  This allows Shard-Query to automatically take advantage of multiple cores without changing any of the queries.

How?  Well Shard-Query transforms a single expensive query into smaller “tasks”.  Each task is a query which examines a small amount of data.  Shard-Query takes a “divide and conquer” approach, where the data is divided into small chunks, and the chunks are operated on in parallel.  Shard-Query treats each partition as a chunk.  Future versions of Shard-Query will support subpartitions and hash partitions in MySQL 5.6.

Results

In general, Shard-Query is faster than MySQL both cold and hot.  There are a few cases where the speed is about the same, or where Shard-Query is slower.  I believe this is due to MySQL bug #68079: queries may not scale linearly on MySQL.  Star schema optimization is not turned on for these queries, so each of the sub-tasks is joining many rows.

1_cold

1_hot

2_cold

2_hot

3_cold

3_hot

4_cold

4_hot

Software:
Shard-Query 2.0 Beta 1 (patched for I_S.partitions)
Tokutek MariaDB 5.5.30-7.0.1

One response to “Shard-Query 2.0 performance on the SSB with InnoDB on Tokutek’s MariaDB distribution

  1. Justin May 26, 2013 at 5:50 PM

    Query flight #1 is slowed down by a bad plan. If I force the join order with STRAIGHT_JOIN the query is much faster. I’m thinking extended keys in 5.6 might fix this:


    SELECT SUM(lo_extendedprice * lo_discount) AS expr_3590594011 FROM dim_date join lineorder ON( lo_orderdatekey = d_datekey ) WHERE d_yearmonth = 'Jan1994' and lo_discount between 4 and 6 and lo_quantity between 26 and 35 AND LO_OrderDateKey >= (19930531) AND LO_OrderDateKey < (19930631) ORDER BY NULL;
    +-----------------+
    | expr_3590594011 |
    +-----------------+
    | NULL |
    +-----------------+
    1 row in set (0.48 sec)

    mysql> SELECT straight_join ... ORDER BY NULL;
    +-----------------+
    | expr_3590594011 |
    +-----------------+
    | NULL |
    +-----------------+
    1 row in set (0.00 sec)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 162 other followers

%d bloggers like this: