...
Like other database engines, MySQL provides execution plans for queries, so lets go ahead and ask MySQL to give us the execution plan - by adding EXPLAIN to the start of the query.
Doing so, gives the following output:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | extra |
---|---|---|---|---|---|---|---|---|---|
1 | PRIMARY | <derived2> | ALL | 852 | |||||
1 | PRIMARY | R_1 | eq_ref | PRIMARY | PRIMARY | 8 | S_1.X_4 | 1 | |
1 | PRIMARY | R_2 | eq_ref | PRIMARY | PRIMARY | 8 | S_1.X_3 | 1 | |
1 | PRIMARY | R_3 | eq_ref | PRIMARY | PRIMARY | 8 | S_1.X_2 | 1 | |
1 | PRIMARY | R_4 | eq_ref | PRIMARY | PRIMARY | 8 | S_1.X_6 | 1 | |
1 | PRIMARY | R_5 | eq_ref | PRIMARY | PRIMARY | 8 | S_1.X_1 | 1 | |
1 | PRIMARY | R_6 | eq_ref | PRIMARY | PRIMARY | 8 | S_1.X_5 | 1 | |
2 | DERIVED | Q_1 | ref | SubjPredObj,PredObjSubj,ObjSubjPred | SubjPredObj | 24 | const,const,const | 1 | Using where; Using index; Using temporary |
2 | DERIVED | Q_2 | ref | SubjPredObj,PredObjSubj | SubjPredObj | 16 | const,const | 1 | Using where; Using index |
2 | DERIVED | Q_3 | ref | SubjPredObj,PredObjSubj,ObjSubjPred | SubjPredObj | 16 | const,const | 852 | Using where; Using index |
2 | DERIVED | Q_4 | ref | SubjPredObj,PredObjSubj,ObjSubjPred | SubjPredObj | 24 | Q_3.o,const,const | 1 | Using where; Using index |
2 | DERIVED | Q_5 | ref | SubjPredObj,PredObjSubj,ObjSubjPred | SubjPredObj | 16 | Q_3.o,const | 1 | Using where; Using index |
2 | DERIVED | Q_6 | ref | SubjPredObj,PredObjSubj,ObjSubjPred | SubjPredObj | 24 | Q_5.o,const,const | 1 | Using where; Using index |
2 | DERIVED | Q_7 | ref | SubjPredObj,PredObjSubj,ObjSubjPred | SubjPredObj | 16 | Q_5.o,const | 1 | Using index |
2 | DERIVED | Q_8 | ref | SubjPredObj,PredObjSubj,ObjSubjPred | SubjPredObj | 24 | Q_7.o,const,const | 1 | Using where; Using index; Distinct |
2 | DERIVED | Q_9 | ref | SubjPredObj,PredObjSubj,ObjSubjPred | SubjPredObj | 16 | Q_7.o,const | 1 | Using where; Using index; Distinct |
2 | DERIVED | Q_10 | ref | SubjPredObj,PredObjSubj,ObjSubjPred | SubjPredObj | 24 | Q_9.o,const,const | 1 | Using where; Using index; Distinct |
2 | DERIVED | Q_11 | ref | SubjPredObj,PredObjSubj | SubjPredObj | 16 | Q_9.o,const | 1 | Using index; Distinct |
So, what does this tell us about the query? Well, first of all it looks like there is very good index use, so the most obvious thing you look for - index usage - is already covered.
But look carefully at the 8 row - the entry for table Q_1 - what is that it says at the end? "Using temporary".
Under certain circumstances, MySQL can use a temporary table to answer DISTINCT queries, and when we look back at the original SQL, there is a SELECT DISTINCT on Q_1. Now, as an experiment, if we remove the DISTINCT, we actually get the same results from the query - but it returns in a little over 6 seconds. SIX SECONDS! This query originally took 20...
But we can't just remove the DISTINCT - because that's part of the SQL generated from SDB, and we would need to go in and modify SDB. Hmmm. So what can we do about the temporary table?
It's not clear what is going into the temporary table, and therefore how big it is. But knowing that it is select 4 hashes per row, each hash is 8 bytes, and there are 200,000 rows, that's at least 9M. Possibly up to 67M (11 tables, 4 hashes per table, 200,000 rows).
The amount of memory allocated by default to temporary tables is 16M, after which it creates temporary tables on disk. The temporary table above being written to disk is certainly going to kill performance!
There are two settings in my.cnf that need to be changed to alter the available temporary table size:
Code Block | ||
---|---|---|
| ||
max_heap_table_size=256M
tmp_table_size=256M |
It's important to note that this is not a recommendation for a temporary table size, we're just testing the hypothesis - we may be able to get away with smaller. Alternatively, your system may need larger.
Info | ||
---|---|---|
| ||
Reading the documentation on MySQL, there is in 5.6 a feature called "batched key access", which on the surface should help with large joins. However, when tested on this query, it appeared to make no difference |
...