Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Note: The hashes in the query above are not necessarily the correct hashes for the actual predicates / objects, but the overall statement is correct.

SQL Execution time: As noted at above, this query was taking - for a particular large profile - 20-ish seconds, and accounting for almost all of the time taken to generate the co-author visualisation.

Tweaking MySQL

Knowing the exact query that is being executed against the database allows us to actually focus on the database, see how the query is being executed and make configuration changes that help it to support the actual data structures and workload being demanded of it.

The first thing to note is that (executing the SQL directly) the performance of the query is very consistent. Knowing that it returns quite a large amount of data (approx 200,000 rows), and has a significant amount of work to do, this suggests that the caches aren't necessarily being used effectively.

So, lets increase the join, read and innodb buffers. If you've installed MySQL on a Mac as described above, you'll need to create a my.cnf in /etc, otherwise edit the my.cnf that is being used:

Code Block
languagetext
join_buffer_size = 32M
read_rd_buffer_size = 32M
innodb_buffer_pool_size = 512M
innodb_file_per_table = 1
innodb_file_format = barracuda

NB: There are also some tweaks to the InnoDB file layout - these are good practice things to have.

Whilst this does not affect the initial execution of the SQL query (20 seconds), it does make subsequent executions for the same profile faster - approximately 14 seconds.

It's not known what size buffers should really be used for any given database size, or how well this translates into overall performance, e.g. how much can be cached to the benefit of the entire application. Of course, each individual site would need to look at the resources available on their own systems, and what they can dedicate to a given purpose. 

Even Deeper Explanation

So far, we've just scratched the surface of MySQL, and put in a basic tweak to the configuration, but we haven't really looked in depth at what the query is actually doing.

Like other database engines, MySQL provides execution plans for queries, so lets go ahead and ask MySQL to give us the execution plan - by adding EXPLAIN to the start of the query:

 

idselect_typetabletypepossible_keyskeykey_lenrefrowsextra
1PRIMARY<derived2>ALL    852 
1PRIMARYR_1eq_refPRIMARYPRIMARY8S_1.X_41 
1PRIMARYR_2eq_refPRIMARYPRIMARY8S_1.X_31 
1PRIMARYR_3eq_refPRIMARYPRIMARY8S_1.X_21 
1PRIMARYR_4eq_refPRIMARYPRIMARY8S_1.X_61 
1PRIMARYR_5eq_refPRIMARYPRIMARY8S_1.X_11 
1PRIMARYR_6eq_refPRIMARYPRIMARY8S_1.X_51 
2DERIVEDQ_1refSubjPredObj,PredObjSubj,ObjSubjPredSubjPredObj24const,const,const1Using where; Using index; Using temporary
2DERIVEDQ_2refSubjPredObj,PredObjSubjSubjPredObj16const,const1Using where; Using index
2DERIVEDQ_3refSubjPredObj,PredObjSubj,ObjSubjPredSubjPredObj16const,const852Using where; Using index
2DERIVEDQ_4refSubjPredObj,PredObjSubj,ObjSubjPredSubjPredObj24Q_3.o,const,const1Using where; Using index
2DERIVEDQ_5refSubjPredObj,PredObjSubj,ObjSubjPredSubjPredObj16Q_3.o,const1Using where; Using index
2DERIVEDQ_6refSubjPredObj,PredObjSubj,ObjSubjPredSubjPredObj24Q_5.o,const,const1Using where; Using index
2DERIVEDQ_7refSubjPredObj,PredObjSubj,ObjSubjPredSubjPredObj16Q_5.o,const1Using index
2DERIVEDQ_8refSubjPredObj,PredObjSubj,ObjSubjPredSubjPredObj24Q_7.o,const,const1Using where; Using index; Distinct
2DERIVEDQ_9refSubjPredObj,PredObjSubj,ObjSubjPredSubjPredObj16Q_7.o,const1Using where; Using index; Distinct
2DERIVEDQ_10refSubjPredObj,PredObjSubj,ObjSubjPredSubjPredObj24Q_9.o,const,const1Using where; Using index; Distinct
2DERIVEDQ_11refSubjPredObj,PredObjSubjSubjPredObj16Q_9.o,const1Using index; Distinct

 

 

Info
titleUnused Tweaks

Reading the documentation on MySQL, there is in 5.6 a feature called "batched key access", which on the surface should help with large joins. However, when tested on this query, it appeared to make no difference