Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Essentially all of the data that you store in VIVO will be given to MySQL for storage. The actual location of this data depends on what system you have, and on how you install MySQL but you won’t need to know the location. You will access the data through VIVO, or occasionally through the MySQL client application.

Create an empty MySQL database

Decide on a database name, username, and password. Log into your MySQL server and create a new database in MySQL that uses UTF-8 encoding . You will need these values for Step IV when you configure the deployment properties. At the MySQL command line you can create the database and user with these commands substituting your values for dbname , username , and password . Most of the time, the hostname will equal localhost .

CREATE DATABASE dbname CHARACTER SET utf8;

Grant access to a database user. For example:

GRANT ALL ON dbname.* TO 'username'@'hostname' IDENTIFIED BY 'password';

Keep track of the database name, username, and password.

Specify the JDBC URL of your database. Change the end of the URL to reflect your database name (if it is not "vivo").

Specify deployment properties

Instruction

Property

Value

Specify the JDBC URL of your database. Change the end of the URL to reflect your database name (if it is not "vivo").

VitroConnection.DataSource.url

jdbc:mysql://localhost/vivo

Change the username to match the authorized user you created in MySQL.

VitroConnection.DataSource.username

username set above

Change the password to match the password you created in MySQL.

VitroConnection.DataSource.password

password set above

Specify the maximum number of active connections in the database connection pool to support the anticipated number of concurrent page requests.

VitroConnection.DataSource.pool.maxActive

40

Specify the maximum number of database connections that will be allowed to remain idle in the connection pool. Default is 25% of the maximum number of active connections.

VitroConnection.DataSource.pool.maxIdle

10

Change the dbtype setting to use a database other than MySQL. Otherwise, leave this value unchanged. Possible values are DB2, derby, HSQLDB, H2, MySQL, Oracle, PostgreSQL, and SQLServer. Refer to http://openjena.org/wiki/SDB/Databases_Supported for additional information.

VitroConnection.DataSource.dbtype

MySQL

 

Issues to be aware of

As indicated but not directly commented on above, a new installation of VIVO initially expects to be connected to an empty VIVO database that has been created with the character set UTF-8 so that the VIVO can read, store, and display characters encoded as Unicode

see alsoTomcat server.xml configuration to support non-Roman characters (UTF-8)

Tuning MySQL

Panel

From Stony Brook – 

By popular request, I've been asked to re-send information about the MySQLTuner tool.  It helped give us feedback on several key mysql tuning parameters.  And it gives suggestions on settings that may help your system run more efficiently, and thus your VIVO run a little bit faster.
The mysqltuner.pl script can be found at:

https://github.com/rackerhacker/MySQLTuner-perl

Panel
From Mark at Griffith Uni -
We use an enterprise hosted MySQL ie. remote to our vivo server via gigabit ethernet.  In this configuration we have found MySQL to be a real performance bottleneck.  Here are some parameters that we have found it worthwhile experimenting with:
innodb_flush_log_at_trx_commit=2
  • this resulted in about a 3x speedup (especially for big ingests)

tmp_table_size

max_heap_table_size

key_buffer_size (needed because many of our queries include a group or sort)

Writing the MySQL transaction log

MySQL allows you to control its logging behavior, using the the innodb_flush_log_at_trx_commit parameter. On some systems, changing the value of this parameter can dramatically improve performance.

...

This page provides full details regarding innodb_flush_log_at_trx_commit: http://dev.mysql.com/doc/refman/5.1/en/innodb-parameters.html#sysvar_innodb_flush_log_at_trx_commit

Setting the MySQL query cache size

Increasing the MySQL query cache size will likely translate into improved VIVO performance in that once large pages have been fetched once, they're typically quite a bit faster to load on later fetches.

Tracing back from SQL to SPARQL

If we identify particularly slow SQL queries, we can try to trace them back to SPARQL queries in the code and look for optimizations to those queries or attempt to solve the problem in a different way.

...

One approach is to watch the status of the MySQL query process during slow queries or page rendering to see what it's doing and/or do an EXPLAIN SELECT on the generated SQL.

Regenerating MySQL indexes

If performance is abysmal on a simple query, check for missing or corrupted MySQL indexes that may cause the query engine to do full table scans.

 

...

TCMalloc and MySQL

...

Interesting GitHub blog post (https://github.com/blog/1422-tcmalloc-and-mysql) describing debugging MySQL performance issues, and using tools like the open source Percona Toolkit and the Google-contributed TCMalloc from gperftools.