Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Corrected Pg host-based-authentication config file name


 

Many organisational DSpace users run reports on DSpace not from within DSpace itself, but from within an external reporting system (Business Intelligence, JasperReports, etc) which connect to the DSpace database via ODBC. Reports may be run on an external reporting system rather than DSpace to allow integrated reporting across a number of systems, to allow integration with other organisational systems, to exploit existing reporting skills, or for other reasons. Typically the connection is read-only, this allows both better security (since it limits the damage that an incompetent or malicious user can do) and better performance (since read-only accesses to tables typically require less locking or resource-use in most databases).

...

Code Block
languagesql
titlecode to create a new user
CREATE ROLE reporting LOGIN PASSWORD 'SECRET';
GRANT CONNECT ON DATABASE dspace to reporting;
GRANT USAGE ON SCHEMA public TO reporting;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO reporting

Oracle

TODO

(this may help: DSpace with Oracle DB test instance)

Enabling remote database connections

Postgres

The trick is to edit pdpg_hdahba.conf to access from the smallest possible range of IP addresses:

Code Block
titleline to add to pg_hba.conf
host # TYPE  DATABASE    USER        CIDR-ADDRESS   dspace       METHOD
host    dspace reporting     reporting   10.123.0.0/64         md5

You also need to make sure the Postgres daemon is listening on an IP address on the Postgres server that can be connected to from wherever you want to run your SQL client. Make sure that your "localhost" address is also included if you run postgres on the same machine as DSpace. The '*' value will make Postgres listen on all local IP addresses:

Code Block
titleedit listen_addresses in postgresql.conf
listen_addresses        md5= '*'

Oracle

TODO

Opening firewall ports

...

Creating the connection string

...

Most reporting systems use URL-like "connection strings" to connect to databases.

Postgres

Code Block
titleconnection string for postgres
jdbc:postgresql://HOSTNAME:5432/dspace

Oracle

Code Block
titleconnection string for oracle
jdbc:oracle:thin:@//HOSTNAME:5432/dspace


Creating reports

ODBC-based reporting tools work by running one or more SQL query against the underlying DSpace database. You can find some examples in

Jira
serverDuraSpace JIRA
serverIdc815ca92-fd23-34c2-8fe3-956808caf8c5
keyDS-1645
which illustrate the kinds of SQL joins that you'll be looking at using. Note the use of item.in_archive = 't' and item.withdrawn = 'f' to check whether the item has been deleted or withdrawn respectively. A diagram of the DSpace database schema is included in the Storage Layer section of the DSpace documentation. The SQL itself is stored in database-specific subdirectories of https://github.com/DSpace/DSpace/tree/master/dspace/etc

Note that some data (such as access stats) aren't held in the SQL database but in solr. Access details for these can be found at Solr#AccessingSolr.

Reusing reports

Do you have some reports that you might be willing to share? if so please log into upload them.

 


This is based on on

Jira
serverDuraSpace JIRA
serverIdc815ca92-fd23-34c2-8fe3-956808caf8c5
keyDS-1645
"support for connecting to institutional reporting tools"