Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Updated GeoLite2 section

...

Property:

solr-statistics.server

Example Values:

solr-statistics.server = http://127.0.0.1/solr/statistics
solr-statistics.server = ${solr.server}/statistics

Informational Note:

Is used by the SolrLogger Client class to connect to the Solr server over http and perform updates and queries. In most cases, this can (and should) be set to localhost (or 127.0.0.1).

To determine the correct path, you can use a tool like wget to see where Solr is responding on your server. For example, you'd want to send a query to Solr like the following:

Code Block
wget http://127.0.0.1/solr/statistics/select?q=*:*

Assuming you get an HTTP 200 OK response, then you should set solr.log.server to the '/statistics' URL of 'http://127.0.0.1/solr/statistics' (essentially removing the "/select?q=:" query off the end of the responding URL.)

  


Property:

solr-statistics.query.filter.bundles

Example
Value:

solr-statistics.query.filter.bundles=ORIGINAL

Informational
Note:

A comma seperated list that contains the bundles for which the file statistics will be displayed.

 

 


Property:

solr-statistics.query.filter.spiderIp

Example Value:

solr-statistics.query.filter.spiderIp = false

Informational Note:

If true, statistics queries will filter out spider IPs -- use with caution, as this often results in extremely long query strings.

  


Property:

solr-statistics.query.filter.isBot

Example Value:

solr-statistics.query.filter.isBot = true

Informational Note:

If true, statistics queries will filter out events flagged with the "isBot" field. This is the recommended method of filtering spiders from statistics.

  


Property:

solr-statistics.spiderips.urls

Example Value:

solr-statistics.spiderips.urls =

Code Block
http://iplists.com/google.txt, \
http://iplists.com/inktomi.txt, \
http://iplists.com/lycos.txt, \
http://iplists.com/infoseek.txt, \
http://iplists.com/altavista.txt, \
http://iplists.com/excite.txt, \
http://iplists.com/misc.txt


Informational Note:

List of URLs to download spiders files into [dspace]/config/spiders. These files contain lists of known spider IPs and are utilized by the SolrLogger to flag usage events with an "isBot" field, or ignore them entirely.

The "stats-util" command can be used to force an update of spider files, regenerate "isBot" fields on indexed events, and delete spiders from the index. For usage, run:

Code Block
dspace stats-util -h

from your [dspace]/bin directory

...


In the {dspace.dir}/config/modules/usage-statistics.cfg file review the following fields. These fields can be edited in place, or overridden in your own local.cfg config file (see Configuration Reference).

  

Property:

usage-statistics.dbfile

Example Value:

usage-statistics.dbfile = ${dspace.dir}/config/GeoLiteCity.dat

Informational Note:

The following referes to the GeoLiteCity database file utilized by the LocationUtils to calculate the location of client requests based on IP address. During the Ant build process (both fresh_install and update) this file will be downloaded from http://www.maxmind.com/app/geolitecity if a new version has been published or it is absent from your [dspace]/config directory.

 


Property:

usage-statistics.resolver.timeout

Example Value:

usage-statistics.resolver.timeout = 200

Informational Note:

Timeout in milliseconds for DNS resolution of origin hosts/IPs. Setting this value too high may result in solr exhausting your connection pool.

 



Property:

useProxies  (Set in dspace.cfg)

Example Value:

useProxies = true

Informational Note:

Will cause Statistics logging to look for X-Forward URI to detect clients IP that have accessed it through a Proxy service (e.g. the Apache mod_proxy).  Allows detection of client IP when accessing DSpace. [Note: This setting is found in the DSpace Logging section of dspace.cfg]

  


Property:

usage-statistics.authorization.admin.usage

Example Value:

usage-statistics.authorization.admin.usage = true

Informational Note:

When set to true, only general administrators, collection and community administrators are able to access the pageview and download statistics from the web user interface. As a result, the links to access statistics are hidden for non logged-in admin users. Setting this property to "false" will display the links to access statistics to anyone, making them publicly available.

  


Property:

usage-statistics.authorization.admin.search

Example Value:

usage-statistics.authorization.admin.search = true

Informational Note:

When set to true, only system, collection or community administrators are able to access statistics on search queries. 
  


Property:

usage-statistics.authorization.admin.workflow

Example Value:

usage-statistics.authorization.admin.workflow = true

Informational Note:

 When set to true, only system, collection or community administrators are able to access statistics on workflow events.
  


Property:

usage-statistics.logBots

Example Value:

usage-statistics.logBots = true

Informational Note:

When this property is set to false, and IP is detected as a spider, the event is not logged.
When this property is set to true, the event will be logged with the "isBot" field set to true.
(see solr-statistics.query.filter.* for query filter options)

...

The GeoLite Database file (at [dspace]/config/GeoLite2-City.mmdb) is used by the Statistics engine to generate location/country based reports.  It is not provided with DSpace; you will need to manually install it. (Note: If you are not using DSpace Statistics, this file is not needed.)This file can be installed automatically when you run ant fresh_install. However, if the file cannot be downloaded & installed automatically, you may need to manually install   See the provider's web site for more information about this file, and to get a free account for downloading it.

Alternatively, DSpace can be configured to use a GeoLite City database file that you already have and maintain by other means.  You can edit [dspace]/config/local.cfg (or [dspace]/config/modules/usage-statistics.cfg), changing the path usage-statistics.dbfile to point to a shared copy of the database.

As this This file is also sometimes frequently updated by MaxMind.com, so you may also wish to update it on occasionwill need to refresh it regularly.  As this is written, the database is updated monthly.

You have three options to install/update this file:

  1. Attempt to re-run the automatic installer from your DSpace Source Directory ([dspace-source]). This will attempt to automatically download the database file, unzip it and install it into the proper location:

    Code Block
    ant update_geolite
    • NOTE: If the location of the GeoLite Database file is known to have changed, you can also run this auto-installer by passing it the new URL of the GeoLite Database File: ant -Dgeolite=[full-URL-of-geolite] update_geolite
  2. OR, you can manually install the file by performing these steps yourself:
  3. OR, you can combine the two alternatives above, by first downloading the GeoLiteCity.dat.gz file to a location accessible to you, and then configure a .dspace.properties file in your home folder. For example, create a .dspace.properties file in the home folder of the user who is running ant to deploy dspace, and add the following line to it:
Code Block
languagetext
title.dspace.properties
geolite=file:///path/to/your/downloaded/GeoLiteCity.dat.gz

...

, and to be allowed to obtain it you need to agree to keep your copy updated.

The file is obtained using MaxMind's geoipupdate tool.  Find it here.  Many Linux distributions also re-package geoipupdate, so you may prefer to get it using your package manager.