Page History
...
The Log Converter program converts log files from dspace.log into an intermediate format that can be inserted into SOLR.
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="b91eb92d-89c8-48fa-b707-eb09d76afcef"><ac:plain-text-body><![CDATA[ | Command used: |
|
Java class: | org.dspace.statistics.util.ClassicDSpaceLogConverter | |
Arguments short and long forms): | Description | |
| Input file | |
| Output file | |
| Adds a wildcard at the end of input and output, so it would mean dspace.log* would be converted. (For example, the following files would be included because of this argument: dspace.log, dspace.log.1, dspace.log.2, dspace.log.3, etc.) | |
| If the log files have been created with DSpace 1.6 | |
| Display verbose output (helpful for debugging) | |
| Help |
The command loads the intermediate log files that have been created by the aforementioned script into SOLR.
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="b5a177f3-ef55-4e13-9caa-e785369493dc"><ac:plain-text-body><![CDATA[ | Command used: | | ]]></ac:plain-text-body></ac:structured-macro> |
Java class: | org.dspace.statistics.util.StatisticsImporter | ||
Arguments (short and long forms): | Description | ||
| input file | ||
| Adds a wildcard at the end of the input, so it would mean dspace.log* would be imported | ||
| To skip the reverse DNS lookups that work out where a user is from. (The DNS lookup finds the information about the host from its IP address, such as geographical location, etc. This can be slow, and wouldn't work on a server not connected to the internet.) | ||
| Display verbose ouput (helpful for debugging) | ||
| For developers: allows you to import a log file from another system, so because the handles won't exist, it looks up random items in your local system to add hits to instead. | ||
| Help |
...
Filtering and Pruning Spiders
...
Command used: |
|
Java class: | org.dspace.statistics.util.StatisticsClient |
Arguments (short and long forms): | Description |
| Update Spider IP Files from internet into |
| Delete Spiders in Solr By isBot Flag. Will prune out all records that have |
| Delete Spiders in Solr By IP Address. Will prune out all records that have IP's that match spider IPs. |
| Update isBog Flag in Solr. Marks any records currently stored in statistics that have IP addresses matched in spiders files |
| Calls up this brief help table at command line. |
...
If they want to keep the spiders out of the solr repository, they can run just use the "-i
" option and they will be removed immediately.
There are guards in place to control what can be defined as an IP range for a bot, in {{\ Wiki Markup [dspace
\]/config/spiders
}}, spider IP address ranges have to be at least 3 subnet sections in length 123.123.123 and IP Ranges can only be on the smallest subnet \ [123.123.123.0 - 123.123.123.255\]. If not, loading that row will cause exceptions in the dspace logs and exclude that IP entry.
Routine SOLR Index Maintenance
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="a0b8af23-c5b4-46e4-b41a-4c9674b95d45"><ac:plain-text-body><![CDATA[ | Command used: |
|
Java class: | org.dspace.statistics.util.StatisticsClient | |
Arguments (short and long forms): | Description | |
| Run maintenance on the SOLR index. Recommended to run daily, to prevent your servlet container from running out of memory |
...