All Versions
- DSpace 7.x (Current Release)
- DSpace 8.x (Unreleased)
- DSpace 6.x (EOL)
- DSpace 5.x (EOL)
- More Versions...
...
Name | Java Class | Function | Enabled by Default? |
---|---|---|---|
HTML Text Extractor | | extracts the full text of HTML documents for full text indexing. (Uses Swing's HTML Parser) | true |
JPEG Thumbnail | | creates thumbnail images of GIF, JPEG and PNG files | true |
Branded Preview JPEG | | creates a branded preview image for GIF, JPEG and PNG files (disabled by default) | false |
PDF Text Extractor | | extracts the full text of Adobe PDF documents (only if text-based or OCRed) for full text indexing. (Uses the Apache PDFBox tool) | true |
XPDF Text Extractor | | extracts the full text of Adobe PDF documents (only if text-based or OCRed) for full text indexing (Uses the XPDF command line tools ( http://www.foolabs.com/xpdf/) available for Unix.) See XPDF Filter Configuration for details on installing/enabling. | false |
Word Text Extractor | | extracts the full text of Microsoft Word or Plain Text documents for full text indexing. (Uses the "Microsoft Word Text Mining" tools.) | true |
PowerPoint Text Extractor | | extracts the full text of slides and notes in Microsoft PowerPoint and PowerPoint XML documents for full text indexing (Uses the Apache POI tools.) | true |
Please note that the filter-media
script will automatically update the DSpace search index by default (see ReIndexing Content (for Browse or Search)) This is the recommended way to run these scripts. But, should you wish to disable it, you can pass the -n flag to either script to do so (see Executing (via Command Line) below).
...