Page History
...
Name | Java Class | Function | Enabled by Default? |
---|---|---|---|
HTML Text Extractor |
| extracts the full text of HTML documents for full text indexing. (Uses Swing's HTML Parser) | trueyes |
JPEG Thumbnail |
| creates thumbnail images of GIF, JPEG and PNG files | trueyes |
Branded Preview JPEG |
| creates a branded preview image for GIF, JPEG and PNG files | falseno |
PDF Text Extractor |
| extracts the full text of Adobe PDF documents (only if text-based or OCRed) for full text indexing. (Uses the Apache PDFBox tool) | trueyes |
XPDF Text Extractor |
| extracts the full text of Adobe PDF documents (only if text-based or OCRed) for full text indexing (Uses the XPDF command line tools available for Unix.) See XPDF Filter Configuration for details on installing/enabling. | falseno |
Word Text Extractor |
| extracts the full text of Microsoft Word or Plain Text documents for full text indexing. (Uses the "Microsoft Word Text Mining" tools.) | yes |
Excel Text Extractor | org.dspace.app.mediafilter.ExcelFilter | extracts the full text of Microsoft Excel documents for full text indexing. (Uses the "Apache POI" tools.) | trueyes |
PowerPoint Text Extractor |
| extracts the full text of slides and notes in Microsoft PowerPoint and PowerPoint XML documents for full text indexing (Uses the Apache POI tools.) | trueyes |
ImageMagick Image Thumbnail Generator |
| uses ImageMagick to generate thumbnails for image bitstreams. Requires installation of ImageMagick on your server. See ImageMagick Media Filters. | falseno |
ImageMagick PDF Thumbnail Generator | org.dspace.app.mediafilter.ImageMagickPdfThumbnailFilter | uses ImageMagick and Ghostscript to generate thumbnails for PDF bitstreams. Requires installation of ImageMagick and Ghostscript on your server. See ImageMagick Media Filters. | falseno |
Please note that the filter-media
script will automatically update the DSpace search index by default.
...