Page History
...
Code Block |
---|
User-agent: * # Disable access to Discovery search and filters Disallow: /discover Disallow: /search-filter # This should be the FULL URL to your HTML Sitemap. # Make sure to replace "[dspace.url]" with the value of your 'dspace.url' setting in your dspace.cfg file. Sitemap: http://[dspace.url]/htmlmap # If you have configured DSpace (Solr-based) Statistics to be publicly accessible, # then you likely do not want this content to be indexed # Disallow: /displaystats # Uncomment the following line ONLY if sitemaps.org or HTML sitemaps are used # and you have verified that your site is being indexed correctly. # Disallow: /browse # You also may wish to disallow access to the following paths, in order # to stop web spiders from accessing user-based content: # Disallow: /advanced-search # Disallow: /contact # Disallow: /feedback # Disallow: /forgot # Disallow: /login # Disallow: /register # Disallow: /search |
Note that for your additional disallow statements to be recognized under the User-agent: * group, they can not be separated by white lines from the declared user-agent: * block. A white line indicates the start of a new user agent block. Without a leading user-agent declaration on the first line, blocks are ignored. Comment lines are allowed and will not break the user-agent block.
This is OK:
Code Block |
---|
User-agent: *
# Disable access to Discovery search and filters
Disallow: /discover
Disallow: /search-filter
Disallow: /displaystats
Disallow: /advanced-search |
This is not OK, as the two lines at the bottom will be completely ignored.
Code Block |
---|
User-agent: *
# Disable access to Discovery search and filters
Disallow: /discover
Disallow: /search-filter
Disallow: /displaystats
Disallow: /advanced-search |
To identify if a specific user agent has access to a particular URL, you can use this handy robots.txt tester.
Ensure Item Metadata appears in the HTML HEAD
...
Overview
Content Tools