...
- The code is all open – universities may prefer to run their own (in which case you request that they sponsor to help keep the code updated)
- Service providers may decide they could host competitive search services, with their own value added in tweaks to relevance ranking, etc.
- As the price goes up, a cost benefit analysis will steer people to other options including custom Google search appliances, etc.
Technical Risks
Indexing is too slow
This could be a problem for two reasons
- Indexing consumes too many resources and is costly to support
- Updates cannot happen with sufficient frequency to satisfy local requirements or incentivize updates
- People have more confidence in a system which they can control themselves by seeing corrections happen
What will contribute to slowness?
- Indexing more detail, especially detail that is more remotely connected to the individual entity being indexed
- e.g., if you want to get the names of all co-authors for a researcher, not just the titles of their papers
- Doing more to align or disambiguate data at indexing time
- e.g., including in the index a list of possible alternative organizations, people, journals, etc. to facilitate corrections during use
- Doing more queries, computation or analysis to improve relevance ranking
- e.g., boosting relevance based on the number of papers or grants
Indexing interferes with performance on the site being indexed
This may become the countervailing pressure from distributed sites --
Relevance ranking proves intractably messy
...