Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • The code is all open – universities may prefer to run their own (in which case you request that they sponsor to help keep the code updated)
  • Service providers may decide they could host competitive search services, with their own value added in tweaks to relevance ranking, etc.
  • As the price goes up, a cost benefit analysis will steer people to other options including custom Google search appliances, etc.

Technical Risks

Indexing is too slow

This could be a problem for two reasons

  • Indexing consumes too many resources and is costly to support
  • Updates cannot happen with sufficient frequency to satisfy local requirements or incentivize updates
    • People have more confidence in a system which they can control themselves by seeing corrections happen

What will contribute to slowness?

  • Indexing more detail, especially detail that is more remotely connected to the individual entity being indexed
    • e.g., if you want to get the names of all co-authors for a researcher, not just the titles of their papers
  • Doing more to align or disambiguate data at indexing time
    • e.g., including in the index a list of possible alternative organizations, people, journals, etc. to facilitate corrections during use
  • Doing more queries, computation or analysis to improve relevance ranking
    • e.g., boosting relevance based on the number of papers or grants
Indexing interferes with performance on the site being indexed

This may become the countervailing pressure from distributed sites --

Relevance ranking proves intractably messy

...