Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Info
titleCurrent released version

DirIngest GSearch 2.1 (May 2008) is the currently released version.

...

The following figure serves to give a first understanding for a developer, who will use GSearch in a Fedora application:

The figure shows:

  • A REST client, running in a user's browser, which may combine accesses to Fedora and to the Search Service.
  • A SOAP client, running anywhere, may do the same.
  • The Search Service implements a generic set of operations:
    • updateIndex - indexing the contents of the Fedora repository.
    • gfindObjects - search similar to Fedora findObjects and to the SRW/SRU operation searchRetrieve.
    • browseIndex - browsing terms in a given index, similar to the SRW/SRU operation scan.
    • getRepositoryInfo - describing the properties of a repository,
    • getIndexInfo - describing the properties of an index.
  • Engine specific implementations of the operations will receive client requests, communicate with the engine indexer and search server, and return the responses in the appropriate form to the clients.

...

Architectural Snapshots
Anchor
arch
arch



  • All engine specific operations return an engine specific xml answer, which is transformed by an engine-specific xslt stylesheet into result page xml. For a SOAP request this is the answer. For a REST request this is transformed to an html answer. There may be any number of xslt stylesheets to select from, the default ones are selected in the properties file. Selecting a copy stylesheet will allow the transfer of an answer untransformed. An alternative result page format is OpenSearch, which is an RSS2.0 extension.
  • Parameters allow clients to select repository, index, and xslt stylesheets by name. In a real application, these values may be determined by the developer in the code, or by the administrator in the properties file.
    Image Modified
  • Objects in the Fedora repository are exported in FOXML format, transformed into an appropriate document format by the indexing stylesheet, and indexed by the engine in question. The XML datastreams are indexed as decided in the stylesheet. One managed or external datastream may be indexed per FedoraObject (which one is configurable), assuming that they contain the same text in different mimetypes.
  • The following updateIndex actions are available:
    • createEmpty - creating or emptying the index. For a new index, you have to run createEmpty once, before you can run the other actions.
    • fromFoxmlFiles ( filePath ) - indexing FOXML records; filePath may be null, in which case the configured Fedora Object Directory is used, so that the whole of the Fedora registry is indexed.
    • fromPid ( PID ) - indexing one FOXML record, as exported by Fedora API-M; in case a previous index document with the same PID exists, it is first deleted. This is the incremental update operation that shall be called after all of Fedora's API-M operations that modifies a FedoraObject.
    • deletePid ( PID ) - deleting one index document.

A typical application will index one repository in one index. However, you have the possibility to index many repositories in one or more indexes in parallel, as illustrated here:

  • There are OperationsImpl classes for Zebra, Lucene and Solr. The configManyToMany example has indexes for two engines, therefore similar searches may be compared.

...