Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

From Eric Meeks, Friday, March 29, 2013:

Oddly enough I was just working on this, so passing you the info in case it helps.  You can see a form page at  http://profiles.ucsf.edu/sparql/ and for programmatic access to do a direct POST of a query you can see what to do from below. Should be consistent with what Griffin showed you.

As noted, the performance is atrocious.  Hoping to find an improvement when we install the new version of profiles (in staging next week). If that doesn’t work, I know how to architect something that might do the trick as long as we can tolerate the latency.

Thanks
Eric

Panel

From: Meeks, Eric
Sent: Wednesday, March 27, 2013 11:12 AM
To: David Eichmann
Subject: RE: SPARQL internal server errors

Ok, I can help with part of this.  First the good news.

If you go to http://profiles.ucsf.edu/Profiles102SPARQLAPI/ProfilesSPARQLAPI.svc and click on the “Help page” link you will see some very ugly yet helpful information (automatically generated by .NET) that explains how to build an HTTP Post into our service.  As an example, from the form the code builds the following XML and does a POST with content type “text/xml” to http://prod1-profiles.ucsf.edu/Profiles102SPARQLAPI/ProfilesSPARQLAPI.svc/Search :
<query-request><query>PREFIX core: &lt;http://vivoweb.org/ontology/core#&gt;PREFIX foaf: &lt;http://xmlns.com/foaf/0.1/&gt;SELECT DISTINCT ?p ?o WHERE {    ?s foaf:firstName &quot;Griffin&quot; .    ?s foaf:lastName &quot;Weber&quot; .    ?s ?p ?o}</query></query-request>

The not so good news.  The performance is horrid and I don’t know if there is much we can do about it.

The SemWeb code does a lookup based on a some nested views where one of the columns in the WHERE clause is a function based on underlying physical tables. The following example query ( a part of what you need to just look up someone by name as in our example) takes 13 minutes to return: SELECT id, hash FROM [profiles_prod].[RDF.SemWeb].vwPublic_literals WHERE hash IN (N'+nz2QpCjYuclfcfCvvC4FYZUnms=' , N'dR0IoA0fUefnVXqcOQV56k5zrnU=');

SQL Server allows you to create indexes in views and on computed columns, and that’s been done. Yet still, a part of our process for a simple name lookup takes that long.

Eric

Panel

From: David Eichmann [mailto:david-eichmann@uiowa.edu]
Sent: Wednesday, March 27, 2013 9:15 AM
To: Meeks, Eric
Subject: Re: SPARQL internal server errors

Eric,

   Griffin has never actually provided the URL for their endpoint, only said that they have one.  Our SPARQL endpoint human interface is at:

http://research.icts.uiowa.edu/rdfexpo/createQuery.html

which targets

http://research.icts.uiowa.edu/rdfexpo/query.html

This second one is what I'm hitting programmatically.  The Northwestern endpoint is a little funky in that you have to double escape the query, but their endpoint is at

http://vivo.scholars.northwestern.edu/sparql

I've endpoints for Cornell and Floriday as well, but they seem to be offline at the moment…  The current issue with programmatically accessing UCSF is that there doesn't seem to be a 'raw' endpoint to interrogate - it's masked by the asp page.  Do you have a means of generating a full dump of your RDF?  I could stand an endpoint up with the data for us to experiment with.

- Dave

On Mar 26, 2013, at 9:23 PM, Meeks, Eric wrote:

Panel


Hmm.  Can I ask, are you seeing the same issues when you try and query Harvard’s system?

This is a part of the system that we haven’t touched, so we have what Harvard handed to us. Then based on what I see in their code, for the SPARQL endpoint they’ve more or less just packaged up what they found here:http://razor.occams.info/code/semweb/

Regardless of what you are seeing from Harvard’s site, this is something I hope we can resolve.  And to better help understand what a proper programmatic endpoint would look like, can you point me to a good example?  I’ve started to dive into some of the SemWeb code, so I do at least know where to start.

Thanks
Eric

From: David Eichmann [mailto:david-eichmann@uiowa.edu]
Sent: Tuesday, March 26, 2013 7:10 PM
To: Meeks, Eric
Subject: Re: SPARQL internal server errors

Eric,

   Looks like I can run *some* queries now.  Unfortunately, it appears that your endpoint doesn't support count() as an aggregate function - I get a syntax error when plugging one of my queries into your form.  I also don't see a means of programatically interrogated the endpoint… :-(

- Dave

On Mar 26, 2013, at 2:01 PM, Meeks, Eric wrote:

Panel

I’ve fixed the issue, so now it just times out instead J
 
We need the latest version of the Profiles code and even that might not resolve our issue. From what I understand the .NET Sparql framework is something that the Profiles team found in Open Source, and I question if it can realistically operate at our scale without some serious tuning.
 
Thanks for the heads up!
Eric
 
From: David Eichmann [mailto:david-eichmann@uiowa.edu]
Sent: Monday, March 25, 2013 4:32 PM
To: Meeks, Eric
Subject: Re: SPARQL internal server errors
 
I just tried http://profiles.ucsf.edu/sparql and got a form, so I tried it…  ;-)
 
- Dave
 
On Mar 25, 2013, at 6:28 PM, Meeks, Eric wrote:

Panel

I'll look into this. What URL are you using?
I'm not sure we've published our endpoint!

Sent from my Verizon Wireless Droid


-----Original message-----
From: David Eichmann <david-eichmann@uiowa.edu>
To: "Meeks, Eric" <eric.meeks@ucsf.edu>
Sent: Mon, Mar 25, 2013 23:26:01 GMT+00:00
Subject: SPARQL internal server errors

Eric,

    I just tried your SPARQL endpoint with the sample query and got a 500 internal server error…

- Dave