Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

List authors by number of publications

The WHERE clause begins by finding people who have one or more authorships.  Each person in the result must have an rdfs:label.  The query then groups by author.  A COUNT aggregation is used to get the publication count.  The results are ordered by descending publication count – the authors with the highest number of publications will be listed first.  Some people have more than one rdfs:label – they may have name variations and/or they may have names with or without various language tags.  An aggregation on the author label returns a single author name.  In this query we use STR to remove any language tags, and then use MIN to get the first name variant in alphabetical order.  If the person has just one rdfs:label, that label is returned as the name.

Note:  When GROUP BY is used, aggregations must be used in the SELECT statement to indicate what will be done with multiple values of each attribute to appear in the result.

Note:  The use of the COUNT aggregation with a GROUP BY is a standard SPARQL pattern for generating a frequency table.  In this query we are creating a frequency table of publications by author.

Code Block
titleList authors by number of publications
linenumberstrue
SELECT (MIN(strSTR(?author_label)) AS ?author_name) (COUNT(DISTINCT ?authorship) AS ?pub_count)
WHERE {
    ?author vivo:relatedBy ?authorship .
    ?author a foaf:Person .
    ?authorship a vivo:Authorship .
    
    ?author rdfs:label ?author_label .

}
GROUP BY ?author
ORDER BY DESC(?pub_count)

...