Calls are held every Thursday at 1 pm eastern time – convert to your time at http://www.thetimezoneconverter.com
Announcements
Ontology Working Group: next call is Thursday, May 1 15 at noon EDT/ 9 am PDT
Agenda to be determined -- look for an announcement
addressing persistent VIVO URIs and related issues
today’s call was focused on datasets and dataset ontologies, including options to engage with a W3C working group active in that area
The Health Care Life Sciences working group new draft standard for dataset description
We’re trying to think how VIVO can contribute
Submitting use cases on how we represent datasets in VIVO
Patrick West: DCAT ontology is what we’re using at RPI for the Deep Carbon project. Also have our own DCO ontology that we’d be happy to share. Integrating with CKAN for data flow and management.
Making people aware that VIVOs will have extensive contextual data around datasets
Apps & Tools Working Group: next call is April 29 at 1 pm EDTMay 13th at 1pm EDT
April 29th call:
Find links to YouTube videos of all previous calls there
On April 15 call: Chris talked about FundRef and funding org data and stable URIs -- match with citizen(something).org and then visualize this
Alexandre Rademaker (IBM Brazil) and getting specialized data into RDF and then into VIVO -- video will be on YouTube soon
did his whole presentation with Emacs
Had a call with David Wood yesterday about Callimachus
he spoke at the VIVO conference last year
Apps and Tools workshop at the conference. Looking for participants to do demos -- looking for the best ways to create, use, visualize VIVO data and would love to have additional authors and help
VIVO Bootcamp at ELAG 2014 (June 10 at University of Bath, United Kingdom)
Violeta Ilik from Texas A&M will be there
- the organizers (including Violeta) are looking for another representative from the VIVO community
Upcoming Activities
First annual survey of VIVO sites
see Paul's email to the lists: What does your site care about? Participate in the 2014 VIVO Annual Survey
The survey form is live: https://www.formstack.com/forms/?1704676-gIMWlsYzom
A presentation proposal has been submitted to the conference and in order to have time to compile and analyze the data, responses are requested by May 29
Next themed weekly call topic – April 24thMay 8th: VIVO End User Documentation – for supporting end users who will be editing in VIVO, whether individually or as proxy editors for a whole department (requested by University of Florida) -- can we make templated documentation where individual sites can put in tips or extra information? Rumor has it there’s a good document underway -- can this be shared on GitHub using GitHub pages? (Alex: GitHub being used by non-coders for local political campaigns and other collaborative writing projects)
Are there examples from other projects that do documentation better?
Format of themed calls could be a mix of presentations and working session
Note that we are seeking volunteer facilitators for each themed call -- let Alex, Paul, or Jon know directly or via the listserv(s)
May 22 -- will resume discussion of performance
Updates
Colorado Brown (Don)
focused on effort to roll out Orcid iD’s to our Faculty and PRA’s
sending a pre-registration email next week
collecting those over the summer and will be added to the Faculty Information System (FIS) database
inclined to wait for 1.6 since the ontology changed slightly in its representation of ORCID iDs with 1.6
looking at upgrading to VIVO 1.6 prior to ingesting Orcid iD’s into VIVO
big challenge will be to upgrade our harvest process. If there are any links to harvester code that has been converted to 1.6 I would like to see this to help bootstrap my process.
John Fereira <jaf30@cornell.edu> has volunteered to be the VIVO Harvester project maintainer
LASP - ontology complete.
Other people seem to be using other tools -- have other organizations done the upgrade with other tools?
a mix -- the Harvester can be seen as a framework designed to have modular steps where the code used for each step may be Python just as well as VIVO
mention of BESSIG April 16 meeting with LASP VIVO team? Attendees from local labs including a USGS (?) group
LASP is thinking of putting up a SPARQL endpoint
Cornell (Jon, Jim, Brian, et al)
VIVO 1.6.1 released -- not a lot of feedback yet
We have a number of other projects that may use all or elements of VIVO
Working toward a 1.7 release, keeping it modest, to get on a schedule of two releases a year in accordance with discussions at the DuraSpace Sponsor Summit in DC in early March
May and November seem to be the logical times to not get involved in the summer vacation and winter holiday periods
looking for a code freeze the end of May
Question: will there be major ontology changes with 1.7?
Answer: No. Just any bug fixes or improvements/clarifications that we receive feedback have been problems with 1.6
We do hope to produce the VIVO-ISF modules for 1.7 from the master ISF ontology in a more automated way, as is being worked on for eagle-i
Duke (Patrick)
Question about how to add book or artistic work reviews -- helpful reply from Michaeleen (using “featured in” predicate) -- starting to look into how to represent this as “review of” (BIBO) in VIVO, as well as in Elements
Upgrading Duke’s Ruby-based “Mapper” interop to Elements 4.6 API
Released core of “Mapper” and Duke ontology extensions on GitHub -- will send some info to list -- contact them with questions -- started this process at Hackathon
Florida (Chris)
Starting our 1.6 migration planning -- hoping to be migrated over before the Conference
Have a lot of data ingests built around the 1.5 ontology that need some re-working
Presented 2 posters (on figshare?) at American Medical Informatics Association (AMIA) Clinical Research Informatics Meeting.
dChecker (on github) - Sem Web data quality tool that is based on SPARQL and python.
Enterprise Data 2 RDF - PeopleSoft to UF VIVO RDF
have started to understand more of the test cases that have to check for in the ingest -- dual appointments, etc
Working on universal list of funding organizations (aka, NIH, Sloan, etc.) with URI’s and geospatial location data and IRS non profit data associated.
Wants to create a map showing the organizations funding active grants at UF
About 6,000 funding organizations have DOIs, and each of them has an EIN (employee identification number) from the IRS -- an important linking variable since is in FundRef also
Link to Mike’s Python tools on GitHub https://github.com/mconlon17
Apps & Tools group has been recruiting presenters -- not necessarily just on finished products -- starting to have working discussion sessions
IBM Research and FGV (http://www.fgv.br) (Alexandre)
Interested in VIVO+CKAN (datasets and people, along the lines of what the Deep Carbon Observatory at RPI has done). IBM Research and City of Rio de Janeiro project can be a very interesting use case.
Involved in the implementation of VIVO at FGV (unfortunately, the VIVO instance is not yet public. it has 23k people and 72k research instances). Problems and questions about VIVO Performance! -- a good topic for a themed call in May
Memorial University (Max)
have been trying to find documentation on migrating from 1.5 to 1.6 -- to be able to migrate their 1.5-era Knowledge Mobilization Ontology to be compatible with 1.6
had a session on Friday at the I-Fest with Melissa and Brian -- but Max’s hard drive crashed subsequently so is looking for notes
Brian -- can re-send the ontology file that were working on in Protégé, and instructions about how to download the ISF again
Yaffle update -- have been communicating with University of New Brunswick (UNB) and CNA (College of North Atlantic ) about sharing the Yaffle code
Thinking about separate instances for now
Have a front end design and have been doing usability tests
Have a Drupal front end that has been hooked up to VIVO, using MilesW’s Drupal RDF extension (link?)
Smithsonian (Alvin)
Meeting with Institutional stakeholders/funders to address their specific needs
Identifying data sources for import and writing RFP for contractor to import and establish method for updating/reporting per user needs
Jon: DuraSpace VIVO project starting to work on list of Registered Service Providers -- please let any contractors interested in VIVO work about DuraSpace’s RSP directory.
Symplectic (Alex)
attended UCSF-hosted Profiles developer call last Friday
joined SHARE Technical Working Group (SHARE webpage) -- attended call this past Tuesday
RPI (Patrick)
Upgraded from 1.5 to 1.6 and migrated to a new machine all in the same step.
Dealt with this issue: https://jira.duraspace.org/browse/VIVO-653
Dealt with this issue: https://jira.duraspace.org/browse/VIVO-711
Applied our changes to the code (CKAN and Handle integration)
Changed the URI of the DCO ontology. This caused an issue with the JENA index. Basically it was corrupt. So we would have to query using the old URI <http://deepcarbon.net/ontology/schema#projectAssociatedWith> but the result of the query would show the new URI <http://info.deepcarbon.net/schema#projectAssociatedWith>
Basically had to export all the RDF as n-triples (ingesting RDF/XML is broken in VIVO 1.6). Dropped the mysql datbase, created a new empty database, re-installed VIVO, started tomcat and let it fill in. The ingested the n-triples file. It’s still working (105,000 statements to work through)
Integration with CKAN broke with upgrade to 1.6. Fix to be installed later today
Still trying to work through the ontology changes in 1.6. The obo ontology is the focus of this. Since the predicate URIs are, um, not as self explanatory as you usually find.
Issue on Tuesday with MySQL creating massive temporary files (16GB each) given more complex SPARQL queries. This caused a file system full issue that had to be resolved by changing the MySQL configuration to point to a different temp directory.
Still having issues with Server Status … huh … actually, not seeing those issues at the moment. Perhaps it was tied to the index problem we were having.
Single Sign-on integration using shibboleth is completed in the test environment. We’re running tests now and hope to have in production by the end of the month.
We now have over 1000 individuals, and each one of those will be able to log in to VIVO and add/modify content given a certain set of permissions.
We’re starting to create some really, really cool visualizations using data from VIVO using our faceted browser, using Drupal, Google Maps and Google Earth, and using simple web pages.
Don in chat window: Hi Patrick - have you had a chance to sit on the VIVO ontology calls? We talked about the HCLSdataset w3c working group. Curious about your dcat extensions and this: http://htmlpreview.github.io/?https://github.com/joejimbo/HCLSDatasetDescriptions/blob/master/Overview.html
UCSF (Eric is out of the office today)
Virginia Tech (Keith)
Implementing Symplectic now and about 4 weeks out from getting data ingested into VIVO -- starting with 1 or 2 departments but are planning to expand to the whole of Virginia Tech
Weill Cornell (Paul)
Performance
Still struggling with downtime and slower performance of larger profiles
Brian was able to get our code/data performing well on his instance
Goal #1: test Brian’s instance with our load-testing tool
Goal #2: try installing VIVO on a dedicated server
Goal #3: install VIVO on a dedicated VM at Cornell Ithaca for performance testing so that we have an instance instrumented for test data coming from Weill and other orgs
Where do sites store their official organizational hierarchy including official name, type, sub/super organizations, leadership, etc.?
Chris -- at UF, VIVO is now the only place where there is a complete organization hierarchy
- Believe this is the case at Brown as well -- dealing with research institutes and programmers
Notable list traffic
Steve)
The Brown site is live http://vivo.brown.edu and also redirects from http://research.brown.edu
Patrick: we used a special JDBC setting to control for special character garbage created by ingest
in deploy.properties, add some configuration:
VitroConnection.DataSource.url = jdbc:mysql://localhost/dev_vivo?useUnicode=true&characterEncoding=utf-8
Did CVs come from another source? Were attached to the old site but also from other sources. Faculty members submitted CVs manually at the start of the process, but there was a legacy system for making them available and where new ones had sometimes been updated.
The CVs are held within the VIVO system itself
Next -- working on the publication layer and building that out -- a foundation to build more and more stuff on
Ted put together a Django app for editing as part of the umbrella management application -- Faculty will use the same system for editing/adding publications, updating the overview, adding pictures, etc.
Wired into 1.5 for the moment but will use web services in VIVO 1.6 -- communicating with SPARQL endpoints for now using add and remove requests
Thoughts of migrating to 1.6? Also on the docket -- would simplify the process of updating from the Django interface
Cornell (Jon, Jim, Tim, Brian, Huda)
We start the migration to 1.6 on May 8, Slope Day -- estimating one week to complete migration and plan to disable editing during that time
Has been using the 3-tier build along with Florida and Colorado, where local changes are added on top of VItro and VIVO
Duke (Patrick, Sheri, Richard)
Working on putting together plan for your next phase of work. This includes adding the following: non-faculty opt-in, collecting additional professional activity data, report by year and a news feed section.
The News site will include references to Scholars
Big task is to support reporting data out of the system
Adding artistic works to our widgets tool
Florida (Chris)
Ingesting Grants Data and Continue to work on Sponsors data for sharing related to FundRef.org
update to dChecker software. Now adds links to bad data in the email report. Code release soon on GitHub.
Figure: Image of Data Quality Report
Memorial University (Max Hu)
We just had an internal site running (Drupal), the old Yaffle accounts have been migrated. It’s available for testing and hooking up to VIVO.
The graphic design (Ver.02) will be completed by Friday (May 2rd.)
Continue working on the KMyaffle ontology to meeting vivo 1.6 ISF requirements.
Prepare data to ingest to VIVO 1.6
Scripps (Michaeleen)
no update
in middle of annual budget process so will fill out VIVO survey after budget deadline
kudos to Brown
Smithsonian (Alvin)
Have a pilot VIVO set up as a proof of concept -- upgraded to 1.6.1 this week and has suggested edits which he’s entered on the wiki
Met earlier in the week with people from EPA and IFPRI (Cristina Perez) on practical examples of getting data in.
Will be looking at ingest options and will share questions with the list
Stony Brook (Tammy)
Also working on uploading grants data -- interested in Chris’s work on shareable sponsor URIs at UF
Chris wants ontology advice on how to represent the data
a DOI for each of 6,000 sponsors in the US, but not shared as RDF on the Internet with URIs
idea is to have a single URI referencing the Sloan Foundation across multiple VIVOs
Needed to remove data, so pulled it out (confirmed by logs) but all appeared to be there in SPARQL queries
Does add/remove go to a scratchpad model?
Brian: any of the asserted triples that you’re directly deleting should disappear from KB-2, but not the inferred statements. Those get removed by the inferencer are put in a scratch pad that should be deleted after the processing completes; if the inferencer doesn’t complete it’s removal, there may be a scratch model still left hanging around
Tammy -- would a problem in the course of retraction abort removal of the remaining RDF included in the retraction?
Symplectic (Alex)
Information, Interaction, and Influence -- Digital Science workshop organized by Amy Brand (VP Academic & Research Relations) at University of Chicago on May 19-20: “Research Information Technologies and their Role in Advancing Science”
Free to attend for academics. Invited speakers include friends of VIVO: Kristi Holmes (Northwestern), Bill Barnett (Indiana), Simon Porter (Melbourne), Griffin Weber (Harvard), Leslie Yuan (UCSF), Rebecca Bryant (ORCID), Daniel Hook (Digital Science), Euan Adie (Altmetric)
More info at http://www.digital-science.com/events/information-interaction-and-influence -- link to Eventbrite registration
the ORCID Outreach Meeting is May 21 and 22, immediately following – https://orcid.org/content/orcid-outreach-meeting-and-codefest-may-2014
RPI (Patrick)
An engagement team for the DCO project has taken a look at the interface in general and has several questions and concerns -- can we have a dialog?
Sure -- let’s have a themed call to discuss, or can discuss offline
1. Project displaying issue on person’s page - when adding people to projects through the “other research activities” property on person’s page, the project is shown as “missing activity”, and this “missing activity” node is not associated with the project. Go to http://info.deepcarbon.net/vivo/display/da5b50ce3-2877-4e9f-9ba4-f645d811bf43 and click on the Research tab.
Tim -- has been in correspondence with Han about this -- relates to the most specific type
2. There is no custom form page for adding people to projects from a project’s page. That’s why we had to do the other way around and ran into issue #1. Any progress on this?
3. Two weeks in a row now we’ve had to restart tomcat due to out of memory issues. We’ve upped the memory numerous times, and there should be enough memory. Looking at the vivo log file we’re getting GRIDLOCK issues. Once the gridlock issue happens we run out of memory. https://scm.escience.rpi.edu/trac/ticket/2130
Brian -- the out of memory error is related to the permgen setting, not the heap -- did you update both heap and permgen
Patrick -- updated both; not a lot before the error in the log that looks like a problem
Brian -- let’s pursue offline
4. Our Publications page, which used to show information, now simply displays the page that says “This page is not yet configured. Implement a link to configure this page if the user has permission.” http://info.deepcarbon.net/vivo/publications . It is working in 1.5.
Our plan is that hundreds of users will be able to create, edit, delete instance data from our VIVO site. We’re not ingesting data from anywhere and users simply view the content, users will actually be able to create content. And we’ll be creating various policies for different groups of users. Are there any sites that currently do this?
The Engagement Team of Deep Carbon Observatory is requesting a usability study for VIVO. They have concerns with what they’ve seen so far (related to the previous bullet).
Patrick: to what extent do sites enable self-editing?
Jon/Chris: both our sites allow self-editing
Patrick/Jon: let’s have a themed call on usability
UCSF (Eric)
Just last week released a new version of Profiles code with the latest OpenSocial engine -- seems to be fast
Have a gadget to pull in grants that somebody else has adopted
Still scraping together data about co-authors, pulling in part from VIVOs
with Fuseki, speed is good but then runs off a cliff at a certain amount of data -- is it likely to need more memory?
Will be working with Symplectic to pull in publications from sources other than PubMed, including Scopus (Paul -- talk to Weill about their experience and scripts)
might try the Harvesters built for VIVO
Virginia Tech (Julie)
Keith right in the middle of Symplectic Elements implementation -- installation and configuration and connecting to data sources that require institutional subscriptions
Anticipate working with Symplectic’s VIVO connector about the middle of May
Interested in other institutions’ experience with user engagement
Committees of researchers to make decisions about ontology extensions, for example? -- Duke has a set of power users who help with training but also provide feedback from users
Would this be a good topic for a themed call?
Weill Cornell (Paul)
Performance? Gave Brian and Jim the Weill code and data to install on a Mac and a VM in Ithaca; Jim has been helping install VIVO on a new VM at Weill and are seeing better performance with that installation. Currently looking at fine tuning and documenting settings in detail.
A Google Doc for the emerging performance documentation (based on last week’s themed call) while it’s still in flux -- will then go on the wiki
Wiki home for performance guide will be https://wiki.duraspace.org/display/VIVO/Performance+Troubleshooting+Guide
Temporal doc linked from there for collab edit https://docs.google.com/a/symplectic.co.uk/document/d/1ylp9HEzJiBsBP6vx1vd-Irf8o3Ff-5vDhytOVTI5_Ho/edit#heading=h.vdtjwwvnjdn7
Notable list traffic
- VIVO Harvester, Pubmed Sample Issue
- Cache map of science
- OpenVis Conf in Boston this week
- Harvester 1.5 Code
sometimes when you log in you get a bunch of status warnings -- fixed by dropping MySQL VIVO DB
a complaint about the XML prefix when trying to load an RDF/XML document in 1.6 -- is this a known problem? Jim -- there was a problem in the develop branch, but should be fixed in 1.6.1
problem manifests that when uploading files, settings in form fields selecting formats other than RDF/XML are ignored
this sounds like it may be a different problem -- if you see the exception again, please send a screenshot
- URLs
See the vivo-dev-all archive and vivo-imp-issues archive for complete email threads
Call-in Information
...