One caution here – it's important to think carefully about the default namespace you use for URIs in VIVO if you want linked data requests to work – please see A simple installation and look for the basic settings of the vitroVitro.defaultnamespacedefaultNamespace.
Doing further cleanup once in VIVO
VIVO's interactive editing can be helpful in fixing problems in relatively small datasets but it's important to remember that if data originate outside of VIVO and are not corrected at the source, subsequent updates will likely re-introduce the error.
It's also common to have discrepancies in the source data – for example, the naming conventions and identifiers used for departments in a personnel database vs. those used in a grants administration system. There is a command to merge two individuals in VIVO, specifying which URI to retain, but that will combine the statements associated with both, leading to duplicate labels and potentially other duplicates.
VIVO has only limited support for owl:sameAs reasoning due to the performance implications of having to query for all statements about more than one URI whenever rendering information about any one URI declared to be sameAs another.
Many VIVO installations have developed workflows for checking VIVO data, ranging from broken link checkers to nightly SPARQL queries to detect malformed data such as publications without authors, people without identifiers, 'orphaned' dates no longer referenced by any property statements, and so forth. These tools have been discussed on previous Apps and Tools Interest Group calls that have been recorded and uploaded to YouTube.