Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

THIS IS AN OLD VERSION. HERE IS THE LATEST ARKs FAQ.

Table of Contents

Basics

How can I give feedback on this document?

...

The only prerequisite is to fill out an online request for a NAAN on behalf of your organization. There is no charge to obtain a NAAN and all memory organizations are welcome. Within a day or two you should receive an email containing a NAAN for your organization's exclusive use. Meanwhile consider the following.

...

  • What things do you want to name with ARKs? Generally you name objects that you own, control, or manage.
  • Will you assign ARKs to things contained in larger things that have ARKs? This (granularity) is not a problem, and the '/' character may help.
  • Where do you want your ARKs to resolve to? Examples: formatted file, surrogate for a physical thing, landing page with choices, etc.
  • Which web server will host your objects? You are asked this when you request a NAAN, even if it's not yet working.
  • Which web server/resolver will you use as hostname in the ARK-based URLs that you advertise/publish?
  • ...

    It's like serving ordinary URLs. Incoming URL strings get mapped to content that you return, and if your resolver redirects ARKs to those URLs, you're all set. If you're dealing directly with incoming ARK strings, you can map (convert) them to a form your server handles (eg, map them to URLs on arrival). In this second case, your server is acting as a local resolver.

    If you choose to run your own ARK infrastructure, you get complete autonomy at the expense of maintaining a server/resolver. On the one hand, you might run all custom infrastructure – including content management, web hosting, minting (generating unique identifier strings), and running your own server/resolver. That infrastructure could be very simple, such as server configured to map incoming ARK-based URLs to server file pathnames. When you request your NAAN you will be asked to supply the base URL of your local server or resolver.

    At the other extreme, you might work with a vendor that supplies all the infrastructure so that, for example, you can focus on creating content. Hybrid solutions are also common, such as just taking your current web server arrangement and just adding an identifier management piece (eg, the API/UI provided by ezid.cdlib.org, which partners with n2t.net).

    You will also want to think about whether to advertise (release, publish, disseminate) your ARKs based at your resolver or at n2t.net. You might choose the former for branding or the latter for stability. Resolving your ARKs through n2t.net is always possible, regardless of how you advertise them (this is a side-effect of obtaining a NAAN).

    ...

    • To keep costs down.
    • To work with exactly the metadata you want.
    • To be able to create identifiers without metadata.
    • To have an identifier as soon as you create the first draft of your data.
    • To hold that identifier private while the data and metadata evolve, and decide (maybe years) later, to publish or discard it.
    • To retain that identifier upon publication, perhaps then assigning an additional identifier, such as a DOI.
    • Because ARKs were built for generic application and don't have to be tortured into identifying physical samples or field stations.
    • To be able to change vendor and/or infrastructure without having to coordinate a database transfers with a central authority.
    • To be able to deal with the namespace splitting problem without losing control of your identifiers.
    • To link identifiers to different kinds of nuanced persistence commitments.
    • To be able to add queries (eg, ?lang=en) when resolving your identifiers.
    • To use open infrastructure consistent with your organization's values.
    • To link directly to the objects you value instead of to landing pages.
    • To create one identifier that enables millions (suffix passthrough).
    • To access convenient, full-function metadata via inflections 120979533.

    What does ARK have in common with DOI, Handle, PURL, and URN?

    ...

    • Landing pages: Crossref and DataCite DOIs link to publisher landing pages constructed around but not directly to objects you care about, but ARKs can freely link directly to objects you care about, which is machine- and human-friendly since it does not require an extra human navigation step for common tasks such as
      • opening an article's PDF file for reading,
      • referencing an image file meant to be incorporated automatically inline into a document, and
      • citing a spreadsheet to be used for direct data analysis software.
    • DOIs, Handles, etc. do not support ARK-style inflections120979533 that permit access to metadata regardless of whether an identifier points to an object or its landing page.
    • Unlike DOIs and Handles, ARKs don't have metadata requirements. ARKs that haven't been released into the world are easy to delete.
    • All things eventually pass, including hostnames and the web itself and the "https://" protocol. When that first part of the identifier ceases to have meaning, only ARKs and URNs will include the label (eg, "ark:") indicating the type of identifier that remains.
    • For DOIs, Handles, and PURLs, you are required to use their respective resolvers. ARKs and URNs, permit you to use your own resolver.
    • To create DOIs and Handles, you are required to pay a membership fee and, for DOIs, per-DOI charges. There are no fees for ARKs, PURLs, and URNs.
    • To create Handles, you are required to install and maintain a local Handle server, which gives you another system to monitor, patch, and troubleshoot.
    • Although you can use a local or vendor resolver for your ARKs and URNs, ARKs can be resolved via the global n2t.net resolver.
    • The envisioned URN resolution infrastructure was never built, so URNs are currently resolved as URLs, and there is no designated global URN-as-URL resolver. In order to register to create URNs, you must apply for a URN namespace.

    ARKs have some unique features that support early object development: ARKs can be deleted, can be born with no metadata, and can exist with any metadata you care to store. 

    ...

    Yes. Sometimes having two identifiers is useful, although it can become confusing when it happens often. Many people start by assigning ARKs to each thing they create in order to have a stable reference right from the beginning, even before they know whether they want to publish it, let alone keep it. Starting with an ARK, you benefit from being able to keep the original identifier from birth through to public release as the object and its metadata matures. For the subset of things that you end up wanting to publish in places that require DOIs, you can assign DOIs at publication time. This is a way in which ARKs support early object development.

    In such a scenario, to reduce the burden of maintaining both identifiers you could register the DOI to redirect to the ARK. At the cost of maintaining just one identifier (the ARK), this would keep newly published links and links previously stored and bookmarked by your collaborators from breaking.

    ...

    Creating metadata (extra information associated with or describing an object) has several key benefits. First, no matter what the ARK redirects to – whether a landing page or a file – metadata gives users vital information about the object, such as references to newer versions, creation date, provenance, etc. For ARKs typically metadata is accessed via inflections120979533.

    Metadata also eases some persistence pain. By themselves, persistent identifier strings are often opaque, revealing little about what they identify (because non-opaque identifiers do not age or travel well). But opaque identifiers are difficult because they give you no clues as to what the identifiers were meant to identify. In the absence of metadata you are forced to access the object itself to remind yourself what it is, and to trust that it's the correct object. Metadata really helps. Moreover, discrepancies between returned metadata and the accessed object help everyone detect identifier changes and errors. 

    ...