You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

February 24 LD4L Workshop breakout session: Usage Data

facilitator: Paul Deschner

  1. Usage data sources

    • OCR-ed bibliographies and page rank

    • ILL usage

    • Yahoo circ logs

    • Web analytics (e.g., DPLA UI analytics, esp. contextual granularity)

    • Search terms as form of usage; also as compared to other usage data

    • Entities extracted from queries, not simply literal queries themselves

    • How often a link is traversed; how many times your link has been reconciled in triple store

    • Browsed materials

    • Citations; also citation networks as compared to other usage data

    • Course-book lists across institutions

  2. StackScore

    • Makes data muddy

    • Too many metrics mixed together; need to separate out the metrics

    • Common metrics needed across institutions

    • Computational transparency important: metrics and algorithms

  3. Negative usage data at local institution

    • Important to see what users are looking for but local institution doesn’t have

    • What doesn’t circulate in-house but is available via ILL

    • What isn’t read at Columbia but at Yale

  4. Usage data runs risk of becoming prescriptive

    • Blandness of collections when everyone acquires most popular items

  5. Use cases

    • Keeping tabs on popularity of colleagues’ publications

    • Usage data as diagnostic tool for targeted collections: highly invested-in parts of collection not being used could drive arranging an exhibition to increase awareness

    • Scholars doing research on other scholars research and publications

    • Look at when items were used: what was checked out in last week, month, year, etc.

    • Link traversals and other link metrics could be sent to link’s source

  6. Long tail issue generally and at own institution

    • Options: random selection out of tail for exposure, subject-filtered selection

    • Important that UI expose long-tail possibilities prominently, above the page-fold

    • Usage data from other institutions and ILL balances out local-institution’s biases

  7. Privacy

    • Opt-in option for users willing to share their usage data

    • Huddersfield University (England): more liberal approach to data exposure, including access to clustering (users who borrowed this also borrowed that) and usage by academic course and school

    • IP-based web stats inherently less risky than personal ID-based circulation data

    • Anonymization tools important

    • Clustering dangerous
  • No labels