The WHG data pipeline: upload, enhance, link, publish

I'd like to explain the basics of the WHG pipeline here, but haven't got the time right now.

Aperiam perferendis asperiores voluptates quidem quae minima dolore nulla illum, ea assumenda et laudantium odit dolore officiis, saepe beatae consectetur esse quaerat minima cumque quibusdam dicta totam, quae voluptatem architecto vitae provident facilis quis omnis?

Upload and Enhance

Upload your place data to a private workspace in the WHG, then augment it with geometry and authority identifiers from Wikidata.

Upload data guide

WHG supports Linked Places format (LPF) and a simpler delimited file format, LP-Delimited (LP-D).

Records from valid uploaded files (one per place) are written to our relational database. Each is assigned a permanent pid identifier, maintained along with your own unique identifier for as long as the dataset remains in the system.

Enhance data guide

WHG's reconciliation service finds potential matches to your records in our index of 3.5 million Wikidata places. These are presented in a review screen, and for each accepted match, one or more geometry and authority URI records are written to the database—enhancing your record without altering it.

Publish and Link

Your published place data can be very useful to others working in the same region—more so when your individual records are linked with those of other datasets in WHG's "union index."

Publish data guide

Once a dataset has been reconciled with Wikidata and its metadata is complete, WHG staff can flag it as public, making its records discoverable in the platform's search, browse, and API features.

Link data guide

The final step in accessioning a dataset to WHG is reconciling its records against our growing "union index." This step links matched records for each given place, generating a cluster or "mini-graph" of one or more attestations ite.