WHG Walkthrough: Contributing and preparing data

Take the following steps to walk through uploading sample datasets to WHG and performing reconciliation against our internal index of the complete Getty Thesaurus of Geographic Names (TGN)

Register and Upload

  • Register on the site and log in
  • Click menu option 'My Data'
  • Click "add new" link
  • Fill in Create Dataset form
    • Title: any string
    • Label: a unique string, 20 or fewer characters
    • hint: try part of the title + '_' + your initials
    • Data type: Places is the only choice right now
    • Description: Briefly
    • URI base: leave blank unless the data is published. Hint: if published, e.g. 'http://myorg.org/places/'
    • Public?: Check if it's okay for anyone to view the data
    • Initial file
      • Choose a file: Example files are available via a link on the right side of the screen. Click to download the .zip file to a location you'll remember and expand it. Select one of these - either tsv or lpf format.
      • Format: LP-TSV (delimited text; spec); Linked Places format (JSON-LD & GeoJSON compatible; spec).
  • Click the "Upload" button

The data file will be validated for the chosen format, and if there are no errors, its contents are inserted into the WHG database and you are directed to its Portal page.

At this stage, you can:

  • browse the contents of the uploaded data (Browse tab)
  • initiate reconciliation tasks in order to identify records for matching places in Wikidata and the Getty Thesaurus of Geographic Names (TGN)

Reconciliation

WHG reconciliation services allows dataset owners to augment their data with additional geometry for more complete mapping and analysis, and with links to modern name authority resources like Getty TGN, and via Wikidata, GeoNames, VIAF, and DBpedia. Those links are the essential "glue" enabling the semi-automation of data linking.

  • Leave default settings in place, with Getty TGN selected
  • Click the "Start" button
  • For each record in the uploaded data file, a search is performed against an indexed copy of the entire TGN. Up to three passes (queries) are made; if the first returns no results, the second is performed, and so on. These are labeled pass1, pass2, and pass3 in the results.
  • Upon completion, a result summary is displayed, with links to review the prospective matches (hits) for each pass.
  • Click on the first 'review' link in the list on the right to begin

Reconciliation Review

Once a reconciliation task is complete, dataset owners and team members must review the prospective matches, declaring match/no match for each. This is made easier with our Review screen.

  • The Reconciliation Review screen presents all of the uploaded records that had any hits, one by one on the left and a list of the hits on the right.
  • If any of the hits on the right are a 'close match' or 'exact match' with your record, click the appropriate radio button, then the "Save" button. The screen then advances to the next record and the previous is removed from the queue.
  • Assertions of matches are saved to the WHG database as 'place_link' records, associated with your dataset's place record.
  • Additionally, if the "accept geometries in matches" box was checked when creating this reconciliation task, any geometries in the authority record (TGN in this case), are saved as new place_geom records, also linked to your dataset's place record.
  • If none of the hits are a match, simply click "Save" and the screen advances.
  • If none of the hits are a match, simply click "Save" and the screen advances. A help icon links to an explanation of the formal relations, closeMatch and exactMatch
  • The "related" choice is available experimentally. Any "related" assertions are recorded, but are not reflected in the interface at this time.
  • Note that if your record has a geometry, it will show up in the map as a green marker, and geometries from all of the hits appear as orange markers. Hovering over the globe symbol for a hit will highlight that record on the map.
  • Note that after the first save, an undo link appears on the left side of the grey banner. This will undo any result of the last save and return that record to the queue.

After reviewing all hits from all passes, affirming any matches discovered, you will have effectively augmented your dataset in the WHG database with new place_link and place_geom records. Those additions will be reflected in the Browse tab map and record details. Also, your dataset is now prepared for accessioning to the WHG index. Note that this step will be performed by WHG staff in consultation with contributors.