whg.bulk_import

Bulk-import dataset metadata from the World Historical Gazetteer (WHG) API.

The WHG public API returns a list of datasets (Feature objects) at:

GET https://whgazetteer.org/api/datasets

Attributes

WHG_DATASETS_URL

logger

plugin

here

Functions

_parse_features(...)

Validate raw API items into Feature records.

fetch_datasets(...)

Fetch all datasets from the WHG API.

bulk_translate(...)

Translate a list of raw WHG feature dicts to OME EducationResource cards.

bulk_import(→ list[dict])

Fetch WHG datasets and return serialised OME records.

Module Contents

whg.bulk_import.WHG_DATASETS_URL = 'https://whgazetteer.org/api/datasets'
whg.bulk_import.logger
whg.bulk_import.plugin
whg.bulk_import._parse_features(items: list) list[server.plugins.whg.whg_models.Feature]

Validate raw API items into Feature records.

Uses a Python 3.13+ ExceptionGroup to gather and report all validation errors at once rather than stopping at the first malformed record. Valid records are returned even when some items fail validation.

whg.bulk_import.fetch_datasets(url: str = WHG_DATASETS_URL) list[server.plugins.whg.whg_models.Feature]

Fetch all datasets from the WHG API.

Args:

url: WHG datasets API endpoint.

Returns:

A list of Feature records.

whg.bulk_import.bulk_translate(features: list[dict]) collections.abc.Iterator[server.plugins.ome_plugin.EducationResource]

Translate a list of raw WHG feature dicts to OME EducationResource cards.

whg.bulk_import.bulk_import(url: str = WHG_DATASETS_URL, cache_path: pathlib.Path | None = None) list[dict]

Fetch WHG datasets and return serialised OME records.

Results are cached locally so that repeated runs do not re-fetch the API.

Args:

url: WHG datasets API endpoint. cache_path: Path to the local JSON cache file. If None, defaults

to whg.json next to this module.

Returns:

A list of serialised EducationResource dicts.

whg.bulk_import.here