whg.bulk_import¶
Bulk-import dataset metadata from the World Historical Gazetteer (WHG) API.
- The WHG public API returns a list of datasets (
Featureobjects) at:
Attributes¶
Functions¶
|
Validate raw API items into Feature records. |
|
Fetch all datasets from the WHG API. |
|
Translate a list of raw WHG feature dicts to OME EducationResource cards. |
|
Fetch WHG datasets and return serialised OME records. |
Module Contents¶
- whg.bulk_import.WHG_DATASETS_URL = 'https://whgazetteer.org/api/datasets'¶
- whg.bulk_import.logger¶
- whg.bulk_import.plugin¶
- whg.bulk_import._parse_features(items: list) list[server.plugins.whg.whg_models.Feature]¶
Validate raw API items into Feature records.
Uses a Python 3.13+ ExceptionGroup to gather and report all validation errors at once rather than stopping at the first malformed record. Valid records are returned even when some items fail validation.
- whg.bulk_import.fetch_datasets(url: str = WHG_DATASETS_URL) list[server.plugins.whg.whg_models.Feature]¶
Fetch all datasets from the WHG API.
- Args:
url: WHG datasets API endpoint.
- Returns:
A list of
Featurerecords.
- whg.bulk_import.bulk_translate(features: list[dict]) collections.abc.Iterator[server.plugins.ome_plugin.EducationResource]¶
Translate a list of raw WHG feature dicts to OME EducationResource cards.
- whg.bulk_import.bulk_import(url: str = WHG_DATASETS_URL, cache_path: pathlib.Path | None = None) list[dict]¶
Fetch WHG datasets and return serialised OME records.
Results are cached locally so that repeated runs do not re-fetch the API.
- Args:
url: WHG datasets API endpoint. cache_path: Path to the local JSON cache file. If
None, defaultsto
whg.jsonnext to this module.- Returns:
A list of serialised
EducationResourcedicts.
- whg.bulk_import.here¶