prelinger.fetch_prelinger_videos

Fetch Prelinger video metadata from the Internet Archive.

Two complementary functions are provided:

  • search_prelinger - queries the Internet Archive Advanced Search API to discover video identifiers within the Prelinger collection.

  • fetch_item_metadata - retrieves the full metadata for a single item via the Internet Archive Metadata (md-read) API.

Usage (run directly):

uv run server/plugins/prelinger/fetch_prelinger_videos.py

Results are cached in prelinger_finland_videos.json inside this directory.

Attributes

IA_SEARCH_URL

IA_METADATA_URL

HEADERS

plugin

Functions

search_prelinger(→ list[str])

Search the Prelinger collection and return a list of item identifiers.

fetch_item_metadata(...)

Fetch the full metadata for a single Prelinger item via the md-read API.

bulk_import(...)

Search the Prelinger collection for query and return a list of

Module Contents

prelinger.fetch_prelinger_videos.IA_SEARCH_URL = 'https://archive.org/advancedsearch.php'
prelinger.fetch_prelinger_videos.IA_METADATA_URL = 'https://archive.org/metadata/{identifier}'
prelinger.fetch_prelinger_videos.HEADERS
prelinger.fetch_prelinger_videos.search_prelinger(query: str, rows: int = 50, *, httpx_client: httpx.Client) list[str]

Search the Prelinger collection and return a list of item identifiers.

Uses the Internet Archive Advanced Search API: GET https://archive.org/advancedsearch.php

prelinger.fetch_prelinger_videos.fetch_item_metadata(identifier: str, *, httpx_client: httpx.Client) server.plugins.prelinger.prelinger_models.PrelingerItem

Fetch the full metadata for a single Prelinger item via the md-read API.

GET https://archive.org/metadata/{identifier}

prelinger.fetch_prelinger_videos.bulk_import(query: str = 'finland', rows: int = 50) list[server.plugins.prelinger.prelinger_models.PrelingerItem]

Search the Prelinger collection for query and return a list of PrelingerItem objects with full metadata.

Results are cached in prelinger_finland_videos.json (when query is "finland"). Re-run with a fresh environment to bypass the cache.

prelinger.fetch_prelinger_videos.plugin