project_gutenberg.bulk_import¶
Attributes¶
Functions¶
|
Fetch books from the Gutendex REST API matching query. |
|
Translate a list of raw Gutendex book dicts to OME EducationResource cards. |
|
Fetch Project Gutenberg books by search query and cache results locally. |
Module Contents¶
- project_gutenberg.bulk_import.GUTENDEX_BOOKS_URL = 'https://gutendex.com/books/'¶
- project_gutenberg.bulk_import.DEFAULT_SEARCH_QUERY = 'Sherlock Holmes'¶
- project_gutenberg.bulk_import.DEFAULT_LIMIT = 32¶
- project_gutenberg.bulk_import.API_TIMEOUT_SECONDS = 30.0¶
- project_gutenberg.bulk_import.logger¶
- project_gutenberg.bulk_import.plugin¶
- async project_gutenberg.bulk_import.fetch_books(query: str = DEFAULT_SEARCH_QUERY, limit: int = DEFAULT_LIMIT) list[server.plugins.project_gutenberg.gutenberg_models.GutenbergBook]¶
Fetch books from the Gutendex REST API matching query.
Gutendex paginates in pages of 32 books. This function fetches pages until limit books have been collected or no more pages are available.
- Args:
query: Search query string (e.g.
"Sherlock Holmes"). limit: Maximum total number of books to return.- Returns:
A list of
GutenbergBookrecords.- Raises:
RuntimeError: If an API request fails.
- project_gutenberg.bulk_import.bulk_translate(books: list[dict]) collections.abc.Iterator[server.plugins.ome_plugin.EducationResource]¶
Translate a list of raw Gutendex book dicts to OME EducationResource cards.
- project_gutenberg.bulk_import.bulk_import(query: str = DEFAULT_SEARCH_QUERY, limit: int = DEFAULT_LIMIT, cache_path: pathlib.Path | None = None) list[dict]¶
Fetch Project Gutenberg books by search query and cache results locally.
On the first run the function calls the Gutendex REST API and writes the results to cache_path. Subsequent calls read from the cache so that the network is not hit again.
- Args:
query: Search query passed to the Gutendex REST API. limit: Maximum number of books to fetch from the API. cache_path: Path for the local JSON cache. Defaults to
gutenberg_sherlock_holmes.jsonnext to this file.- Returns:
A list of serialised
EducationResourcedicts.
- project_gutenberg.bulk_import.here¶