7.1 KiB
API and Data Handoff
This is a handoff snapshot, not the product SSOT.
Product SSOT: mall-docs report-notebooklm docs, snapshot date: 2026-06-03.
Current Implementation Status
Implemented in this repository:
- FastAPI app under
/api/report-notebooklm/v1. - SQLAlchemy model layer for the Phase 1 table set.
- Alembic initial migration.
- Seed import script with institutions, reports, modules, audio assets, users, favorites, and playback-progress fixtures.
- Public read endpoints for health, feeds, reports, modules, institutions, and listen list.
- Tests covering seed counts, public response shape, module visibility, gray-source handling, and listen behavior.
Not implemented yet:
- Auth APIs.
- Personal state APIs.
- Audio stream signing endpoint.
- Outbound events endpoint.
- Internal management APIs.
- Real Redis cache invalidation policy.
- Real object-storage signed URL policy.
- Production pagination/cursor behavior beyond seed-scale responses.
Data Tables
| Table | Purpose | Current model |
|---|---|---|
institutions |
Institution profile, source tier, website, topics, credibility notes. | Implemented |
reports |
Report master record, source, topics, publication state, cache version. | Implemented |
raw_artifacts |
NotebookLM artifact metadata and object-storage references. | Implemented as metadata only |
display_artifacts |
Reviewed display version metadata for App consumption. | Implemented |
display_modules |
Detail-page modules, sort order, visibility, content or content reference. | Implemented |
audio_assets |
Audio metadata and object-storage key. | Implemented |
related_news |
Related-source candidates and reviewed related items. | Implemented |
users |
User account records. | Implemented as seed model, no auth routes |
favorites |
User report favorites. | Implemented as seed model, no API routes |
reading_history |
User reading/history events. | Implemented as model, no API routes |
saved_listens |
User saved-listen records. | Implemented as model, no API routes |
playback_progress |
Playback progress sync records. | Implemented as seed model, no API routes |
outbound_events |
External attribution events. | Implemented as model, no API route |
Public API Implemented
Prefix: /api/report-notebooklm/v1
| Method | Path | Purpose |
|---|---|---|
GET |
/health |
Service health. |
GET |
/feed/recommended |
Published report cards for recommendation feed. |
GET |
/reports |
Published report cards with basic filters. |
GET |
/reports/{report_id} |
Report detail skeleton and published modules. |
GET |
/reports/{report_id}/modules/{module_id} |
Full content for a visible module. |
GET |
/institutions |
Active institution list. |
GET |
/institutions/{institution_id} |
Institution detail with latest/recent reports. |
GET |
/listen |
Published audio-backed report list. |
Current filters:
/reports:topic,institution_id,has_audio,source_tier,q,page_size./institutions:topic,source_tier,page_size./feed/recommendedand/listen:page_size.
Current pagination is seed-scale. Responses return next_cursor: null and has_more: false.
Planned Public API
The Phase 1 contract also expects:
| Method | Path | Purpose |
|---|---|---|
GET |
/audio/{audio_id}/stream |
Return short-lived playable URL. |
POST |
/outbound/events |
Persist external attribution click event. |
Audio stream must not return a permanent object-storage URL. The planned behavior is backend-signed short-lived playback URL with no download URL.
Planned Auth and Personal State API
Auth:
POST /auth/phone/startPOST /auth/phone/verifyPOST /auth/wechatPOST /auth/apple
Personal state:
GET /meGET /me/favoritesPOST /me/favoritesDELETE /me/favorites/{report_id}GET /me/historyPOST /me/historyGET /me/listens/savedPOST /me/listens/savedDELETE /me/listens/saved/{audio_id}POST /me/playback-progressGET /me/playback-progress/{audio_id}
These endpoints are contract-level requirements but are not implemented in this scaffold.
Planned Internal API
Internal APIs should require service token and network allowlist. They must never be exposed to the App.
POST /internal/reportsPOST /internal/reports/{report_id}/raw-artifactsGET /internal/reports/{report_id}/raw-artifactsPOST /internal/reports/{report_id}/display-artifactsPATCH /internal/modules/{module_id}POST /internal/reports/{report_id}/publishPOST /internal/reports/{report_id}/hidePOST /internal/related-news/candidates
Publishing should update report display status, update has_audio, bump cache_version, and clear related cache keys.
Public vs Internal Fields
Public responses may expose:
- Report identity, title, subtitle, one-liner, topics, institution card, release time, source tier, interpretation label,
has_audio, andcache_version. - Detail source note, source URL where allowed, risk disclaimer, and published display modules.
- Module metadata needed by the client:
module_id,type,layer,render_mode,has_detail_page,is_publish_blocking,requires_human_review,sort_order,title_cn,content,preview,content_ref,content_etag.
Public responses must not expose:
- Raw artifact payload.
- Object-storage private paths for raw artifacts.
- NotebookLM notebook IDs, source IDs, conversation IDs, or local account information.
- Local filesystem paths.
display_versionormodule.version.- User phone hash, WeChat OpenID, Apple user ID, or auth internals.
The public cache contract is a single cache_version string. display_version and module version are server-internal fields only.
Seed Data
The seed importer currently creates:
- 18 institutions.
- 27 reports, including one NotebookLM sample report and multiple boundary cases.
- 15 audio assets.
- More than 120 display modules.
- Test users, favorites, and playback progress.
Seed boundary cases intentionally cover:
- Reports with audio and reports without audio.
- Hidden/unpublished report behavior.
- Gray broker source with restricted source URL behavior.
- Published modules vs review-only modules.
study_guidemodule replacing legacyfaq.- Heavy modules using
card_plus_pagepreview plus full-module endpoint.
Do not treat seed content as production content. It exists to exercise app/API behavior and edge cases.
Detail Module Model
The detail page uses a skeleton plus module model:
- Inline modules include small
contentdirectly in the detail response. - Heavy modules use
render_mode=card_plus_page, returnpreviewin detail, and load full content from/reports/{report_id}/modules/{module_id}. - Unknown future module types should not break the App; they should fall back to hidden or generic rendering.
Core module types:
basic_infoexecutive_overviewcore_insightskey_datasource_complianceinstitutiondifferentiated_viewweaknessestimelinestudy_guidestructure_graphrelated_sourcesinfographicaudioresearch_discovery