6.0 KiB
Content Pipeline Handoff
This is a handoff snapshot, not the product SSOT.
Product SSOT: mall-docs report-notebooklm docs, snapshot date: 2026-06-03.
Content Principle
Use NotebookLM as a source-driven research engine, not as a generic rewriting model.
The pipeline may orchestrate, clean, validate, map, and review NotebookLM-native artifacts. It must not silently replace missing NotebookLM artifacts with locally rewritten publishable content.
Source Inputs
Phase 1 content is based on public or authorized institutional research reports. Priority source categories:
- Official public sources.
- Authorized partner sources.
- Gray broker public sources, with stricter review and source display handling.
Vision source lists, tiering, and historical source-health experience may be used as reference material. Production data must not depend on a local Vision runtime, local path, local cache, or local account state.
NotebookLM Workflow
Recommended report run order:
- Inspect the source PDF: title, institution, date, page count, size, and report type.
- Create or reuse one notebook for one report source unless a multi-report synthesis is explicitly planned.
- Upload the report source.
- Generate the P0 text package:
- source description
- native Briefing Doc
- native Blog Post
- data table
- query dimensions
- query key data
- query divergence
- query weaknesses
- Generate useful P1 artifacts:
- query timeline
- query related sources
- Study Guide
- mind map, if download succeeds
- Generate P2 artifacts asynchronously:
- infographic candidate
- audio brief
- research discovery
- Persist every artifact status in a manifest.
- Deterministically assemble display modules from reviewed artifacts.
- Run human review before publishing.
Artifact Types
The Phase 1 schema supports these NotebookLM artifact types:
| Artifact type | Purpose | Publish blocking | Human review |
|---|---|---|---|
source_summary |
Source-level summary. | No | No |
notebook_summary |
Notebook-level summary. | No | No |
native_briefing_doc |
Native briefing document. | Yes | No |
native_blog_post |
Native blog post. | Yes | No |
native_study_guide |
FAQ, study guide, glossary. | No | No |
data_table |
Structured table data. | Yes | No |
mind_map |
Mind map or graph source. | No | No |
query_dimensions |
Analysis dimensions. | Yes | No |
query_key_data |
Key data points. | Yes | No |
query_divergence |
Views that diverge from consensus. | No | No |
query_weaknesses |
Weaknesses and open questions. | No | No |
query_timeline |
Timeline and turning points. | No | No |
query_related_sources |
Related source candidates. | No | Yes |
research_discovery |
Enrichment queue. | No | Yes |
infographic |
Candidate public image. | No | Yes |
audio_brief |
Listening preview or audio source. | No | No |
Artifact records should keep status, object reference, format, size, hash, generated time, error, and review flags. Raw payloads should stay in object storage and remain internal.
Module Mapping
| Product module | Primary artifact sources | Notes |
|---|---|---|
basic_info |
Source metadata and source summary. | P0, inline. |
executive_overview |
Briefing Doc and Blog Post. | P0, heavy card plus page. |
core_insights |
Briefing Doc and query dimensions. | P0, inline with optional detail page. |
key_data |
Data table and query key data. | P0, heavy card plus page. |
source_compliance |
Source metadata and review notes. | P0, inline, must include disclaimer. |
institution |
Institution record. | P0, inline. |
differentiated_view |
Query divergence. | P1, optional. |
weaknesses |
Query weaknesses. | P1, optional, avoid investment-advice wording. |
timeline |
Query timeline. | P1, optional. |
study_guide |
Native Study Guide. | P1, optional, replaces legacy faq. |
structure_graph |
Mind map or deterministic fallback. | P1, optional. |
related_sources |
Related-source query and review queue. | P1, review required before display. |
infographic |
Infographic candidate. | P2, review required before display. |
audio |
Audio brief or reviewed audio asset. | P2, not required for text publish. |
research_discovery |
Research discovery queue. | P2, internal or reviewed only. |
Publish Gates
Blocking before public release:
- Source upload succeeded and is traceable.
- Required P0 text artifacts exist and have usable content.
basic_info,executive_overview,core_insights,key_data, andsource_complianceare present unless a product decision allows a partial report.- Display artifact is reviewed and approved.
- Source attribution and risk disclaimer are present.
- No raw artifact payload, local path, private notebook ID, or account information appears in public responses.
Non-blocking:
- Mind map.
- Study guide.
- Timeline.
- Related-source candidates.
- Research discovery.
- Infographic.
- Audio.
If optional artifacts fail, record the failure and continue without inventing fallback public copy. Deterministic fallback is allowed for structure graph from already available artifacts.
Cadence Notes
NotebookLM operations should be conservative by default:
- One active NotebookLM operation per account.
- Text artifacts first.
- Media artifacts after text success.
- Heavy media should not block publishable text.
- On transient failure, retry once; if an optional artifact fails again, mark it failed and continue.
The seed importer is not a production runner. A production runner should persist manifests after every operation and support resumable review/import.
Human Review
Review is mandatory for:
- Gray broker sources.
- Related-source candidate display.
- Infographic or generated media.
- Any content where citations/page labels are ambiguous.
- Any copy that could be interpreted as investment advice.
Do not display raw NotebookLM page labels until they are normalized against verifiable source pages or sections.