Content Pipeline Handoff

This is a handoff snapshot, not the product SSOT.

Product SSOT: mall-docs report-notebooklm docs, snapshot date: 2026-06-03.

Content Principle

Use NotebookLM as a source-driven research engine, not as a generic rewriting model.

The pipeline may orchestrate, clean, validate, map, and review NotebookLM-native artifacts. It must not silently replace missing NotebookLM artifacts with locally rewritten publishable content.

Source Inputs

Phase 1 content is based on public or authorized institutional research reports. Priority source categories:

Official public sources.
Authorized partner sources.
Gray broker public sources, with stricter review and source display handling.

Vision source lists, tiering, and historical source-health experience may be used as reference material. Production data must not depend on a local Vision runtime, local path, local cache, or local account state.

NotebookLM Workflow

Recommended report run order:

Inspect the source PDF: title, institution, date, page count, size, and report type.
Create or reuse one notebook for one report source unless a multi-report synthesis is explicitly planned.
Upload the report source.
Generate the P0 text package:
- source description
- native Briefing Doc
- native Blog Post
- data table
- query dimensions
- query key data
- query divergence
- query weaknesses
Generate useful P1 artifacts:
- query timeline
- query related sources
- Study Guide
- mind map, if download succeeds
Generate P2 artifacts asynchronously:
- infographic candidate
- audio brief
- research discovery
Persist every artifact status in a manifest.
Deterministically assemble display modules from reviewed artifacts.
Run human review before publishing.

Artifact Types

The Phase 1 schema supports these NotebookLM artifact types:

Artifact type	Purpose	Publish blocking	Human review
`source_summary`	Source-level summary.	No	No
`notebook_summary`	Notebook-level summary.	No	No
`native_briefing_doc`	Native briefing document.	Yes	No
`native_blog_post`	Native blog post.	Yes	No
`native_study_guide`	FAQ, study guide, glossary.	No	No
`data_table`	Structured table data.	Yes	No
`mind_map`	Mind map or graph source.	No	No
`query_dimensions`	Analysis dimensions.	Yes	No
`query_key_data`	Key data points.	Yes	No
`query_divergence`	Views that diverge from consensus.	No	No
`query_weaknesses`	Weaknesses and open questions.	No	No
`query_timeline`	Timeline and turning points.	No	No
`query_related_sources`	Related source candidates.	No	Yes
`research_discovery`	Enrichment queue.	No	Yes
`infographic`	Candidate public image.	No	Yes
`audio_brief`	Listening preview or audio source.	No	No

Artifact records should keep status, object reference, format, size, hash, generated time, error, and review flags. Raw payloads should stay in object storage and remain internal.

Module Mapping

Product module	Primary artifact sources	Notes
`basic_info`	Source metadata and source summary.	P0, inline.
`executive_overview`	Briefing Doc and Blog Post.	P0, heavy card plus page.
`core_insights`	Briefing Doc and query dimensions.	P0, inline with optional detail page.
`key_data`	Data table and query key data.	P0, heavy card plus page.
`source_compliance`	Source metadata and review notes.	P0, inline, must include disclaimer.
`institution`	Institution record.	P0, inline.
`differentiated_view`	Query divergence.	P1, optional.
`weaknesses`	Query weaknesses.	P1, optional, avoid investment-advice wording.
`timeline`	Query timeline.	P1, optional.
`study_guide`	Native Study Guide.	P1, optional, replaces legacy `faq`.
`structure_graph`	Mind map or deterministic fallback.	P1, optional.
`related_sources`	Related-source query and review queue.	P1, review required before display.
`infographic`	Infographic candidate.	P2, review required before display.
`audio`	Audio brief or reviewed audio asset.	P2, not required for text publish.
`research_discovery`	Research discovery queue.	P2, internal or reviewed only.

Publish Gates

Blocking before public release:

Source upload succeeded and is traceable.
Required P0 text artifacts exist and have usable content.
basic_info, executive_overview, core_insights, key_data, and source_compliance are present unless a product decision allows a partial report.
Display artifact is reviewed and approved.
Source attribution and risk disclaimer are present.
No raw artifact payload, local path, private notebook ID, or account information appears in public responses.

Non-blocking:

Mind map.
Study guide.
Timeline.
Related-source candidates.
Research discovery.
Infographic.
Audio.

If optional artifacts fail, record the failure and continue without inventing fallback public copy. Deterministic fallback is allowed for structure graph from already available artifacts.

Cadence Notes

NotebookLM operations should be conservative by default:

One active NotebookLM operation per account.
Text artifacts first.
Media artifacts after text success.
Heavy media should not block publishable text.
On transient failure, retry once; if an optional artifact fails again, mark it failed and continue.

The seed importer is not a production runner. A production runner should persist manifests after every operation and support resumable review/import.

Human Review

Review is mandatory for:

Gray broker sources.
Related-source candidate display.
Infographic or generated media.
Any content where citations/page labels are ambiguous.
Any copy that could be interpreted as investment advice.

Do not display raw NotebookLM page labels until they are normalized against verifiable source pages or sections.

6.0 KiB Raw Blame History