143 lines
6.0 KiB
Markdown
143 lines
6.0 KiB
Markdown
# Content Pipeline Handoff
|
|
|
|
This is a handoff snapshot, not the product SSOT.
|
|
|
|
Product SSOT: mall-docs report-notebooklm docs, snapshot date: 2026-06-03.
|
|
|
|
## Content Principle
|
|
|
|
Use NotebookLM as a source-driven research engine, not as a generic rewriting model.
|
|
|
|
The pipeline may orchestrate, clean, validate, map, and review NotebookLM-native artifacts. It must not silently replace missing NotebookLM artifacts with locally rewritten publishable content.
|
|
|
|
## Source Inputs
|
|
|
|
Phase 1 content is based on public or authorized institutional research reports. Priority source categories:
|
|
|
|
- Official public sources.
|
|
- Authorized partner sources.
|
|
- Gray broker public sources, with stricter review and source display handling.
|
|
|
|
Vision source lists, tiering, and historical source-health experience may be used as reference material. Production data must not depend on a local Vision runtime, local path, local cache, or local account state.
|
|
|
|
## NotebookLM Workflow
|
|
|
|
Recommended report run order:
|
|
|
|
1. Inspect the source PDF: title, institution, date, page count, size, and report type.
|
|
2. Create or reuse one notebook for one report source unless a multi-report synthesis is explicitly planned.
|
|
3. Upload the report source.
|
|
4. Generate the P0 text package:
|
|
- source description
|
|
- native Briefing Doc
|
|
- native Blog Post
|
|
- data table
|
|
- query dimensions
|
|
- query key data
|
|
- query divergence
|
|
- query weaknesses
|
|
5. Generate useful P1 artifacts:
|
|
- query timeline
|
|
- query related sources
|
|
- Study Guide
|
|
- mind map, if download succeeds
|
|
6. Generate P2 artifacts asynchronously:
|
|
- infographic candidate
|
|
- audio brief
|
|
- research discovery
|
|
7. Persist every artifact status in a manifest.
|
|
8. Deterministically assemble display modules from reviewed artifacts.
|
|
9. Run human review before publishing.
|
|
|
|
## Artifact Types
|
|
|
|
The Phase 1 schema supports these NotebookLM artifact types:
|
|
|
|
| Artifact type | Purpose | Publish blocking | Human review |
|
|
|---|---|---:|---:|
|
|
| `source_summary` | Source-level summary. | No | No |
|
|
| `notebook_summary` | Notebook-level summary. | No | No |
|
|
| `native_briefing_doc` | Native briefing document. | Yes | No |
|
|
| `native_blog_post` | Native blog post. | Yes | No |
|
|
| `native_study_guide` | FAQ, study guide, glossary. | No | No |
|
|
| `data_table` | Structured table data. | Yes | No |
|
|
| `mind_map` | Mind map or graph source. | No | No |
|
|
| `query_dimensions` | Analysis dimensions. | Yes | No |
|
|
| `query_key_data` | Key data points. | Yes | No |
|
|
| `query_divergence` | Views that diverge from consensus. | No | No |
|
|
| `query_weaknesses` | Weaknesses and open questions. | No | No |
|
|
| `query_timeline` | Timeline and turning points. | No | No |
|
|
| `query_related_sources` | Related source candidates. | No | Yes |
|
|
| `research_discovery` | Enrichment queue. | No | Yes |
|
|
| `infographic` | Candidate public image. | No | Yes |
|
|
| `audio_brief` | Listening preview or audio source. | No | No |
|
|
|
|
Artifact records should keep status, object reference, format, size, hash, generated time, error, and review flags. Raw payloads should stay in object storage and remain internal.
|
|
|
|
## Module Mapping
|
|
|
|
| Product module | Primary artifact sources | Notes |
|
|
|---|---|---|
|
|
| `basic_info` | Source metadata and source summary. | P0, inline. |
|
|
| `executive_overview` | Briefing Doc and Blog Post. | P0, heavy card plus page. |
|
|
| `core_insights` | Briefing Doc and query dimensions. | P0, inline with optional detail page. |
|
|
| `key_data` | Data table and query key data. | P0, heavy card plus page. |
|
|
| `source_compliance` | Source metadata and review notes. | P0, inline, must include disclaimer. |
|
|
| `institution` | Institution record. | P0, inline. |
|
|
| `differentiated_view` | Query divergence. | P1, optional. |
|
|
| `weaknesses` | Query weaknesses. | P1, optional, avoid investment-advice wording. |
|
|
| `timeline` | Query timeline. | P1, optional. |
|
|
| `study_guide` | Native Study Guide. | P1, optional, replaces legacy `faq`. |
|
|
| `structure_graph` | Mind map or deterministic fallback. | P1, optional. |
|
|
| `related_sources` | Related-source query and review queue. | P1, review required before display. |
|
|
| `infographic` | Infographic candidate. | P2, review required before display. |
|
|
| `audio` | Audio brief or reviewed audio asset. | P2, not required for text publish. |
|
|
| `research_discovery` | Research discovery queue. | P2, internal or reviewed only. |
|
|
|
|
## Publish Gates
|
|
|
|
Blocking before public release:
|
|
|
|
- Source upload succeeded and is traceable.
|
|
- Required P0 text artifacts exist and have usable content.
|
|
- `basic_info`, `executive_overview`, `core_insights`, `key_data`, and `source_compliance` are present unless a product decision allows a partial report.
|
|
- Display artifact is reviewed and approved.
|
|
- Source attribution and risk disclaimer are present.
|
|
- No raw artifact payload, local path, private notebook ID, or account information appears in public responses.
|
|
|
|
Non-blocking:
|
|
|
|
- Mind map.
|
|
- Study guide.
|
|
- Timeline.
|
|
- Related-source candidates.
|
|
- Research discovery.
|
|
- Infographic.
|
|
- Audio.
|
|
|
|
If optional artifacts fail, record the failure and continue without inventing fallback public copy. Deterministic fallback is allowed for structure graph from already available artifacts.
|
|
|
|
## Cadence Notes
|
|
|
|
NotebookLM operations should be conservative by default:
|
|
|
|
- One active NotebookLM operation per account.
|
|
- Text artifacts first.
|
|
- Media artifacts after text success.
|
|
- Heavy media should not block publishable text.
|
|
- On transient failure, retry once; if an optional artifact fails again, mark it failed and continue.
|
|
|
|
The seed importer is not a production runner. A production runner should persist manifests after every operation and support resumable review/import.
|
|
|
|
## Human Review
|
|
|
|
Review is mandatory for:
|
|
|
|
- Gray broker sources.
|
|
- Related-source candidate display.
|
|
- Infographic or generated media.
|
|
- Any content where citations/page labels are ambiguous.
|
|
- Any copy that could be interpreted as investment advice.
|
|
|
|
Do not display raw NotebookLM page labels until they are normalized against verifiable source pages or sections.
|