chore: prepare yanting monorepo handoff
This commit is contained in:
@@ -0,0 +1,142 @@
|
||||
# Content Pipeline Handoff
|
||||
|
||||
This is a handoff snapshot, not the product SSOT.
|
||||
|
||||
Product SSOT: mall-docs report-notebooklm docs, snapshot date: 2026-06-03.
|
||||
|
||||
## Content Principle
|
||||
|
||||
Use NotebookLM as a source-driven research engine, not as a generic rewriting model.
|
||||
|
||||
The pipeline may orchestrate, clean, validate, map, and review NotebookLM-native artifacts. It must not silently replace missing NotebookLM artifacts with locally rewritten publishable content.
|
||||
|
||||
## Source Inputs
|
||||
|
||||
Phase 1 content is based on public or authorized institutional research reports. Priority source categories:
|
||||
|
||||
- Official public sources.
|
||||
- Authorized partner sources.
|
||||
- Gray broker public sources, with stricter review and source display handling.
|
||||
|
||||
Vision source lists, tiering, and historical source-health experience may be used as reference material. Production data must not depend on a local Vision runtime, local path, local cache, or local account state.
|
||||
|
||||
## NotebookLM Workflow
|
||||
|
||||
Recommended report run order:
|
||||
|
||||
1. Inspect the source PDF: title, institution, date, page count, size, and report type.
|
||||
2. Create or reuse one notebook for one report source unless a multi-report synthesis is explicitly planned.
|
||||
3. Upload the report source.
|
||||
4. Generate the P0 text package:
|
||||
- source description
|
||||
- native Briefing Doc
|
||||
- native Blog Post
|
||||
- data table
|
||||
- query dimensions
|
||||
- query key data
|
||||
- query divergence
|
||||
- query weaknesses
|
||||
5. Generate useful P1 artifacts:
|
||||
- query timeline
|
||||
- query related sources
|
||||
- Study Guide
|
||||
- mind map, if download succeeds
|
||||
6. Generate P2 artifacts asynchronously:
|
||||
- infographic candidate
|
||||
- audio brief
|
||||
- research discovery
|
||||
7. Persist every artifact status in a manifest.
|
||||
8. Deterministically assemble display modules from reviewed artifacts.
|
||||
9. Run human review before publishing.
|
||||
|
||||
## Artifact Types
|
||||
|
||||
The Phase 1 schema supports these NotebookLM artifact types:
|
||||
|
||||
| Artifact type | Purpose | Publish blocking | Human review |
|
||||
|---|---|---:|---:|
|
||||
| `source_summary` | Source-level summary. | No | No |
|
||||
| `notebook_summary` | Notebook-level summary. | No | No |
|
||||
| `native_briefing_doc` | Native briefing document. | Yes | No |
|
||||
| `native_blog_post` | Native blog post. | Yes | No |
|
||||
| `native_study_guide` | FAQ, study guide, glossary. | No | No |
|
||||
| `data_table` | Structured table data. | Yes | No |
|
||||
| `mind_map` | Mind map or graph source. | No | No |
|
||||
| `query_dimensions` | Analysis dimensions. | Yes | No |
|
||||
| `query_key_data` | Key data points. | Yes | No |
|
||||
| `query_divergence` | Views that diverge from consensus. | No | No |
|
||||
| `query_weaknesses` | Weaknesses and open questions. | No | No |
|
||||
| `query_timeline` | Timeline and turning points. | No | No |
|
||||
| `query_related_sources` | Related source candidates. | No | Yes |
|
||||
| `research_discovery` | Enrichment queue. | No | Yes |
|
||||
| `infographic` | Candidate public image. | No | Yes |
|
||||
| `audio_brief` | Listening preview or audio source. | No | No |
|
||||
|
||||
Artifact records should keep status, object reference, format, size, hash, generated time, error, and review flags. Raw payloads should stay in object storage and remain internal.
|
||||
|
||||
## Module Mapping
|
||||
|
||||
| Product module | Primary artifact sources | Notes |
|
||||
|---|---|---|
|
||||
| `basic_info` | Source metadata and source summary. | P0, inline. |
|
||||
| `executive_overview` | Briefing Doc and Blog Post. | P0, heavy card plus page. |
|
||||
| `core_insights` | Briefing Doc and query dimensions. | P0, inline with optional detail page. |
|
||||
| `key_data` | Data table and query key data. | P0, heavy card plus page. |
|
||||
| `source_compliance` | Source metadata and review notes. | P0, inline, must include disclaimer. |
|
||||
| `institution` | Institution record. | P0, inline. |
|
||||
| `differentiated_view` | Query divergence. | P1, optional. |
|
||||
| `weaknesses` | Query weaknesses. | P1, optional, avoid investment-advice wording. |
|
||||
| `timeline` | Query timeline. | P1, optional. |
|
||||
| `study_guide` | Native Study Guide. | P1, optional, replaces legacy `faq`. |
|
||||
| `structure_graph` | Mind map or deterministic fallback. | P1, optional. |
|
||||
| `related_sources` | Related-source query and review queue. | P1, review required before display. |
|
||||
| `infographic` | Infographic candidate. | P2, review required before display. |
|
||||
| `audio` | Audio brief or reviewed audio asset. | P2, not required for text publish. |
|
||||
| `research_discovery` | Research discovery queue. | P2, internal or reviewed only. |
|
||||
|
||||
## Publish Gates
|
||||
|
||||
Blocking before public release:
|
||||
|
||||
- Source upload succeeded and is traceable.
|
||||
- Required P0 text artifacts exist and have usable content.
|
||||
- `basic_info`, `executive_overview`, `core_insights`, `key_data`, and `source_compliance` are present unless a product decision allows a partial report.
|
||||
- Display artifact is reviewed and approved.
|
||||
- Source attribution and risk disclaimer are present.
|
||||
- No raw artifact payload, local path, private notebook ID, or account information appears in public responses.
|
||||
|
||||
Non-blocking:
|
||||
|
||||
- Mind map.
|
||||
- Study guide.
|
||||
- Timeline.
|
||||
- Related-source candidates.
|
||||
- Research discovery.
|
||||
- Infographic.
|
||||
- Audio.
|
||||
|
||||
If optional artifacts fail, record the failure and continue without inventing fallback public copy. Deterministic fallback is allowed for structure graph from already available artifacts.
|
||||
|
||||
## Cadence Notes
|
||||
|
||||
NotebookLM operations should be conservative by default:
|
||||
|
||||
- One active NotebookLM operation per account.
|
||||
- Text artifacts first.
|
||||
- Media artifacts after text success.
|
||||
- Heavy media should not block publishable text.
|
||||
- On transient failure, retry once; if an optional artifact fails again, mark it failed and continue.
|
||||
|
||||
The seed importer is not a production runner. A production runner should persist manifests after every operation and support resumable review/import.
|
||||
|
||||
## Human Review
|
||||
|
||||
Review is mandatory for:
|
||||
|
||||
- Gray broker sources.
|
||||
- Related-source candidate display.
|
||||
- Infographic or generated media.
|
||||
- Any content where citations/page labels are ambiguous.
|
||||
- Any copy that could be interpreted as investment advice.
|
||||
|
||||
Do not display raw NotebookLM page labels until they are normalized against verifiable source pages or sections.
|
||||
Reference in New Issue
Block a user