First real-data export against v0.4.0 surfaced 66 unknown blocks across three content types — captured live and added. Added: - execution_output (Code Interpreter / container.exec / python tool output) → tool_result block. output=content.text, tool_name=author.name, is_error=metadata.aggregate_result.status, summary=metadata.reasoning_title - system_error → error tool_result with tool_name=author.name - tether_browsing_display: spinner placeholders (empty result+summary) skip silently with DEBUG log; defensive populated-case branch maps to tool_result (untested in real data) - tool_result block schema: optional `summary` field rendered as italic line between header and fence - tool_result rendering: tool_name appears in header when present (e.g. `📤 Result: container.exec`); existing tool_name=None calls unchanged - _ROLE_LABELS["tool"] = ("🔧 Tool", "tool") Fixed: - chatgpt.normalize_conversation reads `conversation_id` as fallback for `id`. Live API uses conversation_id; fixtures use id. Pre-fix: empty id in YAML frontmatter and missing context in WARNING logs. Tests: 11 new (192 total, 0 failures). Fixture extended with 4 tool-output cases (execution_output success, empty execution_output that should skip, system_error, tether_browsing_display spinner). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
73 lines
6.2 KiB
Markdown
73 lines
6.2 KiB
Markdown
# Changelog
|
||
|
||
All notable changes to this project will be documented here.
|
||
Format follows [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
|
||
|
||
## [0.4.1] - Unreleased
|
||
### Added
|
||
- ChatGPT `execution_output` (Code Interpreter / `container.exec` / `python`) renders as a `tool_result` block with `tool_name` from `author.name`, `is_error` from `metadata.aggregate_result.status`, and the optional `summary` line populated from `metadata.reasoning_title`. Captured live during planning.
|
||
- ChatGPT `system_error` content (e.g. browse-service 503) renders as an error `tool_result` block with `tool_name` from `author.name` (typically `"web"`).
|
||
- ChatGPT `tether_browsing_display` populated case (defensive, not observed in real data) renders as a `tool_result` block; transient spinner placeholders (empty `result`+`summary`) skip silently with DEBUG log.
|
||
- `tool_result` block schema gains optional `summary: str | None` field, rendered as italic line between header and fenced output.
|
||
- `tool_result` rendering shows `tool_name` in the header when present (e.g. `📤 **Result: container.exec**`); when absent, header stays as `📤 **Result**` (no regression).
|
||
- Markdown exporter: `_ROLE_LABELS["tool"] = ("🔧 Tool", "tool")` so tool-role messages render under a recognisable header instead of the generic fallback.
|
||
- 11 new tests covering all four cases plus the conv_id fallback (192 total, all passing).
|
||
|
||
### Fixed
|
||
- ChatGPT `normalize_conversation` now reads `conversation_id` as a fallback for `id`. Live ChatGPT detail responses use `conversation_id` at top level; fixtures and listing summaries use `id`. Without the fallback, normalized conversations had empty `id` (visible as blank `conversation_id:` in YAML frontmatter and missing context in WARNING log lines).
|
||
|
||
### Migration
|
||
- No new schema breaks; `tool_result` blocks gain a `summary` field that defaults to None on legacy data. Existing exports re-render cleanly with the cache-clear-and-export workflow from v0.4.0.
|
||
|
||
## [0.4.0] - Unreleased
|
||
### Added
|
||
- Rich content support: messages now carry an ordered `blocks` list (text, code, thinking, tool_use, tool_result, citation, image_placeholder, file_placeholder, unknown)
|
||
- ChatGPT voice mode: `audio_transcription` parts render as text blocks; `audio_asset_pointer` and `real_time_user_audio_video_asset_pointer` render as `📎 File attached` placeholders with size and duration metadata
|
||
- ChatGPT Custom Instructions: `user_editable_context` and `model_editable_context` messages now appear in exports (were silently dropped — pre-existing bug fixed); rendered with a `> ℹ️ Hidden context` marker driven by the `is_visually_hidden_from_conversation` flag
|
||
- Image placeholders for `image_asset_pointer` parts (uploads + DALL-E) inside `multimodal_text` and at message level
|
||
- Defensive Claude block extraction: `text`, `thinking`, `tool_use`, `tool_result` (including nested-block flattening), `image` blocks (untested against real data; will fix-forward in v0.4.1 if real shapes diverge)
|
||
- `LossReport` summary table emitted at end of every `export` run, breaking down `unknown blocks` and `extraction failures` by raw type so silently-dropped data becomes visible
|
||
- `_safe_fence` helper picks a fence longer than any backtick run in extracted content, preventing embedded triple-backticks from corrupting downstream rendering (verified live in Joplin during planning)
|
||
- `unknown` blocks render as `> ⚠️ Unsupported content` with the raw type, observed top-level keys, and reason — so future API additions are visible rather than silent
|
||
|
||
### Changed
|
||
- ChatGPT role filter (previously dropped `tool` and `system` messages) is **lifted**: all roles now route through normal extraction; truly empty messages skip via the existing empty-content guard
|
||
- Markdown rendering moves from provider-time to exporter-write-time. Providers produce blocks; exporters call `render_blocks_to_markdown` at write time. This unblocks future Obsidian/HTML exporters
|
||
- `BaseProvider.normalize_conversation` signature now accepts an optional `LossReport` parameter (breaking change for any future custom subclass; FileProvider hasn't shipped yet)
|
||
- `o1`/`o3` reasoning subparts inside `text` content_type messages remain rendered as plain text (defensive; reclassification to `thinking` block deferred until live shape is captured)
|
||
|
||
### Fixed
|
||
- `user_editable_context` / `model_editable_context` extraction (parts-vs-direct-fields mismatch) — Custom Instructions are no longer silently dropped from every conversation
|
||
|
||
### Migration
|
||
- Existing exports are not re-rendered automatically. To pick up v0.4.0 rendering for previously exported conversations:
|
||
```
|
||
python -m src.main cache --clear
|
||
python -m src.main export --provider all
|
||
```
|
||
- JSON exports: messages now contain `blocks` (typed structured content) and may omit the legacy `content` field. External consumers reading JSON should prefer `blocks`.
|
||
- Per-conversation message counts may increase: previously-dropped Custom Instructions, image-only user turns, and tool-only assistant turns now appear.
|
||
|
||
### Out of scope (deferred to v0.5.0+)
|
||
- Binary downloads of images and audio assets (placeholders show metadata only; `content not preserved in this export`)
|
||
- Joplin resource upload for embedded media
|
||
- Filename resolution for `file-XYZ` / `sediment://` references
|
||
- Speculative ChatGPT types (`tether_browsing_display`, `tether_quote`) and DALL-E assistant images — fall through to `unknown` blocks if encountered
|
||
|
||
## [0.2.0] - Unreleased
|
||
### Added
|
||
- Joplin import automation: `joplin` command syncs exported Markdown files to Joplin as notes
|
||
- Notebooks created automatically per provider+project (`ChatGPT - My Project`, etc.)
|
||
- Re-running is safe: notes are updated, not duplicated (Joplin note ID stored in manifest)
|
||
- `JOPLIN_API_TOKEN`, `JOPLIN_API_URL`, `JOPLIN_REQUEST_TIMEOUT` config variables
|
||
- Configurable request timeout with clear error messages and actionable hints on timeout
|
||
- `--project` filter on `export` and `list` commands (case-insensitive substring or `none`)
|
||
- ChatGPT Projects support via `CHATGPT_PROJECT_IDS` env var
|
||
|
||
## [0.1.0] - Unreleased
|
||
### Added
|
||
- Initial implementation: ChatGPT and Claude export via internal web APIs
|
||
- Markdown and JSON exporters
|
||
- Local cache/manifest for incremental sync
|
||
- CLI with export, list, cache, doctor, and auth commands
|