Files
ai-chatexport/CHANGELOG.md
JesseMarkowitz 68e8d532be feat: v0.4.1 — ChatGPT tool-output content types and conv_id fix
First real-data export against v0.4.0 surfaced 66 unknown blocks across
three content types — captured live and added.

Added:
- execution_output (Code Interpreter / container.exec / python tool
  output) → tool_result block. output=content.text,
  tool_name=author.name, is_error=metadata.aggregate_result.status,
  summary=metadata.reasoning_title
- system_error → error tool_result with tool_name=author.name
- tether_browsing_display: spinner placeholders (empty result+summary)
  skip silently with DEBUG log; defensive populated-case branch maps
  to tool_result (untested in real data)
- tool_result block schema: optional `summary` field rendered as
  italic line between header and fence
- tool_result rendering: tool_name appears in header when present
  (e.g. `📤 Result: container.exec`); existing tool_name=None calls
  unchanged
- _ROLE_LABELS["tool"] = ("🔧 Tool", "tool")

Fixed:
- chatgpt.normalize_conversation reads `conversation_id` as fallback
  for `id`. Live API uses conversation_id; fixtures use id.
  Pre-fix: empty id in YAML frontmatter and missing context in
  WARNING logs.

Tests: 11 new (192 total, 0 failures). Fixture extended with 4
tool-output cases (execution_output success, empty execution_output
that should skip, system_error, tether_browsing_display spinner).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-05 09:25:55 -04:00

73 lines
6.2 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters
This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Changelog
All notable changes to this project will be documented here.
Format follows [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
## [0.4.1] - Unreleased
### Added
- ChatGPT `execution_output` (Code Interpreter / `container.exec` / `python`) renders as a `tool_result` block with `tool_name` from `author.name`, `is_error` from `metadata.aggregate_result.status`, and the optional `summary` line populated from `metadata.reasoning_title`. Captured live during planning.
- ChatGPT `system_error` content (e.g. browse-service 503) renders as an error `tool_result` block with `tool_name` from `author.name` (typically `"web"`).
- ChatGPT `tether_browsing_display` populated case (defensive, not observed in real data) renders as a `tool_result` block; transient spinner placeholders (empty `result`+`summary`) skip silently with DEBUG log.
- `tool_result` block schema gains optional `summary: str | None` field, rendered as italic line between header and fenced output.
- `tool_result` rendering shows `tool_name` in the header when present (e.g. `📤 **Result: container.exec**`); when absent, header stays as `📤 **Result**` (no regression).
- Markdown exporter: `_ROLE_LABELS["tool"] = ("🔧 Tool", "tool")` so tool-role messages render under a recognisable header instead of the generic fallback.
- 11 new tests covering all four cases plus the conv_id fallback (192 total, all passing).
### Fixed
- ChatGPT `normalize_conversation` now reads `conversation_id` as a fallback for `id`. Live ChatGPT detail responses use `conversation_id` at top level; fixtures and listing summaries use `id`. Without the fallback, normalized conversations had empty `id` (visible as blank `conversation_id:` in YAML frontmatter and missing context in WARNING log lines).
### Migration
- No new schema breaks; `tool_result` blocks gain a `summary` field that defaults to None on legacy data. Existing exports re-render cleanly with the cache-clear-and-export workflow from v0.4.0.
## [0.4.0] - Unreleased
### Added
- Rich content support: messages now carry an ordered `blocks` list (text, code, thinking, tool_use, tool_result, citation, image_placeholder, file_placeholder, unknown)
- ChatGPT voice mode: `audio_transcription` parts render as text blocks; `audio_asset_pointer` and `real_time_user_audio_video_asset_pointer` render as `📎 File attached` placeholders with size and duration metadata
- ChatGPT Custom Instructions: `user_editable_context` and `model_editable_context` messages now appear in exports (were silently dropped — pre-existing bug fixed); rendered with a `> Hidden context` marker driven by the `is_visually_hidden_from_conversation` flag
- Image placeholders for `image_asset_pointer` parts (uploads + DALL-E) inside `multimodal_text` and at message level
- Defensive Claude block extraction: `text`, `thinking`, `tool_use`, `tool_result` (including nested-block flattening), `image` blocks (untested against real data; will fix-forward in v0.4.1 if real shapes diverge)
- `LossReport` summary table emitted at end of every `export` run, breaking down `unknown blocks` and `extraction failures` by raw type so silently-dropped data becomes visible
- `_safe_fence` helper picks a fence longer than any backtick run in extracted content, preventing embedded triple-backticks from corrupting downstream rendering (verified live in Joplin during planning)
- `unknown` blocks render as `> ⚠️ Unsupported content` with the raw type, observed top-level keys, and reason — so future API additions are visible rather than silent
### Changed
- ChatGPT role filter (previously dropped `tool` and `system` messages) is **lifted**: all roles now route through normal extraction; truly empty messages skip via the existing empty-content guard
- Markdown rendering moves from provider-time to exporter-write-time. Providers produce blocks; exporters call `render_blocks_to_markdown` at write time. This unblocks future Obsidian/HTML exporters
- `BaseProvider.normalize_conversation` signature now accepts an optional `LossReport` parameter (breaking change for any future custom subclass; FileProvider hasn't shipped yet)
- `o1`/`o3` reasoning subparts inside `text` content_type messages remain rendered as plain text (defensive; reclassification to `thinking` block deferred until live shape is captured)
### Fixed
- `user_editable_context` / `model_editable_context` extraction (parts-vs-direct-fields mismatch) — Custom Instructions are no longer silently dropped from every conversation
### Migration
- Existing exports are not re-rendered automatically. To pick up v0.4.0 rendering for previously exported conversations:
```
python -m src.main cache --clear
python -m src.main export --provider all
```
- JSON exports: messages now contain `blocks` (typed structured content) and may omit the legacy `content` field. External consumers reading JSON should prefer `blocks`.
- Per-conversation message counts may increase: previously-dropped Custom Instructions, image-only user turns, and tool-only assistant turns now appear.
### Out of scope (deferred to v0.5.0+)
- Binary downloads of images and audio assets (placeholders show metadata only; `content not preserved in this export`)
- Joplin resource upload for embedded media
- Filename resolution for `file-XYZ` / `sediment://` references
- Speculative ChatGPT types (`tether_browsing_display`, `tether_quote`) and DALL-E assistant images — fall through to `unknown` blocks if encountered
## [0.2.0] - Unreleased
### Added
- Joplin import automation: `joplin` command syncs exported Markdown files to Joplin as notes
- Notebooks created automatically per provider+project (`ChatGPT - My Project`, etc.)
- Re-running is safe: notes are updated, not duplicated (Joplin note ID stored in manifest)
- `JOPLIN_API_TOKEN`, `JOPLIN_API_URL`, `JOPLIN_REQUEST_TIMEOUT` config variables
- Configurable request timeout with clear error messages and actionable hints on timeout
- `--project` filter on `export` and `list` commands (case-insensitive substring or `none`)
- ChatGPT Projects support via `CHATGPT_PROJECT_IDS` env var
## [0.1.0] - Unreleased
### Added
- Initial implementation: ChatGPT and Claude export via internal web APIs
- Markdown and JSON exporters
- Local cache/manifest for incremental sync
- CLI with export, list, cache, doctor, and auth commands