Files

JesseMarkowitz 68e8d532be feat: v0.4.1 — ChatGPT tool-output content types and conv_id fix

First real-data export against v0.4.0 surfaced 66 unknown blocks across
three content types — captured live and added.

Added:
- execution_output (Code Interpreter / container.exec / python tool
  output) → tool_result block. output=content.text,
  tool_name=author.name, is_error=metadata.aggregate_result.status,
  summary=metadata.reasoning_title
- system_error → error tool_result with tool_name=author.name
- tether_browsing_display: spinner placeholders (empty result+summary)
  skip silently with DEBUG log; defensive populated-case branch maps
  to tool_result (untested in real data)
- tool_result block schema: optional `summary` field rendered as
  italic line between header and fence
- tool_result rendering: tool_name appears in header when present
  (e.g. `📤 Result: container.exec`); existing tool_name=None calls
  unchanged
- _ROLE_LABELS["tool"] = ("🔧 Tool", "tool")

Fixed:
- chatgpt.normalize_conversation reads `conversation_id` as fallback
  for `id`. Live API uses conversation_id; fixtures use id.
  Pre-fix: empty id in YAML frontmatter and missing context in
  WARNING logs.

Tests: 11 new (192 total, 0 failures). Fixture extended with 4
tool-output cases (execution_output success, empty execution_output
that should skip, system_error, tether_browsing_display spinner).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-05-05 09:25:55 -04:00

6.2 KiB

Raw Blame History

Changelog

All notable changes to this project will be documented here. Format follows Keep a Changelog.

[0.4.1] - Unreleased

Added

ChatGPT execution_output (Code Interpreter / container.exec / python) renders as a tool_result block with tool_name from author.name, is_error from metadata.aggregate_result.status, and the optional summary line populated from metadata.reasoning_title. Captured live during planning.
ChatGPT system_error content (e.g. browse-service 503) renders as an error tool_result block with tool_name from author.name (typically "web").
ChatGPT tether_browsing_display populated case (defensive, not observed in real data) renders as a tool_result block; transient spinner placeholders (empty result+summary) skip silently with DEBUG log.
tool_result block schema gains optional summary: str | None field, rendered as italic line between header and fenced output.
tool_result rendering shows tool_name in the header when present (e.g. 📤 **Result: container.exec**); when absent, header stays as 📤 **Result** (no regression).
Markdown exporter: _ROLE_LABELS["tool"] = ("🔧 Tool", "tool") so tool-role messages render under a recognisable header instead of the generic fallback.
11 new tests covering all four cases plus the conv_id fallback (192 total, all passing).

Fixed

ChatGPT normalize_conversation now reads conversation_id as a fallback for id. Live ChatGPT detail responses use conversation_id at top level; fixtures and listing summaries use id. Without the fallback, normalized conversations had empty id (visible as blank conversation_id: in YAML frontmatter and missing context in WARNING log lines).

Migration

No new schema breaks; tool_result blocks gain a summary field that defaults to None on legacy data. Existing exports re-render cleanly with the cache-clear-and-export workflow from v0.4.0.

[0.4.0] - Unreleased

Added

Rich content support: messages now carry an ordered blocks list (text, code, thinking, tool_use, tool_result, citation, image_placeholder, file_placeholder, unknown)
ChatGPT voice mode: audio_transcription parts render as text blocks; audio_asset_pointer and real_time_user_audio_video_asset_pointer render as 📎 File attached placeholders with size and duration metadata
ChatGPT Custom Instructions: user_editable_context and model_editable_context messages now appear in exports (were silently dropped — pre-existing bug fixed); rendered with a > ℹ️ Hidden context marker driven by the is_visually_hidden_from_conversation flag
Image placeholders for image_asset_pointer parts (uploads + DALL-E) inside multimodal_text and at message level
Defensive Claude block extraction: text, thinking, tool_use, tool_result (including nested-block flattening), image blocks (untested against real data; will fix-forward in v0.4.1 if real shapes diverge)
LossReport summary table emitted at end of every export run, breaking down unknown blocks and extraction failures by raw type so silently-dropped data becomes visible
_safe_fence helper picks a fence longer than any backtick run in extracted content, preventing embedded triple-backticks from corrupting downstream rendering (verified live in Joplin during planning)
unknown blocks render as > ⚠️ Unsupported content with the raw type, observed top-level keys, and reason — so future API additions are visible rather than silent

Changed

ChatGPT role filter (previously dropped tool and system messages) is lifted: all roles now route through normal extraction; truly empty messages skip via the existing empty-content guard
Markdown rendering moves from provider-time to exporter-write-time. Providers produce blocks; exporters call render_blocks_to_markdown at write time. This unblocks future Obsidian/HTML exporters
BaseProvider.normalize_conversation signature now accepts an optional LossReport parameter (breaking change for any future custom subclass; FileProvider hasn't shipped yet)
o1/o3 reasoning subparts inside text content_type messages remain rendered as plain text (defensive; reclassification to thinking block deferred until live shape is captured)

Fixed

user_editable_context / model_editable_context extraction (parts-vs-direct-fields mismatch) — Custom Instructions are no longer silently dropped from every conversation

Migration

Existing exports are not re-rendered automatically. To pick up v0.4.0 rendering for previously exported conversations:
```
python -m src.main cache --clear
python -m src.main export --provider all
```
JSON exports: messages now contain blocks (typed structured content) and may omit the legacy content field. External consumers reading JSON should prefer blocks.
Per-conversation message counts may increase: previously-dropped Custom Instructions, image-only user turns, and tool-only assistant turns now appear.

Out of scope (deferred to v0.5.0+)

Binary downloads of images and audio assets (placeholders show metadata only; content not preserved in this export)
Joplin resource upload for embedded media
Filename resolution for file-XYZ / sediment:// references
Speculative ChatGPT types (tether_browsing_display, tether_quote) and DALL-E assistant images — fall through to unknown blocks if encountered

[0.2.0] - Unreleased

Added

Joplin import automation: joplin command syncs exported Markdown files to Joplin as notes
Notebooks created automatically per provider+project (ChatGPT - My Project, etc.)
Re-running is safe: notes are updated, not duplicated (Joplin note ID stored in manifest)
JOPLIN_API_TOKEN, JOPLIN_API_URL, JOPLIN_REQUEST_TIMEOUT config variables
Configurable request timeout with clear error messages and actionable hints on timeout
--project filter on export and list commands (case-insensitive substring or none)
ChatGPT Projects support via CHATGPT_PROJECT_IDS env var

[0.1.0] - Unreleased

Added

Initial implementation: ChatGPT and Claude export via internal web APIs
Markdown and JSON exporters
Local cache/manifest for incremental sync
CLI with export, list, cache, doctor, and auth commands

6.2 KiB Raw Blame History Unescape Escape

Changelog

[0.4.1] - Unreleased

Added

Fixed

Migration

[0.4.0] - Unreleased

Added

Changed

Fixed

Migration

Out of scope (deferred to v0.5.0+)

[0.2.0] - Unreleased

Added

[0.1.0] - Unreleased

Added

6.2 KiB

Raw Blame History