Extracts per-message content into a typed `blocks` list (text, code,
thinking, tool_use, tool_result, image_placeholder, file_placeholder,
unknown) and renders them at exporter write time. Voice transcripts,
Custom Instructions, and image references now appear in exports
instead of being silently dropped.
Foundation:
- src/blocks.py: pure block constructors, _safe_fence (fence-corruption
defense, verified live in Joplin), _blockquote_prefix, render
- src/loss_report.py: per-run tally surfaced as INFO summary at end of
export so silently-dropped data becomes visible
Providers:
- ChatGPT: dispatch on content_type produces typed blocks; voice shapes
(audio_transcription, audio_asset_pointer, real_time_user_audio_video_
asset_pointer) locked from live DevTools capture; Custom Instructions
bug fix (parts-vs-direct-fields); role filter lifted; hidden-context
marker driven by is_visually_hidden_from_conversation flag
- Claude: defensive dispatch for text/thinking/tool_use/tool_result/image
with recursive nested-block flattening; untested against real rich-
content data — fix-forward in v0.4.1
Exporter:
- Markdown renders from blocks at write time via render_blocks_to_markdown;
backward-compat fallback to content for any pre-v0.4.0 cached data
Tests:
- 27 new tests across providers, exporters, CLI; fixtures rebuilt with
real-shape ChatGPT voice + Custom Instructions cases
- 181/181 pass
Behavior changes (intentional):
- JSON output omits content; consumers should read blocks
- Per-conversation message counts increase (Custom Instructions, image-
only, tool-only messages now appear)
- Existing exports not auto-re-rendered; users wanting fresh output run
cache --clear then export
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>