feat: v0.4.0 — rich content support with typed blocks and loss visibility

Extracts per-message content into a typed `blocks` list (text, code, thinking, tool_use, tool_result, image_placeholder, file_placeholder, unknown) and renders them at exporter write time. Voice transcripts, Custom Instructions, and image references now appear in exports instead of being silently dropped. Foundation: - src/blocks.py: pure block constructors, _safe_fence (fence-corruption defense, verified live in Joplin), _blockquote_prefix, render - src/loss_report.py: per-run tally surfaced as INFO summary at end of export so silently-dropped data becomes visible Providers: - ChatGPT: dispatch on content_type produces typed blocks; voice shapes (audio_transcription, audio_asset_pointer, real_time_user_audio_video_ asset_pointer) locked from live DevTools capture; Custom Instructions bug fix (parts-vs-direct-fields); role filter lifted; hidden-context marker driven by is_visually_hidden_from_conversation flag - Claude: defensive dispatch for text/thinking/tool_use/tool_result/image with recursive nested-block flattening; untested against real rich- content data — fix-forward in v0.4.1 Exporter: - Markdown renders from blocks at write time via render_blocks_to_markdown; backward-compat fallback to content for any pre-v0.4.0 cached data Tests: - 27 new tests across providers, exporters, CLI; fixtures rebuilt with real-shape ChatGPT voice + Custom Instructions cases - 181/181 pass Behavior changes (intentional): - JSON output omits content; consumers should read blocks - Per-conversation message counts increase (Custom Instructions, image- only, tool-only messages now appear) - Existing exports not auto-re-rendered; users wanting fresh output run cache --clear then export Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-04 23:17:18 -04:00
parent 4798edcea7
commit 473d02f71a
16 changed files with 1786 additions and 232 deletions
--- a/tests/test_cli.py
+++ b/tests/test_cli.py
@@ -127,3 +127,50 @@ class TestExportSinceValidation:
            },
        )
        assert "Invalid --since date" not in result.output
+
+
+# ---------------------------------------------------------------------------
+# LossReport summary
+# ---------------------------------------------------------------------------
+
+
+class TestLossReportSummary:
+    """The LossReport's format_summary() pinned format covers zero, top-5, and overflow cases."""
+
+    def test_zero_summary_uses_none_sentinel(self):
+        from src.loss_report import LossReport
+
+        report = LossReport()
+        out = report.format_summary()
+        assert "[export] Run summary:" in out
+        assert "conversations:        0" in out
+        assert "messages rendered:    0" in out
+        # Both "(none)" sentinels present — never empty parens
+        assert out.count("(none)") == 2
+
+    def test_top_5_breakdown(self):
+        from src.loss_report import LossReport
+
+        report = LossReport()
+        for raw_type in ("a", "b", "c", "d", "e", "f", "g"):
+            report.record_unknown(raw_type)
+            if raw_type == "a":
+                # Make 'a' the most common
+                for _ in range(4):
+                    report.record_unknown("a")
+        out = report.format_summary()
+        # Top entry shown
+        assert "a=5" in out
+        # Overflow line present (7 types, top 5 + 2 more)
+        assert "+ 2 more types" in out
+
+    def test_messages_and_conversations_recorded(self):
+        from src.loss_report import LossReport
+
+        report = LossReport()
+        report.record_conversation()
+        report.record_message()
+        report.record_message()
+        out = report.format_summary()
+        assert "conversations:        1" in out
+        assert "messages rendered:    2" in out