Commit Graph

27 Commits

Author SHA1 Message Date
JesseMarkowitz
473d02f71a feat: v0.4.0 — rich content support with typed blocks and loss visibility
Extracts per-message content into a typed `blocks` list (text, code,
thinking, tool_use, tool_result, image_placeholder, file_placeholder,
unknown) and renders them at exporter write time. Voice transcripts,
Custom Instructions, and image references now appear in exports
instead of being silently dropped.

Foundation:
- src/blocks.py: pure block constructors, _safe_fence (fence-corruption
  defense, verified live in Joplin), _blockquote_prefix, render
- src/loss_report.py: per-run tally surfaced as INFO summary at end of
  export so silently-dropped data becomes visible

Providers:
- ChatGPT: dispatch on content_type produces typed blocks; voice shapes
  (audio_transcription, audio_asset_pointer, real_time_user_audio_video_
  asset_pointer) locked from live DevTools capture; Custom Instructions
  bug fix (parts-vs-direct-fields); role filter lifted; hidden-context
  marker driven by is_visually_hidden_from_conversation flag
- Claude: defensive dispatch for text/thinking/tool_use/tool_result/image
  with recursive nested-block flattening; untested against real rich-
  content data — fix-forward in v0.4.1

Exporter:
- Markdown renders from blocks at write time via render_blocks_to_markdown;
  backward-compat fallback to content for any pre-v0.4.0 cached data

Tests:
- 27 new tests across providers, exporters, CLI; fixtures rebuilt with
  real-shape ChatGPT voice + Custom Instructions cases
- 181/181 pass

Behavior changes (intentional):
- JSON output omits content; consumers should read blocks
- Per-conversation message counts increase (Custom Instructions, image-
  only, tool-only messages now appear)
- Existing exports not auto-re-rendered; users wanting fresh output run
  cache --clear then export

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-04 23:17:18 -04:00
JesseMarkowitz
4798edcea7 docs: update README for chunked ChatGPT session cookies
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 22:32:01 -04:00
JesseMarkowitz
19bfdaecbe fix: v0.2.1 — chunked ChatGPT cookies and Claude project path
- Support __Secure-next-auth.session-token.0/.1 split cookies; ChatGPT
  now issues tokens that exceed the 4KB per-cookie limit and must be
  sent as two named chunks or the auth endpoint returns no accessToken.
  Add CHATGPT_SESSION_TOKEN_1 env var; update auth wizard instructions.

- Fix Claude conversations exported to wrong directory when project name
  is present in the listing but absent from the detail endpoint response.
  Explicitly propagate "project" alongside _-prefixed annotation keys.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 22:32:14 -04:00
JesseMarkowitz
4ccd918eb1 fix: list command shows Claude titles and fits 80-col terminals
Claude's list endpoint returns conversations with a `name` field rather
than `title`, so every Claude row was falling through to "Untitled".
Also set no_wrap + ellipsis overflow and tune column widths so the table
renders one row per conversation in Windows Command Prompt (80 cols).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 14:49:58 -04:00
JesseMarkowitz
a869e8c7ba fix for project files written to wrong directory 2026-03-30 15:25:18 -04:00
JesseMarkowitz
340293ab94 fix for project files not extracted 2026-03-30 13:22:05 -04:00
JesseMarkowitz
050cd49124 updated to run on Windows and add est capabilities 2026-03-30 11:08:05 -04:00
JesseMarkowitz
304cf4fde4 feat: v0.2.0 — Joplin import, ChatGPT Projects, --project filter
Core features:
- Add `joplin` command: syncs exported Markdown to Joplin via local REST API
- Notebooks auto-created per provider+project (e.g. "ChatGPT - My Project")
- Idempotent: notes updated (not duplicated) on re-run; note ID tracked in manifest
- Add `--project` filter to `export` and `list` commands (substring or 'none')
- Add ChatGPT Projects support via CHATGPT_PROJECT_IDS env var

Config:
- Add JOPLIN_API_TOKEN, JOPLIN_API_URL, JOPLIN_REQUEST_TIMEOUT
- Version now read from importlib.metadata (single source of truth: pyproject.toml)
- Bump version to 0.2.0

Quality:
- Explicit Timeout handling in JoplinClient with actionable error messages
- token validation (validate_token) separate from connectivity (ping)
- Remove debug_auth.py, debug_claude.py, and untracked .har file
- Add *.har to .gitignore (may contain auth cookies/session tokens)
- Update README, CHANGELOG, FUTURE.md to reflect v0.2.0

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 06:04:03 -05:00
JesseMarkowitz
23d7c17255 fix: use curl_cffi Chrome TLS impersonation for Claude provider
claude.ai has the same Cloudflare TLS fingerprinting protection as
chatgpt.com. Apply the same fix: curl_cffi impersonate=chrome120,
remove base class User-Agent to avoid JA3/UA mismatch.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-28 05:34:35 -05:00
JesseMarkowitz
51c806c2c6 fix: set Claude sessionKey in cookie jar instead of raw header
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-28 05:33:04 -05:00
JesseMarkowitz
f500c038cb fix: remove User-Agent override to prevent TLS/UA fingerprint mismatch
curl_cffi sets a User-Agent consistent with its JA3 TLS fingerprint.
BaseProvider's custom UA (Chrome/121) conflicted with the chrome120
TLS fingerprint, causing Cloudflare to flag the request as a bot.
Removing the UA from session headers lets curl_cffi manage its own.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-28 05:27:53 -05:00
JesseMarkowitz
bb92ed2731 fix: update debug_auth.py to check accessToken presence by key
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-28 05:24:18 -05:00
JesseMarkowitz
5c6dcafa34 fix: use curl_cffi Chrome TLS impersonation to bypass Cloudflare
chatgpt.com uses Cloudflare's TLS fingerprinting (JA3/JA4) which
blocks Python requests regardless of cookies. curl_cffi impersonates
Chrome's exact TLS handshake, making requests indistinguishable from
a real browser at the transport layer.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-28 05:20:52 -05:00
JesseMarkowitz
d236fdb21a fix: set session cookie in cookie jar instead of manual header
Using self._session.cookies.set() ensures the cookie is sent correctly
by the requests session on all calls, including /api/auth/session.
Also add sec-fetch-* headers required by chatgpt.com.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 23:39:38 -05:00
JesseMarkowitz
6a33de682a fix: implement two-step ChatGPT auth (session cookie → access token)
The __Secure-next-auth.session-token cannot be used directly as a Bearer
token. It must first be exchanged via GET /api/auth/session (with the token
sent as a Cookie) to obtain a short-lived accessToken. This accessToken is
then used as the Authorization: Bearer header for all backend-api calls.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 23:37:42 -05:00
JesseMarkowitz
b41634d892 fix: load .env in doctor command; handle JWE tokens gracefully
Doctor was reading env vars before loading .env, so tokens set in .env
were invisible. ChatGPT now uses JWE (encrypted JWT) tokens which
PyJWT cannot decode without the server key — treat decode failure as
"token set, expiry unknown" rather than a FAIL.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 23:34:54 -05:00
JesseMarkowitz
8e9ca36b57 docs: add README
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 23:14:37 -05:00
JesseMarkowitz
726905cc09 test: add unit tests and fixtures
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 23:13:15 -05:00
JesseMarkowitz
389732fd9e feat: add CLI
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 23:10:31 -05:00
JesseMarkowitz
d1cac3ce04 feat: add markdown and JSON exporters
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 23:03:58 -05:00
JesseMarkowitz
f4ef937aa1 feat: add cache module
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 23:01:15 -05:00
JesseMarkowitz
3adb2d2b48 feat: add ChatGPT and Claude providers
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 22:59:06 -05:00
JesseMarkowitz
6073034789 feat: add provider base class
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 22:56:07 -05:00
JesseMarkowitz
6a32e127fd feat: add config loader with validation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 22:54:37 -05:00
JesseMarkowitz
1f347b581f feat: add utils module
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 22:53:07 -05:00
JesseMarkowitz
3efc4f3045 feat: add logging config
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 22:49:17 -05:00
JesseMarkowitz
62445c7c0c chore: initialize project scaffold
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 22:45:46 -05:00