feat: v0.2.0 — Joplin import, ChatGPT Projects, --project filter

Core features: - Add `joplin` command: syncs exported Markdown to Joplin via local REST API - Notebooks auto-created per provider+project (e.g. "ChatGPT - My Project") - Idempotent: notes updated (not duplicated) on re-run; note ID tracked in manifest - Add `--project` filter to `export` and `list` commands (substring or 'none') - Add ChatGPT Projects support via CHATGPT_PROJECT_IDS env var Config: - Add JOPLIN_API_TOKEN, JOPLIN_API_URL, JOPLIN_REQUEST_TIMEOUT - Version now read from importlib.metadata (single source of truth: pyproject.toml) - Bump version to 0.2.0 Quality: - Explicit Timeout handling in JoplinClient with actionable error messages - token validation (validate_token) separate from connectivity (ping) - Remove debug_auth.py, debug_claude.py, and untracked .har file - Add *.har to .gitignore (may contain auth cookies/session tokens) - Update README, CHANGELOG, FUTURE.md to reflect v0.2.0 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 06:04:03 -05:00
parent 23d7c17255
commit 304cf4fde4
16 changed files with 1795 additions and 133 deletions
--- a/README.md
+++ b/README.md
@@ -1,6 +1,6 @@
 # AI Chat Exporter

-A personal backup tool for ChatGPT and Claude conversation history. Exports your chats to Markdown files structured for archival in [Joplin](https://joplinapp.org/). Each conversation becomes a single `.md` file with YAML frontmatter, organised into folders that map directly to Joplin notebooks.
+A personal backup tool for ChatGPT and Claude conversation history. Exports your chats to Markdown files and syncs them to [Joplin](https://joplinapp.org/) as notes. Each conversation becomes a single `.md` file with YAML frontmatter, organised into folders that map directly to Joplin notebooks.

 Supports incremental sync — only new or updated conversations are exported on each run. Every run is resumable: if interrupted, re-running picks up exactly where it left off.

@@ -101,20 +101,62 @@ Copy `.env.example` to `.env` and fill in your values:
 cp .env.example .env
 ```

+### Provider tokens
+
+| Variable | Description |
+|----------|-------------|
+| `CHATGPT_SESSION_TOKEN` | Your ChatGPT JWT session token (`eyJ…`) |
+| `CHATGPT_PROJECT_IDS` | Comma-separated ChatGPT project IDs (see below) |
+| `CLAUDE_SESSION_KEY` | Your Claude session key |
+
+### Output
+
 | Variable | Default | Description |
 |----------|---------|-------------|
-| `CHATGPT_SESSION_TOKEN` | — | Your ChatGPT JWT session token |
-| `CLAUDE_SESSION_KEY` | — | Your Claude session key |
-| `EXPORT_DIR` | `./exports` | Where to write exported files |
+| `EXPORT_DIR` | `./exports` | Where to write exported Markdown files |
 | `OUTPUT_STRUCTURE` | `provider/project/year` | Folder structure (see below) |
+
+### Joplin
+
+| Variable | Default | Description |
+|----------|---------|-------------|
+| `JOPLIN_API_TOKEN` | — | Authorization token from Joplin Web Clipper settings |
+| `JOPLIN_API_URL` | `http://localhost:41184` | Joplin API URL (change only if you've customised the port) |
+| `JOPLIN_REQUEST_TIMEOUT` | `30` | Seconds before an API call times out. Increase for very large conversations. |
+
+### Cache & logging
+
+| Variable | Default | Description |
+|----------|---------|-------------|
 | `CACHE_DIR` | `~/.ai-chat-exporter` | Where to store the sync manifest |
 | `LOG_FILE` | `~/.ai-chat-exporter/logs/exporter.log` | Log file path (`none` to disable) |

 ---

+## ChatGPT Projects
+
+ChatGPT project conversations are stored separately from your main conversation list and require extra configuration.
+
+### Finding your project IDs
+
+1. Open ChatGPT and click a Project in the left sidebar
+2. Look at the browser URL — it will look like:
+   `https://chatgpt.com/g/g-p-68c2b2b3037c8191890036fb4ae3ed9f-my-project/project`
+3. Copy the `g-p-…` part (everything up to but not including the slug after the second `-`)
+
+Add all your project IDs to `.env` as a comma-separated list:
+
+```
+CHATGPT_PROJECT_IDS=g-p-68c2b2b3037c8191890036fb4ae3ed9f,g-p-anotherprojectid
+```
+
+The `auth` wizard can also guide you through this step interactively.
+
+---
+
 ## Output Structure

-All exported files go under `EXPORT_DIR`. The structure maps to Joplin notebooks.
+All exported files go under `EXPORT_DIR`. The folder structure maps directly to Joplin notebooks.

 ### Default: `provider/project/year`

@@ -136,7 +178,9 @@ exports/
            └── 2024-06-10_manifest-setup_jkl22222.md
 ```

-### Joplin Notebook Mapping (for future automated import)
+### Joplin Notebook Mapping
+
+Each provider+project combination maps to a flat Joplin notebook created automatically by the `joplin` command:

 | Export folder | Joplin notebook |
 |---------------|-----------------|
@@ -177,7 +221,7 @@ exports/
 python -m src.main auth
 ```

-Guided wizard to find and save session tokens. Detects OS and shows the correct DevTools shortcut.
+Guided wizard to find and save session tokens and ChatGPT project IDs. Detects OS and shows the correct DevTools shortcut.

 ### `doctor` — Health check

@@ -205,6 +249,12 @@ python -m src.main export --format both
 # Only conversations updated since a date
 python -m src.main export --since 2024-06-01

+# Only conversations in a specific project (case-insensitive substring)
+python -m src.main export --project "learning python"
+
+# Only conversations outside any project
+python -m src.main export --project none
+
 # Write to a custom directory
 python -m src.main export --output /path/to/my/notes

@@ -212,15 +262,54 @@ python -m src.main export --output /path/to/my/notes
 python -m src.main export --dry-run
 ```

-Options: `--provider [chatgpt|claude|all]`, `--format [markdown|json|both]`, `--output PATH`, `--since YYYY-MM-DD`, `--dry-run`
+Options: `--provider [chatgpt|claude|all]`, `--format [markdown|json|both]`, `--output PATH`, `--since YYYY-MM-DD`, `--project NAME`, `--dry-run`

 ### `list` — List conversations

 ```bash
+# List all conversations for all providers
+python -m src.main list
+
+# Single provider
 python -m src.main list --provider chatgpt
+
+# Filter by project
+python -m src.main list --project "learning python"
+
+# Only conversations outside any project
+python -m src.main list --project none
 ```

-Fetches and displays all conversations without exporting them.
+Fetches and displays all conversations without exporting them. Useful for verifying what the tool can see before running an export.
+
+### `joplin` — Sync to Joplin
+
+```bash
+# Sync all pending conversations to Joplin
+python -m src.main joplin
+
+# Preview what would be synced without sending anything
+python -m src.main joplin --dry-run
+
+# Sync a single provider
+python -m src.main joplin --provider chatgpt
+
+# Sync only conversations in a specific project
+python -m src.main joplin --project "learning python"
+
+# Sync only conversations outside any project
+python -m src.main joplin --project none
+```
+
+Reads the local export cache and pushes each exported Markdown file to Joplin as a note. Notebooks are created automatically. Re-running is safe — notes are updated (not duplicated).
+
+**Prerequisites:**
+1. Run `export` first to generate the Markdown files
+2. Open Joplin → Tools → Options → Web Clipper → enable the service
+3. Copy the Authorization token and add `JOPLIN_API_TOKEN=<token>` to your `.env`
+4. Joplin desktop must be open when you run this command
+
+Options: `--provider [chatgpt|claude|all]`, `--project NAME`, `--dry-run`

 ### `cache` — Manage the sync manifest

@@ -239,15 +328,20 @@ python -m src.main cache --clear --provider claude

 ## How the Cache Works

-The cache manifest lives at `~/.ai-chat-exporter/manifest.json` and records every exported conversation: its title, project, `updated_at` timestamp, and output file path.
+The cache manifest lives at `~/.ai-chat-exporter/manifest.json` and records every exported conversation: its title, project, `updated_at` timestamp, output file path, and (after Joplin sync) the Joplin note ID.

-On every run:
+On every `export` run:
 1. Fetch the full conversation list from the provider
 2. Compare each conversation's `updated_at` against the manifest
 3. Export only conversations that are new or have been updated
 4. Write each successfully exported conversation to the manifest **immediately** (not batched)

-**This design makes every run inherently resumable.** If the tool is interrupted for any reason — rate limit, network drop, Ctrl+C, crash — simply re-run the same command. It will skip already-exported conversations and continue from where it stopped.
+On every `joplin` run:
+1. Read the manifest to find conversations not yet synced to Joplin, or re-exported since last sync
+2. Push each pending Markdown file to Joplin (create or update)
+3. Store the Joplin note ID in the manifest so subsequent runs update rather than duplicate
+
+**This design makes every run inherently resumable.** If the tool is interrupted for any reason — rate limit, network drop, Ctrl+C, crash — simply re-run the same command. It will skip already-processed conversations and continue from where it stopped.

 To force a full re-export: `python -m src.main cache --clear` then re-run export.

@@ -265,11 +359,36 @@ Note: Claude's `sessionKey` is an opaque string — the only way to know it's ex
 ### `429 Rate Limited`
 The tool automatically pauses, saves progress, and exits with a clear message showing how many conversations were exported vs remaining. Just re-run the same export command to resume — the cache picks up exactly where it left off.

+### Joplin: "JOPLIN_API_TOKEN is not set"
+You need to configure the token before running the `joplin` command:
+1. Open Joplin desktop
+2. Go to Tools → Options → Web Clipper
+3. Enable the Web Clipper service
+4. Copy the Authorization token shown on that page
+5. Add `JOPLIN_API_TOKEN=<token>` to your `.env` file
+
+### Joplin: "Joplin is not responding"
+Joplin desktop must be running when you run the `joplin` command. The Web Clipper service shuts down when Joplin is closed.
+
+### Joplin: "Joplin rejected the API token (HTTP 401)"
+The token in `JOPLIN_API_TOKEN` doesn't match what Joplin expects. Get a fresh token from Joplin → Tools → Options → Web Clipper → Authorization token.
+
+### Joplin: note timed out
+If you see a timeout error, Joplin took longer than `JOPLIN_REQUEST_TIMEOUT` seconds (default: 30) to respond. Possible causes:
+- The conversation is very large and Joplin is slow to index it
+- Joplin is busy syncing or loading a large library
+- Joplin has frozen — try restarting it
+
+To increase the timeout: add `JOPLIN_REQUEST_TIMEOUT=60` to your `.env`.
+
+### ChatGPT project conversations not appearing
+Make sure you've added the project IDs to `CHATGPT_PROJECT_IDS` in your `.env`. See [ChatGPT Projects](#chatgpt-projects) for how to find them. Project conversations are not included in the default conversation listing — they must be fetched separately.
+
 ### Schema warnings in logs (`Unexpected API response shape`)
 The provider's internal API may have changed. Run with `--debug`, sanitize the output (remove any personal content), and check the project's GitHub Issues for known fixes.

 ### Non-text content warnings
-Images, code interpreter outputs, DALL-E generations, and Claude artifacts are not exported in v0.1.0. A WARNING is logged for each skipped item. See `FUTURE.md` for the v0.4.0 roadmap.
+Images, code interpreter outputs, DALL-E generations, and Claude artifacts are not exported in v0.2.0. A WARNING is logged for each skipped item. See `FUTURE.md` for the roadmap.

 ### Empty export / all conversations skipped
 No new or updated conversations since your last run. To verify: `python -m src.main cache --show`. To force a full re-export: `python -m src.main cache --clear`.
@@ -285,17 +404,18 @@ No new or updated conversations since your last run. To verify: `python -m src.m

 See `FUTURE.md` for planned features:

- **v0.1.x** — `export --force` flag to bypass cache for a single run
- **v0.2.0** — Joplin integration: auto-import exported files via Joplin's local REST API
+- **v0.2.x** — `export --force` flag; `joplin --force` flag; per-conversation cache reset
 - **v0.3.0** — Official API fallback: parse export ZIP files from ChatGPT/Claude settings
 - **v0.4.0** — Rich content: images, artifacts, code interpreter output, extended thinking
+- **v0.5.0** — Watch/scheduled mode; Obsidian vault output

 ---

 ## Security Notes

- All exported data is stored **locally only** — nothing is sent anywhere
+- All exported data is stored **locally only** — nothing is sent anywhere except to your local Joplin instance
 - Exported files and the cache manifest are created with `600` permissions (owner read/write only)
 - `.env` is in `.gitignore` — **never commit it**
 - Session tokens are never logged, printed, or included in error messages
+- The Joplin API token is only ever sent to `localhost` — it never leaves your machine
 - If you accidentally commit `.env`: immediately log out and back in to invalidate the token, then remove it from git history using [BFG Repo Cleaner](https://rtyley.github.io/bfg-repo-cleaner/) or `git filter-branch`