feat: v0.2.0 — Joplin import, ChatGPT Projects, --project filter

Core features:
- Add `joplin` command: syncs exported Markdown to Joplin via local REST API
- Notebooks auto-created per provider+project (e.g. "ChatGPT - My Project")
- Idempotent: notes updated (not duplicated) on re-run; note ID tracked in manifest
- Add `--project` filter to `export` and `list` commands (substring or 'none')
- Add ChatGPT Projects support via CHATGPT_PROJECT_IDS env var

Config:
- Add JOPLIN_API_TOKEN, JOPLIN_API_URL, JOPLIN_REQUEST_TIMEOUT
- Version now read from importlib.metadata (single source of truth: pyproject.toml)
- Bump version to 0.2.0

Quality:
- Explicit Timeout handling in JoplinClient with actionable error messages
- token validation (validate_token) separate from connectivity (ping)
- Remove debug_auth.py, debug_claude.py, and untracked .har file
- Add *.har to .gitignore (may contain auth cookies/session tokens)
- Update README, CHANGELOG, FUTURE.md to reflect v0.2.0

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
JesseMarkowitz
2026-03-01 06:04:03 -05:00
parent 23d7c17255
commit 304cf4fde4
16 changed files with 1795 additions and 133 deletions

152
README.md
View File

@@ -1,6 +1,6 @@
# AI Chat Exporter
A personal backup tool for ChatGPT and Claude conversation history. Exports your chats to Markdown files structured for archival in [Joplin](https://joplinapp.org/). Each conversation becomes a single `.md` file with YAML frontmatter, organised into folders that map directly to Joplin notebooks.
A personal backup tool for ChatGPT and Claude conversation history. Exports your chats to Markdown files and syncs them to [Joplin](https://joplinapp.org/) as notes. Each conversation becomes a single `.md` file with YAML frontmatter, organised into folders that map directly to Joplin notebooks.
Supports incremental sync — only new or updated conversations are exported on each run. Every run is resumable: if interrupted, re-running picks up exactly where it left off.
@@ -101,20 +101,62 @@ Copy `.env.example` to `.env` and fill in your values:
cp .env.example .env
```
### Provider tokens
| Variable | Description |
|----------|-------------|
| `CHATGPT_SESSION_TOKEN` | Your ChatGPT JWT session token (`eyJ…`) |
| `CHATGPT_PROJECT_IDS` | Comma-separated ChatGPT project IDs (see below) |
| `CLAUDE_SESSION_KEY` | Your Claude session key |
### Output
| Variable | Default | Description |
|----------|---------|-------------|
| `CHATGPT_SESSION_TOKEN` | — | Your ChatGPT JWT session token |
| `CLAUDE_SESSION_KEY` | — | Your Claude session key |
| `EXPORT_DIR` | `./exports` | Where to write exported files |
| `EXPORT_DIR` | `./exports` | Where to write exported Markdown files |
| `OUTPUT_STRUCTURE` | `provider/project/year` | Folder structure (see below) |
### Joplin
| Variable | Default | Description |
|----------|---------|-------------|
| `JOPLIN_API_TOKEN` | — | Authorization token from Joplin Web Clipper settings |
| `JOPLIN_API_URL` | `http://localhost:41184` | Joplin API URL (change only if you've customised the port) |
| `JOPLIN_REQUEST_TIMEOUT` | `30` | Seconds before an API call times out. Increase for very large conversations. |
### Cache & logging
| Variable | Default | Description |
|----------|---------|-------------|
| `CACHE_DIR` | `~/.ai-chat-exporter` | Where to store the sync manifest |
| `LOG_FILE` | `~/.ai-chat-exporter/logs/exporter.log` | Log file path (`none` to disable) |
---
## ChatGPT Projects
ChatGPT project conversations are stored separately from your main conversation list and require extra configuration.
### Finding your project IDs
1. Open ChatGPT and click a Project in the left sidebar
2. Look at the browser URL — it will look like:
`https://chatgpt.com/g/g-p-68c2b2b3037c8191890036fb4ae3ed9f-my-project/project`
3. Copy the `g-p-…` part (everything up to but not including the slug after the second `-`)
Add all your project IDs to `.env` as a comma-separated list:
```
CHATGPT_PROJECT_IDS=g-p-68c2b2b3037c8191890036fb4ae3ed9f,g-p-anotherprojectid
```
The `auth` wizard can also guide you through this step interactively.
---
## Output Structure
All exported files go under `EXPORT_DIR`. The structure maps to Joplin notebooks.
All exported files go under `EXPORT_DIR`. The folder structure maps directly to Joplin notebooks.
### Default: `provider/project/year`
@@ -136,7 +178,9 @@ exports/
└── 2024-06-10_manifest-setup_jkl22222.md
```
### Joplin Notebook Mapping (for future automated import)
### Joplin Notebook Mapping
Each provider+project combination maps to a flat Joplin notebook created automatically by the `joplin` command:
| Export folder | Joplin notebook |
|---------------|-----------------|
@@ -177,7 +221,7 @@ exports/
python -m src.main auth
```
Guided wizard to find and save session tokens. Detects OS and shows the correct DevTools shortcut.
Guided wizard to find and save session tokens and ChatGPT project IDs. Detects OS and shows the correct DevTools shortcut.
### `doctor` — Health check
@@ -205,6 +249,12 @@ python -m src.main export --format both
# Only conversations updated since a date
python -m src.main export --since 2024-06-01
# Only conversations in a specific project (case-insensitive substring)
python -m src.main export --project "learning python"
# Only conversations outside any project
python -m src.main export --project none
# Write to a custom directory
python -m src.main export --output /path/to/my/notes
@@ -212,15 +262,54 @@ python -m src.main export --output /path/to/my/notes
python -m src.main export --dry-run
```
Options: `--provider [chatgpt|claude|all]`, `--format [markdown|json|both]`, `--output PATH`, `--since YYYY-MM-DD`, `--dry-run`
Options: `--provider [chatgpt|claude|all]`, `--format [markdown|json|both]`, `--output PATH`, `--since YYYY-MM-DD`, `--project NAME`, `--dry-run`
### `list` — List conversations
```bash
# List all conversations for all providers
python -m src.main list
# Single provider
python -m src.main list --provider chatgpt
# Filter by project
python -m src.main list --project "learning python"
# Only conversations outside any project
python -m src.main list --project none
```
Fetches and displays all conversations without exporting them.
Fetches and displays all conversations without exporting them. Useful for verifying what the tool can see before running an export.
### `joplin` — Sync to Joplin
```bash
# Sync all pending conversations to Joplin
python -m src.main joplin
# Preview what would be synced without sending anything
python -m src.main joplin --dry-run
# Sync a single provider
python -m src.main joplin --provider chatgpt
# Sync only conversations in a specific project
python -m src.main joplin --project "learning python"
# Sync only conversations outside any project
python -m src.main joplin --project none
```
Reads the local export cache and pushes each exported Markdown file to Joplin as a note. Notebooks are created automatically. Re-running is safe — notes are updated (not duplicated).
**Prerequisites:**
1. Run `export` first to generate the Markdown files
2. Open Joplin → Tools → Options → Web Clipper → enable the service
3. Copy the Authorization token and add `JOPLIN_API_TOKEN=<token>` to your `.env`
4. Joplin desktop must be open when you run this command
Options: `--provider [chatgpt|claude|all]`, `--project NAME`, `--dry-run`
### `cache` — Manage the sync manifest
@@ -239,15 +328,20 @@ python -m src.main cache --clear --provider claude
## How the Cache Works
The cache manifest lives at `~/.ai-chat-exporter/manifest.json` and records every exported conversation: its title, project, `updated_at` timestamp, and output file path.
The cache manifest lives at `~/.ai-chat-exporter/manifest.json` and records every exported conversation: its title, project, `updated_at` timestamp, output file path, and (after Joplin sync) the Joplin note ID.
On every run:
On every `export` run:
1. Fetch the full conversation list from the provider
2. Compare each conversation's `updated_at` against the manifest
3. Export only conversations that are new or have been updated
4. Write each successfully exported conversation to the manifest **immediately** (not batched)
**This design makes every run inherently resumable.** If the tool is interrupted for any reason — rate limit, network drop, Ctrl+C, crash — simply re-run the same command. It will skip already-exported conversations and continue from where it stopped.
On every `joplin` run:
1. Read the manifest to find conversations not yet synced to Joplin, or re-exported since last sync
2. Push each pending Markdown file to Joplin (create or update)
3. Store the Joplin note ID in the manifest so subsequent runs update rather than duplicate
**This design makes every run inherently resumable.** If the tool is interrupted for any reason — rate limit, network drop, Ctrl+C, crash — simply re-run the same command. It will skip already-processed conversations and continue from where it stopped.
To force a full re-export: `python -m src.main cache --clear` then re-run export.
@@ -265,11 +359,36 @@ Note: Claude's `sessionKey` is an opaque string — the only way to know it's ex
### `429 Rate Limited`
The tool automatically pauses, saves progress, and exits with a clear message showing how many conversations were exported vs remaining. Just re-run the same export command to resume — the cache picks up exactly where it left off.
### Joplin: "JOPLIN_API_TOKEN is not set"
You need to configure the token before running the `joplin` command:
1. Open Joplin desktop
2. Go to Tools → Options → Web Clipper
3. Enable the Web Clipper service
4. Copy the Authorization token shown on that page
5. Add `JOPLIN_API_TOKEN=<token>` to your `.env` file
### Joplin: "Joplin is not responding"
Joplin desktop must be running when you run the `joplin` command. The Web Clipper service shuts down when Joplin is closed.
### Joplin: "Joplin rejected the API token (HTTP 401)"
The token in `JOPLIN_API_TOKEN` doesn't match what Joplin expects. Get a fresh token from Joplin → Tools → Options → Web Clipper → Authorization token.
### Joplin: note timed out
If you see a timeout error, Joplin took longer than `JOPLIN_REQUEST_TIMEOUT` seconds (default: 30) to respond. Possible causes:
- The conversation is very large and Joplin is slow to index it
- Joplin is busy syncing or loading a large library
- Joplin has frozen — try restarting it
To increase the timeout: add `JOPLIN_REQUEST_TIMEOUT=60` to your `.env`.
### ChatGPT project conversations not appearing
Make sure you've added the project IDs to `CHATGPT_PROJECT_IDS` in your `.env`. See [ChatGPT Projects](#chatgpt-projects) for how to find them. Project conversations are not included in the default conversation listing — they must be fetched separately.
### Schema warnings in logs (`Unexpected API response shape`)
The provider's internal API may have changed. Run with `--debug`, sanitize the output (remove any personal content), and check the project's GitHub Issues for known fixes.
### Non-text content warnings
Images, code interpreter outputs, DALL-E generations, and Claude artifacts are not exported in v0.1.0. A WARNING is logged for each skipped item. See `FUTURE.md` for the v0.4.0 roadmap.
Images, code interpreter outputs, DALL-E generations, and Claude artifacts are not exported in v0.2.0. A WARNING is logged for each skipped item. See `FUTURE.md` for the roadmap.
### Empty export / all conversations skipped
No new or updated conversations since your last run. To verify: `python -m src.main cache --show`. To force a full re-export: `python -m src.main cache --clear`.
@@ -285,17 +404,18 @@ No new or updated conversations since your last run. To verify: `python -m src.m
See `FUTURE.md` for planned features:
- **v0.1.x** — `export --force` flag to bypass cache for a single run
- **v0.2.0** — Joplin integration: auto-import exported files via Joplin's local REST API
- **v0.2.x** — `export --force` flag; `joplin --force` flag; per-conversation cache reset
- **v0.3.0** — Official API fallback: parse export ZIP files from ChatGPT/Claude settings
- **v0.4.0** — Rich content: images, artifacts, code interpreter output, extended thinking
- **v0.5.0** — Watch/scheduled mode; Obsidian vault output
---
## Security Notes
- All exported data is stored **locally only** — nothing is sent anywhere
- All exported data is stored **locally only** — nothing is sent anywhere except to your local Joplin instance
- Exported files and the cache manifest are created with `600` permissions (owner read/write only)
- `.env` is in `.gitignore`**never commit it**
- Session tokens are never logged, printed, or included in error messages
- The Joplin API token is only ever sent to `localhost` — it never leaves your machine
- If you accidentally commit `.env`: immediately log out and back in to invalidate the token, then remove it from git history using [BFG Repo Cleaner](https://rtyley.github.io/bfg-repo-cleaner/) or `git filter-branch`