ai-chatexport/FUTURE.md

# Planned Future Work

Items completed in each release are moved to the changelog. Items here are
designed for but not yet implemented. The codebase is structured to make each
of these additions straightforward.

**Completed:**
- v0.1.0 — Core export: ChatGPT + Claude, incremental sync, Markdown + JSON output
- v0.2.0 — Joplin import automation (`joplin` command, create/update notes, notebook auto-creation)
- v0.4.0 — Rich content support: typed message blocks (text, code, thinking, tool_use, tool_result, image_placeholder, file_placeholder, unknown); ChatGPT voice transcripts as text + audio placeholders; Custom Instructions extraction; data-loss visibility via `LossReport` summary and visible `unknown` blocks

---

## Export `--force` Flag (v0.2.x)

Add `--force` to the `export` command to re-export already-cached conversations
without permanently clearing the entire manifest. Useful for re-generating files
after changing the Markdown template or output structure.

Implementation: pass a `force=True` flag to `cache.get_new_or_updated()`, which
returns all conversations regardless of cache state when force is True.

Current workaround: `python -m src.main cache --clear` then re-run export.

## Joplin `--force` Flag (v0.2.x)

Similarly, add `--force` to the `joplin` command to re-sync all cached
conversations to Joplin regardless of whether they've been synced before.
Useful after making formatting changes to the Markdown exporter.

Implementation: in `get_joplin_pending()`, return all entries that have a
`file_path` when `force=True`, ignoring `joplin_synced_at`.

## Per-Conversation Cache Reset (v0.2.x)

Add `cache --reset --conversation <id>` to force re-export or re-sync of a
single conversation without clearing the entire provider cache.

Current workaround: manually edit `~/.ai-chat-exporter/manifest.json` and
delete the entry, then re-run export.

---

## Official API Fallback (v0.3.0)

If the unofficial internal web API approach breaks, migrate to official export
file parsing as a fallback:
- ChatGPT: parse `conversations.json` from Settings → Export Data
- Claude: parse `conversations.json` from Settings → Privacy → Export Data

The `BaseProvider` abstract class is intentionally designed so that a
`FileProvider` subclass can implement the same interface
(`list_conversations`, `get_conversation`, `normalize_conversation`)
without any changes to cache, exporters, or CLI code.

To add this: implement `src/providers/file_chatgpt.py` and
`src/providers/file_claude.py`, then add `--input-file` flag to the
export command to accept a pre-downloaded export ZIP or JSON.

---

## Binary Content Downloads (v0.5.0)

v0.4.0 ships placeholders for images and audio assets but does not download
the binary content. The `_safe_fence`-wrapped placeholders include the asset
reference (`sediment://...` or `file-service://...`), MIME type, size, and
duration where available; the actual bytes are not preserved.

Next steps:
- Download attached images alongside the Markdown export, save under a
  `media/` sibling directory with a stable filename derived from the asset
  reference.
- Replace `image_placeholder` rendering with an inline `![](relative/path)`
  reference once the file is on disk.
- Joplin integration: upload binaries as Joplin resources via `POST /resources`,
  rewrite the rendered Markdown to use `:/resourceId` references, and track
  the resource ID in the cache manifest so re-syncs stay idempotent.
- DALL-E images on the assistant side: not observed in this user's data; the
  code path exists (`source = "model_generated"`) but is untested.

The block-level schema is already in place — only the file-fetch + rewrite
layer needs to be added. See the `image_placeholder` and `file_placeholder`
block definitions in `src/blocks.py`.

## Reclassify o1/o3 Reasoning Subparts (v0.4.1)

v0.4.0 leaves dict parts inside `text` content_type messages with shape
`{"summary": ..., "content": ...}` rendered as plain text (defensive — the
shape was inferred from a code comment, not captured live). Once a real
reasoning conversation is captured, reclassify these as `thinking` blocks.

## Suppress Hidden Context (v0.4.x)

If Custom Instructions duplication across conversations becomes a storage
problem, add `EXPORTER_INCLUDE_HIDDEN_CONTEXT=false` env var. The toggle is
a single `os.getenv()` check at the start of
`_extract_editable_context_blocks` in `src/providers/chatgpt.py` — return
empty list if disabled.

---

## Scheduled / Watch Mode (v0.5.0)

Add a `watch` command (or cron integration helper) to run exports automatically
on a schedule:

```bash
python -m src.main watch --interval 6h   # poll every 6 hours
```

This would run `export` + `joplin` in sequence, then sleep. Alternatively,
provide a `cron` command that prints the correct crontab line for the user's
setup.

Implementation: simple loop with `time.sleep()`, or emit a crontab entry
string that calls the export and joplin commands in sequence. A `--once`
flag would do a single run then exit (useful for cron itself).

---

## Obsidian Vault Output (v0.5.0)

Add an `obsidian` command (or `--target obsidian` flag) to sync exported
conversations into an Obsidian vault directory. The current Markdown format
is already largely compatible; the main differences are:

- Obsidian uses YAML frontmatter `properties` (same format, already supported)
- Tags should use `#tag` inline or `tags:` list in frontmatter (already done)
- Wikilinks (`[[Title]]`) instead of Markdown links — optional, Obsidian
  supports both

Implementation: the existing `MarkdownExporter` output is already valid in
Obsidian. An `ObsidianSyncer` class (mirroring `JoplinClient`) would simply
copy files to the vault directory and maintain a flat or nested folder
structure matching the user's Obsidian setup. No API needed — just file I/O.

---

## Joplin Nested Notebooks (future)

Currently notebooks are flat: `ChatGPT - My Project`. Joplin supports nested
notebooks via `parent_id`. A future option (`JOPLIN_NESTED_NOTEBOOKS=true`)
could create a two-level hierarchy:

```
ChatGPT/
  My Project/
  No Project/
Claude/
  Budget Tracker/
```

Implementation: `get_or_create_notebook` would first find/create the provider
notebook, then find/create the project notebook as a child.

---

## Token Expiry Notifications (future)

Proactively warn when a token is close to expiry (within 48h for ChatGPT),
rather than only surfacing the warning at startup. Options:

- Add an `expiry` subcommand that prints token status and exits non-zero if
  any token is expired or expiring soon (useful in scripts/cron)
- Send a desktop notification via `notify-send` (Linux) or `osascript` (macOS)
  when a token is within 24h of expiry

---

## Search Command (future)

Add a `search` command to full-text search across all exported Markdown files:

```bash
python -m src.main search "kubernetes ingress"
python -m src.main search "kubernetes ingress" --provider claude --project devops
```

Implementation: `grep`/`ripgrep` over `EXPORT_DIR`, display results with
conversation title, date, and a snippet. No index needed — Markdown files are
small enough to grep directly.