# Wakatipu Expanded API Catalog - 2026-06-05

Generated from `app.future_tools.FUTURE_ENDPOINTS` plus the migrated Lucerne JSON endpoint catalog. The first 70 endpoints extend the existing Wakatipu extraction API while preserving API-key auth, request logging, tool flags, temp-file cleanup, and JSON-first responses. The Lucerne endpoints add web, text, ML, vector, workflow, and retained-output APIs to the same Wakatipu app.

| Endpoint | Tool | Package/Backend |
| --- | --- | --- |
| `/api/pdf/split` | `pdf.split` | `pypdf` |
| `/api/pdf/rotate-pages` | `pdf.rotate` | `pypdf` |
| `/api/pdf/delete-pages` | `pdf.delete-pages` | `pypdf` |
| `/api/pdf/extract-images` | `pdf.extract-images` | `pymupdf` |
| `/api/pdf/annotations` | `pdf.annotations` | `pymupdf` |
| `/api/pdf/form-fields` | `pdf.form-fields` | `pypdf` |
| `/api/pdf/fill-form` | `pdf.fill-form` | `pypdf` |
| `/api/pdf/flatten-form` | `pdf.flatten-form` | `pypdf` |
| `/api/pdf/redact` | `pdf.redact` | `pymupdf` |
| `/api/pdf/encryption-status` | `pdf.encryption-status` | `pypdf` |
| `/api/pdf/decrypt` | `pdf.decrypt` | `pypdf` |
| `/api/pdf/compress` | `pdf.compress` | `pypdf` |
| `/api/pdf/pages-to-images` | `pdf.pages-to-images` | `pymupdf` |
| `/api/pdf/ocr-searchable` | `pdf.ocr-searchable` | `ocrmypdf` |
| `/api/pdf/invoice-fields` | `pdf.invoice-fields` | `regex` |
| `/api/pdf/tables-export` | `pdf.tables-export` | `pdfplumber` |
| `/api/files/type` | `file.type` | `python-magic` |
| `/api/files/metadata` | `file.metadata` | `hashlib` |
| `/api/files/duplicate-fingerprint` | `file.duplicate-fingerprint` | `hashlib` |
| `/api/docx/structure` | `docx.structure` | `python-docx` |
| `/api/docx/comments-changes` | `docx.comments-changes` | `python-docx` |
| `/api/docx/to-markdown` | `docx.to-markdown` | `python-docx` |
| `/api/pptx/speaker-notes` | `pptx.speaker-notes` | `python-pptx` |
| `/api/pptx/images` | `pptx.images` | `python-pptx` |
| `/api/pptx/thumbnails` | `pptx.thumbnails` | `python-pptx` |
| `/api/odf/extract` | `odf.extract` | `odfpy` |
| `/api/rtf/extract` | `rtf.extract` | `builtin` |
| `/api/spreadsheets/schema` | `spreadsheet.schema` | `csv/openpyxl` |
| `/api/spreadsheets/formulas` | `spreadsheet.formulas` | `openpyxl` |
| `/api/spreadsheets/comments` | `spreadsheet.comments` | `openpyxl` |
| `/api/spreadsheets/named-ranges` | `spreadsheet.named-ranges` | `openpyxl` |
| `/api/spreadsheets/normalized-json` | `spreadsheet.normalized-json` | `csv/pandas` |
| `/api/csv/detect` | `csv.detect` | `csv` |
| `/api/csv/profile` | `csv.profile` | `csv` |
| `/api/images/exif` | `image.exif` | `pillow` |
| `/api/images/palette` | `image.palette` | `pillow` |
| `/api/images/thumbnail` | `image.thumbnail` | `pillow` |
| `/api/images/convert` | `image.convert` | `pillow` |
| `/api/images/orientation` | `image.orientation` | `pillow` |
| `/api/images/phash` | `image.phash` | `pillow` |
| `/api/ocr/boxes` | `ocr.boxes` | `pytesseract` |
| `/api/ocr/language` | `ocr.language` | `pytesseract` |
| `/api/ocr/confidence` | `ocr.confidence` | `pytesseract` |
| `/api/ocr/layout` | `ocr.layout` | `pytesseract` |
| `/api/barcodes/generate` | `barcode.generate` | `qrcode` |
| `/api/archives/manifest` | `archive.manifest` | `zipfile/tarfile` |
| `/api/archives/risk-scan` | `archive.risk-scan` | `zipfile/tarfile` |
| `/api/archives/safe-extract` | `archive.safe-extract` | `zipfile/tarfile` |
| `/api/archives/nested-inspect` | `archive.nested-inspect` | `zipfile/tarfile` |
| `/api/archives/password-detect` | `archive.password-detect` | `zipfile` |
| `/api/email/attachments-list` | `email.attachments-list` | `email` |
| `/api/email/attachments-extract` | `email.attachments-extract` | `email` |
| `/api/email/headers-analysis` | `email.headers-analysis` | `email` |
| `/api/email/thread-summary` | `email.thread-summary` | `regex` |
| `/api/email/pii-contacts` | `email.pii-contacts` | `regex` |
| `/api/media/metadata` | `media.metadata` | `ffprobe` |
| `/api/audio/waveform` | `audio.waveform` | `ffmpeg` |
| `/api/media/thumbnail` | `media.thumbnail` | `ffmpeg` |
| `/api/media/trim` | `media.trim` | `ffmpeg` |
| `/api/media/transcode` | `media.transcode` | `ffmpeg` |
| `/api/transcription/diarization` | `transcription.diarization` | `diarization` |
| `/api/transcription/subtitles` | `transcription.subtitles` | `srt/vtt` |
| `/api/pdf/merge` | `pdf.merge` | `pypdf` |
| `/api/spreadsheets/compare` | `spreadsheets.compare` | `csv/openpyxl` |
| `/api/batch/run` | `batch.run` | `background-jobs` |
| `/api/outputs/{job_id}/download` | `outputs.download` | `retained-output` |
| `/api/outputs/{job_id}/signed-link` | `outputs.signed-link` | `retained-output` |
| `/api/jobs/{job_id}/cancel` | `jobs.cancel` | `database` |
| `/api/jobs/{job_id}/retry` | `jobs.retry` | `database` |
| `/api/jobs/{job_id}/webhook` | `jobs.webhook` | `webhook-contract` |

## Request Pattern

Most endpoints accept `multipart/form-data` with `file` and optional `options` JSON string. Multi-file endpoints accept repeated `files` fields. Job/output endpoints use the path parameters shown above.

## Migrated Lucerne JSON Catalog

These endpoints accept JSON request bodies unless noted.

| Endpoint | Tool | Package/Backend |
| --- | --- | --- |
| `/api/web/extract-article` | `web.extract_article` | `trafilatura` |
| `/api/web/extract-readable` | `web.extract_readable` | `readability-lxml` |
| `/api/web/browser-snapshot` | `web.browser_snapshot` | `playwright/httpx-fallback` |
| `/api/web/links` | `web.links` | `beautifulsoup4` |
| `/api/web/scrape-selectors` | `web.scrape_selectors` | `beautifulsoup4` |
| `/api/text/clean` | `text.clean` | `ftfy` |
| `/api/text/language` | `text.language` | `langdetect` |
| `/api/text/entities` | `text.entities` | `spacy/regex-fallback` |
| `/api/text/keywords` | `text.keywords` | `nltk/counter-fallback` |
| `/api/text/dates` | `text.dates` | `dateparser/regex-fallback` |
| `/api/text/embed` | `text.embed` | `sentence-transformers/hash-fallback` |
| `/api/text/similarity` | `text.similarity` | `rapidfuzz` |
| `/api/text/dedupe` | `text.dedupe` | `dedupe/fallback` |
| `/api/text/topics` | `text.topics` | `gensim/bertopic/fallback` |
| `/api/ml/classify` | `ml.classify` | `scikit-learn` |
| `/api/ml/cluster` | `ml.cluster` | `scikit-learn/fallback` |
| `/api/vectors/search` | `vectors.search` | `faiss/qdrant/fallback` |
| `/api/workflows/run` | `workflows.run` | `inprocess` |
| `/api/workflows/jobs` | `workflows.jobs` | `inprocess` |
| `/api/workflows/jobs/{job_key}` | `workflows.jobs` | `inprocess` |
| `/api/outputs` | `storage.outputs` | `database/S3` |
| `/api/outputs/{output_key}` | `storage.outputs` | `database/S3` |
| `/api/tools/catalog` | `tools.catalog` | `database` |
| `/api/web/url-metadata` | `web.url_metadata` | `beautifulsoup4` |
| `/api/web/sitemap` | `web.sitemap` | `xml` |
| `/api/web/robots` | `web.robots` | `httpx` |
| `/api/web/crawl` | `web.crawl` | `httpx/beautifulsoup4` |
| `/api/web/feed` | `web.feed` | `beautifulsoup4` |
| `/api/web/screenshot` | `web.screenshot` | `playwright` |
| `/api/web/pdf-export` | `web.pdf_export` | `playwright` |
| `/api/web/structured-data` | `web.structured_data` | `json-ld` |
| `/api/web/social-cards` | `web.social_cards` | `beautifulsoup4` |
| `/api/web/broken-links` | `web.broken_links` | `httpx` |
| `/api/web/readability-score` | `web.readability_score` | `textstat` |
| `/api/web/sanitize-html` | `web.sanitize_html` | `beautifulsoup4` |
| `/api/web/diff` | `web.diff` | `rapidfuzz` |
| `/api/web/paywall-heuristic` | `web.paywall_heuristic` | `heuristic` |
| `/api/web/locale-language` | `web.locale_language` | `langdetect` |
| `/api/text/summarize` | `text.summarize` | `heuristic` |
| `/api/text/rewrite` | `text.rewrite` | `ftfy` |
| `/api/text/sentiment` | `text.sentiment` | `lexicon` |
| `/api/text/risk-flags` | `text.risk_flags` | `regex` |
| `/api/text/pii-redact` | `text.pii_redact` | `regex` |
| `/api/text/entity-link` | `text.entity_link` | `dictionary` |
| `/api/text/keywords-custom` | `text.keywords_custom` | `counter` |
| `/api/ml/taxonomy-classify` | `ml.taxonomy_classify` | `scikit-learn` |
| `/api/ml/multilabel` | `ml.multilabel` | `scikit-learn` |
| `/api/ml/cluster-summaries` | `ml.cluster_summaries` | `scikit-learn` |
| `/api/vectors/index` | `vectors.index` | `local-memory` |
| `/api/vectors/qdrant-search` | `vectors.qdrant_search` | `qdrant/fallback` |
| `/api/vectors/semantic-dedupe` | `vectors.semantic_dedupe` | `rapidfuzz` |
| `/api/workflows/templates` | `workflows.templates` | `inprocess` |
| `/api/workflows/schedule` | `workflows.schedule` | `inprocess` |
