Feature Guide
Core Features
Translate Once, Output Many
Translate the same file into multiple languages in a single run — perfect for multilingual subtitles or i18n projects. For example, translate an English subtitle file into Chinese, Japanese, German, and French at once and download all versions packaged. 120+ languages supported, with more added regularly.
Translation Cache
Translation results are saved locally in your browser. When parameters match, the tool returns the cached result and skips the API call:
- Persistent: survives refreshes and browser restarts
- High capacity: holds millions of records without bloating memory
- Toggle off: disable temporarily when debugging prompts or model settings
- One-click clear: clean up everything from the settings panel
- Hit conditions: source text + source/target language + service-specific key params must all match (LLMs key off prompt/temperature/thinking effort; Qwen-MT off
domains; traditional MT only off the source text). Changing temperature, editing prompts, or toggling thinking will miss the cache.
Long Text & Concurrency
Tuned for large documents and batch jobs:
- Concurrency control: customize request rate — max out paid APIs or throttle free ones to avoid bans
- Streaming for large files: chunk-based handling keeps the UI responsive
- Context-aware translation: subtitles and documents get sent with surrounding context so the AI understands flow
- Per-line retry: failed lines are tracked separately; the rest of the batch isn't blocked
Failed-Line Retry
LLMs occasionally drop a line, return an empty response, or break formatting. When that happens:
- Failed lines automatically fall back to the original text — never empty, so the output is always usable
- A red alert at the top of the result panel says how many lines failed
- Click Retry failed lines to reissue only those — completed content stays as-is and isn't re-billed
- A copy button lets you grab the failed source rows for manual handling elsewhere
Multi-language batch mode: when translating into multiple targets at once, if a whole target language fails (quota exhausted, model refusal, etc.), the failing language codes get aggregated into a dedicated panel. One-click copy them back into the "target languages" field to retry.
Network blips, 429 rate-limits, and 5xx errors retry automatically. Bad API key, timeouts, context-length-exceeded, and max_tokens truncation never retry. See FAQ → Will failed translations retry? for the full list.
Cancel Translation
Click the close button on the progress modal to abort a running batch. Already-translated lines are cached, so clicking "Translate" again resumes from where you stopped.
RTL Language Auto-Adaptation
Right-to-left languages (Arabic, Hebrew, Persian, Urdu) automatically render right-to-left in the textarea and result view — no manual configuration.
Usage Modes
Batch vs. Single-File
The tool switches modes based on what you upload:
- Batch mode (default): drop multiple files, they queue up automatically and download as a bundle when done.
- Single-file mode: upload one file or paste text — review line-by-line, edit before exporting.
Advanced settings let you lock to single-file mode if you prefer.
JSON Translate is single-file mode only.
One-Click Source/Target Swap
A ⇄ button sits between the source and target language dropdowns — click to swap them. The button greys out when the source is "Auto-detect" or when multi-language mode is on (you can't swap "auto" or against multiple targets).
Language Picker
122 languages are grouped by geography + speaker count (Common / Europe / Middle East / Central Asia / South Asia / Southeast Asia / Africa / Americas & Oceania) so you can find what you need fast. Multi-language mode adds four quick-preset buttons that merge-select common bundles:
- Global Top 10: the 10 most-spoken languages (English, Chinese, Spanish, French, Japanese, Portuguese, German, Russian, Hindi, Arabic)
- European mainstream: French, German, Italian, Spanish, Portuguese, Dutch, Polish, and other commercial European languages
- East Asian: Chinese (Simplified + Traditional), Japanese, Korean, Cantonese
- Indian subcontinent: Hindi, Bengali, Tamil, Marathi, Gujarati, and other major South Asian languages
In single-language mode the tool also remembers your last 5 picks and surfaces them in a "Recent" group at the top of the dropdown. The mobile layout collapses to a single column automatically.
API Connection Status
The badge at the top of the main page tells you the current API's status at a glance:
- Not configured / Needs config: URL or API key missing
- Configured: filled in but not yet tested
- Testing → ✓ Connected or Connection failed: test results
- Free API: free, no-config services like GTX
Click the badge to jump to the API settings panel.
Presets: API Config and Prompts, Separately
API configs and prompts are stored as two independent preset types so they combine freely:
- API presets: snapshot the current service's URL, key, model, temperature, etc. Useful for switching between local Ollama, a remote gateway, and a paid cloud endpoint.
- Prompt presets: snapshot the system + user prompts. Switch between a "strict terminology" prompt and a "creative paraphrase" prompt without touching the API config.
Both types support add / load / rename / update / delete, and travel with settings import/export.
Post-Translation Cleanup
After translation, the tool can automatically apply simple string replacements:
- Character filtering: strip stray symbols like
♪ ♫from subtitles - Format cleanup: remove leftover HTML tags
This feature does plain string replacement only — no escape sequences (\n, \t etc.). Use the Text Splitter for richer transformations.
Advanced Settings
Settings Import/Export
One-click backup of every configuration: API credentials, model parameters, API presets, Prompt presets. The exported JSON imports across devices, ideal for team sharing or moving to a new machine.
General Options
- Use Cache: enabled by default. Reads cached results when parameters match. Disable temporarily while debugging.
- Retry Count: maximum retries on failure. Bump it up on shaky networks or rate-limited free endpoints.
- Retry Timeout (seconds): per-request timeout. Increase for slow models or long text.
- Max response tokens (Custom OpenAI-compatible only): optional
maxTokenscap to prevent local small models from getting stuck in repetition loops. Default 0 (unlimited). When a response truncates (finish_reason=length), the line is marked failed and won't retry — same params would truncate again. Cloud LLMs rely on the server-side default and don't expose this control. - Remove characters after translation: auto-strip specified characters or fragments from results (e.g.,
♪in subtitles, leftover<i>tags). - Custom export filename: standardize filenames in batch exports. Placeholders:
{name}(source filename),{lang}(target language),{ext}(extension),{date},{time}. Example:{name}_{lang}_{date}.{ext}.
API Parameter Tuning
Chunk Size
Non-LLM APIs (Google / Azure) split long text into chunks before sending. Chunk size is the per-chunk character cap. Common limits:
⚠️ Google Translate Web breaks line breaks, so chunking is disabled there.
Delay (ms)
The cooldown between chunked requests. Increase on poor networks or free APIs. For example, Azure Translate Free Tier works best at 5000 ms or higher.
Concurrent Lines
The max number of lines translated in parallel. The default is already tuned per service — free APIs run fast, paid APIs default to safe. Bump it up if you want speed, drop it on 429 rate-limits, otherwise leave it alone.
Defaults per service (reference)
- GTX (Free): 100
- Commercial MT (Google / Azure / DeepL): 20-100
- Cloud LLMs (Claude / Gemini / OpenAI / Qwen, etc.): 20
- Custom local LLM / TranslateGemma / DeepLX: 10
Context Batch Size
- Default: 3 (1 for some providers)
- What it does: in context-aware mode, how many "target lines" go into a single request. Each request also carries the surrounding context.
- Trade-off: larger values give higher throughput but force the model to emit more lines per response, raising the chance of formatting drift. Smaller values are steadier but use more requests. Stick with 3 for documents/subtitles, push to 5 for plain text.
Context Lines
How many surrounding lines accompany each batch sent to the model — more = more coherent, but heavier requests. The default is tuned, normally no need to change. Drop it if you hit "context length exceeded"; raise it for better dialogue flow.
Defaults per service (reference)
- Cloud LLMs (Claude / Gemini / Nvidia / Azure OpenAI): 100
- Custom local LLM: 30 (models under 14B tend to drop lines in long batches, hence the smaller default)

