Translation API Guide
The tool integrates 7 classic translation APIs and 24 large language models, so you can pick whichever fits your text type, budget, and privacy needs.
Which API should I pick?
For detailed comparisons and how to get API keys, keep reading ↓
Classic Translation APIs
Notes:
- DeepL can't be called directly from the browser — the tool routes through a built-in proxy by default. If you have your own proxy, fill it in the API URL field.
- Qwen-MT is Alibaba Cloud's translation-specialized model. See Qwen-MT essentials below.
- TranslateGemma is Google's open-source translation-specialized Gemma model. You'll need to run it locally with LM Studio / Ollama / llama.cpp — see Local Model Setup.
- GTX API / Edge API are both zero-config free machine translation that back each other up — if one can't connect, switch to the other. See Free machine translation essentials below.
For more reliable service, apply for a commercial API key — see the API application guide.
Large Language Models (LLMs)
Supported: DeepSeek, OpenAI, Claude, Gemini, Qwen, Moonshot (Kimi), Zhipu GLM, Doubao, Xiaomi MiMo, MiniMax, Tencent Hunyuan, Baidu ERNIE, Cohere, xAI (Grok), Mistral, Perplexity, YandexGPT (AI Studio), OpenRouter, Groq, SiliconFlow, GitHub Models, Nvidia NIM, Azure OpenAI, LiteLLM (self-hosted gateway), plus any OpenAI-compatible endpoint.
LLMs work best for:
- Literature and technical documentation that needs deeper understanding
- Multilingual content where consistent terminology matters
- Custom prompts to control translation style
Key parameters:
- Model: enter the model name from your provider; for Azure OpenAI, enter the deployment name.
- Temperature: defaults to 0.7. Try 0.2 for technical content, 0.9 for marketing or creative paraphrasing.
- Thinking mode: lets the AI think before translating — higher quality, slower and pricier. Supported models show a toggle in the UI. The toggle is stored per model so switching models preserves each one's setting independently. The exact UI form depends on the provider:
- Three levels (off / low / medium / high): Claude, Gemini, OpenAI GPT-5, Qwen3, Azure OpenAI, Nvidia NIM, OpenRouter, Groq (GPT-OSS), Perplexity Sonar Deep Research
- Binary toggle (off / on): DeepSeek (ON always uses the top tier — the API's middle tiers add nothing for translation), Doubao, Zhipu GLM, Moonshot (Kimi), Xiaomi MiMo, SiliconFlow, ERNIE 5.0 Thinking, Mistral (Medium 3.5 / Small 4), Cohere Command A Reasoning (underlying APIs accept on/off but not an effort param)
- Low / high only: xAI Grok 4.3 (API limitation)
- Three-state (off / on / auto): when you enter an unlisted custom model on a thinking-capable provider (incl. Mistral, Perplexity, and the custom OpenAI-compatible endpoint). Auto omits the thinking param to follow the model's built-in default — a fallback for strict providers that 422 on a non-thinking SKU; defaults to Off
- Always-on SKUs (thinking is intrinsic to the model; no toggle shown): MiniMax M2.x, Tencent Hunyuan (TurboS / 2.0 Thinking / T1), Mistral Magistral, Perplexity Sonar Reasoning Pro, Grok 4.20 Reasoning / Multi-Agent
- No thinking toggle: GitHub Models, YandexGPT (the gateway/API doesn't support reasoning params), LiteLLM (the gateway injects no thinking params — upstream model defaults apply)
Provider-Specific Notes
- GitHub Models: authenticates with a GitHub PAT (needs the
models:readscope); free quota is tiered per model — a good entry point if you don't have a paid key. - Tencent Hunyuan: the official endpoint currently rejects browser CORS preflights (direct calls always fail), so API Relay is ON by default to make it work out of the box. If Tencent fixes this, you can switch back to direct in API Settings. See API Relay.
- YandexGPT (AI Studio): needs a Folder ID in addition to the API key (grab it from the folders page in the Yandex AI Studio console). For the model field, enter a SKU name (e.g.
yandexgpt-5.1; open-weight SKUs like Qwen3, DeepSeek, and GPT-OSS are also offered) or paste a fullgpt://<folder_id>/<model>/latestURI. Yandex's API sends no CORS headers, so API Relay is ON by default (the switch stays user-controllable, and you can point the URL at your own relay instead).
Regional Endpoint Switcher
Many providers run separate endpoints for Mainland China, International, and US regions. The official endpoints appear as quick-pick chips above the URL field — click to switch:
URL Auto-Completion
Self-hosted and multi-region providers (Custom, LiteLLM, TranslateGemma, Qwen / Qwen-MT, Doubao, Nvidia) complete the URL to the full path the moment focus leaves the field; for every other provider, a custom URL is normalized automatically before each request is sent. Paste http://host:port or http://host:port/v1 and the tool fills in the rest — the classic "missing /v1/chat/completions → connection failure" mistake can't happen.
Free machine translation essentials
The tool ships three zero-config free machine-translation services — GTX (Free), Edge (Free), and DeepLX (Free). None need an API key; they call the official endpoints directly from your browser and your text never touches this tool's servers. They take different routes, so they back each other up: if one can't connect, just switch to another in the dropdown — no key required. GTX is the default.
GTX gateway is switchable
GTX defaults to translate-pa.googleapis.com (the gateway behind Google's web-translate widget — CORS-correct, good availability). Quick-switch chips sit above the URL field:
- translate-pa (default): recommended, works in most network environments
- Legacy gtx: the old
translate.googleapis.com/translate_aendpoint. Google has tightened anti-abuse on it (many IPs get redirected to a captcha page, which the browser reports as CORS), but the block is IP-reputation-based and some regions/networks still pass — kept as a fallback - Self-hosted mirror: paste your own mirror URL (e.g. a Cloudflare Worker); the tool auto-detects the protocol from the address shape
Rate limits and automatic slowdown
The shared free endpoints are rate-limited per user. GTX now translates in batched chunks — many lines packed into ~5000-character blocks, one request per block — so request volume drops sharply versus line-by-line, and everyday use rarely triggers throttling. Huge bursts can still hit limits; the tool handles it automatically:
- When throttled it pauses all requests to the service and resumes on its own shortly after, showing "Rate limited — pausing briefly, will retry automatically"
- Translation slows down but keeps going; in most cases no action is needed
- If the failure panel keeps showing rate-limit messages: wait a few minutes and hit "Retry" (the cache skips completed lines), switch to the Edge (Free) / DeepLX (Free) backups, or move long batch jobs to a keyed service like DeepL / Qwen-MT / DeepSeek
Can't connect, or seeing CORS errors in the console? First switch the gateway or try Edge (Free) — that fixes most cases in one step. Still failing? Check your network environment (mainland China blocking, corporate network interception, browser extensions) — see FAQ → GTX Free cannot connect for the checklist.
Qwen-MT Essentials
Qwen-MT is a machine translation service (not a general LLM). It has no system-prompt concept and works purely with source/target language codes — so the Prompt settings don't apply.
Picking a Model
You'll need to fill in the Model field manually:
Domain Hint
The domains field tells the model what industry the text is from, so terminology lands closer to the field. Important: write a short English description, not a keyword list. Alibaba's official example:
Leave empty if you don't need it.
Native Glossary Channel
Qwen-MT is one of the few MT services with native glossary support: with the Glossary enabled, matched terms are sent through the official translation_options.terms parameter and applied by the model itself — more reliable than prompt injection.
Unsupported Languages
Qwen-MT covers ~92 languages; a number of low-resource ones aren't covered and the UI auto-blocks them with a clear message (the in-app blocklist is authoritative): e.g. Kyrgyz (ky), Turkmen (tk), Tajik (tg), Mongolian (mn), Malayalam (ml), Uyghur (ug), Amharic (am), and dozens more.
API Relay & Built-in Proxy
Some providers' official endpoints block direct browser calls (CORS). The tool offers two proxy channels; text is never stored on our servers.
API Relay (user-controlled)
12 providers — DeepSeek, OpenAI, Claude, Qwen, Moonshot (Kimi), Doubao, Zhipu GLM, Mistral, xAI (Grok), Perplexity, Tencent Hunyuan, and YandexGPT — expose an "API Relay" switch in API Settings. When on, requests route through our Cloudflare relay (only the request body and auth headers are forwarded):
- Most providers default to OFF — when a direct call hits a CORS / 403 wall, the UI shows an actionable hint to enable it
- Hunyuan and YandexGPT default to ON: both official endpoints currently can't be reached from browsers, so the relay is what makes them work out of the box. The switch stays available — flip back to direct if the upstream ever fixes CORS
- Self-hosted relay: prefer your own relay? Put its address in the URL field. The precedence is fixed: custom URL > relay switch > official direct — with a URL filled in, the relay switch is grayed out with a note ("clear the URL to re-enable")
- The relay passes through the server's
Retry-Afterheader, so rate-limit auto-slowdown is exactly as precise as direct calls
Built-in Proxy (no switch)
DeepL and Nvidia NIM route through a separate built-in proxy by default. If you specify a custom API URL in settings, the proxy is bypassed and requests go directly to your URL.
Local Model Setup
Want to run models locally for privacy? The tool works with any OpenAI-compatible local server. For decent translation quality with a generic LLM, use qwen3-14b or larger (32B-class works even better); on limited VRAM, switch to the translation-specialized TranslateGemma — solid quality from 4B up.
In mainland China, download models from ModelScope — far faster than direct Hugging Face access or LM Studio's built-in downloader, and the official TranslateGemma repos are mirrored there.
Default Endpoints
These appear as quick-pick chips next to the URL field.
LiteLLM Self-Hosted Gateway
LiteLLM is a self-hosted proxy that fronts 100+ upstream models behind one OpenAI-compatible API. Pick LiteLLM directly from the service list — versus Custom, it gives you a dedicated config slot, so running LiteLLM as your daily driver doesn't force you to rewrite Custom's URL every time you switch to another self-hosted endpoint.
- Default address
http://127.0.0.1:4000/v1/chat/completions(litellm's default port); the URL is the credential - API Key optional: not needed for a bare local proxy; fill it in if your proxy has a master / virtual key configured
- Model can stay empty: when started with
litellm --model X(the official quick-start) or with acompletion_modelserver default, an empty model field follows the server's default; for multi-model config.yaml deployments, enter the model alias - LiteLLM allows browser cross-origin requests by default — no extra CORS setup
TranslateGemma
Google's translation-specialized Gemma model, trained specifically for translation quality. Quick notes:
- Pick "TranslateGemma" directly from the service list — don't go through "Custom (OpenAI-compatible)" with
translategemma-4b-itas the model name. The two take entirely different code paths: the dedicated TranslateGemma service makes line-by-line calls tailored to the Gemma translation model's I/O format, while Custom uses the generic LLM pipeline with batching and context markers — which causes dropped lines and slower runs on small (under 14B) models. - The default URL points to LM Studio on port 1234; one click switches to Ollama / llama.cpp
- API key is optional: leave it empty for a plain local server; if your deployment requires auth (LM Studio's "require API key", vLLM's
--api-key, or a reverse proxy in front), fill it in and requests carry anAuthorization: Bearerheader - Recommended models:
translategemma-4b-it(compact and fast),translategemma-12b-it(better quality), ortranslategemma-27b-it(best quality) - Prompt settings don't apply: like Qwen-MT, it's a machine-translation service — the prompt is built into the call format; system/user prompts only affect LLM providers
- Source language must be explicit — auto-detect isn't supported
- Limited language coverage: only ~55 mainstream languages (Google's WMT24++ benchmark scope). About 68 low-resource languages — including Cantonese (yue), Bhojpuri (bho), Wolof (wo), Aragonese (an), Guarani (gn), Kurdish (ckb/kmr) — are blocked by the UI. Use DeepL / Google / Azure / Qwen-MT for broader coverage
Solving CORS Issues
If a local model can't be reached, the two usual culprits:
Step 1: Disable ad/privacy extensions, then refresh and retry.
Step 2: Enable CORS on the local server.
Ollama
Run this once in PowerShell (Win + X to open Terminal) to enable it permanently:
*allows all origins. For tighter security, use a specific domain likehttp://192.168.2.20:3000.
Restart the Ollama service for the change to take effect. To enable temporarily, set the variable when starting:
LM Studio
- Open the "Developer" icon in the left menu
- Go to the local server settings page, click "Settings" at the top
- Check the "Enable CORS" box

That's it — local models should work now. If you're still stuck, check for port conflicts and look at the browser console for the actual error. (Special thanks to mrfragger for the configuration tips.)
Language Support
This tool supports translation between 120+ major languages, organized by region.
Language Code Reference
Use the language codes below for batch multi-language configuration (e.g., en, zh, ja, ko):
Common
Europe
Middle East
Central Asia
South Asia
Southeast Asia
Africa
Americas & Oceania
API Support Documentation
LLMs support all languages. Machine translation API language support:

