What text operations does this tool support?

Regex extract/remove, keyword filtering, line-level dedup, sort (asc/desc), reverse, batch prefix/suffix, JSON beautify, smart paragraph split, adjacent-line swap, plus a few preset combos targeted at web/social-media data. All on one page, stackable.

What if I don't know how to write regex?

The Regex Engine card ships with 5 common presets — URL (strict), URL (loose), Remove Index, Extract JSON Key, GPT cite markers. One click fills in the regex and the matching flags. For more complex patterns, describe what you need to ChatGPT/Claude and paste the generated regex back to iterate.

What does the "Smart Trim" toggle actually do?

When enabled (default), most action buttons (match, sort, dedup, prefix/suffix, …) first trim per-line whitespace and skip empty lines. When off, the original line structure is preserved.

What's the difference between "Remove Matches" and "Run Match"?

"Run Match" extracts every hit and lists them out. "Remove Matches" does the opposite — it deletes hits from the source and keeps everything else, also collapsing 3+ consecutive newlines down to 2.

Can I export the processed text?

Yes — one-click copy, save as text-processed.txt, or click "Result → Source" to feed the current result back into the input for further processing. Runs entirely in your browser — million-character files never leave your machine.

Text Toolbox

Text Toolbox is an integrated browser text utility that consolidates the most common cleanup, extraction, and formatting operations into a single page. Messy text from a webpage, lists of fields, or content that needs bulk reordering — all handled in a few clicks.

Text Toolbox interface

What problems does it solve?

Extract specific content: pull URLs, JSON keys, pattern-matching fields from a large body of text
Clean up noise: strip ads, GPT citation markers, HTML tags, blank lines
Tidy terminal-copied text: strip the quote bars and hard line wraps from Claude Code / terminal output and restore paste-ready paragraphs
Batch formatting: add prefixes/suffixes to each line — Markdown lists, CSV, SQL IN clauses
Sort & organize: asc/desc sort, reverse, dedup (with an exclusion list)
Complex pipelines: filter → regex → affix multi-step cleanups

Page layout

Top to bottom, three cards:

Source Text — input area + file upload + the "Smart Trim" toggle (bottom right)
Regex Engine — regex input + 5 presets + 3 flags + two action buttons
Line Tools — line-level operations grouped by purpose

When done, a Result card appears at the bottom with copy / export / format / move-to-source actions.

Source Text card

Paste: directly into the textarea
Upload: drag-and-drop or click; TXT, MD, JSON, CSV and other rich-text formats supported
Smart Trim toggle (default on): when on, almost every processing button first trims per-line whitespace and drops empty lines; when off, the original line structure is preserved

Regex Engine card

Regex input: any JavaScript regex pattern
5 common presets (CheckableTags): one click fills in the regex and sets the appropriate flags
- URL (strict): matches plain https:// URLs without trailing punctuation
- URL (loose): catches URLs containing brackets, semicolons, more punctuation cases
- Remove Index: strips leading "1. ", "2、", "3) " line numbers
- Extract JSON Key: pulls every key name out of a JSON blob (multiline)
- GPT cite markers: cleans [1], (cite...) residue from GPT/Claude output
3 flags: global (g) / multiline (m) / case-insensitive (i)
Two buttons:
- Run Match: extract every match and list them; toast shows the count
- Remove Matches: delete matches from the source; also collapses 3+ newlines to 2

Line Tools card (grouped by purpose)

Organize

Sort Ascending / Descending: alphabetical (Unicode order); click toggles direction
Reverse: invert line order
Dedup: drop fully duplicate lines (use with the "Exclude" textarea below to keep certain lines even when duplicated)
Format: drop blank lines + smart trim (or not, depending on the toggle)

Filter

Type comma-separated keywords, e.g. ad,promo,channel
Click "Filter Lines": delete every line containing any of the keywords; result lands in the Result card

Prefix / Suffix

Prefix input: prepended to every line (empty by default)
Suffix input: appended to every line (defaults to ,100, edit freely)
Example: prefix - , empty suffix → convert plain text into a Markdown list
Example: prefix ', suffix ', → convert a list of strings into a SQL IN (...) clause

Convert

Smart Split: uses compromise English NLP for sentence boundaries + Chinese paragraph rules
CLI Text Cleanup: built for text copied out of Claude Code / a terminal — strips leading quote bars (▎ ▌ │) and indentation, reflows lines hard-wrapped at the terminal width back into full paragraphs (CJK lines join directly, Latin lines get a single space), and keeps list items and code blocks line-by-line. The result is auto-copied to the clipboard, ready to paste
JSON Beautify: lenient parse (handles unquoted keys, single quotes, comments) + 2-space indent
Common Link Replace: replace every https://huggingface.co with https://modelscope.cn/models — useful for switching HuggingFace links to a China-accessible mirror

Advanced (same row of buttons)

Regex Extract + Affix: extract via regex then batch-apply prefix/suffix in one step
Batch Task Extract: a URL → number pairing pipeline — finds each line's URL, then scans for numbers attached to keywords like 点赞 / 转发 / 评论 / 播放 / 差 / 曝光 / 阅读 (Chinese social-media metrics) and outputs CSV grouped by metric
Adjacent Swap: swap line pairs end-to-end (input must have an even line count)
Custom Operation: extract all URLs via the "URL (loose)" preset + reverse order + join with commas (handy for reverse-lookup URL lists)

Exclude (paired with Dedup)

The multi-line textarea at the bottom of the Line Tools card
Used by Dedup: lines listed here are kept even when duplicated
Example: exclude Home\nAbout to preserve those headings across repeated occurrences

Result card

When something has been produced, the Result card surfaces:

Copy: one-click clipboard copy
Export: download as text-processed.txt
Format: clean up extra blank lines in the result
Result → Source: pipe the result back into the input for the next step (great for multi-stage pipelines)

Tips

Getting started

Try the presets first — Regex Engine card → click a tag
Test on a small slice before processing critical data
Stacked operations: filter → extract → affix solves most cleanups in three clicks

Power moves

Keep Smart Trim on for almost every workflow
Chain multi-step pipelines via "Result → Source"
When regex stumps you, describe the problem + sample input + expected output to ChatGPT/Claude

Runs entirely in your browser — no data is uploaded — safe for sensitive material.

#Text Toolbox

#What problems does it solve?

#Page layout

#Source Text card

#Regex Engine card

#Line Tools card (grouped by purpose)

#Organize

#Filter

#Prefix / Suffix

#Convert

#Advanced (same row of buttons)

#Exclude (paired with Dedup)

#Result card

#Tips

#Getting started

#Power moves