Anonymize sensitive data, generate synthetic datasets, clean & shape files, merge multiple sources, and compare files — all in your browser
Replace sensitive columns with realistic substitute values
Everything runs in your browser — no data is uploaded or sent anywhere.
Drop your file here or click to browse
Supports CSV, TSV, XLS, XLSX
Red columns were auto-flagged as likely sensitive. Toggle each on or off, and adjust the replacement type if needed.
Match the original, or specify a custom number.
Enter more than the original to generate extra rows using the same column patterns. Enter fewer to include only that many rows.
Original vs. anonymized — first 10 rows.
Download alongside your anonymized dataset.
Combine 2–4 columns to form a unique record identifier.
Build a realistic dataset from scratch
No real data required — everything runs in your browser.
Type your column names separated by commas — the tool will detect the right data type for each one automatically.
Load a pre-built column schema, then tweak as needed.
Edit names, change types, or tweak options. Sample values update live.
Choose the option that fits your situation.
Upload two or more files with the same columns — rows from all files will be combined into one, with duplicates removed.
Drop your files here or click to browse
You can select multiple files at once — CSV, TSV, XLS, XLSX
Compare two files and surface every difference
Budget vs. actuals, this month vs. last, CRM vs. survey — nothing leaves your browser.
Your reference file — the one you consider authoritative.
Drop File A here or click to browse
Supports CSV, TSV, XLS, XLSX
The file you want to check against File A.
Drop File B here or click to browse
Supports CSV, TSV, XLS, XLSX
Pick the column that uniquely identifies each row — an ID, email, or order number.
Select 2–4 shared columns to combine into a single unique identifier.
Choose which columns to check for differences. Columns found in both files are selected by default.
By default, rows are matched by exact key values. Enable fuzzy matching to also catch near-matches — useful when names are formatted differently across systems.
Fix, reshape, or filter a single dataset — no formulas needed
Drop your file here or click to browse
Supports CSV, TSV, XLS, XLSX
Once you upload your file, you will be able to:
You can chain multiple tools in sequence — apply one, then run another on the result.
Pick a tool — after previewing the result you can apply it and run another tool on the same data.
Select which columns define a duplicate row. If two rows match on all selected columns, only the first is kept.
Your data is scanned automatically. Review detected issues, queue fixes, and apply them all in one pass.
Convert columns that represent time periods, categories, or repeated measurements into individual rows — making the data ready for merging, filtering, or charting.
| Name | Jan_Hrs | Feb_Hrs | Mar_Hrs |
|---|---|---|---|
| Alice | 120 | 130 | 110 |
| Bob | 90 | 100 | 95 |
| Name | Month | Hours |
|---|---|---|
| Alice | Jan_Hrs | 120 |
| Alice | Feb_Hrs | 130 |
| Bob | Jan_Hrs | 90 |
| Bob | Feb_Hrs | 100 |
Month = "Jan_Hrs", or join to a forecast file that also has a Month column.Merge two or more columns into a new column. The originals are kept unless you choose to remove them.
Check the columns to include, drag to reorder.
Click the row that contains your real column headers. Any rows above it will be removed.
Any columns not listed here keep their names.
Build a new column from two existing columns. The result is appended as the last column.
Roll up rows by grouping on key columns, then sum, average, or count numeric columns. Useful for converting weekly rows to monthly totals.
Keep only rows that match your conditions. Multiple conditions are combined with AND.
Each row below represents one duplicate group where conflicting values were resolved. For each conflicting field the chosen value and its source row are shown.
These duplicate groups had no conflicting values — all copies agreed or only differed by blank vs. filled. The identifier values listed here are rows you could safely combine in your source system.
Include/exclude columns, rename them, or drag the order using ↑↓. Changes apply only at download time — your working data is unchanged.
Common data quality issues and how to resolve them before merging or reconciling.
"Sarah Chen" and "Sarah Chen " are not equal to a computer. Extra spaces are invisible on screen but break every exact match — the most common cause of silently dropped rows in a merge.
"OPERATIONS", "Operations", and "operations" are three different values in an exact match. Common when data comes from multiple input systems or manual entry.
One system exports 01/15/2024, another exports 2024-01-15. When sorted or compared as text, dates produce wrong results. Numeric comparisons on text dates fail silently.
A column containing "1,200" or "$450.00" looks numeric but is a string. Sums return zero, comparisons fail, and the column sorts alphabetically instead of numerically.
Data where time periods or categories are spread across columns — e.g., Jan, Feb, Mar as separate column headers. This format is readable but cannot be filtered, grouped, or joined. Most BI tools and merge operations require long format.
| Name | Jan | Feb |
|---|---|---|
| Alice | 120 | 130 |
| Name | Month | Hours |
|---|---|---|
| Alice | Jan | 120 |
| Alice | Feb | 130 |
Some exports include report titles, export dates, or blank rows above the actual column headers. When the header row is row 3 instead of row 1, every column name imports as Column1, Column2, etc.
HR uses EMP-001. Workday uses W-001. Workfront uses the employee's full name. When systems use different ID schemes for the same entity, joins silently drop rows — no error, just missing data.
An empty cell, the text "NULL", and the number 0 are three distinct values. Aggregating a column with blanks skews averages. Joining on a null key matches nothing. Most systems export nulls inconsistently.
Duplicate records inflate counts and sums. They are often introduced by exports that include sub-total rows, by multi-sheet copies of the same data, or by incomplete deduplication upstream.
A merge target file needs a column that doesn't exist in the source — for example, a full name built from first and last, or a variance computed from budget and actuals. Attempting to join on a column that doesn't exist fails silently or throws an error.
Consider these items before running a merge or reconciliation.
" Smith" vs "Smith"). Two values that look identical on screen can fail to match because one has a hidden space. Use Clean & Shape → Standardize → Clean up spaces to fix this. Capitalisation should also be consistent — "new york" and "New York" will not match unless normalised.