Show Menu
Cheatography

Cleaning data with MySQL Cheat Sheet (DRAFT) by

Cleaning your data with MySQL

This is a draft cheat sheet. It is a work in progress and is not finished yet.

Remove duplicates

Create CTE with added row_num column
WITH dups AS
(SELECT *,
ROW_NUMBER() OVER (
PARTITION BY [colum­n_n­ames]) AS row_num
FROM [origi­nal­_table] )
SELECT *
FROM dups
WHERE row_num > 1;
Deleting duplicates
DELETE FROM
t USING t
INNER JOIN dups ON t.id = dups.id
WHERE
dups.row_num > 1;
 

Standa­rdise data

Inspect column
SELECT DISTINCT c
FROM t
ORDER BY 1;
Delete spaces
UPDATE t
SET c = TRIM(c)
Delete other character (.)
UPDATE t
SET c = TRIM(T­RAILING '.' FROM c)
WHERE c LIKE '[problem]

Use either
TRAILING
or
LEADING
depending on where the character is