How to remove duplicate rows in SQL (works with Postgres)

This snippet was handy for to remove duplicate rows in SQL. These had crept into a Postgres database table as a result of a dodgy join. This is sort of a note to self, really.

There are probably more than a couple of ways of doing this but the best example I came up with after reading around is:

DELETE FROM table_name
WHERE ctid NOT IN (
SELECT min(ctid)
FROM table_name
GROUP BY field1, field2);

The reason this works revolves around the ctid field. This is a default field you don’t have to specify that exists everywhere, so you can use it to remove duplicate rows (which are identified here by duplicates/groups created with the fields field1 and field2).

Useful references:

File:PostgreSQL logo.3colors.120x120.png
The excellent PostgreSQL logo!

It might well be that if you’re using MySQL/MariaDB/MS SQL or any of those other SQL variants, that this doesn’t apply… I’ll update this if I ever need to use those versions to remove duplicate lines from a table. Meanwhile you might be interested in a list of SQL tips and tricks that I’m putting together — not so much to do with syntax as approaches in the wild.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top