Find duplicate rows in table with SQL
There are constructs where you shouldn't have duplicate column combinations in a table. Best protection against it, is a combined unique constraint. But depending on when you join the project, it might already be to late and you might already have those duplicates in your table.
It's possible to find them with SQL. In our example we have an event sourcing system with domain messages. Every message has a
aggregate_root_id and an
aggregate_root_version. The combination of them should be unique.
We find the duplicates by using a
GROUP BY. You have to make sure that all columns are included in
SELECT COUNT(*), aggregate_root_id, aggregate_root_version FROM domain_messages GROUP BY aggregate_root_id, aggregate_root_version HAVING COUNT(*) > 1;