""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
SQL TIP: DELETE DUPLICATES BASED ON A COLUMN LIST
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
PRODUCT : R:BASE VERSION : 3.1
CATEGORY : SQL SUBCATEGORY : DELETE DUPLICATES
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
Replace programs, complex reports, and relational commands with SQL to
achieve maximum performance.
Using SQL, you can find duplicates based on a list of columns. All you
need is a SELECT command that does the equivalent of a TALLY and uses
a HAVING clause to select the groups that have counts greater than
one. To delete the duplicates, use the SELECT in a DECLARE CURSOR
structure to delete duplicates after keeping the first row in each
group.
For example, using CONCOMP, add a few duplicate rows to TRANSMASTER.
Then run SQLXDUPE (below) to delete duplicates based on the TRANSID,
CUSTID, and EMPID columns:
*( SQLXDUPE.CMD--Delete duplicates)
*( based on a set of columns.)
SET BELL OFF
DROP CURSOR c1
DECLARE c1 CURSOR FOR +
SELECT transid, custid, empid +
FROM transmaster WHERE transid IN +
(SELECT transid FROM transmaster +
GROUP BY transid, custid, empid +
HAVING COUNT(*) > 1 ) +
GROUP BY transid, custid, empid
OPEN c1
FETCH c1 INTO vtransid vitrans, +
vcustid vicust, vempid viemp
WHILE SQLCODE <> 100 THEN
DELETE ROW FROM transmaster +
WHERE transid=.vtransid AND +
custid=.vcustid AND empid = .vempid +
AND COUNT > 1
FETCH c1 INTO vtransid vitrans, +
vcustid vicust, vempid viemp
ENDWHILE
CLEAR VAR vtransid, vitrans, vcustid, +
vicust, vempid, viemp
Modify SQLXDUPE to use your columns and tables. This example also
shows you a good DECLARE CURSOR structure with INDICATOR variables for
each variable fetched by the FETCH command.