How to safely remove only duplicated rows?
I have a tables that contains 6.820.483 and between these rows there a lot of duplicates, I discovered that running this query:
SELECT player_id, match_id, team_id, count(*)
FROM fixtures
GROUP BY player_id, match_id, team_id
HAVING COUNT(*) > 1
structure example:
player_id | match_id | team_id
19014 2506172 12573
19014 2506172 12573
19015 2506172 12573
19016 2506172 12573
19016 2506172 12573
19016 2506172 12573
how can I safely remove only the duplicates? In the example above the table should looks like:
player_id | match_id | team_id
19014 2506172 12573
19015 2506172 12573
19016 2506172 12573
table structure:
CREATE TABLE IF NOT EXISTS `swp`.`fixtures` (
`player_id` INT NOT NULL,
`match_id` INT NOT NULL,
`team_id` INT NOT NULL,
INDEX `player_id_idx` (`player_id` ASC),
INDEX `match_id_idx` (`match_id` ASC),
INDEX `FK_team_fixtures_id_idx` (`team_id` ASC),
CONSTRAINT `FK_player_fixtures_id`
FOREIGN KEY (`player_id`)
REFERENCES `swp`.`player` (`id`)
ON DELETE NO ACTION
ON UPDATE NO ACTION,
CONSTRAINT `FK_match_fixtures_id`
FOREIGN KEY (`match_id`)
REFERENCES `swp`.`match` (`id`)
ON DELETE NO ACTION
ON UPDATE NO ACTION,
CONSTRAINT `FK_team_fixtures_id`
FOREIGN KEY (`team_id`)
REFERENCES `swp`.`team` (`id`)
ON DELETE NO ACTION
ON UPDATE NO ACTION)
ENGINE = InnoDB;
mysql sql
|
show 6 more comments
I have a tables that contains 6.820.483 and between these rows there a lot of duplicates, I discovered that running this query:
SELECT player_id, match_id, team_id, count(*)
FROM fixtures
GROUP BY player_id, match_id, team_id
HAVING COUNT(*) > 1
structure example:
player_id | match_id | team_id
19014 2506172 12573
19014 2506172 12573
19015 2506172 12573
19016 2506172 12573
19016 2506172 12573
19016 2506172 12573
how can I safely remove only the duplicates? In the example above the table should looks like:
player_id | match_id | team_id
19014 2506172 12573
19015 2506172 12573
19016 2506172 12573
table structure:
CREATE TABLE IF NOT EXISTS `swp`.`fixtures` (
`player_id` INT NOT NULL,
`match_id` INT NOT NULL,
`team_id` INT NOT NULL,
INDEX `player_id_idx` (`player_id` ASC),
INDEX `match_id_idx` (`match_id` ASC),
INDEX `FK_team_fixtures_id_idx` (`team_id` ASC),
CONSTRAINT `FK_player_fixtures_id`
FOREIGN KEY (`player_id`)
REFERENCES `swp`.`player` (`id`)
ON DELETE NO ACTION
ON UPDATE NO ACTION,
CONSTRAINT `FK_match_fixtures_id`
FOREIGN KEY (`match_id`)
REFERENCES `swp`.`match` (`id`)
ON DELETE NO ACTION
ON UPDATE NO ACTION,
CONSTRAINT `FK_team_fixtures_id`
FOREIGN KEY (`team_id`)
REFERENCES `swp`.`team` (`id`)
ON DELETE NO ACTION
ON UPDATE NO ACTION)
ENGINE = InnoDB;
mysql sql
1
What do you mean by remove? Do you want to delete them from the table, or just select only unique values?
– WillardSolutions
Nov 20 '18 at 15:24
@WillardSolutions I need to delete them because are duplicated check the example for understand
– Spartaok
Nov 20 '18 at 15:25
Is there a primary key on thefixtures
table?
– WillardSolutions
Nov 20 '18 at 15:25
Does your table have some PK that uniquely identifies a record?
– Robert Kock
Nov 20 '18 at 15:25
@WillardSolutions nope, the table contains FK of records available in other tables
– Spartaok
Nov 20 '18 at 15:26
|
show 6 more comments
I have a tables that contains 6.820.483 and between these rows there a lot of duplicates, I discovered that running this query:
SELECT player_id, match_id, team_id, count(*)
FROM fixtures
GROUP BY player_id, match_id, team_id
HAVING COUNT(*) > 1
structure example:
player_id | match_id | team_id
19014 2506172 12573
19014 2506172 12573
19015 2506172 12573
19016 2506172 12573
19016 2506172 12573
19016 2506172 12573
how can I safely remove only the duplicates? In the example above the table should looks like:
player_id | match_id | team_id
19014 2506172 12573
19015 2506172 12573
19016 2506172 12573
table structure:
CREATE TABLE IF NOT EXISTS `swp`.`fixtures` (
`player_id` INT NOT NULL,
`match_id` INT NOT NULL,
`team_id` INT NOT NULL,
INDEX `player_id_idx` (`player_id` ASC),
INDEX `match_id_idx` (`match_id` ASC),
INDEX `FK_team_fixtures_id_idx` (`team_id` ASC),
CONSTRAINT `FK_player_fixtures_id`
FOREIGN KEY (`player_id`)
REFERENCES `swp`.`player` (`id`)
ON DELETE NO ACTION
ON UPDATE NO ACTION,
CONSTRAINT `FK_match_fixtures_id`
FOREIGN KEY (`match_id`)
REFERENCES `swp`.`match` (`id`)
ON DELETE NO ACTION
ON UPDATE NO ACTION,
CONSTRAINT `FK_team_fixtures_id`
FOREIGN KEY (`team_id`)
REFERENCES `swp`.`team` (`id`)
ON DELETE NO ACTION
ON UPDATE NO ACTION)
ENGINE = InnoDB;
mysql sql
I have a tables that contains 6.820.483 and between these rows there a lot of duplicates, I discovered that running this query:
SELECT player_id, match_id, team_id, count(*)
FROM fixtures
GROUP BY player_id, match_id, team_id
HAVING COUNT(*) > 1
structure example:
player_id | match_id | team_id
19014 2506172 12573
19014 2506172 12573
19015 2506172 12573
19016 2506172 12573
19016 2506172 12573
19016 2506172 12573
how can I safely remove only the duplicates? In the example above the table should looks like:
player_id | match_id | team_id
19014 2506172 12573
19015 2506172 12573
19016 2506172 12573
table structure:
CREATE TABLE IF NOT EXISTS `swp`.`fixtures` (
`player_id` INT NOT NULL,
`match_id` INT NOT NULL,
`team_id` INT NOT NULL,
INDEX `player_id_idx` (`player_id` ASC),
INDEX `match_id_idx` (`match_id` ASC),
INDEX `FK_team_fixtures_id_idx` (`team_id` ASC),
CONSTRAINT `FK_player_fixtures_id`
FOREIGN KEY (`player_id`)
REFERENCES `swp`.`player` (`id`)
ON DELETE NO ACTION
ON UPDATE NO ACTION,
CONSTRAINT `FK_match_fixtures_id`
FOREIGN KEY (`match_id`)
REFERENCES `swp`.`match` (`id`)
ON DELETE NO ACTION
ON UPDATE NO ACTION,
CONSTRAINT `FK_team_fixtures_id`
FOREIGN KEY (`team_id`)
REFERENCES `swp`.`team` (`id`)
ON DELETE NO ACTION
ON UPDATE NO ACTION)
ENGINE = InnoDB;
mysql sql
mysql sql
edited Nov 20 '18 at 15:32
Spartaok
asked Nov 20 '18 at 15:20
SpartaokSpartaok
748
748
1
What do you mean by remove? Do you want to delete them from the table, or just select only unique values?
– WillardSolutions
Nov 20 '18 at 15:24
@WillardSolutions I need to delete them because are duplicated check the example for understand
– Spartaok
Nov 20 '18 at 15:25
Is there a primary key on thefixtures
table?
– WillardSolutions
Nov 20 '18 at 15:25
Does your table have some PK that uniquely identifies a record?
– Robert Kock
Nov 20 '18 at 15:25
@WillardSolutions nope, the table contains FK of records available in other tables
– Spartaok
Nov 20 '18 at 15:26
|
show 6 more comments
1
What do you mean by remove? Do you want to delete them from the table, or just select only unique values?
– WillardSolutions
Nov 20 '18 at 15:24
@WillardSolutions I need to delete them because are duplicated check the example for understand
– Spartaok
Nov 20 '18 at 15:25
Is there a primary key on thefixtures
table?
– WillardSolutions
Nov 20 '18 at 15:25
Does your table have some PK that uniquely identifies a record?
– Robert Kock
Nov 20 '18 at 15:25
@WillardSolutions nope, the table contains FK of records available in other tables
– Spartaok
Nov 20 '18 at 15:26
1
1
What do you mean by remove? Do you want to delete them from the table, or just select only unique values?
– WillardSolutions
Nov 20 '18 at 15:24
What do you mean by remove? Do you want to delete them from the table, or just select only unique values?
– WillardSolutions
Nov 20 '18 at 15:24
@WillardSolutions I need to delete them because are duplicated check the example for understand
– Spartaok
Nov 20 '18 at 15:25
@WillardSolutions I need to delete them because are duplicated check the example for understand
– Spartaok
Nov 20 '18 at 15:25
Is there a primary key on the
fixtures
table?– WillardSolutions
Nov 20 '18 at 15:25
Is there a primary key on the
fixtures
table?– WillardSolutions
Nov 20 '18 at 15:25
Does your table have some PK that uniquely identifies a record?
– Robert Kock
Nov 20 '18 at 15:25
Does your table have some PK that uniquely identifies a record?
– Robert Kock
Nov 20 '18 at 15:25
@WillardSolutions nope, the table contains FK of records available in other tables
– Spartaok
Nov 20 '18 at 15:26
@WillardSolutions nope, the table contains FK of records available in other tables
– Spartaok
Nov 20 '18 at 15:26
|
show 6 more comments
3 Answers
3
active
oldest
votes
I'm not a MySQL expect but you could try this (if you're sure no new records will be inserted in the meantime):
CREATE TABLE tmp_fixtures
(
player_id INT NOT NULL,
match_id INT NOT NULL,
team_id INT NOT NULL
);
SELECT DISTINCT
player_id,
match_id,
team_id
INTO tmp_fixtures
FROM fixtures;
TRUNCATE TABLE fixtures;
In order to make sure no duplicated records are created anymore, you could do the following:
ALTER TABLE fixtures ADD PRIMARY KEY (player_id, match_id, team_id);
After this, repopulate the table and clean up:
INSERT INTO fixtures (player_id, match_id, team_id)
SELECT player_id,
match_id,
team_id
FROM tmp_fixtures;
DROP TABLE tmp_fixtures;
This should be the accepted answer. When you do not use a PK, most DBMS would fail deleting a row in case of ambiguity. Moving old data to a new table which would use PKs is the only safe way to step out of this mess.
– afe
Nov 20 '18 at 15:47
this give me that error: undeclared tmp_fixtures
– Spartaok
Nov 20 '18 at 15:48
Right idea, wrong database. You should be usingcreate table as
.
– Gordon Linoff
Nov 20 '18 at 16:23
Updated the answer.
– Robert Kock
Nov 21 '18 at 8:16
add a comment |
Robert and forpas both provided much better answers, but technically I believe this could be done without creating a new table (at least in MSSQL). I've attempted to translate into MySQL. Again I would likely never do it this way, especally on large data sets, but it was an interesting exercise.
As with all solutions, if you do attempt this backup your table first.
DECLARE @i INT = 0
WHILE @i < 6820483
BEGIN
DELETE FROM f
FROM (
SELECT *
FROM fixtures
WHERE player_id IN (SELECT player_id FROM fixtures GROUP BY player_id HAVING COUNT(*) > 1)
LIMIT 1
) f
SET @i = @i + 1
END
Also as pointed out by the other answers, you will likely want to create a Primary Key
to prevent this in the future.
but why robert solution give me the error error: undeclared tmp_fixtures?
– Spartaok
Nov 20 '18 at 16:21
@Spartaok I'm not sure, maybe SELECT INTO isn't valid in MySQL. His syntax would work in MSSQL. If that isn't allowed in MySQL you could always create the table first and doINSERT INTO tmp_fixtures SELECT DISTINCT...
– Derrick Moeller
Nov 20 '18 at 16:25
add a comment |
There is no other solution but backup distinct rows of the table in a temporary table and restore it afterwards just like @Robert Kock proposed, but:
duplicates can appear again just like before.
So prior to restoring the table run this statement:
ALTER TABLE swp.fixtures ADD PRIMARY KEY(player_id, match_id, team_id);
to add a multi-column primary key so the problem will not appear again.
Edit1
From: https://dev.mysql.com/doc/refman/8.0/en/ansi-diff-select-into-table.html
MySQL Server doesn't support the SELECT ... INTO TABLE Sybase SQL
extension. Instead, MySQL Server supports the INSERT INTO ... SELECT
standard SQL syntax, which is basically the same thing. See Section
13.2.6.1, “INSERT ... SELECT Syntax”. For example:
INSERT INTO tbl_temp2 (fld_id)
SELECT tbl_temp1.fld_order_id
FROM tbl_temp1 WHERE tbl_temp1.fld_order_id > 100;
Edit2 (after Gordon Linoff's suggestion)
So the whole code should be:
CREATE TABLE tmp_fixtures AS
SELECT DISTINCT player_id, match_id, team_id FROM fixtures;
TRUNCATE TABLE fixtures;
ALTER TABLE fixtures ADD PRIMARY KEY(player_id, match_id, team_id);
INSERT INTO fixtures (player_id, match_id, team_id)
SELECT player_id, match_id, team_id FROM tmp_fixtures;
DROP TABLE tmp_fixtures;
Use with caution and only if you have a backup of your data.
the solution propose by robert return an error could you check that'
– Spartaok
Nov 20 '18 at 15:57
@LukStorms I don't. The documentation did...
– forpas
Nov 20 '18 at 16:07
Ah right you are. It was text copied from the source. My bad.
– LukStorms
Nov 20 '18 at 16:08
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53396170%2fhow-to-safely-remove-only-duplicated-rows%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
I'm not a MySQL expect but you could try this (if you're sure no new records will be inserted in the meantime):
CREATE TABLE tmp_fixtures
(
player_id INT NOT NULL,
match_id INT NOT NULL,
team_id INT NOT NULL
);
SELECT DISTINCT
player_id,
match_id,
team_id
INTO tmp_fixtures
FROM fixtures;
TRUNCATE TABLE fixtures;
In order to make sure no duplicated records are created anymore, you could do the following:
ALTER TABLE fixtures ADD PRIMARY KEY (player_id, match_id, team_id);
After this, repopulate the table and clean up:
INSERT INTO fixtures (player_id, match_id, team_id)
SELECT player_id,
match_id,
team_id
FROM tmp_fixtures;
DROP TABLE tmp_fixtures;
This should be the accepted answer. When you do not use a PK, most DBMS would fail deleting a row in case of ambiguity. Moving old data to a new table which would use PKs is the only safe way to step out of this mess.
– afe
Nov 20 '18 at 15:47
this give me that error: undeclared tmp_fixtures
– Spartaok
Nov 20 '18 at 15:48
Right idea, wrong database. You should be usingcreate table as
.
– Gordon Linoff
Nov 20 '18 at 16:23
Updated the answer.
– Robert Kock
Nov 21 '18 at 8:16
add a comment |
I'm not a MySQL expect but you could try this (if you're sure no new records will be inserted in the meantime):
CREATE TABLE tmp_fixtures
(
player_id INT NOT NULL,
match_id INT NOT NULL,
team_id INT NOT NULL
);
SELECT DISTINCT
player_id,
match_id,
team_id
INTO tmp_fixtures
FROM fixtures;
TRUNCATE TABLE fixtures;
In order to make sure no duplicated records are created anymore, you could do the following:
ALTER TABLE fixtures ADD PRIMARY KEY (player_id, match_id, team_id);
After this, repopulate the table and clean up:
INSERT INTO fixtures (player_id, match_id, team_id)
SELECT player_id,
match_id,
team_id
FROM tmp_fixtures;
DROP TABLE tmp_fixtures;
This should be the accepted answer. When you do not use a PK, most DBMS would fail deleting a row in case of ambiguity. Moving old data to a new table which would use PKs is the only safe way to step out of this mess.
– afe
Nov 20 '18 at 15:47
this give me that error: undeclared tmp_fixtures
– Spartaok
Nov 20 '18 at 15:48
Right idea, wrong database. You should be usingcreate table as
.
– Gordon Linoff
Nov 20 '18 at 16:23
Updated the answer.
– Robert Kock
Nov 21 '18 at 8:16
add a comment |
I'm not a MySQL expect but you could try this (if you're sure no new records will be inserted in the meantime):
CREATE TABLE tmp_fixtures
(
player_id INT NOT NULL,
match_id INT NOT NULL,
team_id INT NOT NULL
);
SELECT DISTINCT
player_id,
match_id,
team_id
INTO tmp_fixtures
FROM fixtures;
TRUNCATE TABLE fixtures;
In order to make sure no duplicated records are created anymore, you could do the following:
ALTER TABLE fixtures ADD PRIMARY KEY (player_id, match_id, team_id);
After this, repopulate the table and clean up:
INSERT INTO fixtures (player_id, match_id, team_id)
SELECT player_id,
match_id,
team_id
FROM tmp_fixtures;
DROP TABLE tmp_fixtures;
I'm not a MySQL expect but you could try this (if you're sure no new records will be inserted in the meantime):
CREATE TABLE tmp_fixtures
(
player_id INT NOT NULL,
match_id INT NOT NULL,
team_id INT NOT NULL
);
SELECT DISTINCT
player_id,
match_id,
team_id
INTO tmp_fixtures
FROM fixtures;
TRUNCATE TABLE fixtures;
In order to make sure no duplicated records are created anymore, you could do the following:
ALTER TABLE fixtures ADD PRIMARY KEY (player_id, match_id, team_id);
After this, repopulate the table and clean up:
INSERT INTO fixtures (player_id, match_id, team_id)
SELECT player_id,
match_id,
team_id
FROM tmp_fixtures;
DROP TABLE tmp_fixtures;
edited Nov 21 '18 at 8:15
answered Nov 20 '18 at 15:40
Robert KockRobert Kock
4,0841617
4,0841617
This should be the accepted answer. When you do not use a PK, most DBMS would fail deleting a row in case of ambiguity. Moving old data to a new table which would use PKs is the only safe way to step out of this mess.
– afe
Nov 20 '18 at 15:47
this give me that error: undeclared tmp_fixtures
– Spartaok
Nov 20 '18 at 15:48
Right idea, wrong database. You should be usingcreate table as
.
– Gordon Linoff
Nov 20 '18 at 16:23
Updated the answer.
– Robert Kock
Nov 21 '18 at 8:16
add a comment |
This should be the accepted answer. When you do not use a PK, most DBMS would fail deleting a row in case of ambiguity. Moving old data to a new table which would use PKs is the only safe way to step out of this mess.
– afe
Nov 20 '18 at 15:47
this give me that error: undeclared tmp_fixtures
– Spartaok
Nov 20 '18 at 15:48
Right idea, wrong database. You should be usingcreate table as
.
– Gordon Linoff
Nov 20 '18 at 16:23
Updated the answer.
– Robert Kock
Nov 21 '18 at 8:16
This should be the accepted answer. When you do not use a PK, most DBMS would fail deleting a row in case of ambiguity. Moving old data to a new table which would use PKs is the only safe way to step out of this mess.
– afe
Nov 20 '18 at 15:47
This should be the accepted answer. When you do not use a PK, most DBMS would fail deleting a row in case of ambiguity. Moving old data to a new table which would use PKs is the only safe way to step out of this mess.
– afe
Nov 20 '18 at 15:47
this give me that error: undeclared tmp_fixtures
– Spartaok
Nov 20 '18 at 15:48
this give me that error: undeclared tmp_fixtures
– Spartaok
Nov 20 '18 at 15:48
Right idea, wrong database. You should be using
create table as
.– Gordon Linoff
Nov 20 '18 at 16:23
Right idea, wrong database. You should be using
create table as
.– Gordon Linoff
Nov 20 '18 at 16:23
Updated the answer.
– Robert Kock
Nov 21 '18 at 8:16
Updated the answer.
– Robert Kock
Nov 21 '18 at 8:16
add a comment |
Robert and forpas both provided much better answers, but technically I believe this could be done without creating a new table (at least in MSSQL). I've attempted to translate into MySQL. Again I would likely never do it this way, especally on large data sets, but it was an interesting exercise.
As with all solutions, if you do attempt this backup your table first.
DECLARE @i INT = 0
WHILE @i < 6820483
BEGIN
DELETE FROM f
FROM (
SELECT *
FROM fixtures
WHERE player_id IN (SELECT player_id FROM fixtures GROUP BY player_id HAVING COUNT(*) > 1)
LIMIT 1
) f
SET @i = @i + 1
END
Also as pointed out by the other answers, you will likely want to create a Primary Key
to prevent this in the future.
but why robert solution give me the error error: undeclared tmp_fixtures?
– Spartaok
Nov 20 '18 at 16:21
@Spartaok I'm not sure, maybe SELECT INTO isn't valid in MySQL. His syntax would work in MSSQL. If that isn't allowed in MySQL you could always create the table first and doINSERT INTO tmp_fixtures SELECT DISTINCT...
– Derrick Moeller
Nov 20 '18 at 16:25
add a comment |
Robert and forpas both provided much better answers, but technically I believe this could be done without creating a new table (at least in MSSQL). I've attempted to translate into MySQL. Again I would likely never do it this way, especally on large data sets, but it was an interesting exercise.
As with all solutions, if you do attempt this backup your table first.
DECLARE @i INT = 0
WHILE @i < 6820483
BEGIN
DELETE FROM f
FROM (
SELECT *
FROM fixtures
WHERE player_id IN (SELECT player_id FROM fixtures GROUP BY player_id HAVING COUNT(*) > 1)
LIMIT 1
) f
SET @i = @i + 1
END
Also as pointed out by the other answers, you will likely want to create a Primary Key
to prevent this in the future.
but why robert solution give me the error error: undeclared tmp_fixtures?
– Spartaok
Nov 20 '18 at 16:21
@Spartaok I'm not sure, maybe SELECT INTO isn't valid in MySQL. His syntax would work in MSSQL. If that isn't allowed in MySQL you could always create the table first and doINSERT INTO tmp_fixtures SELECT DISTINCT...
– Derrick Moeller
Nov 20 '18 at 16:25
add a comment |
Robert and forpas both provided much better answers, but technically I believe this could be done without creating a new table (at least in MSSQL). I've attempted to translate into MySQL. Again I would likely never do it this way, especally on large data sets, but it was an interesting exercise.
As with all solutions, if you do attempt this backup your table first.
DECLARE @i INT = 0
WHILE @i < 6820483
BEGIN
DELETE FROM f
FROM (
SELECT *
FROM fixtures
WHERE player_id IN (SELECT player_id FROM fixtures GROUP BY player_id HAVING COUNT(*) > 1)
LIMIT 1
) f
SET @i = @i + 1
END
Also as pointed out by the other answers, you will likely want to create a Primary Key
to prevent this in the future.
Robert and forpas both provided much better answers, but technically I believe this could be done without creating a new table (at least in MSSQL). I've attempted to translate into MySQL. Again I would likely never do it this way, especally on large data sets, but it was an interesting exercise.
As with all solutions, if you do attempt this backup your table first.
DECLARE @i INT = 0
WHILE @i < 6820483
BEGIN
DELETE FROM f
FROM (
SELECT *
FROM fixtures
WHERE player_id IN (SELECT player_id FROM fixtures GROUP BY player_id HAVING COUNT(*) > 1)
LIMIT 1
) f
SET @i = @i + 1
END
Also as pointed out by the other answers, you will likely want to create a Primary Key
to prevent this in the future.
edited Nov 20 '18 at 16:29
answered Nov 20 '18 at 16:14
Derrick MoellerDerrick Moeller
2,72121433
2,72121433
but why robert solution give me the error error: undeclared tmp_fixtures?
– Spartaok
Nov 20 '18 at 16:21
@Spartaok I'm not sure, maybe SELECT INTO isn't valid in MySQL. His syntax would work in MSSQL. If that isn't allowed in MySQL you could always create the table first and doINSERT INTO tmp_fixtures SELECT DISTINCT...
– Derrick Moeller
Nov 20 '18 at 16:25
add a comment |
but why robert solution give me the error error: undeclared tmp_fixtures?
– Spartaok
Nov 20 '18 at 16:21
@Spartaok I'm not sure, maybe SELECT INTO isn't valid in MySQL. His syntax would work in MSSQL. If that isn't allowed in MySQL you could always create the table first and doINSERT INTO tmp_fixtures SELECT DISTINCT...
– Derrick Moeller
Nov 20 '18 at 16:25
but why robert solution give me the error error: undeclared tmp_fixtures?
– Spartaok
Nov 20 '18 at 16:21
but why robert solution give me the error error: undeclared tmp_fixtures?
– Spartaok
Nov 20 '18 at 16:21
@Spartaok I'm not sure, maybe SELECT INTO isn't valid in MySQL. His syntax would work in MSSQL. If that isn't allowed in MySQL you could always create the table first and do
INSERT INTO tmp_fixtures SELECT DISTINCT...
– Derrick Moeller
Nov 20 '18 at 16:25
@Spartaok I'm not sure, maybe SELECT INTO isn't valid in MySQL. His syntax would work in MSSQL. If that isn't allowed in MySQL you could always create the table first and do
INSERT INTO tmp_fixtures SELECT DISTINCT...
– Derrick Moeller
Nov 20 '18 at 16:25
add a comment |
There is no other solution but backup distinct rows of the table in a temporary table and restore it afterwards just like @Robert Kock proposed, but:
duplicates can appear again just like before.
So prior to restoring the table run this statement:
ALTER TABLE swp.fixtures ADD PRIMARY KEY(player_id, match_id, team_id);
to add a multi-column primary key so the problem will not appear again.
Edit1
From: https://dev.mysql.com/doc/refman/8.0/en/ansi-diff-select-into-table.html
MySQL Server doesn't support the SELECT ... INTO TABLE Sybase SQL
extension. Instead, MySQL Server supports the INSERT INTO ... SELECT
standard SQL syntax, which is basically the same thing. See Section
13.2.6.1, “INSERT ... SELECT Syntax”. For example:
INSERT INTO tbl_temp2 (fld_id)
SELECT tbl_temp1.fld_order_id
FROM tbl_temp1 WHERE tbl_temp1.fld_order_id > 100;
Edit2 (after Gordon Linoff's suggestion)
So the whole code should be:
CREATE TABLE tmp_fixtures AS
SELECT DISTINCT player_id, match_id, team_id FROM fixtures;
TRUNCATE TABLE fixtures;
ALTER TABLE fixtures ADD PRIMARY KEY(player_id, match_id, team_id);
INSERT INTO fixtures (player_id, match_id, team_id)
SELECT player_id, match_id, team_id FROM tmp_fixtures;
DROP TABLE tmp_fixtures;
Use with caution and only if you have a backup of your data.
the solution propose by robert return an error could you check that'
– Spartaok
Nov 20 '18 at 15:57
@LukStorms I don't. The documentation did...
– forpas
Nov 20 '18 at 16:07
Ah right you are. It was text copied from the source. My bad.
– LukStorms
Nov 20 '18 at 16:08
add a comment |
There is no other solution but backup distinct rows of the table in a temporary table and restore it afterwards just like @Robert Kock proposed, but:
duplicates can appear again just like before.
So prior to restoring the table run this statement:
ALTER TABLE swp.fixtures ADD PRIMARY KEY(player_id, match_id, team_id);
to add a multi-column primary key so the problem will not appear again.
Edit1
From: https://dev.mysql.com/doc/refman/8.0/en/ansi-diff-select-into-table.html
MySQL Server doesn't support the SELECT ... INTO TABLE Sybase SQL
extension. Instead, MySQL Server supports the INSERT INTO ... SELECT
standard SQL syntax, which is basically the same thing. See Section
13.2.6.1, “INSERT ... SELECT Syntax”. For example:
INSERT INTO tbl_temp2 (fld_id)
SELECT tbl_temp1.fld_order_id
FROM tbl_temp1 WHERE tbl_temp1.fld_order_id > 100;
Edit2 (after Gordon Linoff's suggestion)
So the whole code should be:
CREATE TABLE tmp_fixtures AS
SELECT DISTINCT player_id, match_id, team_id FROM fixtures;
TRUNCATE TABLE fixtures;
ALTER TABLE fixtures ADD PRIMARY KEY(player_id, match_id, team_id);
INSERT INTO fixtures (player_id, match_id, team_id)
SELECT player_id, match_id, team_id FROM tmp_fixtures;
DROP TABLE tmp_fixtures;
Use with caution and only if you have a backup of your data.
the solution propose by robert return an error could you check that'
– Spartaok
Nov 20 '18 at 15:57
@LukStorms I don't. The documentation did...
– forpas
Nov 20 '18 at 16:07
Ah right you are. It was text copied from the source. My bad.
– LukStorms
Nov 20 '18 at 16:08
add a comment |
There is no other solution but backup distinct rows of the table in a temporary table and restore it afterwards just like @Robert Kock proposed, but:
duplicates can appear again just like before.
So prior to restoring the table run this statement:
ALTER TABLE swp.fixtures ADD PRIMARY KEY(player_id, match_id, team_id);
to add a multi-column primary key so the problem will not appear again.
Edit1
From: https://dev.mysql.com/doc/refman/8.0/en/ansi-diff-select-into-table.html
MySQL Server doesn't support the SELECT ... INTO TABLE Sybase SQL
extension. Instead, MySQL Server supports the INSERT INTO ... SELECT
standard SQL syntax, which is basically the same thing. See Section
13.2.6.1, “INSERT ... SELECT Syntax”. For example:
INSERT INTO tbl_temp2 (fld_id)
SELECT tbl_temp1.fld_order_id
FROM tbl_temp1 WHERE tbl_temp1.fld_order_id > 100;
Edit2 (after Gordon Linoff's suggestion)
So the whole code should be:
CREATE TABLE tmp_fixtures AS
SELECT DISTINCT player_id, match_id, team_id FROM fixtures;
TRUNCATE TABLE fixtures;
ALTER TABLE fixtures ADD PRIMARY KEY(player_id, match_id, team_id);
INSERT INTO fixtures (player_id, match_id, team_id)
SELECT player_id, match_id, team_id FROM tmp_fixtures;
DROP TABLE tmp_fixtures;
Use with caution and only if you have a backup of your data.
There is no other solution but backup distinct rows of the table in a temporary table and restore it afterwards just like @Robert Kock proposed, but:
duplicates can appear again just like before.
So prior to restoring the table run this statement:
ALTER TABLE swp.fixtures ADD PRIMARY KEY(player_id, match_id, team_id);
to add a multi-column primary key so the problem will not appear again.
Edit1
From: https://dev.mysql.com/doc/refman/8.0/en/ansi-diff-select-into-table.html
MySQL Server doesn't support the SELECT ... INTO TABLE Sybase SQL
extension. Instead, MySQL Server supports the INSERT INTO ... SELECT
standard SQL syntax, which is basically the same thing. See Section
13.2.6.1, “INSERT ... SELECT Syntax”. For example:
INSERT INTO tbl_temp2 (fld_id)
SELECT tbl_temp1.fld_order_id
FROM tbl_temp1 WHERE tbl_temp1.fld_order_id > 100;
Edit2 (after Gordon Linoff's suggestion)
So the whole code should be:
CREATE TABLE tmp_fixtures AS
SELECT DISTINCT player_id, match_id, team_id FROM fixtures;
TRUNCATE TABLE fixtures;
ALTER TABLE fixtures ADD PRIMARY KEY(player_id, match_id, team_id);
INSERT INTO fixtures (player_id, match_id, team_id)
SELECT player_id, match_id, team_id FROM tmp_fixtures;
DROP TABLE tmp_fixtures;
Use with caution and only if you have a backup of your data.
edited Nov 20 '18 at 16:35
answered Nov 20 '18 at 15:53
forpasforpas
11k3423
11k3423
the solution propose by robert return an error could you check that'
– Spartaok
Nov 20 '18 at 15:57
@LukStorms I don't. The documentation did...
– forpas
Nov 20 '18 at 16:07
Ah right you are. It was text copied from the source. My bad.
– LukStorms
Nov 20 '18 at 16:08
add a comment |
the solution propose by robert return an error could you check that'
– Spartaok
Nov 20 '18 at 15:57
@LukStorms I don't. The documentation did...
– forpas
Nov 20 '18 at 16:07
Ah right you are. It was text copied from the source. My bad.
– LukStorms
Nov 20 '18 at 16:08
the solution propose by robert return an error could you check that'
– Spartaok
Nov 20 '18 at 15:57
the solution propose by robert return an error could you check that'
– Spartaok
Nov 20 '18 at 15:57
@LukStorms I don't. The documentation did...
– forpas
Nov 20 '18 at 16:07
@LukStorms I don't. The documentation did...
– forpas
Nov 20 '18 at 16:07
Ah right you are. It was text copied from the source. My bad.
– LukStorms
Nov 20 '18 at 16:08
Ah right you are. It was text copied from the source. My bad.
– LukStorms
Nov 20 '18 at 16:08
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53396170%2fhow-to-safely-remove-only-duplicated-rows%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
What do you mean by remove? Do you want to delete them from the table, or just select only unique values?
– WillardSolutions
Nov 20 '18 at 15:24
@WillardSolutions I need to delete them because are duplicated check the example for understand
– Spartaok
Nov 20 '18 at 15:25
Is there a primary key on the
fixtures
table?– WillardSolutions
Nov 20 '18 at 15:25
Does your table have some PK that uniquely identifies a record?
– Robert Kock
Nov 20 '18 at 15:25
@WillardSolutions nope, the table contains FK of records available in other tables
– Spartaok
Nov 20 '18 at 15:26