免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 1029 | 回复: 0
打印 上一主题 下一主题

MYSQL中删除重复记录的方法 [复制链接]

论坛徽章:
0
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2008-03-15 14:46 |只看该作者 |倒序浏览

转载

1、
具体实现如下:

Table         Create Table                                            
------------  --------------------------------------------------------
users_groups  CREATE TABLE `users_groups` (                           
                `id` int(10) unsigned NOT NULL AUTO_INCREMENT,        
                `uid` int(11) NOT NULL,                              
                `gid` int(11) NOT NULL,                              
                PRIMARY KEY (`id`)                                    
              ) ENGINE=InnoDB AUTO_INCREMENT=15 DEFAULT CHARSET=utf8  

users_groups.txt内容:
1,11,502
2,107,502
3,100,503
4,110,501
5,112,501
6,104,502
7,100,502
8,100,501
9,102,501
10,104,502
11,100,502
12,100,501
13,102,501
14,110,501

mysql> load data infile 'c:\\users_groups.txt' into table users_groups fieldsrminated by ',' lines terminated by '\n';
Query OK, 14 rows affected (0.05 sec)
Records: 14  Deleted: 0  Skipped: 0  Warnings: 0
mysql> select * from users_groups;
query result(14 records)
id
uid
gid
1
11
502
2
107
502
3
100
503
4
110
501
5
112
501
6
104
502
7
100
502
8
100
501
9
102
501
10
104
502
11
100
502
12
100
501
13
102
501
14
110
50114 rows in set (0.00 sec)
mysql> create temporary table tmp_wrap select * from users_groups group by uid having count(1) > 1 union all
select * from users_groups group by uid having count(1) = 1;
Query OK, 7 rows affected (0.11 sec)
Records: 7  Duplicates: 0  Warnings: 0
mysql> truncate table users_groups;
Query OK, 14 rows affected (0.03 sec)
mysql> insert into users_groups select * from tmp_wrap;
Query OK, 7 rows affected (0.03 sec)
Records: 7  Duplicates: 0  Warnings: 0
mysql> select * from users_groups;
query result(7 records)
id
uid
gid
1
11
502
2
107
502
3
100
503
4
110
501
5
112
501
6
104
502
9
102
501
mysql> drop table tmp_wrap;
Query OK, 0 rows affected (0.05 sec)
2、还有一个很精简的办法。
查找重复的,并且除掉最小的那个。
delete users_groups as a from users_groups as a,
(
select *,min(id) from users_groups group by uid having count(1) > 1
) as b
where a.uid = b.uid and a.id > b.id;

(7 row(s)affected)
(0 ms taken)

query result(7 records)
id
uid
gid
1
11
502
2
107
502
3
100
503
4
110
501
5
112
501
6
104
502
9
102
501
3、现在来看一下这两个办法的效率。
运行一下以下SQL 语句
create index f_uid on users_groups(uid);
explain select * from users_groups group by uid having count(1) > 1 union all
select * from users_groups group by uid having count(1) = 1;

explain select * from  users_groups as a,
(
select *,min(id) from users_groups group by uid having count(1) > 1
) as b
where a.uid = b.uid and a.id > b.id;

query result(3 records)
id
select_type
table
type
possible_keys
key
key_len
ref
rows
Extra
1
PRIMARY
users_groups
index
(NULL)
f_uid
4
(NULL)
14

2
UNION
users_groups
index
(NULL)
f_uid
4
(NULL)
14

(NULL)
UNION RESULT
ALL
(NULL)
(NULL)
(NULL)
(NULL)
(NULL)


query result(3 records)
id
select_type
table
type
possible_keys
key
key_len
ref
rows
Extra
1
PRIMARY
ALL
(NULL)
(NULL)
(NULL)
(NULL)
4

1
PRIMARY
a
ref
PRIMARY,f_uid
f_uid
4
b.uid
1
Using where
2
DERIVED
users_groups
index
(NULL)
f_uid
4
(NULL)
14



很明显的第二个比第一个扫描的函数要少。


本文来自ChinaUnix博客,如果查看原文请点:http://blog.chinaunix.net/u/25044/showart_497086.html
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP