《MySQL查询优化实践.ppt》由会员分享,可在线阅读,更多相关《MySQL查询优化实践.ppt(46页珍藏版)》请在三一办公上搜索。
1、,MySQL查询调优实践,David Jiang,Weibo:insidemysql,Gtalk:,DTCC2012,主题,B+树索引,简单查询优化,OLTP,复杂查询优化,OLAP,DTCC2012,关于我,近10年MySQL数据库使用经验,MySQL 3.23 MySQL 5.6,InnoSQL分支版本创始人,www.innomysql.org,独立数据库咨询顾问,MySQL技术内幕系列作者,MySQL技术内幕:InnoDB存储引擎(已出版)MySQL技术内幕:SQL编程(已出版)MySQL技术内幕:性能调优(待出版),DTCC2012,B+树索引,常用的索引 B+Tree Index T
2、 Tree Index Hash Index,什么是索引?,提高查询速度?It depends,减少IO次数DTCC2012,B+树索引,聚集索引(Clustered Index),叶子节点存放整行记录,辅助索引(Secondary Index),叶子节点存放row identifier,InnoDB:primary key,书签查找(bookmark lookup),查找代价大,MyISAM:物理位置(偏移量),更新代价大,B+树的高度=IO次数=随机IO,34层,DTCC2012,B+树索引-InnoDB存储引擎,DTCC2012,B+树索引-MyISAM存储引擎,DTCC2012,H,2
3、,3,DTCC2012,4,B+树索引聚集索引 VS 辅助索引Clustered index key=4 bytesSecondary index key=4 bytesKey pointer=6 bytesAverage row length=300 bytesPage size=16K=16384 bytesAverage node occupancy=70%(both for leaf and index page)Fan-out for clustered index=16384*70%/(4+6)=1000Fan-out for secondary index=16384*70%/(
4、4+6)=1000Average row per page for clustered index=16384*70%/300=35Average row per page for clustered index=16384*70%/(4+6)=1000,Clustered Index1000*35=35,000(1000)2*35=35,000,000(1000)3*35=35,000,000,000,Secondary Index1000*1000=1,000,000(1000)2*1000=1,000,000,000(1000)3*1000=1,000,000,000,000,B+树索引
5、,辅助索引的优势,树的高度较小=需要的IO次数少,树的大小较小=scan需要扫描的页较少 优化器倾向于使用辅助索引,辅助索引的劣势,查找完整记录还需查询,InnoDB:查询聚集索引,MyISAM:直接查找MYD物理位置,DTCC2012,B+树索引-InnoDB索引CREATE TABLE UserInfo(userid INT NOT NULL AUTO_INCREMENT,username VARCHAR(30),registdate DATETIME,email,VARCHAR(50),PRIMARY KEY(userid),UNIQUE KEY idx_username(usernam
6、e),KEY idx_registdate(registdate)Engine=InnoDB;,CREATE TABLE UserInfo(userid INT NOT NULL AUTO_INCREMENT,username VARCHAR(30),CREATE TABLE idx_username(userid INT NOT NULL,username VARCHAR(30),CREATE TABLE idx_registdate(userid INT NOT NULL,registdate DATETIME),registdate DATETIME,PRIMARY KEY(userna
7、me,userid)PRIMARY KEY(registdate,userid),email,VARCHAR(50),);,);,PRIMARY KEY(userid);,DTCC2012,B+树索引-InnoDB索引,DTCC2012,B+树索引-插入顺序问题,聚集索引插入,主键是自增长的 插入是顺序的,每页中的填充率高(15/16),顺序扫描(Scan)可以达到磁盘顺序读的速率 一般不推荐使用UUID,插入非顺序,DTCC2012,B+树索引-插入顺序问题,辅助索引插入,插入的顺序是乱序的,插入(David,Monty,Jimmy,Amy,Michael),插入的顺序是顺序的,插入时间,需
8、要产生B+树的分裂,需要较大的开销,每页的填充率较低(60%70%)顺序扫描不能达到磁盘顺序读的速率,若插入是乱序的,DTCC2012,B+树索引-插入顺序问题 辅助索引顺序扫描速度 select count(1)from stock KEY fkey_stock_2(s_i_id),INT Avg row length:355辅助索引:6分42秒4M/秒Logical_reads:12700001 Physical_reads:100057强制聚集索引:4分38秒120130M/秒,Logical_reads:14670405 Physical_reads:2170333,DTCC2012,
9、简单查询优化,简单查询,OLTP,简单查询特点 SQL语句较为简单 返回少部分数据 并发量大,优化原则,减少随机读=使用索引,High CardinalityDTCC2012,简单查询优化,SELECT FROM table where primary_key=?SELECT FROM table where key=?,DTCC2012,简单查询优化-复合索引 索引键值为多个列(a,b)a:1,1,2,2,3,3(a,b):(1,1),(1,2),(2,1),(2,4),(3,1),(3,2),b:1,2,1,4,1,2,DTCC2012,简单查询优化-复合索引,复合索引(a,b)可被使用于
10、:,SELECT*FROM t WHERE a=?,SELECT*FROM t WHERE a=?AND b=?,SELECT*FROM t WHERE a=?ORDER BY b索引覆盖,查询b也可以使用该索引 WHERE b=?,DTCC2012,简单查询优化-索引覆盖,从辅助索引直接得到结果,不需要书签查找,(primary key1,primary key2,key1,,key2,),SELECT key2 FROM table WHERE key1=xxx;SELECT primary key2,key2 FROM table WHERE,key1=xxx;,SELECT prima
11、ry key1,key2 FROM table WHERE,key1=xxx;,SELECT primary key1,primary key2,key2 FROM table,WHERE key1=xxx;,DTCC2012,简单查询优化-索引覆盖,CREATE TABLE ItemLog(,logId INT NOT NULL AUTO_INCREMENT,userId VARCHAR(100)NOT NULL,itemId INT NOT NULL,date DATETIME NOT NULL,PRIMARY KEY(logId),KEY idx_userId_date(userId,d
12、ate)ENGINE=INNODB;,SELECT COUNT(1)FROM ItemLog WHERE date=2012-04-01 AND date2012-05-01;,DTCC2012,简单查询优化-书签查找优化CREATE TABLE UserInfo(userid INT NOT NULL AUTO_INCREMENT,username VARCHAR(30),registdate DATETIME,email,VARCHAR(50),PRIMARY KEY(userid),UNIQUE KEY idx_username(username),KEY idx_registdate(
13、registdate)Engine=InnoDB;SELECT email FROM UserInfo WHERE username=DavidIf:聚集索引高度:4&辅助索引高度:3Then:一共需要7次逻辑IODTCC2012,);,简单查询优化-书签查找优化 分表 Like Index Coverage,CREATE TABLE UserInfo(userid INT NOT NULL AUTO_INCREMENT,username VARCHAR(30),CREATE TABLE UserInfoDetail(userid INT NOT NULL AUTO_INCREMENT,use
14、rname VARCHAR(30),registdate DATETIME,email,VARCHAR(50),PRIMARY KEY(userid),UNIQUE KEY idx_username(username),PRIMARY KEY(userid),UNIQUE KEY idx_username(username),KEY idex_registdate(registdate);SELECT email FROM UserInfoDetail WHERE username=DavidIf:辅助索引高度:3Then:逻辑IO减少为3DTCC2012,简单查询优化-总结,每个页填充率高,
15、包含的记录多,减少IO次数,IO=性能 OLTP only,利用索引覆盖技术避免书签查找 利用分表技术避免书签查找,DTCC2012,复杂查询优化 复杂查询 OLAP JOIN Subquery 复杂查询特点,数据量大并发少需访问较多的数据索引并不再是唯一的优化方向调优工作复杂,DTCC2012,复杂查询优化-JOIN,MySQL JOIN 类型,Simple Nested Loops Join Block Nested Loops Join,MySQL 5.5+,Classic Hash Join,MariaDB 5.3+,Block Hash Nested Loops Join,DTCC2
16、012,复杂查询优化-SNLJ,For each tuple r in R doFor each tuple s in S do,If r and s satisfy the join condition,Then output the tuple,For each tuple r in R do,lookup r join condition in S index,if found s=r,Then output the tuple,Scan cost(no index)=Rn+RnSn=O(RnSn)Scan cost(with index)=Rn+Rn SBH=O(Rn),DTCC201
17、2,复杂查询优化-SNLJ INNER JOIN with Index INNER JOIN联接顺序可更改 优化器喜欢选择较小的表作为外部表 Scan cost(with index)=Rn+Rn SBH=O(Rn)SELECT b.emp_no,a.title,a.from_date,a.to_dateFROM titles aINNER JOIN employees bon a.emp_no=b.emp_no;mysql SELECT COUNT(1)FROM employeesG;*1.row*count(1):300024,mysql SELECT COUNT(1)FROM title
18、sG;*1.row*count(1):443308,DTCC2012,复杂查询优化-SNLJ,SELECT b.emp_no,a.title,a.from_date,a.to_dateFROM titles a,STRAIGHT_JOIN employees bon a.emp_no=b.emp_no;,DTCC2012,复杂查询优化-BNLJFor each tuple r in R dostore used columns as p from R in join bufferFor each tuple s in S doIf p and s satisfy the join condit
19、ionThen output the tuple The join_buffer_size system variable determines the size of each join bufferJoin buffering can be used when the join is of type ALL or index or range.One buffer is allocated for each join that can be buffered,so a given query might beprocessed using multiple join buffers.Onl
20、y columns of interest to the join are stored in the join buffer,not whole rows.For join with no index,Reduce inner table scan times,DTCC2012,复杂查询优化-Example Outer table R(1,a),(2,b),(3,c),(4,d)Inner table S:(1,2010-01-01),(2,2010-01-01),(3,2010-01-01),NLJ,BNLJ,Outer table scanInner table scanCompare
21、times,1412,1112,DTCC2012,复杂查询优化-BNLJ,SELECT b.emp_no,a.title,a.from_date,a.to_dateFROM employees_noindex b,INNER JOIN titles_noindex a ON a.emp_no=b.emp_noWHERE b.birth_date=1965-01-01;,DTCC2012,复杂查询优化-BNLJ,SELECT b.emp_no,a.title,a.from_date,a.to_dateFROM titles_noindex a,LEFT OUTER JOIN employees_
22、noindex b ON a.emp_no=b.emp_noWHERE b.birth_date=1965-01-01;,MySQL 5.5 709.645 sec(MySQL 5.5不支持OUTER JOIN),MySQL 5.6 57.483 sec 12x,DTCC2012,复杂查询优化-BHNJ,For each tuple r in R do,store used columns as p from R in join bufferbuild hash table according join buffer,for each tuple s in S do,probe hash ta
23、bleif find,Then output the tuple,通过哈希表减少内部表的比较次数,Join Buffer Size的大小决定了Classic Hash Join的效率 只能用于等值联接,Scan cost=Rn+Sn=O(Rn+Sn),DTCC2012,复杂查询优化-Example Outer table(1,a),(2,b),(3,c),(4,d)Inner table:(1,2010-01-01),(2,2010-01-01),(3,2010-01-01),NLJ,BNLJ,BNLJH,Outer table scanInner table scanCompare time
24、s,1412,1112,113,DTCC2012,复杂查询优化-BHNJ,SELECT MAX(l_extendedprice)FROM orders,lineitem WHERE,o_orderdate BETWEEN 1995-01-01 AND 1995-01-31 ANDl_orderkey=o_orderkey;,MySQL 5.5 125.3sec,MariaDB 5.3 23.104sec 5x,DTCC2012,复杂查询优化-BHNJ,执行速度(秒),逻辑IO,MySQL 5.5MySQL 5.5(强制使用i_l_orderkey索引)MariaDB 5.3(BNLH),125
25、.3153.02123.104,1570614068796077083DTCC2012,复杂查询优化-JOIN总结,SNLJ,适合较小表之间的联接 联接的列需包含有索引,BHNLJ,大表之间的等值联接操作,大表与小表之间的等值联接操作,Join Buffer Size的大小决定了内部表的扫描次数 推荐MariaDB作为数据集市或者数据仓库,DTCC2012,查询优化-子查询,子查询,独立子查询 相关子查询,MySQL子查询,独立子查询转换为相关子查询(SQL重写 LAZY),SELECT.FROM t1 WHERE t1.a IN(SELECT b FROM t2);SELECT.FROM t
26、1 WHERE EXISTS(SELECT 1 FROM t2,WHERE t2.b=t1.a);,Scan cost:O(A+A*B),优化视情况而定,DTCC2012,);,查询优化-子查询SELECT orderid,customerid,employeeid,orderdateFROM ordersWHERE orderdate IN(SELECT MAX(orderdate)FROM ordersGROUP BY(DATE_FORMAT(orderdate,%Y%m)#Time:111227 23:49:16#UserHost:rootroot localhost 127.0.0.1
27、#Query_time:6.081214 Lock_time:0.046800 Rows_sent:42 Rows_examined:727558 Logical_reads:91584 Physical_reads:19use tpcc;SET timestamp=1325000956;SELECT orderid,customerid,employeeid,orderdateFROM ordersWHERE orderdate IN(SELECT MAX(orderdate)FROM orders,GROUP BY(DATE_FORMAT(orderdate,%Y%M),DTCC2012,
28、查询优化-子查询,SELECT orderid,customerid,employeeid,orderdateFROM orders AS AWHERE EXISTS,(SELECT*,FROM orders,GROUP BY(DATE_FORMAT(orderdate,%Y%M)HAVING MAX(orderdate)=A.OrderDate,);,DTCC2012,查询优化-子查询,SELECT orderid,customerid,employeeid,orderdateFROM orders AWHERE EXISTS,(SELECT*FROM(SELECT MAX(orderdat
29、e)AS orderdate,FROM orders,GROUP BY(DATE_FORMAT(orderdate,%Y%M)BWHERE A.orderdate=B.orderdate,);,#Time:111227 23:45:49,#UserHost:rootroot localhost 127.0.0.1,#Query_time:0.251133 Lock_time:0.052001 Rows_sent:42 Rows_examined:1729 Logical_reads:1923,Physical_reads:25 DTCC2012,查询优化-子查询SELECT orderid,c
30、ustomerid,employeeid,A.orderdateFROM orders AS A,(SELECT MAX(orderdate)AS orderdateFROM ordersGROUP BY(DATE_FORMAT(orderdate,%Y%m)AS BWHERE A.orderdate=B.orderdate;#UserHost:rootroot localhost 127.0.0.1#Thread_id:1 Schema:tpcc QC_hit:No#Query_time:0.296897 Lock_time:0.212167 Rows_sent:42 Rows_examin
31、ed:941 Logical_reads:1258,Physical_reads:28.,DTCC2012,查询优化-子查询,MySQL 5.6/MariaDB对子查询的优化,支持对于独立子查询的优化,SEMI JOIN,支持各种类型的子查询优化,SET optimizer_switch=semijoin=on,materialization=on;,#UserHost:rootroot localhost 127.0.0.1#Thread_id:1 Schema:tpcc QC_hit:No,#Query_time:0.176819 Lock_time:0.147888 Rows_sent:42 Rows_examined:1729 Logical_IO:1927,Physical_IO:29.,DTCC2012,查询优化-子查询总结,通过EXPLAIN分析执行计划,避免IN=EXISTS转换带来的高额开销,避免多次的关联查询操作,使用MySQL 5.6/MariaDB 5.3处理复杂子查询,操作,SQL优化器自动优化,DTCC2012,参考资料,http:/http:/kb.askmonty.org/en/MySQL技术内幕,MySQL技术内幕:InnoDB存储引擎 MySQL技术内幕:SQL编程,DTCC2012,Q&A,DTCC2012,
链接地址:https://www.31ppt.com/p-2277704.html