论坛徽章:: 0

电梯直达

1楼 [收藏(0)] [报告]

发表于 2005-03-11 10:32 |只看该作者 |倒序浏览

有知道EXPLAIN PLAN列出的NESTED LOOP,HASH JOIN,SORT MERGE,普通的连接等连接方式是如何实现以及什么时候会用什么样的连接方法的,或者举个例子来说明,或者告知小弟哪儿有这方面的资料不管英文中文的都好,期待。。。

文库|博客

sshd

家境小康

论坛徽章:: 0

2楼 [报告]

发表于 2005-03-11 14:37 |只看该作者

oracle优化器的连接方式如何实现的？

normally you do not need to worry about when these methods will be used. you only need to gather statistics of the database objects. oracle optimizer will choose proper methods for table joins.

you could tell oracle which method to use if you really know what you are doing, using HINT for each SQL

example:

select /*+ XXX */ YYY
from ZZZ
where ...

where XXX can be nested loop, hash join etc
check oracle manual for exact syntax.

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

欧阳不修

白手起家

论坛徽章:: 0

3楼 [报告]

发表于 2005-03-11 18:05 |只看该作者

oracle优化器的连接方式如何实现的？

多谢斑竹回复，我明白斑竹的意思，但我认为只有明确理解oracle优化器是如何进行优化的，也就是说提到的那几种连接是如何进行的，我们才能清楚自己写的sql是高效的还是低效的；因为我们总不能每次执行一个语句都要找执行计划看一下吧或者执行计划显示的hash join，我想大部分人都不明白这到底意味着什么。
因为我最近在研究sql优化的东西，对oracle的优化器和各种sql写法参考了历史上各位大侠留下的劳动成果自己也加入了自己的理解和试验，发现这方面的资料中文实在是少之又少，小弟也是入门oracle不久，metalink和asktom也刚加入，对里面内容还是不是很熟悉，所以萌发到此寻求各位有经验之士能否提供一些心得体会或者参考信息，非常感谢。
另外，我现在已经整理出二十多条sql优化的规则(类似高效C++书的风格)，只是有些规则知其然不知其所以然阿，请各位老大伸出援助之手，谢谢啦

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

sshd

家境小康

论坛徽章:: 0

4楼 [报告]

发表于 2005-03-12 02:42 |只看该作者

oracle优化器的连接方式如何实现的？

take
select ENAME, DNAME from emp, dept where emp.DEPTNO=dept.DEPTNO;
for example.

assume emp has E rows, and dept has D rows

0) worst case

for each deptno in emp table, find the corresponding row in dept table through full table scan

this will do full table scan of table dept E times.

1) nested loop
for each deptno in emp table, find the corresponding row in dept table through index dept(deptno)

this will probe index dept(deptno) E times

2) sort merge

sort table emp by deptno and sort table dept by deptno

them joint the corresponding rows using two pointers. you just advance the two pinters a bit each time, and basically you will only need to scan the sorted tables once for each table

3) hash join

you build a hash table using one table and probe the hash table using the other table. for example

build a hash table using the dept table

assume the hash function is mod(DEPTNO, 7)

SQL>; select rowid, DEPTNO, DNAME, LOC from dept;

ROWID                DEPTNO DNAME       LOC
------------------ ---------- -------------- -------------
AAAD0AAAEAAAAAVAAA       10 ACCOUNTING    NEW YORK
AAAD0AAAEAAAAAVAAB       20 RESEARCH    DALLAS
AAAD0AAAEAAAAAVAAC       30 SALES       CHICAGO
AAAD0AAAEAAAAAVAAD       40 OPERATIONS    BOSTON

The hash table created looks like

      HV ROWID
---------- ------------------
      2 AAAD0AAAEAAAAAVAAC
      3 AAAD0AAAEAAAAAVAAA
      5 AAAD0AAAEAAAAAVAAD
      6 AAAD0AAAEAAAAAVAAB

How do you get it?
deptno = 10  ->; mod(10, 7) ->; 3.
so hash value 3 corresponds to rowid AAAD0AAAEAAAAAVAAA

Now you can start to do the table join.

SQL>; select * from emp;

   EMPNO ENAME    JOB             MGR HIREDATE       SAL    COMM    DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
   7369 SMITH    CLERK          7902 17-DEC-80       800                   20
...............

SMITH works in deptno 20, and it's hash value is mod(20,7) = 6,
so we use 6 as the index to probe the hash table, and find the the row in dept has physical address AAAD0AAAEAAAAAVAAB, and then we go that physical address to the corresponding row in the dept table.

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

sshd

家境小康

论坛徽章:: 0

5楼 [报告]

发表于 2005-03-12 02:53 |只看该作者

oracle优化器的连接方式如何实现的？

for hash join, after you build a hash table, you will only probe the hash table E times to complete the table join.

note:
1) nested loop and hash join are not symmetric, so you have a choice of using different table as the driving table
2) hash join: normally oracle will choose a proper hash function. It's not as simple as the one in the example
3) different deptno can have the same hash value. in this case you will have a hash collision. (example: deptno = 10 and deptno = 80 in above example)

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

sshd

家境小康

论坛徽章:: 0

6楼 [报告]

发表于 2005-03-12 03:31 |只看该作者

oracle优化器的连接方式如何实现的？

in 9i, oracle's optimizer (optimizer_mode) either is rule based or cost based.

If you have all statistics of the object involved in a SQL, oracle will choose the plan that has the lowest cost (how much disk I/O involved + how much CPU)

If you do not have the statistics or it incomplete, oracle will use some prdefined rules to execute the SQL, and normally the plan is not optimal comparing to cost based execution plans.

It unlikely the understand exactly how the optimizer will work.