论坛徽章:: 0

电梯直达

1楼 [收藏(0)] [报告]

发表于 2007-11-01 10:11 |只看该作者 |倒序浏览

本文引用到的metalink-note，请下载：
http://blogimg.chinaunix.net/blog/upfile2/071101103553.rar

目的：应用DBMS_STATS包，收集schema的statistics
结果： dbms_stats.gather_schema_stats(ownname       => 'OWNERNAME',
                              estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE,
                              method_opt => 'FOR ALL COLUMNS SIZE REPEAT',
                              degree          => 6,
                              cascade       => TRUE);
此时会报错哦
SQL>EXEC DBMS_STATS.gather_schema_stats(ownname => 'TEST',estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE,method_opt => 'for all columns size repeat',degree => 4,cascade => true);
报错：
ORA-00909: 参数个数无效
ORA-06512: 在"SYS.DBMS_STATS", line 10070
ORA-06512: 在"SYS.DBMS_STATS", line 10564
ORA-06512: 在"SYS.DBMS_STATS", line 10751
ORA-06512: 在"SYS.DBMS_STATS", line 10805
ORA-06512: 在"SYS.DBMS_STATS", line 10782
ORA-06512: 在line 2
原因是Oracle的一个BUG Note:311360.1

The problem is a side affect of  the fix to Bug 3150750,
which triggers histogram collection.

As we have SIZE repeat  and no existing stats , we fail during construction
Sql for height histograms , as number of buckets is NULL

Solution
Run method_opt=>'for all columns size 200' first, and the run
for all columns repeat. These will ensure that number of buckets
is not null.
得先
Run exec dbms_stats.gather_schema_stats with method_opt => 'for all columns size 200'
然后再
Run exec dbms_stats.gather_schema_stats with method_opt => 'for all columns size repeat'

问题：
1.       刚刚IMP过后，schema还需要再分析收集吗？
2.       如何得知当前schema的statitics信息？
3.       analyze 命令、dbms_utility. analyze_schema和dbms_stats包之间的区别
4.       dbms_stats.gather_table_stats()方法中的参数解释
5.       Histogram是什么？

回答：
Q：刚刚IMP过后，schema还需要再分析收集吗？
A：要
Q：如何得知当前schema的statitics信息？
A：Metalink-Note：130768.1
Q：analyze 命令、dbms_utility. analyze_schema和dbms_stats包之间的区别
A：Metalink-Note：237397.1
Q：dbms_stats.gather_table_stats()方法中的参数解释
A：Metalink-Note：406475.1
请将1和2结合起来看
1
DBMS_STATS.GATHER_TABLE_STATS(ownname VARCHAR2,
                              tabname VARCHAR2,
                              partname       VARCHAR2 DEFAULT NULL, --> means ALL partitions
                              estimate_percent NUMBER DEFAULT NULL, --> means compute
                              block_sample    BOOLEAN DEFAULT FALSE,
                              method_opt    VARCHAR2 DEFAULT 'FOR ALL COLUMNS SIZE 1',--> 1 bucket
/*where size_clause is defined as size_clause := SIZE {integer | REPEAT | AUTO | SKEWONLY}
integer--Number of histogram buckets. Must be in the range [1,254].
REPEAT--Collects histograms only on the columns that already have histograms.
AUTO--Oracle determines the columns to collect histograms based on data distribution and the workload of the columns.
SKEWONLY--Oracle determines the columns to collect histograms based on the data distribution of the columns*/
                              degree          NUMBER DEFAULT NULL, --> 1
                              granularity    VARCHAR2 DEFAULT 'DEFAULT', --> means level (PARTITION + GLOBAL)
                              cascade       BOOLEAN DEFAULT FALSE, --> no cascading index by default
                              no_invalidate BOOLEAN DEFAULT FALSE);
2
dbms_stats.gather_table_stats(ownname       =>,
                              tabname       =>,
                              partname       =>,
                              estimate_percent =>, http://www.dba-oracle.com/art_otn_cbo_p2.htm
                              block_sample    =>,
                              method_opt    =>,
                              degree          =>,parallel degree(并行收集维度)　看CPU个数
                              granularity    =>,
                              cascade       =>,true is also gather columns and index’s statistics;
                              no_invalidate =>);

Q：Histogram是什么
A：
http://www.dbasupport.com/oracle/ora9i/CBO5_1.shtml
Histograms
Histogram tells the optimizer how the data is distributed for a column. This information is used in determining the selectivity of the column for a given query and arriving at an optimal execution plan.
Column statistics in the form of histograms are appropriate for columns whose data distribution deviates from the expected uniform distribution. For uniformly distributed data, the optimizer can do proper costing for executing a particular statement. When data is not uniformly distributed, also know as highly skewed data distribution, the optimizer may not be in a position to accurately estimate the selectivity of a query. This option provides statistics at a very low level and its use would be rare, though it could prove very beneficial in certain scenarios.
Columns not eligible for histogram
Please note that histograms should not be used when any of the following is true.
1.    The column data is uniformly distributed.
For example, if we have a column in a table that holds around 100 distinct values. If the number of records that each value holds (or a range of values) is somewhat similar, the data is more or less uniformly distributed.
For instance, the table may have 100000 records, 20% of these are between values 1-15, 15% are between 16-30, 25% are between 31-50 and so on. If we design a data distribution graph (histogram chart) on these figures, the height of each value or range will be more or less balanced.
2.    The column is not at all used in query predicates.
There is no need for providing histogram statistics on columns that are not being used in query conditions. Histogram statistics are stored in the dictionary and do take up space and analysis time of the optimizer.
3.    All query predicates or criteria for the column use bind variables!
Yes that's right, here Oracle requires that hard coded values be provided for use of the histogram statistics. Bind variables will not make use of the same.
4.    The column is unique and used only with equality predicates.
Columns eligible for histogram
If the above rules are not satisfying for a column, it could be considered for distribution statistics. For example, if we again consider a column that holds around 100 distinct statuses in a 100000 records table, 80% of the values lie between 15-30 and the remaining 20% is distributed in the other ranges. If we design a data distribution graph (histogram chart) on these figures, the height of few values or ranges will be very high where as others will be very low. This shows an oblique line for the data distribution.
A histogram is good for number columns. For character columns only the first 32 bytes (as of 8.1.7.4) of the string are used for building the histograms, this may sometimes result in incorrect information being created if the size of the column data exceeds this limitation.
User-specific histogram values can also be stored in the dictionary using the DBMS_STATS.PREPARE_COLUMN_VALUES and DBMS_STATS.SET_COLUMN_VALUES routines.
Dictionary tables
Histogram information is stored in the following dictionary tables.
Histogram values for columns in tables:
  DBA_TAB_HISTOGRAMS
endpoint_number - End point number
endpoint_value - Normalized end point value for the buckets.
endpoint_actual_value - Actual data value, only shows non-numeric value for the column.
For partition table histograms values:
  DBA_PART_HISTOGRAMS
  DBA_SUBPART_HISTOGRAMS
For evaluating histograms on indexed columns:
  INDEX_HISTOGRAM
repeat_count - Number of times one or more index key is repeated in the table.
keys_with_repeat_count - Number of index keys that are repeated.
Other Views that give similar data:
  DBA_TAB_COL_STATISTICS
  DBA_PART_COL_STATISTICS DBA_SUPPART_COL_STATISTICS
Columns in the above tables are self-explanatory.
Buckets in Histograms
Histogram statistics are stored in the form of buckets. Buckets represent the partitioning of data values, depending on the range. By default, 75 buckets are created. A maximum of 254 buckets can be specified for a column. How many buckets are required for a column will depend on the occurrences of distinct values. The default number of buckets is appropriate, but you will have to experiment with various bucket sizes to find the most suitable size.
If the number of distinct column values is less than the number of buckets specified, the individual column values and the count of these values is directly stored as histogram statistics. If the number of distinct column values is more than the buckets specified, Oracle uses an algorithm to store these values in ranges. If a series of continuous ranges have the same value, they may not be shown in the histogram table to save on space.
You may find columns with one-bucket histograms, these are as good as no histogram statistics and the optimizer ignores them.

本文来自ChinaUnix博客，如果查看原文请点：http://blog.chinaunix.net/u1/46451/showart_412028.html

文库|博客

返回列表

Chinaunix › 论坛 › 操作系统 › Solaris › Solaris文档中心 › DBMS_STATS包应用案例

DBMS_STATS包应用案例 [复制链接]

浏览过的版块