TOMSYAN 发表于 2011-12-23 03:09

Tracing Parallel Execution with _px_trace. Part I [ID 444164.1]

<DIV>
<TABLE cellSpacing=6 cellPadding=0 width="100%" border=0>
<TBODY>
<TR>
<TD vAlign=top noWrap width="50%" colSpan=3><EM>改时间</EM> 10-OCT-2011&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<EM>类型</EM> BULLETIN&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<EM>状态</EM> PUBLISHED</TD>
<TD vAlign=top>&nbsp;</TD></TR></TBODY></TABLE>

<FONT face="宋体, MS Song">***Checked for relevance on 29-April-2010*** <BR>***Checked for relevance on 10-Oct-2011*** <BR><B><U><FONT size=+2>Tracing Parallel Execution with _px_trace. Part I</FONT></U></B> <BR></FONT>
<P><B><FONT size=+1>Introduction</FONT></B>
<P>Since 9.0.1 a new mechanism has been introduced to obtain Parallel Execution (PX) trace information. This trace mechanism is keyword and parameter based. The existing event-based tracing is still available, but for some areas in _px_trace like compilation there is no event. Because each component of PX controls its own tracing, getting an event-based trace usually means setting one or more events and then deciding at what level they should be set; this can be cumbersome and there may not be any consistency of trace format across the different events. Keyword-based tracing is simpler to set than event-based tracing and is consistent across PX components.
<P>This note gives an overview of some parts of the new trace facility and can also be used to help diagnose PX problems. This note is written for people with some background and experience in PX.
<P>This first part gives some examples of how to trace the compilation of a PX query. This helps for question "Why does my query (not) run in parallel?". It also gives some examples of how to trace the allocation of PX server ( slaves ) and some idea why they are maybe not can be allocated. <BR>
<P>There are three sections to this note : <BR>
<OL>
<LI><a href="https://support.oracle.com/CSP/main/article?cmd=show&amp;type=NOT&amp;id=444164.1#Important PX phrases" target="_blank">Important PX phrases </A>Explain the main phrases/terms that are used in document.
<LI><a href="https://support.oracle.com/CSP/main/article?cmd=show&amp;type=NOT&amp;id=444164.1#Usage" target="_blank">Usage </A>- gives some examples with annotated trace illustrating various scenarios :
<UL>
<LI><a href="https://support.oracle.com/CSP/main/article?cmd=show&amp;type=NOT&amp;id=444164.1#Example1" target="_blank">Example#1 </A>select statement with one slave set, degree of parallelism from the dictionary.
<LI><a href="https://support.oracle.com/CSP/main/article?cmd=show&amp;type=NOT&amp;id=444164.1#Example2" target="_blank">Example#2 </A>select statement with two slave sets, degree of parallelism from the dictionary.
<LI><a href="https://support.oracle.com/CSP/main/article?cmd=show&amp;type=NOT&amp;id=444164.1#Example#3" target="_blank">Example#3 </A>select statement, too few slaves available to run query in parallel.
<LI><a href="https://support.oracle.com/CSP/main/article?cmd=show&amp;type=NOT&amp;id=444164.1#Example #4" target="_blank">Example#4 </A>requested Degree Of Parallelism (DOP) rounded down dynamically due to CPU load
<LI><a href="https://support.oracle.com/CSP/main/article?cmd=show&amp;type=NOT&amp;id=444164.1#Example#5" target="_blank">Example#5 </A>Degree Of Parallelism taken from a hint.
<LI><a href="https://support.oracle.com/CSP/main/article?cmd=show&amp;type=NOT&amp;id=444164.1#Example#6" target="_blank">Example#6 </A>Join of two tables each with different DOP setting, which DOP do we choose ?
<LI><a href="https://support.oracle.com/CSP/main/article?cmd=show&amp;type=NOT&amp;id=444164.1#Example#7" target="_blank">Example#7 </A>Parallel execution in RAC. </LI></UL>
<LI><a href="https://support.oracle.com/CSP/main/article?cmd=show&amp;type=NOT&amp;id=444164.1#Current" target="_blank">Current SQL cursor</A> How to find the current SQL command for what the trace information is written.
<LI><a href="https://support.oracle.com/CSP/main/article?cmd=show&amp;type=NOT&amp;id=444164.1#Reference" target="_blank">Reference</A> - gives a full list of the options. </LI></OL>
<P><BR><A name="Important PX phrases"></A><B><U><FONT size=+1>Important PX phrases</FONT></U></B>
<OL>
<LI><B>QC</B> The user background process that coordinates the slaves and that does all the work that runs in serial.
<LI><B>Instance</B> All the active processes and the memory of the database belong to one Oracle SID. A RAC system is made of more than one instance.
<LI><B>set/threads</B>Parallel query use a divide and conquer strategy. This means if for example a large table needs to be scan, multiple processes are used to scan parts of the table. This processes is a single step are called thread and the number is the DOP. Oracle use for complex query not only 1 set of process it use 2 or more. This is necessary to avoid skewed data.
<LI><B>DOP</B> This is the number of threads in 1 set.
<LI><B>slave</B> This is a synonym for a thread ( single process in a set).
<LI><B>height</B> This is number of thread in a set ( synonym for DOP).
<LI><B>server/Svrs</B> This is a synonym for thread or slave. </LI></OL>
<P><BR><A name=Usage></A><B><U><FONT size=+1>Usage</FONT></U></B>
<P>All of these examples were generated with the following trace setting:
<P>SQL&gt; alter session set "_px_trace"="compilation","execution","messaging";
<P><BR>Trace files are created in the user_dump_dest and for the slaves in the background_dump_dest. On some platforms they are written in the user_dump_dest. On RAC systemes trace files are written on every node, where a slave or the qc is started. Interesting is here the QC trace files.
<P><A name=Example1></A><B><U>Example #1</U></B>
<P>This example should give an overview over the most important functions in the area of compilation and allocation of parallel execution servers.<BR>Here's a simple query which will use one slave set with two slaves:
<P>SQL&gt; alter session set "_px_trace"="compilation","execution","messaging"; <BR>Create a large table for the next examples with: CREATE TABLE bigemp AS SELECT * FROM all_objects; <BR>SQL&gt; alter table bigemp parallel 2; <BR>SQL&gt; select count(*) from bigemp;
<P>------------ start of trace -------------- <BR>--- cut some lines--- kkfdapdml <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; pgadep:2 pdml mode:0 PQ allowed DML allowed not autonomous =&gt; not allowed <BR><B>--&gt;This are checks if PDML is enabled. This lines can be ignored. </B><BR>--- cut some lines--- <BR>The table : 1166 ---&gt; <B>This is Table object id. from all_objects.</B><BR>kkfdPaPrm ---&gt; <B>This shows the functions from the source code.</B> <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; DOP = 2 (computed from hint/dictionary) <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; use dictionary DOP on table ---&gt; <B>Use <U>D</U>egree <U>O</U>f <U>P</U>arallelism from the dictionary</B> <BR>kkfdtsc <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; DOP = 2 (from kkfdPaPrm()) <BR>kkfdPiAlo <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; DOP = 2 (kkfdpideg) <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Invalid Part. No. ---&gt; <B>Not a partitioned table</B> <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; No Pruning Info ---&gt; <B>Ditto</B> <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Pararllel Operation Type = 2 (kkfdpiOpType) <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Flags = 41 (kkfdpiflg) <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; min. tranx free list = 0 (kkfdpimgi) <BR>kkfdPaPrm <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; DOP = 1 (computed from hint/dictionary) <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; use dictionary DOP on table <BR>kkfdtsc <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; serial - DOP = 1 (from kkfdPaPrm()) or 10383 level 1 not set <BR>kxfrSysInfo <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; DOP trace -- compute default DOP from system info <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # instance alive = 1 (kxfrsnins) <BR>kxfrDefaultDOP <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; DOP Trace -- compute default DOP <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # CPU = 1 <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Threads/CPU = 2 ("parallel_threads_per_cpu") <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; default DOP = 2 (# CPU * Threads/CPU) <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; default DOP = 2 (DOP * # instance) <BR>kxfrSysInfo <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; system default DOP = 2 (from kxfrDefaultDOP()) ---&gt; <B>This is the computed System default DOP.</B> <BR>kxfralo <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; DOP trace -- requested thread from best ref obj = 2 (from kxfrIsBestRef()) ---&gt; <B>This is the DOP after all objects are checked. The DOP can be adjusted in the next functions, because there are restrictions like adaptive load algorithm.</B> <BR>kxfralo <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; threads requested = 2 (from kxfrComputeThread()) <BR>kxfralo <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; adjusted no. threads = 2 (from kxfrAdjustDOP()) <BR>kxfralo <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; about to allocate 2 slaves <BR>kxfrAllocSlaves <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; DOP trace -- call kxfpgsg to get 2 slaves <BR>kxfpgsg <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; num server requested = 2 <BR>kxfpiinfo <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; inst <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1m ---&gt; <B>On this system we have: 1 instance</B> <BR>kxfpgsg <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; getting 1 sets of 2 threads, client parallel query execution flg=0x30 ---&gt; <B>Get 1 slave set of 4 slaves</B> <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Height=2, Affinity List Size=0, inst_total=1, coord=1 <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Insts 1 <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Threads 2 <BR>kxfpg1srv <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; trying to get slave P000 on instance 1 ---&gt; <B>Let's go and grab the slaves ( repeated for each slave). We see also the name of the slave process.</B> <BR>kxfpg1sg <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Got It. 1 so far. <BR>kxfpg1srv<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; trying to get slave P001 on instance 1<BR>kxfpg1sg<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Got It. 2 so far. <BR>kxfrAllocSlaves <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; actual num slaves alloc'd = 2 (kxfpqcthr) ---&gt; <B>Then display the number of slaves allocated</B> <BR>kxfxcPutSession ---&gt; <B>This function is only visible in version 10g or higher. This shows the CBO parameter that are used to compile the PX query.</B> <BR>Send Compilation environment:<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; optimizer_mode_hinted = false<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; optimizer_features_hinted = 0.0.0<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; parallel_execution_enabled = true<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; parallel_query_forced_dop = 0<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; parallel_dml_forced_dop = 0<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; parallel_ddl_forced_degree = 0<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; parallel_ddl_forced_instances = 0<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; _query_rewrite_fudge = 90<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; optimizer_features_enable = 10.2.0.3<BR>...... <BR>------------end of trace -------------- <BR>For questions like "Why does my query run in parallel?", the DOP setting in function kkfdPaPrm should be checked. Is there one with a DOP &gt; 1? If so, this part of the trace will also show the object id from which we got the DOP setting and this id can be used to find the object name. Then you can check if there is are parallel setting in the data dictonary or if there is a hint. <BR>
<P><A name=Example2></A><B><U>Example #2</U></B>
<P>A slightly more complex query which will use 2 slaves sets each of 2 slaves and therefore require 4 slaves in total
<P>SQL&gt; alter table bigemp parallel 2; <BR>SQL&gt; select ename, count(*) from bigemp group by ename;
<P>To make it easier to read, only extracts from the trace file will be shown:
<P>------------ start trace --------------
<P>kkfdPaPrm <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; DOP = 2 (computed from hint/dictionary) <BR>use dictionary DOP on table ---&gt; <B>From ALTER TABLE BIGEMP PARALLEL 2</B>
<P>--- cut some trace ---
<P>kxfpiinfo <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; inst <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1 ---&gt; <B>Instance 1 </B>
<P>--- cut some trace ---
<P>kxfpgsg <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; getting 2 sets of 2 threads, client parallel query execution flg=0x30 <BR>---&gt; <B>Try to get 2</B> <B>slave sets, each of 2 slaves, i.e. 4 PQ slaves.</B> <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Height=2, Affinity List Size=0, inst_total=1, coord=1 <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Insts 1 <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Threads 2
<P>--- cut some trace ---
<P>kxfpgsg <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; got 2 threads on 1 instance(s), avg height=2 <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Insts 1 <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Svrs 4 ---&gt; <B>We got the 4 slaves.</B>
<P>----- end trace --------- <BR>
<P><A name="Example #3"></A><B><U>Example #3</U></B>
<P>SQL&gt; alter table bigemp parallel 8; <BR>SQL&gt; select count(*) from bigemp; <BR>
<P>In this example the &lt;&gt; was set to TRUE, which in turns forces &lt;&gt; to be TRUE. So the requested degree of parallelism can be reduced dynamically based on the current machine load.
<P>---- start ----
<P>kxfrAllocSlaves <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; DOP trace -- call kxfpgsg to get 8 slaves ---&gt; <B>Try for 8 slaves</B>
<P>kxfpAdaptDOP <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Requested=8 Granted=4 Target=4 Load=1 Default=2 users=0 sets=1 ---&gt; <B>Adaptive load algorithm rounds down to 4</B> <BR>kxfpgsg <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; load adapt num servers requested to = 4 (from kxfpAdaptDOP()) <BR>
<P>---- End ---- <BR>
<P><A name=Example#4></A><B><U>Example #4</U></B>
<P>For this example:
<P>- parallel_max_servers = 10 <BR>- parallel_min_percent = 0 ( we can still parallelize even if we can't grab 100% of requested slaves) <BR>- Query requires 12 slaves
<P>SQL&gt; alter table bigemp parallel 12; <BR>SQL&gt; select count(*) from bigemp;
<P>------ start --------
<P>kxfpgsg <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; getting 1 sets of 12 threads, client parallel query execution flg=0x30 <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Height=12, Affinity List Size=0, inst_total=1, coord=1 <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Insts 1 <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Threads 12 ---&gt; <B>We try to get 12 slaves.</B>
<P>--- cut some lines---
<P>kxfpgsg <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; got 10 threads on 1 instance(s), avg height=10 <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Insts 1 <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Svrs 10 ---&gt; <B>We actually allocate 10 slaves.</B> <BR>
<P>----- end -------
<P>If parallel_min_percent had been = 100, this would indicate we must grab all requested slaves for the query to be able to run, otherwise the user sees:
<P>- ORA-12827: insufficient parallel query slaves available
<P>and in the trace file:
<P>kxfrAllocSlaves <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; actual num slaves alloc'd = 10 (kxfpqcthr) <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; DOP trace -- num slaves alloc'd &lt; min required(min=%d) ---&gt; <B>We didn't find 100% of requested slaves, return error.</B> <BR>
<P><A name=Example#5></A><B><U>Example#5</U></B>
<P>In this example the trace illustrates how the degree of parallelism was derived from a query hint:
<P>SQL&gt; select /*+PARALLEL(BIGEMP,3) */ count(*) from bigemp;
<P>------ start ------ <BR>kkfdPaPrm <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; DOP = 3 (computed from hint/dictionary) <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; use hint DOP on table ---&gt; <B>We used a hint to derive DOP</B> <BR>---- end -----
<P><A name=Example#6></A><B><U>Example#6</U></B>
<P>In this example, two tables are joined, each has a different DOP, which DOP do we use :
<P>SQL&gt;alter table bigemp parallel 4; <BR>SQL&gt;alter table bigdept parallel 2; <BR>SQL&gt;select dname, avg(sal) from bigemp e, bigdept d where e.deptno=d.deptno group by dname;
<P>---- start ----
<P>kkfdPaPrm <BR><B>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; DOP = 4</B> (computed from hint/dictionary) ---&gt; <B>DOP for BIGEMP</B> <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; use dictionary DOP on table <BR>--- cut some lines--- <BR>kkfdPaPrm <BR><B>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; DOP = 2</B> (computed from hint/dictionary) ---&gt; <B>DOP for BIGDEPT</B> <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; use dictionary DOP on table <BR>--- cut some lines--- <BR>kxfralo <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; DOP trace -- requested thread from <B>best ref obj = 4</B> (from kxfrIsBestRef ---&gt; <B>Use BIGEMP's DOP</B>
<P>---- end ----
<P><A name=Example#7></A><B><U>Example#7</U></B>
<P>The following query was run on a 2 instance RAC database.
<P>(note Parallel_automatic_tuning = TRUE ).
<P>SQL&gt;alter table bigemp parallel 4; <BR>SQL&gt;select count(*) from bigemp; <BR><BR>
<P>---- start ----
<P>kxfpiinfo <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; inst <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1 2 ---&gt; <B>Instance# [ CPU_COUNT : PARALLEL_MAX_SERVERS ]</B>
<P>kxfpclinfo <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; inst(load:user:pct:fact)aff <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1(4 :0 :100:50 ) 2(0:0:100:200) ---&gt; <B>Load information for each instance, so instance #1 :</B> <BR><B>---&gt; load=4</B> <BR><B>---&gt; 0 PX users</B> <BR><B>---&gt; User has no resource restriction, can use 100% of system</B> <BR><B>---&gt; Load factor represents how much more load the machine can accept</B> <BR><B>the bigger the number, the better.</B> <BR>kxfpAdaptDOP <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Requested=4 Granted=4 <B>Target=16</B> Load=4 Default=8 users=0 sets=1 <BR>---&gt; <B>Target of 16 is estimate of best load for the cluster.</B> <BR>---&gt; <B>DOP of queries may be reduced to maintain this target as best we can.</B> <BR>kxfpsori <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Sorted: 2(0:200) 1(4:50) ---&gt; <B>Sort instances by load and factor.</B> <BR><B>---&gt; Put "unit" more slaves on instance that can best afford it.</B> <BR><B></B>---&gt; <B>More slaves will go to Instance 2 as it has lower load.</B>
<P>kxfpunit <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; var=4 limit=8 use_aff=0 aff_num=0 <B>unit=2</B> ---&gt; <B>Unit is a function of CPU_COUNT on the QC node.</B>
<P>kxfpgsg <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; getting 1 sets of 4 threads, client parallel query execution flg=0x30 ---&gt; <B>try for 1 slave set of 4 slaves</B> <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Height=0, Affinity List Size=0, <B>inst_total=2</B>, coord=1 ---&gt; <B>We have two instances, QC is on instance #1.</B> <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Insts 2 1 ---&gt; <B>Instances are 2 and 1</B> <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Threads 4 0 ---&gt; <B>We will try to get 4 slaves on instance 2. The numbers are # of slaves corresponding to the Instances in previous line in the order they are printed.</B> <BR>kxfpg1srv <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; trying to get slave P000 on instance 2 ---&gt; <B>get slaves from instance #2</B> <BR>kxfpg1sg <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Got It. 1 so far. <BR>kxfpg1srv <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; trying to get slave P001 on instance 2 <BR>kxfpg1sg <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Got It. 2 so far.
<P>--- cut ----
<P>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; got 4 threads on 1 instance(s), avg height=4 ---&gt; <B>We managed to get 4 slaves on instance #2. Thread means here slaves in 1 slaves set</B> <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Insts 2 1 <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Svrs 4 0
<P>---- end ---- <BR>
<P><A name=Current></A><B><U><FONT size=+1>Current SQL cursor</FONT></U></B> <BR>How to find the SQL sent to Slaves: In 9i this is not so easy, because every slave set get a new SQL command that is part of the original query. AS example:<BR>
<P>kxfxcp1 <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; parsing at slave set 1:<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp; SELECT /*+ CIV_GB */ A1.C0,COUNT(SYS_OP_CSR(A1.C1,0)) FROM :Q2753000 A1 GROUP BY A1.C0<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; TQ: 1 nbgra: 0 flg: 0x1<BR><BR>kxfxcp1<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; parsing at slave set 2:<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; SELECT /*+ PIV_GB */ A1.C0 C0,SYS_OP_MSR(COUNT(*)) C1 FROM (SELECT /*+<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; NO_EXPAND ROWID(A2) */ A2."DEPTNO" C0 FROM "SCOTT"."EMP" PX_GRANULE(0<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; , BLOCK_RANGE, DYNAMIC) A2) A1 GROUP BY A1.C0<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; TQ: 0 nbgra: 1 flg: 0x2002<BR><BR>In 10g you have to search for the function kxfxcp1.<BR>
<P>kxfxcp1 <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Sending parse to slave set 1: <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; User sqllen sent from QC = 74 <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;; select /*+ parallel(t,5) */ owner# , count(*) from test t group by owner# <BR>In 10g all slaves in all sets work on the same SQL command, so in every function kxfxcp1 you wil the same SQL command. It is not longer possible to see o what part of the SQL teh current set work. <BR>
<P><A name=Reference></A><B><U><FONT size=+1>Reference</FONT></U></B>
<P>This section documents describes teh syntax of _px_trace. In practice only a subset as documented above will be required.
<P>There are up to three components that can be specified when starting the trace
<P>- Area in which tracing is required. <BR>- Verbosity. <BR>- Do we want timing information.
<P>alter session set "_px_trace"=[area],[area],..,; <BR>as example<BR>alter session set "_px_trace"=high,execution,medium,execution,time;
<P>The possible values for each of these is
<P><B><U>Verbosity:</U></B>
<UL>
<LI>high
<LI>medium
<LI>low </LI></UL><B><U>Area:</U></B>
<UL>
<LI>scheduling - ( equivalent to some of event 10384 and some of 10390)
<LI>execution - (equivalent to some of event 10390)
<LI>granule - (equivalent to some of event 10390 and some of 10391)
<LI>messaging - (equivalent to event 10392 and event 10393)
<LI>buffer - (equivalent to event 10399)
<LI>compilation - ( no equivalent event)
<LI>all - all of the above
<LI>none - none of the above. </LI></UL><B><U>Timing</U></B>
<UL>
<LI>time </LI></UL>Here are some examples :
<P>In the init.ora file:
<BLOCKQUOTE>_px_trace="high","execution","medium","buffer","time"</BLOCKQUOTE>Or, more usually, at the session level:
<BLOCKQUOTE>SQL&gt; alter session set "_px_trace"="high","execution","medium","buffer","time";</BLOCKQUOTE>To get the full trace:
<BLOCKQUOTE>SQL&gt;alter session set "_px_trace"="all";</BLOCKQUOTE>Note : Put the verbosity level followed by the areas to be traced at that level of verbosity.
<P>To switch off the trace :
<BLOCKQUOTE>SQL&gt; alter session set "_px_trace"="none"; <BR><BR></BLOCKQUOTE></DIV>
页: [1]
查看完整版本: Tracing Parallel Execution with _px_trace. Part I [ID 444164.1]