(c)2011, OraPub, Inc. Warranty: There is none. Use at your own risk. Use at your own risk. OraPub is not responsible for the research related script. File : SqlInterArrivalCollection1a.txt Author : Craig Shallahamer, craig@orapub.com Original : 12-Jan-2011 Last : 01-Feb-2011 The purpose of this tool kit is to determine the interarrival distribution of SQL statements. It could be modified to support procedures, but I did not do this. This tool kit will help answer the question, "Do SQL statements arrive perfectly spaced, that is with a constant gap between them?" or "Do SQL statements arrive in a normal distribution pattern" or "Do SQL statements arrive randomly, with no apparent pattern?" Most DBAs assume statements arrive in a normal distribution fashion. That is, there is an average gap between arrivals and there are just as many arrivals less than the average gap as there are greater than the average gap. My hunch this is incorrect, but I want to prove it to myself. But to demonstrate this, I need to set up an experiment that not only I can run, but others as well. This is what this little tool kit is all about. Enjoy! --------------------- General Experimental Strategy --------------------- The general experimental strategy is there is a monitoring Oracle user that looks for a predetermined SQL statement to begin running. If it finds it, it logs the time. The results may look like this: Over a 5 second sample period: Arrival Time Stamp (HH24:MI:SS) ------------------------------ 00:10:10 00:10:11 00:10:15 00:10:35 During the 5 second period, 4 arrivals occurred and therefore the average arrival rate is 4/5 arrivals/second, which is 0.80 arrivals/second. While the average arrival rate is nice to know, I'm also interested in the distribution of the arrivals. For this I need the arrival gaps, that is, the inter-arrival times. From this raw data we then determine the time between each arrival which, for our sample will be: Arrival Gap (second/arrival) ---------------------------- - 01 04 20 Once I get a few of these, I can do some pretty cool mathematical analysis, like create a histogram. --------------------- Data Collection Strategy --------------------- If the Oracle system is performing well and transactions are not backed up, then I can adequately determine when a SQL statement arrives by looking at it's execution start time, which will also be close to when v$sqlstats inserts the first row for the execution or updates its existing row. Based on a previous blog entry, I demonstrated that for SQL statements, the execution column in v$sqlstats is incremented when the statement begins. Therefore, when detecting an execution count change, we know the SQL statement has begun and therefore arrived. The arrival time is logged in the op_results_raw table. The monitoring occurs in a very tight pl/sql loop with, by default has no delay. You can set a delay other than zero by changing the define troll_delay_in. I am using v$sqlstats instead of its underlying x$kkssqlstat table because it's more likely YOU will be able to run this in a production environment. All you would need to do to use x$kkssqlstat is simply substitute the object name in the collection script and of course, connect as SYS when you collect the data. One of the keys to make this work is to clearly identify the application SQL. To do this we need its SQL_ID and PLAN_HASH_VALUE columns. If you look closely at the "Identify SQL Statements" section below, you'll see one way to gather this information from your system. --------------------- Identifying SQL Statements --------------------- Here is one way to identify active SQL statements and display their sql_id and plan_hash_value. You must determine the sql_id and plan_hash_value for the SQL you want to monitor. If you do not already know this, use this script to help identify some active SQL candidates for monitoring. Just copy and past the SQL into the monitoring user's sqlplus session. set tab off alter session set commit_write="batch,nowait"; drop table bogus; create table bogus as select sql_id,plan_hash_value,executions,disk_reads,buffer_gets from v$sqlstats where executions > 10; exec dbms_lock.sleep(60); select cur.sql_id, cur.plan_hash_value, cur.executions-prev.executions execs, cur.disk_reads-prev.disk_reads pios, cur.buffer_gets-prev.buffer_gets lios, substr(cur.sql_text,1,50) the_sql from v$sqlstats cur, bogus prev where cur.sql_id = prev.sql_id and cur.plan_hash_value = prev.plan_hash_value and cur.executions-prev.executions > 1 order by 3; Here's an example from my test system. SQL_ID PLAN_HASH_VALUE EXECS PIOS LIOS THE_SQL ------------- --------------- ---------- ---------- ---------- -------------------------------------------------- bfujkg8dw1aax 2890534904 2 0 2 SELECT UPPER(PARAMETER_VALUE) FROM MGMT_PARAMETERS 5n1fs4m2n2y0r 299250003 2 0 12 select pos#,intcol#,col#,spare1,bo#,spare2,spare3 0xqn4sx1ytghr 2797331186 2 2 12 select /*+ first_rows(1) no_expand */ tab.msgid 3ktacv9r56b51 4184428695 2 5 14 select owner#,name,namespace,remoteowner,linkname, 089dbukv1aanh 1388734953 2 0 0 SELECT SYS_EXTRACT_UTC(SYSTIMESTAMP) FROM DUAL 3g7sxtj9d6zd3 742841275 2 1 4 select privilege#,nvl(col#,0),max(mod(nvl(option$, 8swypbbr0m372 893970548 2 3 6 select order#,columns,types from access$ where d_o c8h3jdwaa532q 2890534904 2 0 2 SELECT TO_NUMBER(PARAMETER_VALUE) FROM MGMT_PARAME fsbqktj5vw6n9 1443566277 3 2 36 select next_run_date, obj#, run_job, sch_job from gx4mv66pvj3xz 2570921597 3 0 13 select con#,type#,condlength,intcols,robj#,rcon#,m 3w4qs0tbpmxr6 1224215794 3 0 6 select con#,obj#,rcon#,enabled,nvl(defer,0),spare2 83taa7kaw59c1 3765558045 4 4 13 select name,intcol#,segcol#,type#,length,nvl(preci 08bqjmf8490s2 2890534904 4 1 4 SELECT PARAMETER_VALUE FROM MGMT_PARAMETERS WHERE 60uw2vh6q9vn2 0 5 6 45 insert into col$(obj#,name,intcol#,segcol#,type#,l 3nkd3g3ju5ph1 2853959010 5 4 18 select obj#,type#,ctime,mtime,stime, status, datao 53saa2zkr6wc3 3954488388 6 2 24 select intcol#,nvl(pos#,0),col#,nvl(spare1,0) from grwydz59pu6mc 3684871272 6 3 12 select text from view$ where rowid=:1 96g93hntrzjtr 2239883476 7 2 14 select /*+ rule */ bucket_cnt, row_cnt, cache_cnt, 5ms6rbzdnq16t 1846751226 12 0 36 select job, nvl2(last_date, 1, 0) from sys.job$ wh cvb1dphpubtfw 2786546476 25 36348 3234086 SELECT SUM(OBJECT_ID) FROM ALL_OBJECTS cm5vu20fhtnq1 3332340200 49 1 166 select /*+ connect_by_filtering */ privilege#,leve fm7q0unp5a0by 3703260412 55 2778 383175 UPDATE CUSTOMERS10 SET STATUS = TO_CHAR(SYSDATE,'M 9gm93sw7hq936 2506895180 55 2774 444768 UPDATE CUSTOMERS11 SET STATUS = TO_CHAR(SYSDATE,'M acz1t53gkwa12 4269646525 56 2885 474490 UPDATE CUSTOMERS12 SET STATUS = TO_CHAR(SYSDATE,'M bsa0wjtftg3uw 2020579421 165 0 495 select file# from file$ where ts#=:1 018x8jv0c9900 2600030034 1377 698 34452 select count(*) from user$ u, xdb.xdb$schema s 0tkzr5pu8kv2u 4281053736 1377 42 35802 select count(*) from xdb.xdb$schema s where bit I'm going to use: acz1t53gkwa12 4269646525 56 2885 474490 UPDATE CUSTOMERS12 SET STATUS = TO_CHAR(SYSDATE,'M --------------------- Create Data Collection Objects --------------------- Review the below SQL. You should be able to simply copy and paste most of the below SQL directly into the your monitoring user's SQL*Plus session. Make sure the collection Oracle user has select privileges on a few key views. -- You may need to connect as SYS to do this: -- def monitoruser=system grant select on v_$sqlstats to &monitoruser; grant select on v_$session to &monitoruser; grant execute on dbms_lock to &monitoruser; connect &monitoruser drop table op_results_raw; create table op_results_raw ( sql_id varchar2(20), plan_hash_value number, hit timestamp, note varchar2(10) ); create or replace function get_interval_s(i_intrvl interval day to second) return number is begin return extract(day from i_intrvl) * 86400 + extract(hour from i_intrvl) * 3600 + extract(minute from i_intrvl) * 60 + extract(second from i_intrvl); exception when others then begin return null; end; end; / CREATE or REPLACE PROCEDURE get_sql_interarrival_prc ( sample_secs_in number, -- The number of seconds to sample/collect data. delay_secs_in number, -- This reduces the impact of the collection but reduces the accuracy of data. sql_id_in varchar2, -- The statement to instrument. plan_hash_value_in number -- The statement to instrument. ) as BEGIN DECLARE time_mark timestamp; loop_no number; sample_no number; last_executions number; curr_executions number; BEGIN delete from op_results_raw; commit; -- current_timestamp is a reserved word, like sysdate. select current_timestamp into time_mark from dual; loop_no := 0; sample_no := 0; insert into op_results_raw values (sql_id_in,plan_hash_value_in,current_timestamp,'START'); select executions into last_executions from v$sqlstats where plan_hash_value = plan_hash_value_in and sql_id = sql_id_in; WHILE ( get_interval_s( current_timestamp-time_mark ) < sample_secs_in ) LOOP loop_no := loop_no + 1; select executions into curr_executions from v$sqlstats where plan_hash_value = plan_hash_value_in and sql_id = sql_id_in; -- If executions has changed, we know a new execution has just BEGUN. IF (curr_executions > last_executions) THEN insert into op_results_raw values (sql_id_in,plan_hash_value_in,current_timestamp,'DATA'); last_executions := curr_executions; sample_no := sample_no + 1; END IF; dbms_lock.sleep(delay_secs_in); END LOOP; insert into op_results_raw values (sql_id_in,plan_hash_value_in,current_timestamp,'STOP'); dbms_output.put_line('loop_no='||loop_no||' sample_no='||sample_no); END; END; / --------------------- Data Collection --------------------- Finally, we are ready to collect some data...well almost. -- Set the parameters for what we're looking for and the gathering constraints -- def sql_id_in=acz1t53gkwa12 def plan_hash_value_in=4269646525 def sample_duration_secs_in=60 def delay_secs_in=0 alter session set commit_write="batch,nowait"; set serveroutput on exec get_sql_interarrival_prc(&sample_duration_secs_in,&delay_secs_in,'&sql_id_in',&plan_hash_value_in); -- Now view the just collected raw data (I didn't show any of my results.): set tab off set linesize 150 select * from op_results_raw order by hit; --------------------- Data Analysis Prep --------------------- set tab off pagesize 50 -- The true sample duration is: select get_interval_s(t1.hit-t0.hit) Duration_secs from (select sql_id,plan_hash_value,hit from op_results_raw where note='START') t0, (select sql_id,plan_hash_value,hit from op_results_raw where note='STOP') t1 where t0.sql_id=t1.sql_id and t0.plan_hash_value=t1.plan_hash_value; SQL> l 1 select get_interval_s(t1.hit-t0.hit) Duration_secs 2 from (select sql_id,plan_hash_value,hit from op_results_raw where note='START') t0, 3 (select sql_id,plan_hash_value,hit from op_results_raw where note='STOP') t1 4 where t0.sql_id=t1.sql_id 5* and t0.plan_hash_value=t1.plan_hash_value SQL> / DURATION_SECS ------------- 59.999985 -- The number of arrivals select count(*) Arrivals from op_results_raw where note='DATA'; SQL> select count(*) Arrivals from op_results_raw where note='DATA'; ARRIVALS ---------- 58 -- Average arrival rate (per second) select x.arrivals/get_interval_s(t1.hit-t0.hit) Avg_Arrival_Rate_per_Sec from (select sql_id,plan_hash_value,hit from op_results_raw where note='START') t0, (select sql_id,plan_hash_value,hit from op_results_raw where note='STOP') t1, (select sql_id,plan_hash_value,count(*) arrivals from op_results_raw where note='DATA' group by sql_id,plan_hash_value) x where t0.sql_id=t1.sql_id and t0.sql_id=x.sql_id and t0.plan_hash_value=t1.plan_hash_value and t0.plan_hash_value=x.plan_hash_value; SQL> l 1 select x.arrivals/get_interval_s(t1.hit-t0.hit) Avg_Arrival_Rate_per_Sec 2 from (select sql_id,plan_hash_value,hit from op_results_raw where note='START') t0, 3 (select sql_id,plan_hash_value,hit from op_results_raw where note='STOP') t1, 4 (select sql_id,plan_hash_value,count(*) arrivals from op_results_raw where note='DATA' group by sql_id,plan_hash_value) x 5 where t0.sql_id=t1.sql_id 6 and t0.sql_id=x.sql_id 7 and t0.plan_hash_value=t1.plan_hash_value 8* and t0.plan_hash_value=x.plan_hash_value SQL> / AVG_ARRIVAL_RATE_PER_SEC ------------------------ .966666908 -- -- To get a list of the inter-arrival times -- -- Detail listing -- set tab off pagesize 50 col hit print col PrevHit print col ArrivalGapSec format 999990.000000 select hit, lag(hit) over (order by hit) PrevHit, get_interval_s(hit - lag(hit) over (order by hit)) ArrivalGapSec from op_results_raw where note='DATA' order by 1; -- I'm only going to show the first few lines. HIT PREVHIT ARRIVALGAPSEC ------------------------------ ------------------------------ -------------- 12-JAN-11 02.23.38.015668 PM 12-JAN-11 02.23.39.033214 PM 12-JAN-11 02.23.38.015668 PM 1.017546 12-JAN-11 02.23.40.052424 PM 12-JAN-11 02.23.39.033214 PM 1.019210 12-JAN-11 02.23.41.071279 PM 12-JAN-11 02.23.40.052424 PM 1.018855 12-JAN-11 02.23.42.177730 PM 12-JAN-11 02.23.41.071279 PM 1.106451 12-JAN-11 02.23.43.196310 PM 12-JAN-11 02.23.42.177730 PM 1.018580 12-JAN-11 02.23.44.214545 PM 12-JAN-11 02.23.43.196310 PM 1.018235 12-JAN-11 02.23.45.232070 PM 12-JAN-11 02.23.44.214545 PM 1.017525 12-JAN-11 02.23.46.257132 PM 12-JAN-11 02.23.45.232070 PM 1.025062 ... -- Summary, used for input into analysis -- -- (uses sample select as above) col hit noprint col PrevHit noprint set tab off pagesize 50 col ArrivalGapSec format 999990.000000 / ARRIVALGAPSEC -------------- 1.017546 1.019210 1.018855 1.106451 1.018580 1.018235 1.017525 1.025062 1.023676 -- For input into WolframAlpha and perhaps Mathematica, I want a comma delimited list. -- col hit noprint col PrevHit noprint set tab off pagesize 0 col ArrivalGapSec format 999990.000000 select hit, lag(hit) over (order by hit) PrevHit, get_interval_s(hit - lag(hit) over (order by hit)) ||',' ArrivalGapSec from op_results_raw where note='DATA' order by 1; 1.017546, 1.01921, 1.018855, 1.106451, 1.01858, 1.018235, 1.017525, 1.025062, 1.023676, 1.02267, 1.026292, 1.026526, 1.023412, 1.019862, 1.018121, 1.018232, 1.107351, 1.045038, 1.01754, ... To get a quick histogram, go to http://www.wolframalpha.com, enclose the above list in curly braces (removing the last comma), and submit! Look down the page and you'll see the histogram! --------------------- Additional Data To Play With --------------------- -- Here is an example based on one of my test environments. If you wanted, just change the initial four defines and you could copy and paste all the SQL directly into your monitoring user's SQL*Plus session! That's essentially what I did. def sql_id_in=acz1t53gkwa12 def plan_hash_value_in=4269646525 def sample_duration_secs_in=120 def delay_secs_in=0 alter session set commit_write="batch,nowait"; set serveroutput on exec get_sql_interarrival_prc(&sample_duration_secs_in,&delay_secs_in,'&sql_id_in',&plan_hash_value_in); col hit noprint col PrevHit noprint set tab off pagesize 0 col ArrivalGapSec format 999990.000000 select hit, lag(hit) over (order by hit) PrevHit, get_interval_s(hit - lag(hit) over (order by hit)) ||',' ArrivalGapSec from op_results_raw where note='DATA' order by 1; , 1.022105, 1.018354, 1.028192, 1.025623, 1.018903, 1.018764, 1.018035, 1.018967, 1.018399, 1.020803, 1.050645, .986606, 1.018642, 1.038674, 1.026601, 1.019091, 1.027014, 1.024511, 1.282889, 1.017959, 1.022251, 1.017911, 1.262761, .774376, 1.023408, 1.023244, 1.026507, 1.025558, 1.023352, 1.021937, 1.0252, 1.01929, 1.016789, 1.017846, 1.018105, 1.018548, 1.017709, 1.023534, 1.022685, 1.026959, 1.025238, 1.01838, 1.026235, 1.022591, 1.027411, 1.025829, 1.025032, 1.023549, 1.017612, 1.018886, 1.019038, 1.018614, 1.01895, 1.019194, 1.018716, 1.023809, 1.113276, 1.017926, 1.017645, 1.018225, 1.020574, 1.018207, 1.018075, 1.017816, 1.017743, 1.017952, 1.018053, 1.128741, 1.026853, 1.026018, 1.018271, 1.02485, 1.02246, 1.019628, 1.022861, 1.020588, 1.104826, 1.018694, 1.018318, 1.018813, 1.058905, 1.018913, 1.026336, 1.025684, 1.094094, 1.027325, 1.023476, 1.017369, 1.024076, 1.026887, 1.019939, 1.018971, 1.01988, 1.018736, 1.025788, 1.025404, 1.024762, 1.026208, 1.027537, 1.02386, 1.112024, 1.021383, 1.028775, 1.02233, 1.214973, 1.017639, 1.018568, 1.018437, 1.02713, 1.026199, 1.026203, 1.025023, 1.023517, 1.022998, 1.029126, 116 rows selected. --END