Friday, 13 February 2015

Join Elimination and potential risk of Wrong Results

Join Elimination is a query transformation feature introduced in 10gR2. This feature lives up to its name, Optimizer now has the ability to not to execute a join even if the SQL asks for it.

Let's look at this example. ( These tests were done on 12.1.0.2)

SQL> create table m(a number not null,n number not null, constraint m_pk primary key(n));

Table created.

SQL> create table d(b number not null,n number not null, constraint m_fk foreign key(n) references m(n));

Table created.

M-D has a parent-child relationship. All rows in table D reference exactly one row in table M

Now Consider this query.

SQL>  select d.b from m inner join d on m.n = d.n;

Plan hash value: 3387653791

----------------------------------
| Id  | Operation         | Name |
----------------------------------
|   0 | SELECT STATEMENT  |      |
|   1 |  TABLE ACCESS FULL| D    |
- - - - - - - - - - - - - - - - - 

In the SQL - you explicitly asked for a join (m.n = d.n ). But if you look at the execution plan Optimizer has not executed the join. It does a FULL TABLE SCAN on table D. This is legitimate - Optimizer is aware that all rows in table D reference exactly one row in table M. So there is no need to perform a join here.
For join elimination to happen certain conditions must be met.

1. There must be a PK-FK relationship
2. SQL must select columns ONLY from the child table.

If you select columns from parent table as well, JE does not kick in. 

SQL>  select m.a, d.b from m inner join d on m.n = d.n;

---------------------------------------------
| Id  | Operation                    | Name |
---------------------------------------------
|   0 | SELECT STATEMENT             |      |
|   1 |  NESTED LOOPS                |      |
|   2 |   NESTED LOOPS               |      |
|   3 |    TABLE ACCESS FULL         | D    |
|   4 |    INDEX UNIQUE SCAN         | M_PK |
|   5 |   TABLE ACCESS BY INDEX ROWID| M    |
---------------------------------------------

It is expected to see a nested loop join here. Join is executed by the Optimizer because you have to access both the tables to satisfy the query.

Interestingly, you see two NESTED LOOPS in the execution plan but our query has just one join condition.This is another 11g Optimization feature called Nested Loop Join Batching ( Controlled by the parameter_nlj_batching_enabled). Will talk more about this in some other blog post.

We now talk about certain parameters and constraint attributes that can potentially cause wrong results due to JE.

Consider this example

SQL> create table m(a number not null, n number not null, constraint m_pk primary key(n) rely enable validate);

Table created.

SQL> create table d(b number not null, n number not null, constraint m_fk foreign key(n) references m(n) rely disable novalidate);

Table created.

SQL> insert into d values(1,1);

1 row created.

SQL> commit;

Commit Complete

Note that I'm able to insert into table D even though value '1' does not exist in the parent table M. This is because my foreign key constraint is in a state of RELY DISABLE NOVALIDATE

SQL> select d.b from m inner join d on m.n = d.n;

no rows selected

This is correct result. Join condition is never met because value '1' does not exist in table M.

Looking at the execution plan

Plan hash value: 299975660

-----------------------------------
| Id  | Operation          | Name |
-----------------------------------
|   0 | SELECT STATEMENT   |      |
|   1 |  NESTED LOOPS      |      |
|   2 |   TABLE ACCESS FULL| D    |
|   3 |   INDEX UNIQUE SCAN| M_PK |
-----------------------------------

JE elimination does not kick in. This is because we have a RELY NOVALIDATE constraint. 

Now alter the value of the parameter query_rewrite_integrity to TRUSTED.

SQL> alter session set query_rewrite_integrity=TRUSTED;

SQL> select d.b from m inner join d on m.n = d.n;

         B
----------
         1
 
( WRONG RESULTS)

Plan hash value: 3387653791

----------------------------------
| Id  | Operation         | Name |
----------------------------------
|   0 | SELECT STATEMENT  |      |
|   1 |  TABLE ACCESS FULL| D    |
----------------------------------

When query_rewrite_integrity is set to TRUSTED - Join Elimination kicks in. It does a full table scan on D and the query returns 1 row. This is a wrong result.

CONCLUSION

Join elimination does not kick in with RELY NOVALIDATE constraints when QUERY_REWRITE_INTEGRITY is set to ENFORCED. Join elimination continues to consider RELY NOVALIDATE constraints when QUERY_REWRITE_INTEGRITY is set to TRUSTED or STALE_TOLERATED.This can cause the query to return wrong results. Therefore to avoid wrong results, you should be careful in choosing the right value for the parameter QUERY_REWRITE_INTEGRITY specially when you have RELY NOVALIDATE constraints.

Thursday, 5 February 2015

Using Result Cache and SQL Patch together

Result Caching is a new feature in 11g that caches the complete result sets ( Not the database blocks).To use the result cache, you must set the initialization parameter RESULT_CACHE_MODE  to FORCE or use the hint /*+ RESULT_CACHE */ in the query. Wondering if you are using a packaged application and you can't really edit the query ? Here is when SQL Patch feature comes to the rescue. We can use both these features combined to use result cache without having to edit the query.

SQL> drop table test;
SQL> create table test (a number); 
SQL> insert into test values(1); 
SQL> commit; 
SQL> exec sys.dbms_sqldiag_internal.i_create_patch 
    (sql_text  => 'select count(*) from test', 
     hint_text=> 'OPT_PARAM(''result_cache_mode'' ''FORCE'')',  
     name => 'MYPATCH');
 
 SQL> select count(*) from test; 
 SQL> select * from table(dbms_xplan.display_cursor); 

Plan hash value: 1950795681
--------------------------------------------------------------------------------
| Id  | Operation           | Name                       | Rows  | Cost (%CPU)|
--------------------------------------------------------------------------------

|   0 | SELECT STATEMENT    |                            |       |     3 (100)|
         
|   1 |  RESULT CACHE       | 6wgaf3vqwzvzx1sfmk7cw30v0v |       |            |
         
|   2 |   SORT AGGREGATE    |                            |     1 |            |
         
|   3 |    TABLE ACCESS FULL| TEST                       |     1 |     3   (0)| 
--------------------------------------------------------------------------------

Execution Plan confirms that a cache ID is generated and subsequent executions of this SQL would use the result cache.

NOTE: If your database version is lower than 12.1.0.2 you must install Patch 16974854 to use this feature. Without this fix, Optimizer ignores the result cache hint.

How to find the bind variable value of a SQL that is currently executing

There are two reliable ways to get the bind variable values of a SQL that is currently executing.

1. SQL Monitoring (V$SQL_MONITOR.BIND_XML)
2. Errorstack Trace

We look at both of these methods in details. 

I've not included 10046 ( level 8 or 12 ) trace here as it may not report the bind variable value if you have enabled the trace after the SQL has crossed the parse phase.

SQL Monitoring (V$SQL_MONITOR.BIND_XML)

SQL Monitoring was introduced in 11.1 but the column BIND_XML was added only in 11.2

We run a query with bind variables that takes little longer to complete.

SQL> VAR BIND1 NUMBER;
SQL> EXEC :BIND1:=999;
SQL> select count(*) from DBA_OBJECTS A,DBA_OBJECTS B WHERE A.OBJECT_ID>:BIND1;

While this SQL is still running, query v$sql_monitor from a different session.

SQL> select xmltype(binds_xml) from v$sql_monitor where sid =74 and status = 'EXECUTING';

XMLTYPE(BINDS_XML)
------------------
<binds>
<bind name=":BIND1" pos="1" dty="2" dtystr="NUMBER" maxlen="22" len="3">999</bind>
</binds>

Note that SQL Monitoring kicks in for SQL statements that run for 5 seconds or more.

If you have a bind variable in the SELECT clause of the query, bind_xml is not populated. For e.g.

select :bind1 AS BIND, ENAME from SCOTT.EMP ;

 ERRORSTACK

Errorstack trace has lot of information in it. Here I focus on the bind variable section of the trace file alone.

This is how you can take an errorstack trace. 12280 is the OS Process ID of the session that runs the SELECT.

connect / as sysdba
oradebug setospid 12280
oradebug unlimit
oradebug dump errorstack 3
oradebug tracefile_name /* This tells you the file name  */
exit

Header section of the trace file shows you the the Current SQL statement text and its SQL ID.

----- Error Stack Dump -----

----- Current SQL Statement for this session (sql_id=9z7qgrmf5at7b) 

select count(*) from DBA_OBJECTS A,DBA_OBJECTS B WHERE A.OBJECT_ID>:BIND1

Now search the trace file with the keyword "Dump Cursor sql_id=9z7qgrmf5at7b". Scroll down a bit and you would see the bind information.

----- Bind Info (kkscoacd) -----
Bind#0
oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
oacflg=03 fl2=1000000 frm=00 csi=00 siz=24 off=0
kxsbbbfp=2b479dde1cc8 bln=22 avl=03 flg=05
value=999

oacdty is the data type of the variable. 02 indicates a NUMBER datatype. These are some of other identifiers.

01 - Char
02 - Number
08 - Long
09 - Varchar
12 - Date
112- CLOB
113- BLOB
114- BFILE

Saturday, 15 February 2014

Evolving SQL Plan Baselines and Adjusting the Acceptance Threshold

Non-accepted plans can be verified by executing the evolve_sql_plan_baseline function. This function will execute the non-accepted plan and compare its performance to the best accepted plan. 
The execution is performed using peeked bind values of the non-accepted plan. 
If the non-accepted plan's performance is better (threshold of 1.5), the function will make it accepted, thus adding it to the SQL plan baseline. 

Consider the following. 

There is one Non-accepted plan for SQL_HANDLE SQL_0f8c8f355212ed9a

SQL> select sql_handle,plan_name,enabled,accepted from dba_sql_plan_baselines;

SQL_HANDLE                     PLAN_NAME                      ENA ACC
------------------------------ ------------------------------ --- ---
SQL_0f8c8f355212ed9a           SQL_PLAN_0z34g6p915vcu22368d7d YES NO
SQL_0f8c8f355212ed9a           SQL_PLAN_0z34g6p915vcu6ddd367b YES YES

We will now evolve this non-accpeted plan.

SQL> var report clob;  
SQL> exec :report := DBMS_SPM.EVOLVE_SQL_PLAN_BASELINE( sql_handle=> 'SQL_0f8c8f355212ed9a', verify=>'YES', commit=>'YES' );

PL/SQL procedure successfully completed

SQL> print :report


-------------------------------------------------------------------------------
                        Evolve SQL Plan Baseline Report
-------------------------------------------------------------------------------

Inputs:
-------
  SQL_HANDLE = SQL_0f8c8f355212ed9a
  PLAN_NAME  = SQL_PLAN_0z34g6p915vcu22368d7d
  TIME_LIMIT = DBMS_SPM.AUTO_LIMIT
  VERIFY     = YES
  COMMIT     = YES

Plan: SQL_PLAN_0z34g6p915vcu22368d7d
-----------------------------------
  Plan was verified: Time used .06 seconds.
  Plan failed performance criterion: 1.29 times better than baseline plan.
  
                      Baseline Plan      Test Plan     Stats Ratio
                      -------------      ---------     -------------
  Execution Status:        COMPLETE       COMPLETE
  Rows Processed:               10            10
  Elapsed Time(ms):             3.434         .101            34
  CPU Time(ms):                 3.555         .1111           32.03
  Buffer Gets:                  19            15              1.27
  Disk Reads:                   0             0
  Direct Writes:                0             0
  Fetches:                      0             0
  Executions:                   1             1

-------------------------------------------------------------------------------
                                 Report Summary
-------------------------------------------------------------------------------
Number of SQL plan baselines verified: 1
Number of plans accepted: 0

The new plan is 1.29 times better than baseline plan. ( 1.29 is computed as a function of buffer gets and CPU time).
This plan was not accepted because it did not meet the threshold criterion of 1.5. A plan has to be at least 1.5 times
better for the SPM to accept it.
This threshold of 1.5 is controlled by a hidden parameter "_plan_verify_improvement_margin". 150 is the default value

Now let us set this parameter to 120 and evolve the plan again.

SQL> ALTER SYSTEM SET "_plan_verify_improvement_margin"=120;

SQL> var report clob;  
SQL> exec :report := DBMS_SPM.EVOLVE_SQL_PLAN_BASELINE( sql_handle=> 'SQL_0f8c8f355212ed9a', verify=>'YES', commit=>'YES' );

PL/SQL procedure successfully completed

SQL> print :report


-------------------------------------------------------------------------------
                        Evolve SQL Plan Baseline Report
-------------------------------------------------------------------------------

Inputs:
-------
  SQL_HANDLE = SQL_0f8c8f355212ed9a
  PLAN_NAME  = SQL_PLAN_0z34g6p915vcu22368d7d
  TIME_LIMIT = DBMS_SPM.AUTO_LIMIT
  VERIFY     = YES
  COMMIT     = YES

Plan: SQL_PLAN_0z34g6p915vcu22368d7d
-----------------------------------
  Plan was verified: Time used .06 seconds.
  Plan passed performance criterion: 1.29 times better than baseline plan.
  Plan was changed to an accepted plan.
  
                       Baseline Plan      Test Plan     Stats Ratio
                      -------------      ---------     -------------
  Execution Status:        COMPLETE       COMPLETE
  Rows Processed:               10            10
  Elapsed Time(ms):             3.434         .101            34
  CPU Time(ms):                 3.555         .1111           32.03
  Buffer Gets:                  19            15              1.27
  Disk Reads:                   0             0
  Direct Writes:                0             0
  Fetches:                      0             0
  Executions:                   1             1

-------------------------------------------------------------------------------
                                 Report Summary
-------------------------------------------------------------------------------
Number of SQL plan baselines verified: 1
Number of plans accepted: 1

SPM evolution now automatically accepts SQL_PLAN_0z34g6p915vcu22368d7d.

SQL> select sql_handle,plan_name,enabled,accepted from dba_sql_plan_baselines;

SQL_HANDLE                     PLAN_NAME                      ENA ACC
------------------------------ ------------------------------ --- ---
SQL_0f8c8f355212ed9a           SQL_PLAN_0z34g6p915vcu22368d7d YES YES
SQL_0f8c8f355212ed9a           SQL_PLAN_0z34g6p915vcu6ddd367b YES YES

Sunday, 28 July 2013

V$SQL.ELAPSED_TIME FOR PARALLEL QUERIES

Recently I came across an issue where the actual response time of a SQL statement was 50 seconds but v$sql.elapsed_time reported 2400 seconds. AWR too reported 2400 seconds. So, what was going on here ? I saw the execution plan, it was running in Parallel.
This is actually an expected behavior. When the query runs in parallel, v$sql.elapsed_time reports the total elapsed time taken by the query coordinator (QC) and all the parallel slaves. My query was running with 48 parallel slaves. This is why elapsed time in v$sql was seen ~48x higher.
Interestingly elapsed_time in v$sql_monitor displays correct response time. This view displays time associated with the query coordinator process alone.
Takeaway, do not panic if you see a high elapsed time for SQL statements in the AWR report because elapsed time is not necessarily the actual response time.

You can query ASH data to get the wall clock time of a parallel query. It also has a PX_FLAGS column which is currently undocumented but it can tell the degree of parallelism your SQL used in the past


 SELECT SQL_ID,
         SQL_EXEC_START,
          SQL_PLAN_HASH_VALUE PHV,
         MIN (sample_time)
             SQL_START_TIME,
         MAX (sample_time)
             SQL_END_TIME,
         ROUND (
               (  CAST (MAX (sample_time) AS DATE)
                - CAST (SQL_EXEC_START AS DATE))
             * 3600
             * 24)
             etime_secs,
         ROUND (SUM (delta_read_io_bytes) / 1024 / 1024 / 1024, 2)
             PHYSICAL_READ_GB,
         ROUND (SUM (delta_interconnect_io_bytes) / 1024 / 1024 / 1024, 2)
             INTERCONNECT_IO_GB,
         MAX (TRUNC (px_flags / 2097152))
             ACTUAL_DEGREE,
         MIN (NULLIF (TRUNC (px_flags / 2097152), 0))
             MIN_ACTUAL_DEGREE
    FROM DBA_HIST_ACTIVE_SESS_HISTORY
   WHERE sql_id = 'dthn77sj0wpdk' AND (SQL_EXEC_START IS NOT NULL OR SQL_EXEC_ID IS NOT NULL) 
GROUP BY SQL_ID,
         SQL_EXEC_START,
         SQL_PLAN_HASH_VALUE,
         SQL_EXEC_ID,
         QC_INSTANCE_ID,
         QC_SESSION_ID,
         QC_SESSION_SERIAL#
 ORDER BY SQL_EXEC_START  DESC

Saturday, 22 June 2013

Use of Non-Deterministic functions in a SQL may give Wrong Results

Here is an interesting little test case to demonstrate the issue.

create table dept (deptno number, dname varchar2(100));

insert into dept values (1,'HR');

insert into dept values (2,'AZ');
insert into dept values (3,'XL');

SELECT deptno, dname

FROM dept
WHERE deptno in (SELECT round(dbms_random.value(1, 3)) FROM dual);

I would expect that the sub query would return 1 row ( 1/2/3) and therefore the main query should always return 1 record.
Now let us run this query a few times and see the result set.

/

2 AZ


/


now rows selected


/


1 HR

3 XL

Here is the execution plan of the statement.

 ---------------------------------------------------------------------------
 | Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
 ---------------------------------------------------------------------------
 |   0 | SELECT STATEMENT   |      |     1 |    22 |     4   (0)| 00:00:01 |
 |*  1 |  FILTER            |      |       |       |            |          |
 |   2 |   TABLE ACCESS FULL| DEPT |     4 |    88 |     2   (0)| 00:00:01 |
 |*  3 |   FILTER           |      |       |       |            |          |
 |   4 |    FAST DUAL       |      |     1 |       |     2   (0)| 00:00:01 |
 ---------------------------------------------------------------------------
  
 Predicate Information (identified by operation id):
 ---------------------------------------------------
    1 - filter( EXISTS (SELECT 0 FROM "SYS"."DUAL" "DUAL" WHERE
               ROUND("DBMS_RANDOM"."VALUE"(1,3))=:B1))
    3 - filter(ROUND("DBMS_RANDOM"."VALUE"(1,3))=:B1)
  
Strange, different executions return different number of records. Wrong results ?
No. Problem here is with the non-deterministic function (dbms_random.value).
This function is evaluated once for each row returned by DEPT, rather than 
being evaluated once at the start. This behavior is expected as optimizer
does not guarantee when a non-deterministic function will be evaluated during 
the statement execution.
You have to be careful when you use such non-deterministic functions in the SQL statement.

Interesting Case where a full table scan is chosen over an index scan - Out of Range Selectivity ?

I was working on a real-life case where CBO was choosing a full table scan over an index scan.Created a small test case to demonstrate the issue.
I created a table test - with two colummns ID and ORG

create table test ( id number, org varchar2(100) );

begin

for i in 1..300000
loop
insert into test values (1,'XXXXXXXXXXXXXX');
end loop;
end;
/

commit;


- Inserted 300000 records for ID=1


begin
for i in 1..100000
loop
insert into test values (17,'YYYYYYYYYYYYYYYYY');
end loop;
end;
/
commit;

- Inserted 100000 records for ID=17

- Therefore ID has two distinct values 1 and 17.

create index idx_1 on test(id);

- Created an index on TEST.ID


- Gathering 100% statistics with histograms.


exec dbms_stats.gather_table_stats(user,'TEST',estimate_percent=>100,method_opt=>'for columns ID size 2');

Since, we have no values with ID=0. And If I run "select * from test where id=0" , I would expect an index range scan.Since there are no records with ID=0, index scan would finish in the blink of an eye regardless of the size of the table.Let us look at the execution statistics.


SQL> set autotrace on
SQL> select * from test where id=0;

no rows selected


Execution Plan
----------------------------------------------------------
Plan hash value: 1357081020

--------------------------------------------------------------------------

| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 46875 |   869K|   230   (3)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| TEST | 46875 |   869K|   230   (3)| 00:00:03 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

   1 - filter("ID"=0)


Statistics

----------------------------------------------------------
          1  recursive calls
          0  db block gets
       1313  consistent gets
          0  physical reads
          0  redo size
        339  bytes sent via SQL*Net to client
        404  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed

Strangely, Optimizer chooses a FULL TABLE SCAN over Index IDX_1

Notice, that we are doing 1313 consistent gets here.

Now let us force the index and see the execution statistics.


SQL> select /*+ index(test idx_1) */ * from test where id=0;

no rows selected


Execution Plan
----------------------------------------------------------
Plan hash value: 3519455734

-------------------------------------------------------------------------------------

| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       | 46875 |   869K|   249   (1)| 00:00:03 |
|   1 |  TABLE ACCESS BY INDEX ROWID| TEST  | 46875 |   869K|   249   (1)| 00:00:03 |
|*  2 |   INDEX RANGE SCAN          | IDX_1 | 46875 |       |    95   (2)| 00:00:02 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

   2 - access("ID"=0)



Statistics

----------------------------------------------------------
          8  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads
          0  redo size
        339  bytes sent via SQL*Net to client
        404  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          4  sorts (memory)
          0  sorts (disk)
          0  rows processed

OK, with an index hint, query runs much faster and number of consistent gets comes down from 1313 to just 3 !


Now this is interesting, I have 100% statistics, histograms are in place. Why can't optimizer choose an index by itself ?


In the execution plan, we see that optimizer exepects our query to return 46875 rows. This number is looking suspicious.


Because of this large value, optimizer considers full table scan less expensive than an index scan.


Let us work out how optimizer arrived at this number.


These are the values known to the optimizer (before parsing the statement in question).


count(*) for 1 -  300000

count(*) for 17 - 100000   --> Min(BucketSize)
NumRows => Number of rows in the table which is 400000
low_value=1
high_value=17
Predicate Val=0

Optimizer is using this formula to determine the selectivity for predicate value 0.

This predicate is out of range ( not betweeen min value 1 and max value 17)

Sel = (100000 /2*400000) * ( 1- ((1-0)/(17-1))))


    = 0.117


Therefore,


cardinality = Sel * NumRows

            = 0.11 * 400000
= 46875

CONCLUSION


When predicate value is closer to the min / max value selectivity is higher. (Like in our case, 0 is close to 1)

And when your predicate value is way beyond min/max value, selectivity is lower.

SOLUTION


As explained, this is an expected behvaior. This is how optimizer determines selectivity of a out of range predicate.

As a workaround

- Use Index Hint

- Adjust optimizer_index_cost_adj to lower value say 50.
- Use plan stability features such as baselines.

Bloom Partition Pruning In Oracle Vs Postgres

In this blog post let's compare Oracle and Postgres optimizer transformation features.  For this test case I created two small tables an...