Tuesday, 7 January 2020

ORA- 942 despite having the select privilege

Here is an intriguing issue where a user was able to run a SQL statement directly or from within an anonymous PL SQL Block. But the same SQL statement fails with ORA 942 - table or view does not exist, when it is run from within a procedure.

User A was able to execute this
CREATE TABLE A.TEST AS SELECT * FROM B.TEST ;
User A was also able to execute this
BEGIN
EXECUTE IMMEDIATE 'CREATE TABLE A.TEST AS SELECT * FROM B.TEST';
END;
/
But the same user A was not able to execute this procedure.
CREATE PROCEDURE A.TEST_PROC
AS
BEGIN
EXECUTE IMMEDIATE 'CREATE TABLE A.TEST AS SELECT * FROM B.TEST';
END;
/
-- Procedure created
EXECUTE TEST_PROC;

This fails with ORA 942 - table or view does not exist. User A has select privilege on B.TEST via a role!
Oracle allows two modes of operation for executing named PL SQL - definer rights and invoker rights. By default all procedures and functions are created with definer rights. In this definer rights model, all roles are disabled during procedure compilation and execution. We were able to compile the above procedure because B.TEST is within the execute immediate block. But during the execution of this procedure, the role is disabled and therefore User A looses the access on B.TEST.
Hence, the procedure execution fails with ORA 942. Even a call to SET ROLE is not allowed within the procedure. You must remove the role dependency and give direct grants to the user, to be able to run the above procedure.
With invoker rights model, roles are enabled for PL SQL execution but not during compilation. 


Saturday, 7 September 2019

Curious case of SQLs restarting by itself and ORA -12842

We might have seen lots of ORA-12842 "Cursor Invalidated during parallel execution".
Usually a select query running in parallel would fail with this error. This can happen when the select is running concurrently with other sessions doing some sort of partition maintenance ( like exchange partitions or adding / dropping partitions). Well, this is an expected behaviour. Document 1322894.1 tells you to ignore this error and re-execute the query. Or you can just "suppress" the error message by setting a fix control.

SQL> alter system set "_fix_control"='7170213:OFF' ;


But the note doesn't tell you about how Oracle actually suppresses the error. And is it really a good idea to set this fix control ?

Let's try and understand the current database behaviour. If a parallel query encounters ORA 12842 midway during the execution, the error is not immediately reported back to the user. Instead, Oracle silently restarts the SQL on its own hoping that it will go through this time. If this "automated" second run also fails with the same error, Oracle again restarts the SQL. And this continues. If the 10th execution is unsuccessful, only then ORA 12842 is reported back to the user.

Now, if I have a select query that should complete in 15 minutes. Let's say this fails after running for first 10 minutes and then Oracle restarts the query all over again. Assuming that all retry attempts were unsuccessful, this query would effectively run for 100 minutes and then report ORA 12842.
And if the query succeeds in the 10th attempt, the total run time for this query would appear to be 105 minutes! So, the user might complain that his query is running slow but actually Oracle is silently restarting the query in the background. In the ASH data you would see that the sql_exec_id would change after every failure.

Now let's talk about how Oracle suppresses this error. When we turn off this fix control, ORA 12842 is never reported. But there is a catch to it. With the fix control turned off, Oracle does not retry the SQL for 10 times ( as explained above). Instead, it downgrades the SQL to serial in the second attempt itself. And serial SQLs never report ORA 12842.  So, suppressing the error effectively means that your SQL will be downgraded to serial. And this will again have huge performance implications.

Now that we know how Oracle handles these kind of situations, we can make a careful choice on whether this fix control should be set. I'd personally advise against it. There are better ways to handle this error. Like, you could run the partition maintenance on weekends ( outside of your  normal OLTP/ batch reporting jobs).







Friday, 9 August 2019

Adaptive Cursor Sharing - Detecting the problem a little too late

Adaptive Cursor Sharing was introduced in 11g to deal with the issues related to Bind Variable Peeking.
This post assumes that you have a basic understanding of bind peeking and ACS working.
Goal of this post to demonstrate a key problem with ACS that it detects the problem a little too late.

CREATE TABLE t1 (c1 NUMBER, c2 VARCHAR2(10));

-- Insert 1 million rows, c2 is skewed

INSERT INTO t1
SELECT LEVEL c1,
       CHR(ROUND(DBMS_RANDOM.NORMAL) + 77) c2
  FROM DUAL
CONNECT BY LEVEL <= 1e6;

COMMIT;


CREATE INDEX t1_i1 ON t1(c1);

CREATE INDEX t1_i2 ON t1(c2);

-- gather stats at 100% with histograms
EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 't1', estimate_percent => 100, method_opt => 'FOR ALL COLUMNS SIZE 254');


-- data distribution
SELECT c2, COUNT(*) FROM t1 GROUP BY c2 ORDER BY c2;

 C2           COUNT(*)
---------- ----------
H                   3         
I                 234
J                5950
K               60656
L              242076
M              382945
N              241556
O               60508
P                5838
Q                 230
R                   4

VAR b2 varchar2(100)

EXEC :b2 :='M'

SELECT COUNT(DISTINCT c1) FROM t1 WHERE c2 = :b2;


COUNT(DISTINCTC1)

-----------------
           382945
  
/
/
..
-- I execute this statement 20 times.

select * from table(dbms_xplan.display_cursor);

SQL_ID  1cvnvg9f3yrb0, child number 0

-------------------------------------
SELECT COUNT(DISTINCT c1) FROM t1 WHERE c2 = :b2

Plan hash value: 405047221


-----------------------------------------------------------------------------------------

| Id  | Operation            | Name     | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |          |       |       |       |  1814 (100)|          |
|   1 |  SORT AGGREGATE      |          |     1 |    13 |       |            |          |
|   2 |   VIEW               | VW_DAG_0 |   382K|  4861K|       |  1814   (2)| 00:00:22 |
|   3 |    HASH GROUP BY     |          |   382K|  2617K|  6016K|  1814   (2)| 00:00:22 |
|*  4 |     TABLE ACCESS FULL| T1       |   382K|  2617K|       |   491   (3)| 00:00:06 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

   4 - filter("C2"=:B2)


Full Table Scan looks good because bind value that we passed (M) is highly un-selective. (selectivity 0.38)
Now I pass the most selective bind value which is 'H' (selectivity 0.000003).
I am expecting ACS to kick in so that index t1_i2 could be used.
Index Scan on t1_i2 would be the optimal plan here because the current bind value 'H' is highly selective.
 
EXEC :b2 :='H'

SELECT COUNT(DISTINCT c1) FROM t1 WHERE c2 = :b2;

COUNT(DISTINCTC1)
-----------------
                3

select * from table(dbms_xplan.display_cursor);


SQL_ID  1cvnvg9f3yrb0, child number 0

-------------------------------------
SELECT COUNT(DISTINCT c1) FROM t1 WHERE c2 = :b2

Plan hash value: 405047221


-----------------------------------------------------------------------------------------

| Id  | Operation            | Name     | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |          |       |       |       |  1814 (100)|          |
|   1 |  SORT AGGREGATE      |          |     1 |    13 |       |            |          |
|   2 |   VIEW               | VW_DAG_0 |   382K|  4861K|       |  1814   (2)| 00:00:22 |
|   3 |    HASH GROUP BY     |          |   382K|  2617K|  6016K|  1814   (2)| 00:00:22 |
|*  4 |     TABLE ACCESS FULL| T1       |   382K|  2617K|       |   491   (3)| 00:00:06 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

   4 - filter("C2"=:B2)


Note that ACS did not detect the problem as yet (on the first execution). You still see a full table scan.I execute this statement again few times hoping that ACS would kick in and index would be used.

SELECT COUNT(DISTINCT c1) FROM t1 WHERE c2 = :b2;
/
/
/

select * from table(dbms_xplan.display_cursor);


SQL_ID  1cvnvg9f3yrb0, child number 0

-------------------------------------
SELECT COUNT(DISTINCT c1) FROM t1 WHERE c2 = :b2

Plan hash value: 405047221


-----------------------------------------------------------------------------------------

| Id  | Operation            | Name     | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |          |       |       |       |  1814 (100)|          |
|   1 |  SORT AGGREGATE      |          |     1 |    13 |       |            |          |
|   2 |   VIEW               | VW_DAG_0 |   382K|  4861K|       |  1814   (2)| 00:00:22 |
|   3 |    HASH GROUP BY     |          |   382K|  2617K|  6016K|  1814   (2)| 00:00:22 |
|*  4 |     TABLE ACCESS FULL| T1       |   382K|  2617K|       |   491   (3)| 00:00:06 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

   4 - filter("C2"=:B2)


ACS didn't kick in yet !

/
/
<< 8th execution >>
SELECT COUNT(DISTINCT c1) FROM t1 WHERE c2 = :b2;

SQL_ID  1cvnvg9f3yrb0, child number 1

-------------------------------------
SELECT COUNT(DISTINCT c1) FROM t1 WHERE c2 = :b2

Plan hash value: 1964376958


-------------------------------------------------------------------------------------------

| Id  | Operation                      | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |          |       |       |     5 (100)|          |
|   1 |  SORT AGGREGATE                |          |     1 |    13 |            |          |
|   2 |   VIEW                         | VW_DAG_0 |     3 |    39 |     5  (20)| 00:00:01 |
|   3 |    HASH GROUP BY               |          |     3 |    21 |     5  (20)| 00:00:01 |
|   4 |     TABLE ACCESS BY INDEX ROWID| T1       |     3 |    21 |     4   (0)| 00:00:01 |
|*  5 |      INDEX RANGE SCAN          | T1_I2    |     3 |       |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

   5 - access("C2"=:B2)


 
Finally, I see an index range scan. It seems to me that ACS was working behind the scenes and finally detects the problem on the 8th execution which I think is a little too late.

Now a new child cursor has been created which is bind aware.


SQL> select is_bind_sensitive,is_bind_aware,executions,child_number from v$sql where sql_id='1cvnvg9f3yrb0';

I I EXECUTIONS CHILD_NUMBER

- - ---------- ------------
Y N         27            0
Y Y          1            1


SQL> select SQL_ID,CHILD_NUMBER,PREDICATE,RANGE_ID,LOW,HIGH FROM v$sql_cs_selectivity where sql_id='1cvnvg9f3yrb0';

SQL_ID        CHILD_NUMBER PREDICATE                                  RANGE_ID LOW        HIGH

------------- ------------ ---------------------------------------- ---------- ---------- ----------
1cvnvg9f3yrb0            1 =B2                                               0 0.000003   0.000003

CONCLUSION

Let me summarise some of the limitations of ACS.

1. So, it takes a while before ACS kicks in. It needs a certain number of executions before ACS identifies a problem

and corrects it. This actually depends on the number of child 0 executions and may be some other factors.
I did some tests and this is what I found.

-----------------------------------------------------------------
CHILD 0 (FTS)                  CHILD 1 (Index Range Scan)        
Executions.                    Created after n executions.       
-----------------------------------------------------------------
1                      |                          2              
10                     |                          5              
20                     |                          8              
50                     |                         18              
100                    |                         35              
500                    |                        168              
1000                   |                  ACS NEVER KICKS IN   
-----------------------------------------------------------------

As you can see, there is a linear growth on the number of executions that ACS needs before it can make a cursor bind aware 
and create a new child cursor. There is even more interesting thing I noticed in my tests is that when I executed the child 0
cursor 1000 times and then changed my bind value to be most selective, ACS never corrected the problem. I do not know if that is a software bug or an expected behaviour.

2. ACS does not work for SQL statements called from inside the PL SQL block. However, It does seem to work after you set 
session_cached_cursors to 0.

3. A Cursor is not marked bind sensitive if the number of bind variables exceed 14.


4. 
ACS does not work for predicates of the form col <op> func(:bind). For e.g the statement below will not be marked bind sensitive.

SELECT COUNT(DISTINCT c1) FROM t1 WHERE c2 = UPPER(:b2)

Despite all the limitations ACS still is a wonderful feature and is a big step forward in dealing with the bind peeking 
problems. ACS does not solve all your problems but certainly solves some of them !

Saturday, 16 March 2019

JPPD does not occur for a view with UNION-ALL

Here is a small test case to demonstrate the problem ( on 11.2.0.4)


create table t1(a number);

create table t2(a number);

create index idx_t2 on t2(a);

create view v as
(
select * from t1
UNION ALL
select * from t2
)

create table t3(a number);

insert into t1 values(1);

insert into t2 values(2);

insert into t3 values(1);

commit;

exec dbms_stats.gather_table_stats(user,'T1')
exec dbms_stats.gather_table_stats(user,'T2')
exec dbms_stats.gather_table_stats(user,'T3')

explain plan for
select * from v where a in (select a from t3);

-------------------------------------------------------------------------------------
| Id  | Operation                    | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |      |     1 |    16 |     6   (0)| 00:00:01 |
|*  1 |  HASH JOIN SEMI              |      |     1 |    16 |     6   (0)| 00:00:01 |
|   2 |   VIEW                       | V    |     2 |    26 |     4   (0)| 00:00:01 |
|   3 |    UNION-ALL                 |      |       |       |            |          |
|   4 |     TABLE ACCESS STORAGE FULL| T1   |     1 |     3 |     2   (0)| 00:00:01 |
|   5 |     TABLE ACCESS STORAGE FULL| T2   |     1 |     3 |     2   (0)| 00:00:01 |
|   6 |   TABLE ACCESS STORAGE FULL  | T3   |     1 |     3 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


   1 - access("A"="A")
Looking at the predicates, view is evaluated without any filters or predicates. If my view returns billions of rows, those rows are evaluated first and then a semi join is performed. This is very inefficient. I expect sub-query to be executed first, transformed into a view and that be joined with the view V and join predicates should be pushed down. If the predicates are pushed, you would see smart scans (on Exadata) or Index Range Scan.

There is bug on 11.2.0.4 which should have been fixed on 12.2.

Bug 18798414: JPPD NOT HONOURED ON VIEW WITH UNION, WHEN EXECUTION STARTS WITH SUBQUERY

I did test on 12.2 but the issue still remains. I will be testing this on 18c and 19c soon.

For now, you can rewrite the query as shown below and this provides a good workaround .

select * from T1 where a in  (select a from t3);
UNION ALL
select * from T2 where a in  (select a from t3);

This query takes advantage of JPPD.

Update On 7-SEP-2019 : I followed this up with Oracle support. There are certain heuristics that disables JPPD when one of the tables does not have indexes. Like in the example shown above T1 does not have an index.  But JPPD will still be useful without indexes, so that one branch can take advantage of smart scans (on Exadata) and other branch can benefit from indexes.

Here is the solution to get this working on 11.2.0.4

1) get backport of Bug:23246817 for 11.2.0.4.0
-> this gets heuristic from Bug:19803410 in its latest improved version
2) get backport of Bug:18798414 for 11.2.0.4.0
3) get merge patch of 23246817 with 18798414 and install it
-> these two fixes have common files so we need a merge
4) set the following in the session where the JPPD is needed

alter session set "_fix_control"='4569940:0','19803410:0','16053273:1';





Saturday, 30 April 2016

Multi Threading with PL/SQL code to improve the Performance

Purpose of this document is achieve multi threading within a PL/ SQL code to improve the performance.

Consider this scenario

for i IN ( 1..100)
  LOOP
  {
      f(i) ;
  }

Function f is executed serially, 100 times one after another. Starting with 11gR2 we have the dbms_parallel_execute utility, this divides the work into smaller chunks and different sessions ( on different RAC instances) would execute the function concurrently ! Isn't this a cool new feature ?

Here is a quick little demo on how to get this working

SQL> create table T1 ( ID NUMBER);

-- Insert 4 rows into T1

SQL> insert into T1 values(1);
SQL> insert into T1 values(2);
SQL> insert into T1 values(3);
SQL> insert into T1 values(4);
SQL> commit;

SQL> create table T2 (A NUMBER);

-- Create function F that will insert a row into T2. This is the function which we would be running  concurrently. 

CREATE OR REPLACE PROCEDURE F ( start_id IN number, end_id IN number)
AS
BEGIN
FOR sql_rec IN (SELECT ID FROM T1 WHERE ID BETWEEN start_id and end_id)
LOOP
INSERT INTO T2 VALUES(9);
END LOOP;
END;
/

-- This is the real code. 

BEGIN 
    
-- Create a parallel execute task   
      
dbms_parallel_execute.create_task('MULTI_THREAD_TASK');

-- Create chunks by number col.  

DBMS_PARALLEL_EXECUTE.create_chunks_by_number_col
                             (task_name => 'MULTI_THREAD_TASK',
                              table_owner => 'VISHAL',
                              table_name => 'T1',
                              table_column => 'ID',
                              chunk_size => 1);

-- Run the task with parallel level 4.

dbms_parallel_execute.run_task
                        ( task_name => 'MULTI_THREAD_TASK',
                          sql_stmt  => 'begin  F( :start_id, :end_id ); end;',
                          language_flag  => DBMS_SQL.NATIVE,
                          parallel_level => 4);

-- Just in case you want to drop the task.                                   

-- dbms_parallel_execute.drop_task('MULTI_THREAD_TASK' );

END;
/

You can query DBA_PARALLEL_EXECUTE_CHUNKS view to get the job names and then map it to  dba_scheduler_job_run_details. You would get the instance ID and session ID details. By default Oracle would distribute the work in a uniform way across all the RAC instances.

If in case you want all the concurrent sessions to run on a single node, create a service that runs only on that specific node. Then create a new job class and assign the service to it. And then run the task with the job class you just created.

Sunday, 10 May 2015

How on earth can a single row insert take an hour to complete ?

I came across this very intriguing issue recently. Customer reported that a single row insert into a cluster table was taking more than an hour to complete.

This post assumes that you have a basic understanding of Table Clusters. Let's understand how Oracle handles a insert into Cluster Tables. And that there is a serious design flaw here.

Here is a small test case to demonstrate the problem. 

Version : 11.2.0.4

Additional Info : Table reside in NON-ASSM managed Tablespace (Similar issues occur in ASSM managed tablespaces as well)


-- Create a Cluster. No size clause specified. This implies that we can have just one cluster key value per block


SQL> CREATE CLUSTER ABC_C1 (C1 number(5));

-- Create Cluster Index

SQL> CREATE INDEX ABC_C1_IX ON CLUSTER ABC_C1 ;

-- Create table

SQL> create table abc1 (c1 number(5),C2 CHAR(1000), C3 CHAR(1000)) cluster ABC_C1(c1);


-- Now INSERT 1000 rows ( 2 times) with Cluster key Value 1-1000

( ** This PL SQL block is executed two times **)


BEGIN
FOR i in 1..1000
LOOP
insert into abc1 values(i,'A','A');
END LOOP;
END;
/

/


SQL> commit;


-- Analyze the table

SQL> ANALYZE TABLE ABC1 COMPUTE STATISTICS;

SQL> SELECT BLOCKS,NUM_FREELIST_BLOCKS FROM USER_TABLES;


BLOCKS             NUM_FREELIST_BLOCKS

----------           ---------------------------
1000                             1000

Now we have 2,000 rows spanned over 1000 blocks. Each block having 2 rows for the same cluster key value.Each block has some free space in it because the average row length is 2k and block size is 8k. Therefore we have 1000 blocks in the FREELIST.

-- 

SQL> alter system flush buffer_cache;


-- Now insert a new cluster key Value

SQL> set autotrace traceonly exp stat
SQL>
SQL>
SQL> insert into abc1 values (12000,'H','H');

1 row created.


Statistics

----------------------------------------------------------
2 recursive calls
2019 db block gets
4 consistent gets
1021 physical reads
1961796 redo size
839 bytes sent via SQL*Net to client
796 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1 rows processed

As you can see, a single row insert has done 1021 physical reads via db file sequential read. This is insane! Oracle reads 1021 data blocks for a simple insert. In real world scenarios - Big tables may have millions of blocks in the FREELIST. This can cause major performance issues where the number of blocks in the FREELIST is huge.

How Oracle handles this insert ?

To insert a new cluster key , Oracle must find free block in the segment.To find the free space, current behaviour is to scan through the ENTIRE free list. In this case, we have 1000 free blocks. Each of the blocks is read via "db file sequential read", but inserts cannot be performed in either of those blocks because we can have only 1 distinct cluster key value in one block. So, we now scanned through 1000 blocks and none of the blocks are eligible for an insert.  This is where performance takes a hit. Moreover, Oracle also unlinks all these 1000 blocks from the free list because none of them could accommodate the new insert.

How should Oracle actually handle this ?

For a new cluster key, Look into the index to figure out if value being inserted is a new key or not. If yes ( Like in our case) - Do not bother to look into the free list because those free blocks are anyways not eligible for inserts. So, for new cluster key values Oracle should directly allocate a new block.

Update

Oracle has fixed this issue. 


Bug 18938517 - Single row insert into a cluster table does a lot of "db file sequential read"

This fix is included in 12.2.0.1 and above. For lower versions, you need to request for a backport. 


This fix is not enabled by default. You need to set event 45051 to enable this fix.


SQL> alter system set events "45051 trace name context forever, level 50";


Level N is the number of blocks that Oracle probes in the free list before allocating a new block. Basically, this fix forces Oracle to not to scan the entire free list blocks. Read the first 50 blocks and see if it is eligible for an insert. Otherwise allocate a new block for the insert. 


You could choose any number between 1-10000. 

Friday, 13 February 2015

Join Elimination and potential risk of Wrong Results

Join Elimination is a query transformation feature introduced in 10gR2. This feature lives up to its name, Optimizer now has the ability to not to execute a join even if the SQL asks for it.

Let's look at this example. ( These tests were done on 12.1.0.2)

SQL> create table m(a number not null,n number not null, constraint m_pk primary key(n));

Table created.

SQL> create table d(b number not null,n number not null, constraint m_fk foreign key(n) references m(n));

Table created.

M-D has a parent-child relationship. All rows in table D reference exactly one row in table M

Now Consider this query.

SQL>  select d.b from m inner join d on m.n = d.n;

Plan hash value: 3387653791

----------------------------------
| Id  | Operation         | Name |
----------------------------------
|   0 | SELECT STATEMENT  |      |
|   1 |  TABLE ACCESS FULL| D    |
- - - - - - - - - - - - - - - - - 

In the SQL - you explicitly asked for a join (m.n = d.n ). But if you look at the execution plan Optimizer has not executed the join. It does a FULL TABLE SCAN on table D. This is legitimate - Optimizer is aware that all rows in table D reference exactly one row in table M. So there is no need to perform a join here.
For join elimination to happen certain conditions must be met.

1. There must be a PK-FK relationship
2. SQL must select columns ONLY from the child table.

If you select columns from parent table as well, JE does not kick in. 

SQL>  select m.a, d.b from m inner join d on m.n = d.n;

---------------------------------------------
| Id  | Operation                    | Name |
---------------------------------------------
|   0 | SELECT STATEMENT             |      |
|   1 |  NESTED LOOPS                |      |
|   2 |   NESTED LOOPS               |      |
|   3 |    TABLE ACCESS FULL         | D    |
|   4 |    INDEX UNIQUE SCAN         | M_PK |
|   5 |   TABLE ACCESS BY INDEX ROWID| M    |
---------------------------------------------

It is expected to see a nested loop join here. Join is executed by the Optimizer because you have to access both the tables to satisfy the query.

Interestingly, you see two NESTED LOOPS in the execution plan but our query has just one join condition.This is another 11g Optimization feature called Nested Loop Join Batching ( Controlled by the parameter_nlj_batching_enabled). Will talk more about this in some other blog post.

We now talk about certain parameters and constraint attributes that can potentially cause wrong results due to JE.

Consider this example

SQL> create table m(a number not null, n number not null, constraint m_pk primary key(n) rely enable validate);

Table created.

SQL> create table d(b number not null, n number not null, constraint m_fk foreign key(n) references m(n) rely disable novalidate);

Table created.

SQL> insert into d values(1,1);

1 row created.

SQL> commit;

Commit Complete

Note that I'm able to insert into table D even though value '1' does not exist in the parent table M. This is because my foreign key constraint is in a state of RELY DISABLE NOVALIDATE

SQL> select d.b from m inner join d on m.n = d.n;

no rows selected

This is correct result. Join condition is never met because value '1' does not exist in table M.

Looking at the execution plan

Plan hash value: 299975660

-----------------------------------
| Id  | Operation          | Name |
-----------------------------------
|   0 | SELECT STATEMENT   |      |
|   1 |  NESTED LOOPS      |      |
|   2 |   TABLE ACCESS FULL| D    |
|   3 |   INDEX UNIQUE SCAN| M_PK |
-----------------------------------

JE elimination does not kick in. This is because we have a RELY NOVALIDATE constraint. 

Now alter the value of the parameter query_rewrite_integrity to TRUSTED.

SQL> alter session set query_rewrite_integrity=TRUSTED;

SQL> select d.b from m inner join d on m.n = d.n;

         B
----------
         1
 
( WRONG RESULTS)

Plan hash value: 3387653791

----------------------------------
| Id  | Operation         | Name |
----------------------------------
|   0 | SELECT STATEMENT  |      |
|   1 |  TABLE ACCESS FULL| D    |
----------------------------------

When query_rewrite_integrity is set to TRUSTED - Join Elimination kicks in. It does a full table scan on D and the query returns 1 row. This is a wrong result.

CONCLUSION

Join elimination does not kick in with RELY NOVALIDATE constraints when QUERY_REWRITE_INTEGRITY is set to ENFORCED. Join elimination continues to consider RELY NOVALIDATE constraints when QUERY_REWRITE_INTEGRITY is set to TRUSTED or STALE_TOLERATED.This can cause the query to return wrong results. Therefore to avoid wrong results, you should be careful in choosing the right value for the parameter QUERY_REWRITE_INTEGRITY specially when you have RELY NOVALIDATE constraints.

Bloom Partition Pruning In Oracle Vs Postgres

In this blog post let's compare Oracle and Postgres optimizer transformation features.  For this test case I created two small tables an...