Friday, 13 February 2015

Join Elimination and potential risk of Wrong Results

Join Elimination is a query transformation feature introduced in 10gR2. This feature lives up to its name, Optimizer now has the ability to not to execute a join even if the SQL asks for it.

Let's look at this example. ( These tests were done on 12.1.0.2)

SQL> create table m(a number not null,n number not null, constraint m_pk primary key(n));

Table created.

SQL> create table d(b number not null,n number not null, constraint m_fk foreign key(n) references m(n));

Table created.

M-D has a parent-child relationship. All rows in table D reference exactly one row in table M

Now Consider this query.

SQL>  select d.b from m inner join d on m.n = d.n;

Plan hash value: 3387653791

----------------------------------
| Id  | Operation         | Name |
----------------------------------
|   0 | SELECT STATEMENT  |      |
|   1 |  TABLE ACCESS FULL| D    |
- - - - - - - - - - - - - - - - - 

In the SQL - you explicitly asked for a join (m.n = d.n ). But if you look at the execution plan Optimizer has not executed the join. It does a FULL TABLE SCAN on table D. This is legitimate - Optimizer is aware that all rows in table D reference exactly one row in table M. So there is no need to perform a join here.
For join elimination to happen certain conditions must be met.

1. There must be a PK-FK relationship
2. SQL must select columns ONLY from the child table.

If you select columns from parent table as well, JE does not kick in. 

SQL>  select m.a, d.b from m inner join d on m.n = d.n;

---------------------------------------------
| Id  | Operation                    | Name |
---------------------------------------------
|   0 | SELECT STATEMENT             |      |
|   1 |  NESTED LOOPS                |      |
|   2 |   NESTED LOOPS               |      |
|   3 |    TABLE ACCESS FULL         | D    |
|   4 |    INDEX UNIQUE SCAN         | M_PK |
|   5 |   TABLE ACCESS BY INDEX ROWID| M    |
---------------------------------------------

It is expected to see a nested loop join here. Join is executed by the Optimizer because you have to access both the tables to satisfy the query.

Interestingly, you see two NESTED LOOPS in the execution plan but our query has just one join condition.This is another 11g Optimization feature called Nested Loop Join Batching ( Controlled by the parameter_nlj_batching_enabled). Will talk more about this in some other blog post.

We now talk about certain parameters and constraint attributes that can potentially cause wrong results due to JE.

Consider this example

SQL> create table m(a number not null, n number not null, constraint m_pk primary key(n) rely enable validate);

Table created.

SQL> create table d(b number not null, n number not null, constraint m_fk foreign key(n) references m(n) rely disable novalidate);

Table created.

SQL> insert into d values(1,1);

1 row created.

SQL> commit;

Commit Complete

Note that I'm able to insert into table D even though value '1' does not exist in the parent table M. This is because my foreign key constraint is in a state of RELY DISABLE NOVALIDATE

SQL> select d.b from m inner join d on m.n = d.n;

no rows selected

This is correct result. Join condition is never met because value '1' does not exist in table M.

Looking at the execution plan

Plan hash value: 299975660

-----------------------------------
| Id  | Operation          | Name |
-----------------------------------
|   0 | SELECT STATEMENT   |      |
|   1 |  NESTED LOOPS      |      |
|   2 |   TABLE ACCESS FULL| D    |
|   3 |   INDEX UNIQUE SCAN| M_PK |
-----------------------------------

JE elimination does not kick in. This is because we have a RELY NOVALIDATE constraint. 

Now alter the value of the parameter query_rewrite_integrity to TRUSTED.

SQL> alter session set query_rewrite_integrity=TRUSTED;

SQL> select d.b from m inner join d on m.n = d.n;

         B
----------
         1
 
( WRONG RESULTS)

Plan hash value: 3387653791

----------------------------------
| Id  | Operation         | Name |
----------------------------------
|   0 | SELECT STATEMENT  |      |
|   1 |  TABLE ACCESS FULL| D    |
----------------------------------

When query_rewrite_integrity is set to TRUSTED - Join Elimination kicks in. It does a full table scan on D and the query returns 1 row. This is a wrong result.

CONCLUSION

Join elimination does not kick in with RELY NOVALIDATE constraints when QUERY_REWRITE_INTEGRITY is set to ENFORCED. Join elimination continues to consider RELY NOVALIDATE constraints when QUERY_REWRITE_INTEGRITY is set to TRUSTED or STALE_TOLERATED.This can cause the query to return wrong results. Therefore to avoid wrong results, you should be careful in choosing the right value for the parameter QUERY_REWRITE_INTEGRITY specially when you have RELY NOVALIDATE constraints.

No comments:

Post a Comment

19c Multitenant RAC- Default service disables parallelism

We recently upgraded one of our RAC databases from 11.2 to 19.10 on Exadata. And to everyone's surprise, almost everything on the system...