Use join filters and post-join filters
The database server provides support for using a subset of the ANSI join syntax.
- ON keyword to specify the join condition and any optional join filters
- LEFT OUTER JOIN keywords to specify which table is the dominant table (also referred to as outer table)
For more information about this ANSI join syntax, see the HCL Informix® Guide to SQL: Syntax.
- Applies the join condition in the ON clause to determine which rows of the subordinate table (also referred to as inner table) to join to the outer table
- Applies optional join filters in the ON clause before and during
the join
If you specify a join filter on a base inner table in the ON clause, the database server can apply it prior to the join, during the scan of the data from the inner table. Filters on a base subordinate table in the ON clause can provide the following additional performance benefits:
- Fewer rows to scan from the inner table prior to the join
- Use of index to retrieve rows from the inner table prior to the join
- Fewer rows to join
- Fewer rows to evaluate for filters in the WHERE clause
For information about what occurs when you specify a join filter on an outer table in the ON clause, see the HCL Informix Guide to SQL: Syntax.
- Applies filters in the WHERE clause after the join
Filters in the WHERE clause can reduce the number of rows that the database server needs to scan and reduce the number of rows returned to the user.
The term post-join filters refers to these WHERE clause filters.
When distributed queries that use ANSI-compliant LEFT OUTER syntax for specifying joined tables and nested loop joins are executed, the query is sent to each participating database server for operations on local tables of those servers.
For example, the demonstration database has the customer table and the cust_calls table, which tracks customer calls to the service department. Suppose a certain call code had many occurrences in the past, and you want to see if calls of this kind have decreased. To see if customers no longer have this call code, use an outer join to list all customers.
QUERY:
------
SELECT c.customer_num, c.lname, c.company,
c.phone, u.call_dtime, u.call_code, u.call_descr
FROM customer c
LEFT JOIN cust_calls u ON c.customer_num = u.customer_num
ORDER BY u.call_dtime
Estimated Cost: 14
Estimated # of Rows Returned: 29
Temporary Files Required For: Order By
1) virginia.c: SEQUENTIAL SCAN
2) virginia.u: INDEX PATH
(1) Index Keys: customer_num call_dtime (Serial, fragments: ALL)
Lower Index Filter: virginia.c.customer_num = virginia.u.customer_num
ON-Filters:virginia.c.customer_num = virginia.u.customer_num
NESTED LOOP JOIN(LEFT OUTER JOIN)
- The ON-Filters: line lists the join condition that was specified in the ON clause.
- The last line of the SET EXPLAIN ON output shows all three keywords (LEFT OUTER JOIN) for the ANSI join even though this query specifies only the LEFT JOIN keywords in the FROM clause. The OUTER keyword is optional.
QUERY:
------
SELECT c.customer_num, c.lname, c.company,
c.phone, u.call_dtime, u.call_code, u.call_descr
FROM customer c LEFT JOIN cust_calls u
ON c.customer_num = u.customer_num
AND u.call_code = 'I'
ORDER BY u.call_dtime
Estimated Cost: 13
Estimated # of Rows Returned: 25
Temporary Files Required For: Order By
1) virginia.c: SEQUENTIAL SCAN
2) virginia.u: INDEX PATH
Filters: virginia.u.call_code = 'I'
(1) Index Keys: customer_num call_dtime (Serial, fragments: ALL)
Lower Index Filter: virginia.c.customer_num = virginia.u.customer_num
ON-Filters:(virginia.c.customer_num = virginia.u.customer_num
AND virginia.u.call_code = 'I' )
NESTED LOOP JOIN(LEFT OUTER JOIN)
- The optimizer chooses a different index to scan the inner table.
This new index exploits more filters and retrieves a smaller number of rows. Consequently, the join operates on fewer rows.
- The ON clause join filter contains an additional filter.
The value in the Estimated # of Rows Returned line is only an estimate and does not always reflect the actual number of rows returned. The sample query in Figure 2 returns fewer rows than the query in Figure 1 because of the additional filter.
QUERY:
------
SELECT c.customer_num, c.lname, c.company,
c.phone, u.call_dtime, u.call_code, u.call_descr
FROM customer c LEFT JOIN cust_calls u
ON c.customer_num = u.customer_num
AND u.call_code = 'I'
WHERE c.zipcode = "94040"
ORDER BY u.call_dtime
Estimated Cost: 3
Estimated # of Rows Returned: 1
Temporary Files Required For: Order By
1) virginia.c: INDEX PATH
(1) Index Keys: zipcode (Serial, fragments: ALL)
Lower Index Filter: virginia.c.zipcode = '94040'
2) virginia.u: INDEX PATH
Filters: virginia.u.call_code = 'I'
(1) Index Keys: customer_num call_dtime (Serial, fragments: ALL)
Lower Index Filter: virginia.c.customer_num = virginia.u.customer_num
ON-Filters:(virginia.c.customer_num = virginia.u.customer_num
AND virginia.u.call_code = 'I' )
NESTED LOOP JOIN(LEFT OUTER JOIN)
PostJoin-Filters:virginia.c.zipcode = '94040'