User-defined statistics in a query
For SQL statements that use user-defined data types, the optimizer can call custom selectivity and cost functions. Selectivity and cost functions might need to use statistics about the nature of the data in a column. When you create the statcollect() function that collects statistics for a UDT, the database server executes this function automatically when a user runs the UPDATE STATISTICS statement with the MEDIUM or HIGH keyword.
The statistics that the database server collects might require a smart large object for storage. The configuration parameter SBSSPACENAME specifies an sbspace for storing this information. If SBSSPACENAME is not set, the database server might not be able to collect the specified statistics.
- Relational-operator functions (lessthan(), ...)
- Boolean built-in operator functions: like(), matches()
- Write a user-defined function to implement the appropriate operator
function.
For user-defined types, these built-in operator functions do not automatically exist. You must write versions of these functions that handle your user-defined type.
- Write a selectivity function for the operator function to provide
the optimizer with a selectivity value. Selectivity and cost functions might need to use statistics about the nature of the data in a column. If you want these selectivity functions to use data distributions, take the following actions:
- Provide user-defined statistics so that the UPDATE STATISTICS statement saves the data distributions in the sysdistrib system catalog table.
- Access the sysdistrib table from within the selectivity function to obtain the data distributions for the column.
For more information about how to write and register selectivity functions, see Write selectivity and cost functions.