Sort-time costs

A sort requires in-memory work as well as disk work. The in-memory work depends on the number of columns that are sorted, the width of the combined sort key, and the number of row combinations that pass the query filter. You can reduce the cost of sorting.

You can use the following formula to calculate the in-memory work that a sort operation requires:

W_m = (c * N_fr)+ (w * N_frlog₂(N_fr))

W_m: is the in-memory work.
c: is the number of columns to order and represents the costs to extract column values from the row and concatenate them into a sort key.
w: is proportional to the width of the combined sort key in bytes and stands for the work to copy or compare one sort key. A numeric value for w depends strongly on the computer hardware in use.
N_fr: is the number of rows that pass the query filter.

Sorting can involve writing information temporarily to disk if the amount of data to sort is large. You can direct the disk writes to occur in the operating-system file space or in a dbspace that the database server manages. For details, see Configure dbspaces for temporary tables and sort files.

The disk work depends on the number of disk pages where rows appear, the number of rows that meet the conditions of the query predicate, the number of rows that can be placed on a sorted page, and the number of merge operations that must be performed. Use the following formula to calculate the disk work that a sort operation requires:

W_d = p + (N_fr/N_rp) * 2 * (m - 1))

W_d: is the disk work.
p: is the number of disk pages.
N_fr: is the number of rows that pass the filters.
N_rp: is the number of rows that can be placed on a page.
m: represents the number of levels of merge that the sort must use.

The factor m depends on the number of sort keys that can be held in memory. If there are no filters, N_fr/N_rp is equivalent to p.

When all the keys can be held in memory, m=1 and the disk work is equivalent to p. In other words, the rows are read and sorted in memory.

For moderate to large tables, rows are sorted in batches that fit in memory, and then the batches are merged. When m=2, the rows are read, sorted, and written in batches. Then the batches are read again and merged, resulting in disk work proportional to the following value:

W_d = p + (2 * (N_fr/N_rp))

The more specific the filters, the fewer the rows that are sorted. As the number of rows increases, and the amount of memory decreases, the amount of disk work increases.

To reduce the cost of sorting, use the following methods:

Make your filters as specific (selective) as possible.
Limit the projection list to the columns that are relevant to your problem.