Tuesday, 28 August 2012

COLLECT STATs quick reference guide

COLLECT STATS are always done at Column/Multi-column/Index level. Table level stats only refresh the earlier defined column/index level stats. It doesn't mean that it will do collect stats on all columns unless it was earlier done for all columns individually once.


Collect stats just derives the data demographics of the table. These demographics are useful for optimizer to decide the execution of given query which in turn improves performance. Collected statistics are stored in DBC.TVFields or DBC.Indexes tables. However, these two tables cannot be queried .
For Perm tables at column or index level:
COLLECT STATS ON TABLE_NAME COLUMN/INDEX (COLUMN_NAME);

For GT tables:
COLLECT STATS ON TEMPORARY TABLE_NAME COLUMN/INDEX (COLUMN_NAME)

For Table level collect stats refresh:
COLLECT STATS ON TABLE_NAME;


It collects the information like:
  • total row counts of the table,
  • how many distinct values are there in the column,
  • how many rows per value, is the column indexed,
  • if so unique or non unique etc.
To know if collect stats done on table:
HELP STATISTICS TABLE_NAME ;
When to refresh COLLECT STATS:
1.    Typical guideline is  roughly 10% of the data has changed. (By measuring delta in perm space since last collected.) 
2.    Recollect based on stats that have aged 60-90 days. (say last time stats collected was 2 months ago) .
Note:
1. If you delete the table , the old stats still exists as there is no automatic stats updation in Teradata.
2. If you drop the table, then stats will also be dropped

No comments:

Post a Comment