Archive for the ‘TeradataTipInfo’ Category

TeradataTip6 – Let the database recommend which statistics to collect

Tuesday, June 2nd, 2009

Teradata uses statistics to determine the most efficient query plan. Sometimes it is difficult to decide which statistics to collect. Collecting statistics requires a full table scan and you don’t want to do it unnecessarily, but not collecting the required statistics can lead to inefficient queries and increased resource consumption.

The following technique lets Teradata tell you which statistics it could use to improve the query plan for a specific query. To demonstrate, first, we set up some test tables.

CREATE TABLE A(k INTEGER, fk1 INTEGER, fk2 INTEGER)
PRIMARY INDEX(k);
 
CREATE TABLE B(kb1 INTEGER, kb2 INTEGER)
PRIMARY INDEX(kb1);

Now we turn on requesting statistics recommendations like this. The following can be entered as SQL using BTEQ or Teradata SQL Assistant.

diagnostic helpstats on for session;

Then when we ask for an EXPLAIN, we get statistics recommendations after the explain plan text.

EXPLAIN
SELECT * FROM a JOIN b
ON a.fk1 = b.kb1 AND a.fk2 = b.kb2;

Note the recommended COLLECT STATISTICS commands at the end.

Explanation
---------------------------------------------------------------------------
  1) First, we lock a distinct TEST_DB."pseudo table" for read on
     a RowHash to prevent global deadlock for TEST_DB.b.
  2) Next, we lock a distinct TEST_DB."pseudo table" for read on a
     RowHash to prevent global deadlock for TEST_DB.a.
  3) We lock TEST_DB.b for read, and we lock TEST_DB.a for
     read.
  4) We do an all-AMPs RETRIEVE step from TEST_DB.a by way of an
     all-rows scan with a condition of ("(NOT (TEST_DB.a.fk2 IS
     NULL )) AND (NOT (TEST_DB.a.fk1 IS NULL ))") into Spool 2
     (all_amps), which is redistributed by hash code to all AMPs.  Then
     we do a SORT to order Spool 2 by row hash.  The size of Spool 2 is
     estimated with no confidence to be 96 rows.  The estimated time
     for this step is 0.00 seconds.
  5) We do an all-AMPs JOIN step from TEST_DB.b by way of a
     RowHash match scan with a condition of ("NOT (TEST_DB.b.kb1
     IS NULL)"), which is joined to Spool 2 (Last Use) by way of a
     RowHash match scan.  TEST_DB.b and Spool 2 are joined using a
     merge join, with a join condition of ("(fk1 = TEST_DB.b.kb1)
     AND (fk2 = TEST_DB.b.kb2)").  The result goes into Spool 1
     (group_amps), which is built locally on the AMPs.  The size of
     Spool 1 is estimated with no confidence to be 9,216 rows.  The
     estimated time for this step is 0.03 seconds.
  6) Finally, we send out an END TRANSACTION step to all AMPs involved
     in processing the request.
  -> The contents of Spool 1 are sent back to the user as the result of
     statement 1.  The total estimated time is 0.03 seconds.
     BEGIN RECOMMENDED STATS ->
  7) "COLLECT STATISTICS TEST_DB.b COLUMN KB1".  (HighConf)
  8) "COLLECT STATISTICS TEST_DB.b COLUMN (KB2 ,KB1)".  (HighConf)
  9) "COLLECT STATISTICS TEST_DB.b COLUMN KB2".  (HighConf)
 10) "COLLECT STATISTICS TEST_DB.a COLUMN K".  (HighConf)
 11) "COLLECT STATISTICS TEST_DB.a COLUMN (FK2 ,FK1)".  (HighConf)
 12) "COLLECT STATISTICS TEST_DB.a COLUMN FK2".  (HighConf)
 13) "COLLECT STATISTICS TEST_DB.a COLUMN FK1".  (HighConf)
     <- END RECOMMENDED STATS

You can run the COLLECT STATISTICS commands that appear in double-quotes in order to provide Teradata with the data demographic information that it needs for an efficient plan.

TeradataTip5 – How to find join indexes defined on your table

Monday, June 1st, 2009

If you try to drop or mload a table that has a join index defined on it, Teradata will give you an error:

5467:  Cannot drop or MLOAD a table with join or hash indexes. 

To find the join indexes that are defined on your table, run the following query:

SELECT  * FROM dbc.indices 
WHERE indextype='J'
     AND databasename='your_database_name'
     AND tablename='your_table_name'
ORDER BY indexname,columnposition;

TeradataTip4 – convert an integer-based time to a time(0), and convert a DATE,INTEGER to a timestamp(0)

Saturday, May 30th, 2009

Teradata has two types of time types, the old one that is simply an integer format ’99:99:99′ and the new ANSI TIME type. The integer-based one can be used for comparisons but not for time calculations. The new one can be used for everything. This is how you can convert to the new type if you have the old type. Assume you have a table called T with an integer-based column UPDATE_TM like:

CREATE TABLE T
(
   UPDATE_DT DATE FORMAT 'YYYY-MM-DD',
   UPDATE_TM INTEGER FORMAT '99:99:99' )
);

To convert the integer time UPDATE_TM to a TIME:

SELECT
  TIME '00:00:00'
  + CAST(update_tm / 10000 AS INTERVAL HOUR)
  + CAST((update_tm / 100) MOD 100 AS INTERVAL MINUTE)
  + CAST((update_tm MOD 100) AS INTERVAL SECOND)
FROM T

To convert the UPDATE_DT, UPDATE_TM to a timestamp(0):

SELECT 
  CAST(update_dt AS TIMESTAMP(0))
  + CAST(update_tm / 10000 AS INTERVAL HOUR)
  + CAST((update_tm / 100) MOD 100 AS INTERVAL MINUTE)
  + CAST((update_tm MOD 100) AS INTERVAL SECOND)
FROM T

TeradataTip3 – Combine date and time into a timestamp

Thursday, May 28th, 2009
CREATE TABLE T( dt DATE, tm TIME(0));
 
INSERT INTO T VALUES (DATE '2009-05-28', TIME '14:53:45');
 
SELECT dt, tm,
    CAST(  dt AS TIMESTAMP(0)) +
            (tm - TIME '00:00:00' HOUR TO SECOND) ts
 FROM T;
        dt        tm                   ts
----------  --------  -------------------
2009/05/28  14:53:45  2009-05-28 14:53:45

TeradataTip2 – How much space are my tables using?

Wednesday, May 27th, 2009

Use this to find out how much perm space the tables in a specific database are using in megabytes:

SELECT tablename,SUM(currentperm)/1024/1024 MB
FROM dbc.allspace
WHERE databasename='your_database_name'
GROUP BY 1
ORDER BY 2 DESC;

TeradataTip1 – Casting decimal(n) to varchar without extra characters

Tuesday, May 26th, 2009

To convert a decimal(n) value to a simple string containing only digits, use this:

SELECT
  CAST( CAST( d AS format 'Z(I)' ) AS VARCHAR(30) )
FROM T;

Why do you need the double cast? Well, if you try to convert from a column that has a decimal(n) to a numeric string and you simply cast to a varchar, Teradata will add a decimal point at the end of your string.

CREATE TABLE T(d DECIMAL(18));
INSERT INTO T VALUES(42);
INSERT INTO T VALUES(3);
 
SELECT '['||CAST(d AS VARCHAR(20))||']' FROM T;
(('['||d)||']')
----------------------
[3.]
[42.]
SELECT
  '['||CAST(CAST(d AS format 'Z(I)') AS VARCHAR(20)) ||']'
FROM T;
(('['||d)||']')
----------------------
[3]
[42]