Saturday, 26 January 2013

Adhoc Queries - How to handle them in the Teradata environment

Adhoc queries are always the difficult and resource-consuming processes. They are generally fired by the business users and will be used for Strategic decision making.

With the growing size of the Enterprise, this task is becoming more costly.

For the Teradata Database administrator, the following options are available:


  1. Understand the Workload mix (Using Workload Monitor) and balance the Workload (Use Teradata    Active Workload Manager)
  2. Increase or decrease the priority of the Query
  3. Monitor the resource consumption of the Query - (using Viewpoint)
  4. Give "Privileged Users" permissions to the Reporting Users
Administrators can see the following Teradata e-brochure

As for the  End-users, there are a lot of Ad-hoc query tools or Reporting tools.

These tools have a few drawbacks which you can read in the article listed below:
Ad Hoc Query Tools Shootout (This article was written in 1995 and till date some of the drawbacks mentioned by Dr. Paul Dorsey remain)


Friday, 25 January 2013

Sample UNIX Shell script to Archive and Delete Data files

Every Data-warehousing or OLTP systems procure data from flat files.
Storing, archiving and Purging these files is necessary to maintain free space on your file-system.

Below is a sample script which will help you in these activities:


$> vi ArchiveScript.ksh

Then in the vi-editor enter the below mentioned script:

#! /bin/ksh

#Set Local variables

Date1=`date +%Y%m%d`
echo "$Date1 is Current Date"

#User-defined Function to Display error messages
ErrorCapture()
{
if [ $1 -ne 0 ]
then
echo "$2"
exit $1
else
exit 0
fi
}


#Delete users temporary files

if [ -s /tmp/users.tmp1 ]
then
echo " Delete Temporary files"
rm /tmp/users.tmp1
RC=$?
ErrorCapture $RC "Temporary file removal Failed "
else
echo "Temp file not found"
fi

# Archive users data file

if [  -s /fs/fs01/data/users ]
then
echo "Rename file with datestamp"
if [ -d /fs/fs01/data/SYWmbrVisitArch ]
then
mv /fs/fs01/data/users /fs/fs01/data/SYWmbrVisitArch/users_$Date1
else
echo "Creating directory /fs/fs01/data/SYWmbrVisitArch"
mkdir /fs/fs01/data/SYWmbrVisitArch
mv /fs/fs01/data/users /fs/fs01/data/SYWmbrVisitArch/users_$Date1
fi
RC=$?
ErrorCapture $RC "Renaming User Failed"
else
echo "File users not found in /fs/fs01/data"
fi

#Purge files older than 7 days

find /fs/fs01/data/SYWmbrVisitArch -type f -mtime +7 -exec rm {} \;
RC=$?
ErrorCapture $RC "Purge of datafile users older than 7 days Failed"

exit 0


Step 3 - Save by pressing 'Esc' + ":" (colon) + "wq!"


Let us know if this was helpful. Subscribe using the button on the right !!

Thursday, 24 January 2013

Multiload UPSERT in Teradata with Example

UPSERT is an operation in DBMS, where if the new Delta contains updates for existing rows, they are Updated first. The remaining new rows in the Delta file will be inserted.

Multiload is well-equipped to handle such requests.


The following example illustrates how you can use Multiload to perform an UPSERT operation:


Note: This is different from the SQL UPSERT (UPDATE ... ELSE INSERT...)


Sample script:


.LOGTABLE WORK_TBLS.USER_INTERACT_ID_LOG;
.LOGON DWTEST1/USER1,PWD1;
.BEGIN IMPORT MLOAD
TABLES DW_TBLS.USER_INTERACT_ID
WORKTABLES WORK_TBLS.USER_INTERACT_ID_WK
ERRORTABLES WORK_TBLS.USER_INTERACT_ID_ET
WORK_TBLS.USER_INTERACT_ID_UV;

.LAYOUT DATAIN_LAYOUT;
.FIELD INTERACT_DESC 1 VARCHAR(30);
.FIELD INTERACT_TYP * varchar(10);

.DML LABEL UPDATE_DML
DO INSERT FOR MISSING UPDATE ROWS;
UPDATE DW_TBLS.USER_INTERACT_ID
SET
INTERACT_DESC = :INTERACT_DESC
WHERE
INTERACT_TYP = :INTERACT_TYP
;
INSERT INTO DW_TBLS.USER_INTERACT_ID
(
INTERACT_TYP
,INTERACT_DESC
)
values(
:INTERACT_TYP
,:INTERACT_DESC
);
.IMPORT INFILE /tmp/InteractionTypes.csv.tmp1
FORMAT vartext ','
LAYOUT DATAIN_LAYOUT
APPLY UPDATE_DML
;
.END MLOAD;
.LOGOFF;

DO INSERT FOR MISSING UPDATE ROWS - This keyword gives a HINT to multiload that an UPSERT is to be performed.
Otherwise, all rows will qualify for both UPDATE and INSERT, and the UPDATE rows go to _UV table.

Keep Reading to understand how to run a multiload, restart etc.!!

Subscribe with us if you like our Blog. Help us understand what topics should we cover.


Multiload in Teradata using a Variable-length, comma-separated file

How to load a table from Comma-separated Variable length file?


Multiload has four FORMAT to load data:




  • FASTLOAD;
  • BINARY;
  • TEXT;
  • UNFORMAT;
  • VARTEXT.


  • The following is a Teradata Multiload example:

    .LOGTABLE WORK_TBLS.USER_INTERACT_ID_LOG;

    .LOGON DWTEST1/USER1,PWD;

    .BEGIN IMPORT MLOAD
    TABLES DW_TBLS.USER_INTERACT_ID
    WORKTABLES WORK_TBLS.USER_INTERACT_ID_WK
    ERRORTABLES WORK_TBLS.USER_INTERACT_ID_ET
    WORK_TBLS.USER_INTERACT_ID_UV;

    .LAYOUT DATAIN_LAYOUT;
    .FIELD INTERACT_DESC 1 VARCHAR(30);
    .FIELD INTERACT_TYP * varchar(10);

    .DML LABEL INSERT_DML;
    INSERT INTO DW_TBLS.USER_INTERACT_ID
    (
    INTERACT_TYP = :INTERACT_TYP
    ,INTERACT_DESC = :INTERACT_DESC
    );
    .IMPORT INFILE /tmp/InteractionTypes.csv.tmp1
    FORMAT vartext ','LAYOUT DATAIN_LAYOUT
    APPLY INSERT_DML;

    .END MLOAD;

    .LOGOFF;


    Important Points to Note:

    1. The Errortables are Optional. If you do not specify the errortables, they will be created as ET_USER_INTERACT_ID and UV_USER_INTERACT_ID

    But, if they are specified, they are position-dependent. So, ET and UV table may get interchanged by mistake.

    Hence, if you write

    ERRORTABLES
    UV_USER_INTERACT_ID
    ET_USER_INTERACT_ID

    It will result in UV_USER_INTERACT_ID storing Error rows and ET_USER_INTERACT_ID storing UPI violation rows.

    Incase you need any, just write your queries in comments section. Let us know what topics you want covered in the next blogs!!

    Subscribe with us if you like our Blog!!