Friday 2 August 2013

Effective design of BI jobs - reducing manual activity while doing retrospective loading in teradata. Using Python to pass date as a parameter to Control-M

Many a times, I come across designers/developers who use "Current_Date" keyword in their queries.
So, their typical queries look like 

Where TransactionDate > Current_Date - 1

Now even though this is a valid Teradata syntax, but that's all.

When you move this code to production and then it fails for a few days, we will have to manually calculate the date-range and run the queries. A moving window will need multiple iterations of manual tasks, especially if we have UPSERT or MERGE statements involved.

Alternative Solution:

We can use Python and Teradata SQL to overcome this problem.

Let us parameterize the date-range in Teradata SQL as :

Where TransactionDate > Current_Date - 1

replaced by

Where TransactionDate > ?ODATE - 1

The below Python script is generic. It can be used to generate date-range files, that be run from a Scheduling tool like Control-M or cronjob.

Copy the below script in a file and name it as ReplaceDate.py

Calling the script:

$>  python ReplaceDate.py ODATE /analytics/advanced/sql/Test.Btq

where ODATE = Date passed from Control-M in format 'yyyy-mm-dd'

Generic Python Script:


#! /usr/bin/python
#--------------------------------------------------------------------------------------#
# Please ensure that you pass 2 parameters                       #
# Sample call : python ImportDate.py 2013-07-25 Test.Btq   #
# Instruction : Ensure that the Test.Btq is in the same         #
#              path as this script                                             #
# Author : Senior Bigdata Engineer                                    #
#------------------------------------------------------------------------------------#
# Call necessary libraries
import re
import sys
import os.path

SysArray = sys.argv
#print (SysArray[1])
print "No. of parameters", (len(SysArray))

Value = "'" + SysArray[1] + "'"
ODATE1 =(Value, '?ODATE')
ParamCount = len(SysArray)
ScriptName = SysArray[2]
ScriptName_tmp = ScriptName + "_tmp"
#print "The temp script is",ScriptName_tmp
if ( ParamCount != 3 ):
print ("Enter a valid Date and BTEQ scriptname")
exit

#----------------------------------------------------#
# This block of code keeps track of the current date #
#----------------------------------------------------#
from time import gmtime,strftime
ODATE2=strftime("%Y-%m-%d",gmtime())
print "ODATE for the run is", ODATE1[0]
print "Current Date is", ODATE2

#------------------------------------------------#
# Define function to replace ODATE in the BTEQ   #
#------------------------------------------------#
if os.path.exists(ScriptName):
print "File " +ScriptName +" exists"
print "Continue with replacement"
else:
print "File " +ScriptName +" does not exist"
sys.exit(1)
#----------------------------------------------------#
# This block creates a list of values to be replaced #
#----------------------------------------------------#

#import re
print "ODATE variable to be replaced", ODATE1[1]
print "Value to be replaced with", ODATE1[0]
sourcefile = open(ScriptName, "r")
runfile = open(ScriptName_tmp, "wb")
#------------------------------------------------------#
# Read each line of sourcefile and replace the pattern #
#------------------------------------------------------#

for Pattern1 in sourcefile:
#print Pattern1.replace(ODATE1[1],ODATE1[0])
runfile.write(Pattern1.replace(ODATE1[1],ODATE1[0]))
#print Pattern1
sourcefile.close()
runfile.close()
#ReplaceOdate(ScriptName)
#import os.path
if os.path.exists(ScriptName_tmp):
print "The temp script is",ScriptName_tmp
sys.exit(0)
else:
print "Please check the process. Temp file not created"
sys.exit(1)

Output of the Script:

Import multiple jobs with different ODATE in the AJF of Control-M. So, each time the ReplaceDate.py script generates files for particular ODATE.
A Test.Btq_tmp will be created where the ?ODATE variable will be replaced by '2013-08-02'

Now run the Test.Btq_tmp file using Control-M in the desired sequence.

Like us if you find the blog helpful. Also, please leave a comment about what else can be improved.




2 comments:

  1. It’s really nice and meaningful. It’s really cool blog. You have really helped lots of people who visit Blog and provide them useful information. Thanks for sharing.

    Python in-house training for employees in Nigeria

    ReplyDelete

Please share your thoughts and let us know the topics you want covered