Tylogix - iSeries SQL Tips and Techniques |
On how to find duplicate records on iSeries using SQL, James from Kentucky asked :
I am trying to identify duplicate key fields within a file. What is the best way to do this?
Answer:
Here is an easy to follow technique.
Assuming a table TABLE_X with column_1, column_2, and column_3, and assuming the aim is to find all columns who happen to have exact duplicate combination of data for these three columns: You would key, in the SQL command line the following command:
SELECT COLUMN_1, COLUMN_2, COLUMN_3, COUNT(*)
FROM TABLE_X
GROUP BY COLUMN_1, COLUMN_2, COLUMN_3
Go Back to the Top |
Technique to Remove Duplicates from a Table
To remove duplicates from a table, assuming the table is keyed and that only the greatest key value is the one to keep, this is the technique:
DELETE FROM SAMPLETABLE
WHERE ID_NUMBER <
(
SELECT MAX(ID_NUMBER)
FROM SAMPLETABLE MAXVALUES
WHERE MAXVALUES.NAME = SAMPLETABLE.NAME
)
An other method, using the Relative Record Number (RRN, equivalent of Row ID in Oracle) is as follows:
DELETE FROM FILENAMEX H1
WHERE RRN(H1) <
(
SELECT MAX( RRN(H2) ) FROM FILENAMEX H2
WHERE H1.TRANSACTION_ID = H2.TRANSACTION_ID
)
Go Back to the Top |
Technique to extract ONLY n Rows from a table
To limit the amount of results that a query may retrieve, the “FETCH FIRST N ROWS ONLY” is a useful technique:
SELECT * FROM TABLE_NAME
FETCH FIRST 10 ROWS ONLY
The last clause ensures that only 10 rows are fetched.
Go Back to the Top |
On how to manipulate data on a remote system, Dan from California wrote:
I work with for a company who owns several systems, amongst which there are AS/400. I can FTP from/to these systems but no DDM is allowed. Would there be a way to manipulate data on a remote system using FTP?
Answer:
To reach a remote site with SQL, you need to first connect and ensure the remote site is in the system directory. This is explained in detail in the Tylogix Remote SQL Technique page
The best answer is perhaps not using FTP, even if a program can be remotely triggered on a remote system using the FTP instruction "QUOTE" - (more on this in the FTP Q&A Page).
Go Back to the Top |
This SQL tip is the formula to find a job name in IBM's Job Scheduler, based on the contents of its command field:
This SQL statement will find any JS job which has the string '583550'
imbedded in the command line.
You can substitute '583550' with anything such as 'PGMABC' for example,
and find which JS Job runs that command.
SELECT T1.JMJOB, T1.JMGRP, T1.JMSEQ, T1.JMTEXT, T2.CMCSEQ, T2.CMCMD
FROM QUSRIJS/QAIJSMST T1
JOIN QUSRIJS/QAIJS1CM T2 ON
(T1.JMJOB = T2.CMJOB AND T1.JMGRP = T2.CMGRP
AND T1.JMSEQ = T2.CMSEQ)
WHERE T2.CMCMD like '%583550%'
ORDER BY T1.JMGRP, T1.JMSEQ, T2.CMCSEQ
Go Back to the Top
|
This SQL tip is the formula to find jobs that fail most frequently over time in IBM's Job Scheduler:
This query shows JS failures from the JS history file.
This is a method to look for areas for improvement by bringing scheduled jobs that fail frequently over time.
SELECT JMGRP,JMJOB, JMSEQ, COUNT(*) FROM QUSRIJS/QAIJSHST WHERE
SUBSTR(DIGITS((1900*10000)+JMLRD),7,2) >=
SUBSTR(DIGITS(YEAR(CURDATE() )-1),9,2) AND (JMLCC = 'A' OR JMLCC =
'G' OR JMLCC = 'T' OR JMLCC = 'L') AND (JMAPPL = 'MI' OR JMAPPL =
'DW') GROUP BY JMGRP, JMJOB , JMSEQ ORDER BY 4 DESC
Go Back to the Top |
SQL Join Basics
Joining data from more than one table
Sometimes the information you want to see is not in a single table. To form a row of the result table, you might want to retrieve some column values from one table and some column values from another table. You can retrieve and join column values from two or more tables into a single row.
Several different types of joins are supported by DB2 UDB for iSeries: inner join, left outer join, right outer join, left exception join, right exception join, and cross join.
- An Inner Join returns only the rows from each table that have matching values in the join columns. Any rows that do not have a match between the tables will not appear in the result table.
- A Left Outer Join returns values for all of the rows from the first table (the table on the left) and the values from the second table for the rows that match. Any rows that do not have a match in the second table will return the null value for all columns from the second table. Nulls can be monitored and be defaulted with an
IFNULL. For example: IFNULL(TABLEX.OBSERVATION_HR , 0) will replace the value of TABLEX.OBSERVATION_HR with a 0 if a NULL was detected.
- A Right Outer Join return values for all of the rows from the second table (the table on the right) and the values from the first table for the rows that match. Any rows that do not have a match in the first table will return the null value for all columns from the first table.
- A Left Exception Join returns only the rows from the left table that do not have a match in the right table. Columns in the result table that come from the right table have the null value.
- A Right Exception Join returns only the rows from the right table that do not have a match in the left table. Columns in the result table that come from the left table have the null value.
- A Cross Join returns a row in the result table for each combination of rows from the tables being joined (a Cartesian Product).
- A Simulating a Full Outer Join like the left and right outer joins, a full outer join returns matching rows from both tables. However, a full outer join also returns non-matching rows from both tables; left and right.
- Using Multiple Join Types in One Statement There are times when more than two tables need to be joined to produce the desired result. This is an example of how to proceed.
You can simulate a Full Outer Join using a Left Outer join and a Right Exception Join. See Simulating a Full Outer Join for details. Additionally, you can use multiple join types in a single statement. See Multiple join types in one statement for details.
Notes on joins
When you join two or more tables:
- If there are common column names, you must qualify each common name with the name of the table (or a correlation name). Column names that are unique do not need to be qualified.
- If you do not list the column names you want, but instead use SELECT *, SQL returns rows that consist of all the columns of the first table, followed by all the columns of the second table, and so on.
Go Back to the Top |
Casting Data Types with SQL
Sometimes, it may be convenient to compare data which is stored in different data types. This is where casting comes in:
To cast alpha data as numeric, use the following formula. Note that this example also has a substring built-in:
SELECT @iNumValue = INT( SUBSTRING(CharColumnName, StartChar, Length)) from filename
To cast numeric data as alpha, use the following formula:
SELECT @icharValue = CAST( IntegerColumn, as char(5)) from filename
(5 in this example is the length of the character string to convert to)
Sometimes the information you want to see is not in a single table. To form a row of the result table, you might want to retrieve some column values from one table and some column values from another table. You can retrieve and join column values from
An other useful data manipulator I like is the SQL keyword STRIP, which takes only the non-blank characters out of a character column. This is especially useful if you are concatenating values like FIRST_NAME and LAST_NAME. If each of these columns have 50 charaters, you end up with something that does not look very readable.
Using STRIP(FIRST_NAME) || ' ' || STRIP(LAST_NAME), you can get something relatively normal to read.
Go Back to the Top
|
Techniques to convert JD Edwards Julian Dates to MDY and Back using SQL
JD Edwards stores dates in a particular format:
- first digit is a 0 for dates prior to Y2K and 1 for dates after Y2K
- second and third digits are the last two numbers of a year i.e. for 2003 this value will be "03"
- third, fourth and fifth digits are the julian day (nth day of the year).
While this works well for a computer, it is not very readable by people. This can be easily changed for reading and reporting with SQL by using the following SQL formula, here is an example that can be used interactively with SQL: SELECT date(char(1900000+sdupmj)) from f4211
-------------------------------------------------------------------------------------------
Here is an SQL function to convert a JDE Julian Date to a MDY date:
Calling example:
SELECT XJDETOMDY(105144) FROM SYSIBM/SYSDUMMY1
SQL Code:
CREATE FUNCTION XJDETOMDY (IN_JDE_DATE INT)
RETURNS DATE
LANGUAGE SQL
BEGIN
DECLARE OUT_YMD DATE ;
SET OUT_YMD = DATE(CHAR(1900000+IN_JDE_DATE)) ;
RETURN (OUT_YMD) ;
END
For reference and comparison, in Excel, the Julian to Gregorian conversion code is as follows:
A B
====== ===========
1 Julian Gregorian
2 108100 04/09/2008
The contents of the cells A2 and B2 are:
108100 =DATE(1900+INT(A2/1000),1,MOD(A2,1000))
Here is an SQL function to convert a MDY Date to a JDE Date
Calling example:
SELECT XMDYTOJDE(DATE('05/24/05')) FROM SYSIBM/SYSDUMMY1
SQL Code:
CREATE FUNCTION XMDYTOJDE (IN_YMD_DATE DATE)
RETURNS INT
LANGUAGE SQL
BEGIN
DECLARE OUT_JDE_DATE INT ;
DECLARE OUT_JDE_PART1 CHAR(1) ;
DECLARE OUT_JDE_PART2 CHAR(2) ;
DECLARE OUT_JDE_PART3I INT ;
DECLARE OUT_JDE_PART3C CHAR(3) ;
CASE
WHEN IN_YMD_DATE < DATE('01/01/2000')
THEN SET OUT_JDE_PART1 = '0' ;
ELSE
SET OUT_JDE_PART1 = '1' ;
END CASE ;
SET OUT_JDE_PART2 = SUBSTR(CHAR(IN_YMD_DATE), 7,2) ;
SET OUT_JDE_PART3I = DAYS(IN_YMD_DATE) -
DAYS( DATE('01/01/' || OUT_JDE_PART2 ) )Ư ;
CASE
WHEN OUT_JDE_PART3I < 10
THEN SET OUT_JDE_PART3C = '00' || CHAR(OUT_JDE_PART3I) ;
WHEN OUT_JDE_PART3I < 100
THEN SET OUT_JDE_PART3C = '0' || CHAR(OUT_JDE_PART3I) ;
ELSE
SET OUT_JDE_PART3C = CHAR(OUT_JDE_PART3I) ;
END CASE ;
SET OUT_JDE_DATE = INT(OUT_JDE_PART1 || OUT_JDE_PART2 || OUT_JDE_PART3C ) ;
RETURN (OUT_JDE_DATE) ;
END
To do the reverse, in effect to convert a conventional Gregorian date to the "JDE" Julian format, The following SQL code will do the trick:
CASE WHEN SUBSTR( CHAR( YEAR( DATE('2009-03-19') ) ) , 1, 1) = '1'
THEN '0' ELSE '1' END ||
SUBSTR( CHAR( YEAR(DATE('2009-03-19')) * 1000
+ DAYOFYEAR(DATE('2009-03-19')) ) , 3, 5)
|
What is the DB2 equivalent of Oracle's DUAL?
The name of the file on DB2 is sysibm/sysdummy1
A bit less handy than "DUAL", but it works the same. Here are two examples with dates.
SELECT char( DATE(timestamp( '2005-12-31-23.59.59.000000'))) FROM sysibm/sysdummy1
12/31/05
SELECT char( DATE(timestamp( '2005-12-31-23.59.59.000000') +7 days )) FROM sysibm/sysdummy1
01/07/06
Go Back to the Top
|
What are the SQL data type equivalents in C, RPG and COBOL?
The data type equivalence tables can be found in the following pages:
SQL and C Types
SQL and RPG Types
SQL and COBOL Types
Go Back to the Top
|
Defining an iSeries table with long field names and 10-character field references
Here is an example:
CREATE TABLE TESTPF
(
DOCUMENT_COMPANY FOR GLKCO CHAR(10) ,
DOCUMENT_TYPE FOR GLDCT CHAR(2) ,
DOCUMENT_NUMBER FOR GLDOC DEC(8,0)
);
--
LABEL ON TESTPF (DOCUMENT_COMPANY TEXT IS
'Document Company ');
LABEL ON TESTPF (DOCUMENT_TYPE TEXT IS
'Document Type ');
LABEL ON TESTPF (DOCUMENT_NUMBER TEXT IS
'Document Number ');
--
LABEL ON TABLE TESTPF IS 'Example Table';
Go Back to the Top |
How to run an SQL Statement from a CL Program
1) First Enter your SQL Statement in a source member
2) Use the command RUNSQLSTM with the member name
3) An other solution may be to implement and use the EXCSQL command, authored by Dan Riehl, posted on this site.
Go Back to the Top
|
Creating an Integrity Check between two Tables Using SQL
This code shows how two files can be scanned for differences using SQL.
The equivalent can be achived with CMPPFM, but the SQL code is portable.
------------------------------------------------------------------------------
-- Delete ALL rows from Integrity Result Table INTRESULT
------------------------------------------------------------------------------
DELETE FROM INTRESULT ;
------------------------------------------------------------------------------
-- Insert the rows that are different.
------------------------------------------------------------------------------
INSERT INTO INTRESULT
(
CAT_CODE_2 ,
COIND ,
OBJECT_ACCOUNT ,
RECORD_COUNT_DIFFERENCE ,
SUM_DIFFERENCE ,
COMMENT
)
SELECT
BASETABLE.CAT_CODE_2 ,
BASETABLE.COIND ,
BASETABLE.OBJECT_ACCOUNT ,
BASETABLE.SOURCE_RECORD_COUNT - NEWTABLE.TARGET_RECORD_COUNT ,
BASETABLE.SOURCE_SUM - NEWTABLE.TARGET_SUM ,
'Differences between Base Table and Source and New Table Totals'
FROM BASETABLE
JOIN NEWTABLE
ON BASETABLE.CAT_CODE_2 = NEWTABLE.CAT_CODE_2
AND BASETABLE.COIND = NEWTABLE.COIND
AND BASETABLE.OBJECT_ACCOUNT = NEWTABLE.OBJECT_ACCOUNT
where BASETABLE.SOURCE_RECORD_COUNT <> NEWTABLE.TARGET_RECORD_COUNT
or BASETABLE.SOURCE_SUM <> NEWTABLE.TARGET_SUM ;
------------------------------------------------------------------------------
-- Insert the rows that are in BASETABLE but are not in NEWTABLE
------------------------------------------------------------------------------
INSERT INTO INTRESULT
(
CAT_CODE_2 ,
COIND ,
OBJECT_ACCOUNT ,
RECORD_COUNT_DIFFERENCE ,
SUM_DIFFERENCE ,
COMMENT
)
SELECT
BASETABLE.CAT_CODE_2 ,
BASETABLE.COIND ,
BASETABLE.OBJECT_ACCOUNT ,
BASETABLE.SOURCE_RECORD_COUNT ,
BASETABLE.SOURCE_SUM ,
'Records in BASETABLE (Source) but not in NEWTABLE(Target)'
FROM BASETABLE
LEFT EXCEPTION JOIN NEWTABLE
ON BASETABLE.CAT_CODE_2 = NEWTABLE.CAT_CODE_2
AND BASETABLE.COIND = NEWTABLE.COIND
AND BASETABLE.OBJECT_ACCOUNT = NEWTABLE.OBJECT_ACCOUNT ;
------------------------------------------------------------------------------
-- Insert the rows that are in NEWTABLE but are not in BASETABLE
------------------------------------------------------------------------------
INSERT INTO INTRESULT
(
CAT_CODE_2 ,
COIND ,
OBJECT_ACCOUNT ,
RECORD_COUNT_DIFFERENCE ,
SUM_DIFFERENCE ,
COMMENT
)
SELECT
NEWTABLE.CAT_CODE_2 ,
NEWTABLE.COIND ,
NEWTABLE.OBJECT_ACCOUNT ,
NEWTABLE.TARGET_RECORD_COUNT ,
NEWTABLE.TARGET_SUM ,
'Records in NEWTABLE (Data Warehouse summary) but not in BASETABLE (Source)'
FROM NEWTABLE
LEFT EXCEPTION JOIN BASETABLE
ON NEWTABLE.CAT_CODE_2 = BASETABLE.CAT_CODE_2
AND NEWTABLE.COIND = BASETABLE.COIND
AND NEWTABLE.OBJECT_ACCOUNT = BASETABLE.OBJECT_ACCOUNT ;
Go Back to the Top
|
Example of a simple text parsing Stored Procedure
This link points to the source for a simple text parser, created as an SQL Stored Procedure.
The challenge was to get item description data that was relatively well structured, but in Microsoft Excel format, into a table.
There was more than one description sentences per item and we needed to have each item description sentence split in discrete rows.
It all sounds simple, but we had to
- replace manual line feeds with a special stand-in character (we chose "~") to avoid un-wanted line breaks during FTP transfers (we had to use MS Word for that)
- FTP the data
- Parse out the data so that it could be accepted in that particular format at the other end.
The procedure uses two nested cursors. Here is the link to the Text-Parsing Stored Procedure Example.
Go Back to the Top
|
Debugging an SQL Stored Procedure
Debugging a Stored Procedure is very much like debugging any other ILE C program:
Consider a simple SQL Stored Procedure that uses a cursor to update each row in a table with a unique time stamp (The example code is supplied below). The way to create the procedure in a way that will allow it to be debugged is as follows:
RUNSQLSTM SRCFILE(OBJLIB/PROCEDURES) SRCMBR(UPDT_TIMEz) DBGVIEW(*SOURCE)
Note: DBGVIEW(*SOURCE) will step through the SQL code that you wrote. Easier to debug that way.
DBGVIEW(*LIST) will step though the ILE C program generated by the system to wrap around your SQL code. Interesting to visit but harder to debug.
To start the debugging process, use the following expression:
STRDBG PGM(OBJLIB/UPDT_TIMEZ) UPDPROD(*YES) OPMSRC(*YES) DSPMODSRC(*YES)
This will bring up the source of the C program created by the compiler to run the SQL procedure. Hit F10 without entering anything else.
Now, start SQL with the command STRSQL
Within the SQL command line, call the stored procedure, exactly as if you were calling it from a normal command line. The syntax is the same. Note however that if you called a stored procedure directly from the command line, it would crash. Stored procedures have to be called either from an STRSQL screen, or from a member executed with a STRSQLSTM or even from a QMQRY member.
CALL OBJLIB/SAMPLEPGM
Now, the program will start in debug mode and you can step (F10) through it, like any other C program. In addition, if you look at your joblog, you will see what choices the SQL Optimizer has made. (see the SQL Optimizer section in this page for the details)
Here is a sample procedure, using a cursor. It updates each record with a distinct timestamp.
--------------------------------------------------------------------------------
-- TEST SQL PROCEDURE
--------------------------------------------------------------------------------
CREATE PROCEDURE UPDT_TIMEZ
LANGUAGE SQL
-- START PROCEDURE
BEGIN
-- DECLARE CURSOR VARIABLES
DECLARE PUBLISH_TMS TIMESTAMP ;
DECLARE WORK_TIMESTAMP TIMESTAMP ;
DECLARE SQLSTATE CHAR(5) DEFAULT '00000' ;
DECLARE AT_END INT DEFAULT 0 ;
DECLARE SQLCODE INT DEFAULT 0 ;
DECLARE CURSOR_UPD CURSOR FOR
SELECT PUBLISH_TMS FROM ER400SX MAIN;
SET AT_END = 0;
OPEN CURSOR_UPD ;
FETCH_LOOP:
LOOP
FETCH CURSOR_UPD INTO WORK_TIMESTAMP ;
IF SQLCODE = 0 AND SQLSTATE = '00000' THEN
UPDATE ER400SX
SET PUBLISH_TMS = CURRENT TIMESTAMP, TIME_ELAPSED = DAY(CURRENT_TIME_STAMP – WORK_TIMESTAMP)
WHERE CURRENT OF CURSOR_UPD ;
ELSE
SET AT_END = 1 ;
LEAVE FETCH_LOOP ;
END IF ;
END LOOP ;
CLOSE CURSOR_UPD ;
END
Go Back to the Top |
Debugging Stored Procedures: SQLCODE, Variables and DIAGNOSTICS
The variable SQLCODE is actually a sub-field of a data structure named SQLCA.
Here are the steps to debug it:
1) Start your debugger with the following instruction:
STRDBG PGM(PGMNAME) UPDPROD(*YES)
The debugger will then show the source.
Hit F10.
The debugger will leave the source instantly.
2) Call your program:
CALL PGM(PGMNAME)
- The debugger will now start for real. Hit
- F10 to step through a function
- F22 to Step INTO a function and see each line or component going through
3)To see the value of SQLCODE in V5R2, use the following command:
EVAL sqlca (Make sure that "sqlca" is in lower case)
This instruction will display the entire SQLCA data structure, SQLCODE is actually a sub-field of the SQLCA data structure. To interpret the code, go to the IBM SQL Website.Within that page is a search field. Search for the string "SQL messages and codes". A search result box will appear. Hit the first link, "SQL messages and codes". Within the resulting page, look for the link saying "SQL Message Finder" Hit that link. This is a good place to start investigating.
4) Getting Diagnostic information from SQL
DECLARE SQLCODE INT DEFAULT 0 ;
DECLARE SQL_EXCEPTION INT ;
DECLARE RECORD_COUNT INT DEFAULT 0 ;
DECLARE CONTINUE HANDLER FOR SQLEXCEPTION SET SQL_EXCEPTION = 1 ;
GET DIAGNOSTICS RECORD_COUNT = ROW_COUNT ;
SET INSERT_COUNT_CM999 = INSERT_COUNT_CM999 + RECORD_COUNT ;
To get all the possible debugging and diagnostic values available through the GET DIAGNOSTIC SQL verb, go to : http://publib.boulder.ibm.com/infocenter/iseries/v5r3/ic2924/index.htm?info/db2/rbafzmstincludesqlca.htm
5)To see the value of a character string in a C/400 debug session, use the following formula:
EVAL *variablename :s 12 (where 12 is the length of the string you wish to see)
6)To see the value of ALL LOCAL VARIABLES in your C/400 debug session, use the following formula:
EVAL %LOCALVARS
Go Back to the Top |
Deleting Duplicate Records from a Table Using SQL
A simple Correlated Delete Example Using SQL. The value of "select 1" in this situation is really there to verify that the "where exist" condition is true.
DELETE FROM MAIN_TABLE MAIN
WHERE EXISTS
(SELECT 1 FROM UPDATE_TABLE UPDT
WHERE MAIN.KEY_ID=UPDT.KEY_ID)
A good way to test the number of rows to be affected by this delete would be to replace the first line with
SELECT COUNT(1)FROM MAIN_TABLE MAIN
In most cases using 1 will produce the same result as using *. The only time it may differ is the situation where all fields in a record are NULL. A better illustration of the utility of using a 1 is to compare the results with a query that uses a specific field that sometimes contains NULL values.
A Correlated Delete Example Using other conditions with the correlation
DELETE FROM CONTACT_TABLE AD1
WHERE AD1.ID_NUMBER <
(
SELECT MAX(AD2.ID_NUMBER)
FROM CONTACT_TABLE AD2
WHERE (
AD1.ADDRESS_1 = AD2.ADDRESS_1 AND
AD1.ADDRESS_2 = AD2.ADDRESS_2 AND
AD1.ADDRESS_3 = AD2.ADDRESS_3
)
Go Back to the Top |
Joining with Unreliable Numeric Data Stored in a Character Variable
Up until now (V5R3), SQL on the iSeries has no built-in method to determine if a character column contains valid numeric data. Here is a simple way to do this, using the keyword LOCATE. LOCATE determines the position of a string within an other string. If it is not found a zero is returned. Using this technique, one can compare each individual position of the column and figure out if the data is numeric or not, and in the latter case assing a default numeric value. The result of this case can be used to do a reliable join with a numeric key.
Example 1 - strictly numeric data is acceptable:
LEFT OUTER JOIN FAMRFCUS FCUS ON
CASE
WHEN LOCATE(SUBSTR(WEBLOG_RAW_DATA , 22, 1), '0123456789') = 0
OR LOCATE(SUBSTR(WEBLOG_RAW_DATA , 23, 1), '0123456789') = 0
OR LOCATE(SUBSTR(WEBLOG_RAW_DATA , 24, 1), '0123456789') = 0
OR LOCATE(SUBSTR(WEBLOG_RAW_DATA , 25, 1), '0123456789') = 0
OR LOCATE(SUBSTR(WEBLOG_RAW_DATA , 26, 1), '0123456789') = 0
OR LOCATE(SUBSTR(WEBLOG_RAW_DATA , 27, 1), '0123456789') = 0
OR LOCATE(SUBSTR(WEBLOG_RAW_DATA , 28, 1), '0123456789') = 0
OR LOCATE(SUBSTR(WEBLOG_RAW_DATA , 29, 1), '0123456789') = 0
OR LOCATE(SUBSTR(WEBLOG_RAW_DATA , 30, 1), '0123456789') = 0
THEN 0
ELSE DEC( SUBSTR(WEBLOG_RAW_DATA, 22, 9) )
END
= FCUS.RFID_CUSTOMER_ID
Example 2 - Allows for blanks :
SELECT * FROM FILEAA AA
LEFT OUTER JOIN FILEBB BB
ON
CASE WHEN
(
LOCATE(SUBSTR(AA.PURCHASE_ORDER_NUMBER,1, 1),' 0123456789')<>0
AND LOCATE(SUBSTR(AA.PURCHASE_ORDER_NUMBER,2, 1),' 0123456789')<>0
AND LOCATE(SUBSTR(AA.PURCHASE_ORDER_NUMBER,3, 1),' 0123456789')<>0
AND LOCATE(SUBSTR(AA.PURCHASE_ORDER_NUMBER,4, 1),' 0123456789')<>0
AND LOCATE(SUBSTR(AA.PURCHASE_ORDER_NUMBER,5, 1),' 0123456789')<>0
AND LOCATE(SUBSTR(AA.PURCHASE_ORDER_NUMBER,6, 1),' 0123456789')<>0
AND LOCATE(SUBSTR(AA.PURCHASE_ORDER_NUMBER,7, 1),' 0123456789')<>0
AND LOCATE(SUBSTR(AA.PURCHASE_ORDER_NUMBER,8, 1),' 0123456789')<>0
)
THEN INT(AA.PURCHASE_ORDER_NUMBER)
ELSE 0
END
= BB.NUMERIC_PO_NUMBER
Go Back to the Top |
DB2 SQLCaveat! Left Outer Join with a Where clause on the Outer File Causes Inner Join
This is a "DB2 FEATURE": When using Left Outer Join with a WHERE on the outer joined table, the join actually turns itself into an INNER JOIN. Here is an example of this "feature":
This is the left table:
==============================================================================================
SELECT * FROM LEFT_TABLE
LEFT_PK RIGHT_FK DESC
1 3 left row 1
2 4 left row 2
3 0 left row 3
** End of data ********
This is the right table:
==============================================================================================
SELECT * FROM RIGHT_TABLE
RIGHT_PK DESC
1 right row 1
2 right row 2
3 right row 3
4 right row 4
*** End of data ******
This is a select from the Right table that yields 100% of the results
==============================================================================================
SELECT * FROM RIGHT_TABLE
WHERE RIGHT_TABLE.DESC != 'HELLO'
RIGHT_PK DESC
1 right row 1
2 right row 2
3 right row 3
4 right row 4
*** End of data ******
Query that produces a proper left outer join - note the nulls on the third row result:
==============================================================================================
SELECT * FROM LEFT_TABLE LEFT OUTER JOIN RIGHT_TABLE
ON RIGHT_FK = RIGHT_PK
LEFT_PK RIGHT_FK DESC RIGHT_PK DESC
1 3 left row 1 3 right row 3
2 4 left row 2 4 right row 4
3 0 left row 3 - -
*** End of data ********
Identical query but contains a condition using the right table:
==============================================================================================
SELECT * FROM LEFT_TABLE LEFT OUTER JOIN RIGHT_TABLE
ON RIGHT_FK = RIGHT_PK WHERE RIGHT_TABLE.DESC != 'HELLO'
LEFT_PK RIGHT_FK DESC RIGHT_PK DESC
1 3 left row 1 3 right row 3
2 4 left row 2 4 right row 4
*** End of data ********
Go Back to the Top |
Method to simulate a "right-justify" for character data using SQL
There is no "right-justify" string function built into SQL, but it can still be done: Taking a character string of variable size and right-justifying within a target column
containing a maximum of 12 characters. The example below does a bit more but it illustrates well what can be done, effectively concatenating blanks (up to 12) for a length of 12 - the length of the source column.
INSERT INTO TARGET_TABLE
SITE_CODE_RIGHT_JUSTIFED
SELECT
SUBSTR(' ', 1, 12 -
LENGTH( TRIM( CHAR_COLUMN_NAME)))
|| TRIM( CHAR_COLUMN_NAME )
FROM TABLE
FROM SOURCE_DATA
Go Back to the Top |
Executing Correlated Deletes using SQL and a correlated sub-select
Using SQL to do Correlated deletes - Take note of the two WHERE clauses:
(DELETE FROM EMPLOYEE_TABLE EM
WHERE EXISTS
(SELECT * FROM UPDATE_TABLE UPDT WHERE UPDT.ID = EM.ID);
(Note: this technique - using a join as opposed to a correlated sub-select - is available in some SQL flavors, such as MYSQL but NOT YET in DB2 for iSeries)
Go Back to the Top
|
Executing Correlated Updates using SQL on the iSeries – you have to use a correlated sub-select
Using SQL to do Correlated updates - Take note of the two WHERE clauses:
UPDATE EMPLOYEE_TABLE EM
SET (EM.PAY_SCALE, EM.SALARY) =
(SELECT NPAY.PAY_SCALE, NPAY.SALARY FROM NEW_PAY NPAY WHERE NPAY.ID = EM.ID )
WHERE EXISTS
(SELECT *
FROM NEW_NEWPAY NPAY WHERE NPAY.ID = EM.ID )
The following method is not yet available on DB2/400 but it is pretty neat and works on other databases:
Using SQL to do Correlated updates can be done in two ways: The first one is using an inner join, the second one can be done using a correlated sub-select.
(Note: this technique - using a join as opposed to a correlated sub-select - is available in some SQL flavors, such as MYSQL but NOT YET in DB2 for iSeries)
The inner join technique effectively says something like update the set of data as it come in from two sources joined by a key:
UPDATE
EMP_TABLE1 INNER JOIN EMP_TABLE2
ON EMP_TABLE1.EMP_NBR = EMP_TABLE2.EMP_NBR
SET EMP_TABLE1.BENEFIT_NBR = EMP_TABLE2.BENEFIT_NBR
WHERE EMP_TABLE1.BENEFIT_NBR <> EMP_TABLE2.BENEFIT_NBR
Go Back to the Top
|
Finding where (in which library) a DDL-built table with a long name may be
To find the library where a long-name (SQL naming convention) table may be on the system, use the following query in the Catalog Table SYSTABLES:
SELECT
TABLE_NAME, TABLE_SCHEMA FROM QSYS2/SYSTABLES
WHERE TABLE_NAME = 'MONTH_TO_DATE_SALES';
Go Back to the Top
|
Targeting one or more members in a multi-member file using SQL using ALIAS and UNION
To gather the data from multiple members in an iSeries table, here is a technique that uses ALIAS to target individual members and UNION to retrieve the distinct rows:
CREATE ALIAS LIBRARY1/SALES_HIST_1999
FOR LIBRARY1/SALESHIST(MBR_HST_99)
CREATE ALIAS LIBRARY1/SALES_HIST_2000
FOR LIBRARY1/SALESHIST(MBR_HST_00)
SELECT * FROM LIBRARY1/SALES_HIST_1999
UNION
SELECT * FROM LIBRARY1/SALES_HIST_2000
Go Back to the Top |
Converting local TIMESTAMP value to and from a GMT TIMESTAMP
Here are some handy timestamp routines, including a Greenwitch Mean Time (GMT) conversion:
TIME Retrieval using CURTIME function
SELECT curtime() FROM sysibm/sysdummy1
DATE Retrieval using CURDATE function
SELECT curdate() FROM sysibm/sysdummy1
CURRENT TIMESTAMP Retrieval using NOW function
SELECT now() FROM sysibm/sysdummy1
GMT TIMESTAMP using NOW and TIMEZONE
select now()- current timezone from sysibm/sysdummy1
Go Back to the Top |
Exploring the DB2 Catalog
All the Database Objects in DB2 are classified and organized with a link in the DB2 Catalog. The DB2 Catalog tables all start with SYS* and they all reside in library QS2. Examples of the most well known DB2 Catalog tables are:
Catalog Table Description
SYSCOLUMNS: Columns
SYSCST: Constraints
SYSFUNCS: Functions
SYSINDEXES: Indexes
SYSKEYS: Keys
SYSPROCS: Procedures
SYSTABLES: Tables
SYSTRIGGER: Triggers
SYSVIEWS: Views
Example of the use of the DB2 Catalog: "Where is this column (field) name used?" :
SELECT * FROM QSYS2/SYSCOLUMNS
WHERE COLUMN_NAME LIKE '%MCU%'
Go Back to the Top |
Global Character or String Replace in SQL
Global character or string replacement is very simple with SQL using the keyword REPLACE.
In this example this technique is used to replace individual HEX Characters in a table.
Specificaly, a carriage-return special character is replaced with a "|" character.
UPDATE DATA_TABLE SET DESCRIPTION = REPLACE( DESCRIPTION , X'25' , X'4F' )
The same technique can be used to replace strings:
UPDATE DATA_TABLE SET DESCRIPTION = REPLACE( DESCRIPTION , 'THIBAULT' , 'Thibault' )
Go Back to the Top |
Creating Characters by Specifying the Hexadecimal Value in SQL
If necessary, one can enter HEX characters directly while using SQL. This may be practical when dealing with accents and foreign languages for example: HEX(43) is Ñ.
INSERT INTO SAMPLE_TABLE
SAMPLE_COLUMN
SELECT HEX(43)
FROM SYSIBM/SYSDUMMY1
Go Back to the Top
|
Finding (locating) Hexadecimal Values in SQL
Locating a special character, like a carriage return or any other un-printable character is relatively easy with SQL.
This is especially handy when sub-stringing or parsing unstructured data. Here is an example:
SELECT LOCATE(X'25', GDTXVC, 1), -- show the position
SUBSTR(GDTXVC, 1, LOCATE(X'25', GDTXVC, 1)) -- substring from 1 to location
FROM F00165
WHERE LOCATE(X'25', GDTXVC, 1) < 50 -- only if location is < 50 positions
FROM SYSIBM/SYSDUMMY1
Go Back to the Top |
Technique to Call Programs or Execute CL Commands from SQL
The technique is simple: Simply call QCMDEXC, add the command as a parameter and the length of the command as a subsequent parameter, with the lenght in the numeric format shown below. For example, here is an override, followed by a call:
CALL QCMDEXC('OVRDBF FROMFILE(FILEA) TOFILE(FILEB)',0000000036.00000);
or
CALL QCMDEXC('CALL TESTLIB/TESTCLPGM',0000000022.00000);
Go Back to the Top |
Difference between a Stored Procedure and a Function
Stored Procedures have the following charactetistics:
- They can accept n parameters
- They compile as an object of type *PGM, which has to be called from an SQL environment to work
- They can return n parameter values
- Best use for Stored Procedures: when one needs to do record-by-record processing - which can be done using a cursor
Functions have the following charactetistics:
- They can accept n parameters
- They compile as an object of type *SRVPGM which has to be called from an SQL environment to work
- WILL RETURN 1 AND ONLY 1 VALUE
- Best use for functions: For simple, often repeated, formula-intensive calculations
Go Back to the Top |
Method to find the SQL Optimizer suggestions to improve program or stored procedure SQL performance
If you find your SQL is sluggish and you don't know why, you may want to know what the Operating System - or more specifically the SQL optimizer - is thinking. This is possible by using the debugger. For a given SQL routine that is too slow and that you wish to analyze, do the following:
Step 1: If this is an individual SQL statement, cut and paste your SQL into a QMQRY member. If this is a stored procedure, go to step 2.
Step 2: Start Debug using STRDBG UPDPROD(*YES) CHGJOB LOG(4 4 *SECLVL)- this will change the job to record all activities and second level text. With *SECLVL, both the message text and the message help (cause and recovery) of the error message are written to the job log.
Step 3: If we are dealing now with a QM member, run the freshly created QM (STRQM and hit 9 on your new member), otherwise if it is a stored procedure, call the stored procedure from within an SQL environment (STRSQL and do "CALL LIBNAM/PROC_NAME" from the SQL command line or RUNSQLSTM of a member containing "CALL LIBNAM/PROC_NAME")
Step 4: Do a WRKJOB and look at the joblog. Look for
**** Starting optimizer debug message for query and Access path suggestion for file
(Note: DB2 Optimizer suggestions may or may not appear)
These messages will show you what the SQL optimizer is looking for and the suggestions it may have made for you to improve the performance of your SQL code.
Typically the Optimizer will suggest indexes, with precise key orders.
Go Back to the Top |
Technique to generate a Surrogate Key using the Current Time Stamp as a base
Of all possible key formats, the fastest ones are single-column, numeric keys. When a natural key is too long, to improve performance, using a unique numeric surrogate key to represent a unique natural key is a common technique. The question is: how does one reliably generate surrogate keys on the fly when writing a new row in a table? The solution below is both simple and easy. It relies on the fact that a timestamp on iSeries is accurate to one millionth of a second. It is therefore a reliable source for a unique surrogate key. The code below shows how one can use the timetamp to get a unique number, essentially converting the time stamp to a numeric value:
SELECT
SUBSTR(CHAR(NOW()), 1, 4) || SUBSTR(CHAR(NOW()), 6, 2) ||
SUBSTR(CHAR(NOW()), 9, 2) || SUBSTR(CHAR(NOW()),12, 2) ||
SUBSTR(CHAR(NOW()),15, 2) || SUBSTR(CHAR(NOW()),18, 2) ||
SUBSTR(CHAR(NOW()),21, 6)
FROM SYSIBM/SYSDUMMY1
Go Back to the Top |
Aggregating Aggregates SQL Technique
Aggregate the result of multiple count(*) records: Note how the “TOTALDUPS” variable is used to name the COUNT(*) result, and further used to total up all duplicates.
Note also the “COUNT(*) -1” notation. The result of this operation only counts the duplicates (i.e. if there were two identical records, one would be deemed valid and the second one would be deemed to be the duplicate).
SELECT SUM( TOTALDUPS )
FROM (
SELECT POS.COIND, POS.ILITM, POS.ILMCU, COUNT(*) -1 TOTALDUPS
FROM POS_TABLE POS
GROUP BY POS.COIND, POS.ILITM, POS.ILMCU
HAVING COUNT(*) >2
) TABLEX
SUM ( TOTALDUPS )
249
Combine multiple techniques using UNION ALL: Notice the use of “zero and ONHAND” and then “zero and ORDERED”, which enables the UNION – as in the two sets are identical. Now we can get the sum of orders and the matching sum of on-hands (two figures from two different tables).
SELECT SKU, SUM(ORDERED) UNION_ORD ,SUM(ONHAND) UNION_OH
FROM
(
SELECT ORDSKU SKU, ORDQTY ORDERED, 0 ONHAND
FROM ORDERS
UNION ALL
SELECT INVSKU SKU, 0 ORDERED, INVOHQ ONHAND
FROM INVENTORY
)
TABLEX
GROUP BY SKU
HAVING SUM(UNION_ORD) > SUM(UNION_OH)
SKU UNION_ORD UNION_OH
215996AA 274.0000 0.0000
215996LB 12.0000 0.0000
51789764 720.0000 242.0000
Go Back to the Top |
Technique to insert NULL FROM SELECT data statement as a source:
Technique to insert NULL data when a SELECT is used for the source of the insert: Notice the cast of the NULL, this is the feature that enables this technique. The other method is to completely omit the content of TARGET_COL2 in the SELECT, but this method is more explicit, deliberate and easier to read.
INSERT INTO TARGET_TABLE
(
TARGET_COL1,
TARGET_COL2
)
SELECT SRCTBL.START_DATE,
CAST(NULL AS NUMERIC)
FROM SOURCE_TABLE SRCTBL
Go Back to the Top |
Technique to enable self-referential, previous record updating current record:
This technique enables a self-referential update that is sequential – i.e. record x with a key of SITE_CODE, PRODUCT, START_TIMESTAMP. In this process, each row in table SITE_PROD_DAT_TABLE needs to be updated with a value contained in the immediate preceding row with the same SITE_CODE and PRODUCT. To do this, the SITE_PROD_DAT_TABLE is correlated three times in the same SQL statement:
- The MAIN correlation will be the target of the UPDATE
- The UPDTDTA correlation will get the value which will be used to update MAIN (actually, this is a subtraction update, if the value is not found, the IFNULL will subtract itself and leave a zero value in MAIN)
- The MAXBELOWKEY will get the maximum date below the current row with a key that matches SITE_CODE and PRODUCT
Note that it is CRITICAL to have good indexes to get good performance on this type of join. For this statement, the look-back period is only 21 days, which also helps limit the amount of searching. The index for SITE_PROD_DAT_TABLE is by order of cardinality: site_code, product, start_timestamp.
UPDATE SITE_PROD_DAT_TABLE MAIN
SET MAIN.price_delta =
MAIN.value - IFNULL
(
( SELECT UPDTDATA.value
FROM SITE_PROD_DAT_TABLE UPDTDATA
WHERE UPDTDATA.site_code = MAIN.site_code
AND UPDTDATA.product = MAIN.product
AND UPDTDATA.start_timestamp =
(
SELECT max(MAXBELOWKEY.start_timestamp)
FROM SITE_PROD_DAT_TABLE MAXBELOWKEY
WHERE MAXBELOWKEY.site_code = MAIN.site_code
AND MAXBELOWKEY.product = MAIN.product
AND MAXBELOWKEY.start_timestamp >= current date - 21 days
AND MAXBELOWKEY.start_timestamp < MAIN.start_timestamp
)
)
, MAIN.value
)
WHERE MAIN.start_timestamp >= current date - 21 days
Go Back to the Top |
My SQL Catalog appears to be out of sync – what can I do?
I stumbled across a situation where I had a unique key on a table but could not find any constraint for this table in the QSYS2/SYSCST.
Answer: There can be more than one cause:
1) If the unique key was created using a logical file with a unique key (built with a DDS), the constraint will be there but it will not appear in the catalog file QSYS2/SYSCST.
2) If the catalog did go out of sync, the only way to repair it is with a reclaim storage operation, as shown below:
RCLSTG SELECT(*DBXREF)
Go Back to the Top
|
Using Decimal Formatting in SQL
This is a simple trick to format decimal data in SQL. Particularly useful when creating new tables with a simple select statement.
SELECT DEC(MONEY_TABLE.DOLLARS_COLUMN/10000 , 15 , 4) from TABLE
Note that the division must be inside and the Decimal formatting outside. This technique is especially useful for pulling JDE data with accurate decimal information.
To round down the figure above from 4 to 2 decimals only, you can wrap it again using the following technique:
SELECT DEC( DEC(MONEY_TABLE.DOLLARS_COLUMN/10000 , 15 , 4), 15, 2)
To round down the figure above from 4 to 2 decimals only, you can wrap it again using the following technique:
SELECT DEC( CASE WHEN
STRIP(PDCRCD) = 'CAD'
THEN PDAEXP/100
ELSE PDFEA/100
END
, 15,2) EXTENDED_PRICE
Note: Be careful when formatting aggregate values:
==================================================
SUM( DEC(DOMESTIC_OPEN_AMT/100, 14,2)) - Works
DEC( SUM(DOMESTIC_OPEN_AMT) /100, 14,2) - Does not Work
Go Back to the Top |
SQL Horizontal Pivoting Technique Coding Example
The following links point to a real example, with each item belonging to up to 16 business units,
arranged in up to 16 rows. The pivoting spreads the business units the item belongs to sideways.
Pivot_SQL_Code_001
Pivot_SQL_Code_002
Pivot_SQL_Code_003
Pivot_SQL_Code_004
Go Back to the Top
|
A simple "Count Distinct" Technique
Count Distinct sounds obvious, but often, I have to think of the syntax again to make it work. Here is a working example:
SELECT COUNT ( DISTINCT MATERIAL_NUMBER) FROM MATERIAL_TABLE
Go Back to the Top |
A Universal or "case insensitive" Search and Replace Technique
Search for a string and find it, no matter in which case it is written in, and replace with a new string - the trick is the UCASE keyword:
REPLACE (ITEM_DESCRIPTION, UPPER('oem') , 'MANUFACTURER' )
Yields an upper-case result only.
The code below keeps the old case.
------------------------------------------------------------------------------
-- Case Independent String Replacer
-- This code was created to replace non-standard descriptions
-- such as SN, S/N, SerNo, Serial#, Serial NO to at standard "Serial Number"
--
-- This replacer is different from most, as it
-- - finds the string in any possible case
-- - preserves the original on both side of what is found
-- - most practical for search/replace which have to leave the
-- original string in the original case.
------------------------------------------------------------------------------
SELECT
SUBSTR(TEMPPF , 1, 20) ORIGINAL,
CASE WHEN LOCATE('OEM', UPPER(TEMPPF) ) <> 0 -- Yes there!
THEN SUBSTR(TEMPPF, 1, LOCATE('OEM', UCASE(TEMPPF)) - 1)
|| 'MANUFACTURER'
|| CASE WHEN SUBSTR( TEMPPF, -- SUBSTR strg
LOCATE('OEM', UPPER(TEMPPF) ) + LENGTH('OEM'), -- SUBSTR from
LENGTH(TEMPPF) -
LOCATE('OEM', UPPER(TEMPPF))+ LENGTH('OEM')-- SUBSTR to
)= '' -- NOTHING MORE TO ADD.
THEN ''
ELSE -- THERE IS SOMETHING AFTER THE SEARCH STRING.
SUBSTR( TEMPPF, -- SUBSTR strg
LOCATE('OEM', UPPER(TEMPPF)) + LENGTH('OEM') , -- SUBSTR from
LENGTH(TEMPPF) -
LOCATE('OEM', UPPER(TEMPPF))+ LENGTH('OEM') +1) -- SUBSTR to
END
ELSE
TEMPPF -- Nothing to replace, just yield the original string
END
REPLACEMENT_RESULT
FROM CATDA7/TEMPPF
--------------------------------------------------------------------
ORIGINAL REPLACEMENT_RESULT
-------------------- ----------------------------------------------
OEm test replacement MANUFACTURER test replacement
test OEM replacement test MANUFACTURER replacement
test replacement OEM test replacement MANUFACTURER
test replacement OeM test replacement MANUFACTURER with mixed CaSe
oem MANUFACTURER
oem MANUFACTURER
Go Back to the Top |