Wednesday, December 31, 2008

SELECT with first n duplicates

ICETOOL's SELECT operator now allows you to select the first n records with each key or the first n duplicate records with each key.

As example, following SELECT operator could be used to list the top 3 students in each class.
//TOOLIN DD *
SELECT FROM(IN) TO(OUT) ON(1,10,CH) FIRST(3) USING(CTL1)
//CTL1CNTL DD *
SORT FIELDS=(1,10,CH,A,21,2,ZD,D)

We sort on the class name in ascending order, and on the average in descending order, to get the records in order by class name and highest to lowest average. We SELECT on the class name and use FIRST(3) to get the first three records for each class which gives us up to three students with the highest average.

I/P records :
Geometry Fred 85
Geometry Janis 71
Geometry Leonard 78
Geometry Michael 91
Geometry Susan 83
Geometry William 92
Algebra Fred 83
Algebra Janis 90
Algebra Leonard 85
Algebra Michael 94
Algebra Susan 92
Algebra William 87

O/P records:
Algebra Michael 94
Algebra Susan 92
Algebra Janis 90
Geometry William 92
Geometry Michael 91
Geometry Fred 85

Syntax :- The syntax for the new operands of SELECT is as follows:
SELECT FROM(DDIN) TO(DDOUT) ON(1,5,CH) FIRST(x)FIRSTDUP(x)
Detailed Description
:
  • FIRST(x) - Limits the records selected to those with ON values that occur only once (value count = 1) and the first x records of those with ON values that occur more than once (value count > 1).
  • FIRSTDUP(x) - Limits the records selected to the first x records of those with ON values that occur more than once (value count > 1).

Hope you find this post useful. Please do provide your comments or reactions. Any sort of feedback is appreciated. You may post your queries in comments as well. For further detailed reference on SELECT and other new ICETOOL functions,Click Here.

Tuesday, December 30, 2008

SUBSET

SUBSET is a new ICETOOL operator that allows you to create a subset of the input or output records with specific header, trailer, and relative records, or without specific header, trailer, and relative records. SUBSET gives you new capabilities for keeping or removing the first n records of your data set, the last n records of your data set, and/or specific relative records in your data set. SUBSET does not require an "identifier" in the records to be keptor removed; it keeps track of the first n records, relative record numbers, and the last n records automatically.

Various options of SUBSET allow you to define the criteria for keeping or removing records.

As an example, you could use the following SUBSET operator to keep the first two input records, the fifth, sixth and seventh input records, and the last input record.

SUBSET FROM(IN) TO(OUT) KEEP INPUT FIRST(2) RRN(5,7) LAST
I/P records:
HEADER 001
HEADER 002
Vicky
Frank
Regina
Viet
David
Dave
Carrie
Sam
Sri Hari
Martin
LAST 999

O/P records:
HEADER 001
HEADER 002
Regina
Viet
David
LAST 999

Syntax - The syntax for the SUBSET operator is as follows:
SUBSET FROM(indd) TO(outdd) DISCARD(savedd) KEEPREMOVE INPUTOUTPUT HEADERFIRSTHEADER(x)FIRST(x)
RRN(x)RRN(x,y)RRN(x,*) ...
TRAILERLASTTRAILER(x)LAST(x)
USING(xxxx) VSAMTYPE(x)

Detailed Description :
  • FROM(indd) - Specifies the ddname of the input data set.
  • TO(outdd) - Specifies the ddname of the output data set , TO and DISCARD can both be specified.
  • DISCARD(savedd) - DFSORT will write the records it does not select for the operation.
  • KEEP - Specifies that the records that meet the criteria are to be kept.
  • REMOVE - Specifies that the records that meet the criteria are to be removed.
  • INPUT - The criteria of records selection takes place before INREC or SORT.
  • OUTPUT - The criteria of records selection takes place after INREC or SORT.
  • RRN(x) - point to exact single records,
  • RRN(x,y) - specifies a range of records,
  • RRN(x,*) - Specifies relative record numbers x through the last record.

Hope you find it useful. Please do provide your comments or reactions. For further reference click here.