AB INITIO TUTORIALS

Best online resource for Ab Initio Tutorial Tutorials

Ab Initio

Filter by expression

Filter by Expression filters data records according to a specified DML expression.
Basically it can be compared with the where clause of sql select statement.
Different functions can be used in the select expression of the filter by expression component even lookup can also be used.



In this filter by expression there is reject-threshold parameter
The value of this parameter specifies the component's tolerance for reject events. Choose one of the following:
• Abort on first reject — Write Multiple Files stops the execution of the graph at the first reject event it generates.

• Never abort — the component does not stop the execution of the graph, no matter how many reject events it generates.


• Use ramp/limit — the component uses the settings in the ramp and limit parameters to determine how many reject events to allow before it stops the execution of the graph.
The default is Abort on first reject.


Join

Join reads the records from multiple ports, operates on the records with matching keys using a multi input transform function and writes the result into output ports.




In join the key parameter has to be specified from input flow (either of the flow) ascending or descending order (please refer to picture above).
If all the input flows do not have any common field, override-key must be specified to map the key specified.


Reformat

Reformat changes the record format of data records by dropping fields, or by using DML expressions to add fields, combine fields, or transform the data in the records
By default reformat has got one output port but incrementing value of count parameter number. But for that two different transform functions has to be written for each output port.
If any selection from input ports is required the select parameter can be used instead of using ‘Filter by expression’ component before reformat


Rollup

Rollup generates data records that summarize groups of data records on the basis of key specified.

Parts of Aggregate
• Input select (optional)
• Initialize
• Temporary variable declaration
• Rollup (Computation)
• Finalize
• Output select (optional)

Input_select : If it is defined , it filters the input records.

Initialize: rollup passes the first record in each group to the initialize transform function.

Temporary variable declaration:The initialize transform function creates a temporary record for the group, with record type temporary_type.

Rollup (Computation): Rollup calls the rollup transform function for each record in a group, using that record and the temporary record for the group as arguments. The rollup transform function returns a new temporary record.

Finalize:
If you leave sorted-input set to its default, Input must be sorted or grouped:

• Rollup calls the finalize transform function after it processes all the input records in a group.
• Rollup passes the temporary record for the group and the last input record in the group to the finalize transform function.
• The finalize transform function produces an output record for the group.
• Rollup repeats this procedure with each group.

Output select: If you have defined the output_select transform function, it filters the output records.


Aggregates

Aggregate generates data records that summarize groups of data records ( similar to rollup). But it has lesser control over data.


Scan

Scan generates a series of cumulative summary records for groups of data records.
Consider above case input records scan transform functions generates record in output as (if input_select and output_select parameters are not specified)

Scan also can be used for multiple functionality as same as rollup

The main difference between Scan and Rollup is Scan generates intermediate (cumulative) result and Rollup summarizes.

Related Posts Plugin for WordPress, Blogger...
Click
For Special
Download