AB INITIO TUTORIALS

Best online resource for Ab Initio Tutorial Tutorials

Ab Initio
10:50 PM

Advanced Concepts

Please find below the Advanced Concepts in Ab initio

1.Parallelism
2.Continuous Flow
3.Ab Initio Queue
4.Advance Components
5.Sandbox Parameter Editor
6.Graph Parameter Editor
7.Flow

10:34 PM

Advance Components

Ab-initio provides three facilities developing graphs beyond simply inserting pre-built components into workplace and connecting them each other. These are

• Custom components

• Subgraphs

• Macros

When to use Custom component

If the solution to the task is a single executable, you need to use custom component.
Typically you have a program or script you have created in past to perform some type of data transformation, and you now want to use it in Ab-Initio graph. Alternatively, you can write a new program or script with a specific purpose in mind.

A custom component lets you integrate your program or script into an Ab-Initio graph. You can use a custom component in a same way that you would use an Ab-initio pre-built component

When to use a Sub-graph

If you can construct the solution to the task from Ab-Initio pre-built components and you can keep the number and arrangements static from one run of the graph to another you can use sub-graph. Of the three above mentioned facilities sub-graph is the easiest to use.
When you use a sub-graph you can define components parameters at runtime. You can change the value of parameter from one graph to another but the number of component remains static.

Conditional Components

The GDE supports Conditional Components where a shell expression determines, at runtime, whether or not to include certain components.

To turn on this feature click on
 File -> Preferences -> “Parameters” section of dialog
 Check “Conditional Components”
 The “Condition” tab appears on all components
[for reference see the picture below ]
The condition can be provided as argument during run time. This arguments are accessed through the variables which are mentioned in graph/sub-graph parameter list.
The Condition expression is a shell expression that, if it evaluates to “0” or “false”, will cause the component to not exist. Any other value means the component will exist.
Make sure the shell expression returns the string “0” or “false”, not numerical 0.
Components which are conditioned out can be replaced by a flow or removed completely. When removing completely, make sure you don’t leave any required ports unconnected.
To apply the same condition to more than one component, make them a subgraph and condition the subgraph.

10:34 PM

Ab_Initio Queue

Ab Initio queues are an adaptation of the first-in/first-out queue concept:
• They provide record-based persistence.
• Publishers write data to the queue.
• Subscribers to the queue read the data in the order it was written.
In many ways, Ab Initio queues are analogous to multifiles in ordinary Ab Initio graphs. They provide a method for storing records in an ordered sequence of files. However, they also hold additional information necessary to allow both the removal of data from disk when it is no longer needed, and the recovery of graphs that stop running for any reason.
Queues are the most reliable method for storing continuous flow data. We recommend that you get your data into this format as early as possible in the data processing stream
You use the m_queue command to create an Ab Initio queue. When you execute this command, you specify the name of the directory or multidirectory that you want to contain the new queue. This is the value of the queue_path argument to the m_queue command. The Co>Operating System creates the directory you specify and sets up a queue inside it. A directory or multidirectory can contain only one queue. Consequently, although the directory is not the queue, the name of the directory that contains the queue identifies the queue.
Each Ab Initio queue directory contains a number of files, which contain the queue data and the queue infrastructure. If you look at the queue in the file system, you can see these files. However, you never directly operate on them.

m_queue command operartion:
m_queue create -f < subscriber1> < subscriber2> < subscriber3>....

Example
m_queue create -f /u05/cromwell/puranika/Projects/Cromwell/Registartion/data/input/queue sub1 sub2
Stopping Continuous Flow Graphs

Use one of the following two commands to stop a continuous flow graph:
• m_shutdown [-f | -status] job_name Waits until the graph commits the next checkpoint in order to end the job cleanly, then stops the execution of the graph and deletes all checkpoint temporary files.
• m_kill job_name The execution of the graph stops immediately

10:33 PM

Continuous flow

A continuous job is a job that produces usable output before it ends. A continuous flow graph, unlike a regular Ab Initio graph, is intended to run for an indefinite period of time, continually taking in new input and producing new, usable output while the graph keeps running. A continuous flow graph might or might not go on forever.
The advantages of continuous flow graphs include better performance and latency. There is no overhead for starting the job each time a new batch of data arrives. Results of the job are available sooner than for a non-continuous graph.

A continuous flow graph includes
• One or more subscribers. A subscriber is the only allowed data source.
• A publisher at the end of every data flow.
• Any continuous or continuously enabled component can be in the middle, between a subscriber and a publisher.


Restrictions for continuous flow :

• All components in the graph must be continuous components or they must be continuously enabled.
• All components with no output flows must be publishers. There must be at least one publisher in the graph for the graph to determine when a checkpoint can be committed.
• All subscribers must issue checkpoints and compute points in the same sequence.
• The graph must execute in a single phase. More than one phase is not allowed.
• All data in a continuous flow graph must come from a subscriber component. The source of data cannot be an Input File or Input Table component.

Related Posts Plugin for WordPress, Blogger...
Click
For Special
Download