AB INITIO TUTORIALS

Best online resource for Ab Initio Tutorial Tutorials

Ab Initio
2:29 AM

Tuning Tips

Ab initio Tuning Tips

Avoid Sorts
All the sort components use memory while the code is running. Due this memory usage “sort” components slows down the process. So it is suggested not to use “sort” components.

Use Lookups
Lookup File represents one or multiple serial files or a multifile of data records small enough to be held in main memory, letting a transform function retrieve records much more quickly than it could retrieve them if they were stored on disk.
Lookup File associates key values with corresponding data values to index records and retrieve them.
It is better to use catalogs to share lookup files in multiple graphs. Use lookup files to quickly retrieve values from a small dataset; this is often the best way to merge a large dataset with a small one.

Allocate Memory Correctly
Before coding a graph it will be better if there is any idea about volumetric. This will help to get a rough idea about the memory to be allocated.

Phasing
A Phase is a stage of application that runs to completion before the start of the next stage. By dividing the graph into phases one can save resources and avoid deadlock (discussed later).

Check point
A checkpoint is a special time of phasing that saves status information, so as to restart the graph from the point of failure



Avoid Deadlock
Deadlock occurs when a program cannot progress. It depends on the patterns of your data and typically occurs in graphs with data flows that split then join.
A graph carries a potential deadlock when flows diverge and converge within the same phase. If the flows converge at a component that reads its input flows in a particular order, that component may wait for records to arrive on one flow even as the unread data accumulates on others because components have a limited buffering capacity.

Tip: To avoid deadlock, put components in separate phases, or set flow buffering.


Minimize Components
All the components used in a graph have got overhead. Because every components creates its own sub-process while running. Hence it is recommended to minimize number of components. E.g. if matching is to be made on the selected records, it better use the select port of the “join” component instead of using a “filter by expression” component extra.

Related Posts Plugin for WordPress, Blogger...
Click
For Special
Download