This course is designed to introduce advanced parallel job development techniques in DataStage V11.7 In this course you will develop a deeper understanding of the DataStage architecture, including an understanding of the DataStage development and runtime environments. This will enable you to design parallel jobs that are robust, less subject to errors, reusable, and optimized for better performance.
DataStage basic knowledge and some experience developing jobs using DataStage.
4 Days/Lecture & Lab
This advanced course is designed for experienced DataStage developers seeking training in more advanced DataStage job techniques and who are seeking an understanding of the parallel framework architecture and new features/differences from V8.X to V11.7
- Describe the parallel processing architecture
- Describe pipeline and partition parallelism, data partitioning and collecting
- Describe the role and elements of the DataStage configuration file
- Describe sorting in the parallel framework
- Describe optimization techniques for buffering
- Describe and work with parallel framework data types and elements, including virtual data sets and schemas
- Create reusable job components using shared containers
- Describe the function and use of Balanced Optimization
- Data masking and Data Rule stage
- Excel stage (Unstructured stages) and XML file processing (structured stages)
- Director - Job scheduling – Creating/scheduling Batches
- DEV vs PROD architectures and differences