Pentaho Kettle Fundamentals & Advanced

Începător

Pentaho Kettle Fundamentals & Advanced

Durată: 5 zile

Certificare: Diploma de participare

Prezentarea cursului

Pentaho Kettle Fundamentals (3Days) & Advanced (Additional 2 Days)

Ce subiecte abordează cursul

Day 1: Introduction & Core Concepts

• Course overview, objectives, and environment setup
• Introduction to PDI: architecture and components (Spoon, Pan, Kitchen, Carte)
• Navigating the Spoon UI: connections, repositories, and project structure
• Transformations fundamentals: steps, hops, and data flow
• Core input/output steps: CSV Input, Excel Input, Table Input, Text File Output, Table Output
• Lab: Build a simple CSV → Database transformation

Day 2: Data Transformation & Jobs

• Data transformation steps: Filter Rows, Sort Rows, Merge Join, Lookup steps
• String manipulation, data type conversion, and value mapping
• Introduction to Jobs: job entries, flows, success/failure routing
• Variables, parameters, and environment configuration
• Lab: Build a job that orchestrates multiple transformations with error handling

Day 3: Advanced Topics & Real-World Patterns

• Working with databases: connections, bulk loading, slowly changing dimensions (SCD)
• Executing PDI from the command line: Pan and Kitchen, scheduling with cron
• Logging, monitoring, and error handling best practices
• Performance tuning: row buffering, parallelism, partitioning
• Lab: End-to-end ETL pipeline (file ingestion → transform → load → job orchestration)
• Q&A and recap

Day 4: Intermediate–Advanced Transformations

• Stream Lookup vs. Database Lookup: performance trade-offs and use cases
• Advanced join patterns: Sorted Merge Join, fuzzy matching, and multi-stream merging
• Working with JSON and XML: parsing, generating, and transforming semi-structured data
• Dynamic SQL and parameterized queries in Table Input
• Using the JavaScript step and Formula step for complex business logic
• Handling large datasets: lazy conversion, compression, and streaming optimizations
• Lab: Build a transformation that processes nested JSON and loads a normalized relational model

Day 5: Orchestration, Deployment & Integration

• Advanced job design: sub-jobs, parallel execution, and dynamic file handling
• Metadata injection: building dynamic, reusable transformation templates
• Connecting to REST APIs and web services: HTTP Client, REST Client steps
• Integrating PDI with messaging systems (Kafka, JMS) and cloud storage (S3, SFTP)
• Deploying PDI on Carte server: remote execution and clustering basics
• CI/CD for PDI: version control, automated testing with PDI unit test framework
• Lab: Design and deploy a fully parameterized, scheduled pipeline with API ingestion, transformation logic, and database output
• Q&A and recap
 

02JK

Nu ai găsit ce căutai? Dă-ne un mesaj!

Prin trimiterea acestui formular sunteți de acord cu termenii și condițiile noastre și cu Politica noastră de confidențialitate, care explică modul în care putem colecta, folosi și dezvălui informațiile dumneavoastră personale, inclusiv către terți.