SOSCONPH 25 logo
SOSCONPH 25 menu icon
soscon_public_learnsoscon_public_sharesoscon_public_network

Speakers

Workshop 1
14:00 - 15:30

Creating Workflows with Airflow

Airflow
Airflow Logo

Learning Outcomes

Understand the fundamentals of Apache Airflow

Explain what Airflow is, its purpose, core concepts, architecture, and user interface

Design and implement workflows

Create a Directed Acyclic Graph (DAG) to define workflows, set up tasks, and establish dependencies between them

Utilize operators and advanced features

Use Bash, Python, and SQLExecuteQueryOperator; leverage XComs for inter-task communication; and manage variables and conditional branching with TaskGroups

Schedule and manage workflows

Implement catchup, backfill, and CRON expressions to schedule tasks effectively

Gain hands-on experience

Successfully complete practical exercises to reinforce theoretical knowledge and apply it to real-world scenarios

Modules

1

Introducing Apache Airflow

  • What is Airflow?
  • Why use Airflow?
  • Core Concepts, Architecture, and UI
2

Running Workflows on Airflow

  • Creating your first DAG
  • Defining tasks and dependencies
3

Using Operators, XCom, Variables

  • Bash Operators, Python Operators, SQLExecuteQueryOperator
  • Using XComs to pass values
  • Variables
  • Conditional Branching and TaskGroups
4

Catchup, Backfill, and CRON Expressions

    5

    Advanced Concepts

    • Parallelism
    • Pools and Sensors
    • Custom Operators
    6

    Hands-on Exercise

      Requirements

      Basic knowledge in Python

      Basic knowledge of SQL

      Basic knowledge of Command line usage

      Alyssa Fae Ocfemia

      Lead Engineer,

      Samsung R&D Institute Philippines

      Alyssa is a Lead Data Engineer with over eight years of experience at Samsung R&D Institute Philippines (SRPH). A graduate of BS in Computer Science from the University of the Philippines Los Baños, Alyssa initially aspiring to be a Frontend Developer. Upon joining SRPH, she was accepted as a Backend Developer, where she gained proficiency in Spring and Java programming. Later, she was moved to a new Big Data project, sparking her journey of exploration and learning in various areas of Big Data Analytics, ultimately leading her to fall in love with Data Engineering. This experience allowed her to explore numerous Big Data technologies, with Python and the Hadoop framework as her primary tools, and gain expertise in cloud services like AWS and GCP for building robust data platforms. With five years of experience using Apache Airflow, she has excelled in workflow orchestration and automation. Today, she is part of a team at SRPH, driving innovation and delivering impactful data.

      speaker