Workflow Basics

Learn the fundamentals of writing Dagu workflows.

Your First Workflow

Create hello.yaml:

yaml

steps:
  - name: hello
    command: echo "Hello from Dagu!"

Run it:

bash

dagu start hello.yaml

Workflow Structure

A complete workflow contains:

yaml

# Metadata
name: data-pipeline
description: Process daily data
tags: [etl, production]

# Configuration  
schedule: "0 2 * * *"
params:
  - DATE: ${DATE:-today}

# Steps
steps:
  - name: process
    command: python process.py ${DATE}

# Handlers
handlerOn:
  failure:
    command: notify-error.sh

Steps

The basic unit of execution.

Simple Commands

yaml

steps:
  - name: download
    command: wget https://example.com/data.csv

Multi-line Scripts

yaml

steps:
  - name: process
    script: |
      #!/bin/bash
      set -e
      
      echo "Processing..."
      python analyze.py data.csv
      echo "Complete"

Shell Selection

yaml

steps:
  - name: bash-task
    shell: bash
    command: echo $BASH_VERSION
    
  - name: python-task
    shell: python3
    script: |
      import pandas as pd
      df = pd.read_csv('data.csv')
      print(df.head())

Dependencies

Steps run sequentially by default. Use depends for parallel execution or to control order.

yaml

steps:
  - name: download
    command: wget data.csv
    
  - name: process
    command: python process.py
    
  - name: upload
    command: aws s3 cp output.csv s3://bucket/

Parallel Execution

yaml

steps:
  - name: setup
    command: echo "Setup"
    
  - name: task1
    command: echo "Task 1"
    depends: setup
    
  - name: task2
    command: echo "Task 2"
    depends: setup
    
  - name: finish
    command: echo "All tasks complete"
    depends: [task1, task2]

Working Directory

Set where commands execute:

yaml

steps:
  - name: in-project
    dir: /home/user/project
    command: python main.py
    
  - name: in-data
    dir: /data/input
    command: ls -la

Environment Variables

Global Environment

yaml

env:
  - API_KEY: secret123
  - ENV: production

steps:
  - name: use-env
    command: echo "Running in $ENV"

Step-Level Environment

Steps can have their own environment variables that override DAG-level ones:

yaml

env:
  - ENV: production

steps:
  - name: dev-test
    command: echo "Running in $ENV"
    env:
      - ENV: development  # Overrides DAG-level
      - TEST_FLAG: true
    # Output: Running in development

Load from .env Files

yaml

dotenv:
  - .env
  - .env.production

steps:
  - name: use-dotenv
    command: echo $DATABASE_URL

Capturing Output

Store command output in variables:

yaml

steps:
  - name: get-version
    command: git rev-parse --short HEAD
    output: VERSION
    
  - name: build
    command: docker build -t app:${VERSION} .

Basic Error Handling

Continue on Failure

yaml

steps:
  - name: optional-step
    command: maybe-fails.sh
    continueOn:
      failure: true
      
  - name: always-runs
    command: cleanup.sh

Simple Retry

yaml

steps:
  - name: flaky-api
    command: curl https://unstable-api.com
    retryPolicy:
      limit: 3

Timeouts

Prevent steps from running forever:

yaml

steps:
  - name: long-task
    command: ./process.sh
    timeoutSec: 300  # 5 minutes

Step Descriptions

Document your steps:

yaml

steps:
  - name: etl-process
    description: |
      Extract data from API, transform to CSV,
      and load into data warehouse
    command: python etl.py

Tags and Organization

Group related workflows:

yaml

name: customer-report
tags: 
  - reports
  - customer
  - daily

group: Analytics  # UI grouping

Workflow Basics ​

Your First Workflow ​

Workflow Structure ​

Steps ​

Simple Commands ​

Multi-line Scripts ​

Shell Selection ​

Dependencies ​

Parallel Execution ​

Working Directory ​

Environment Variables ​

Global Environment ​

Step-Level Environment ​

Load from .env Files ​

Capturing Output ​

Basic Error Handling ​

Continue on Failure ​

Simple Retry ​

Timeouts ​

Step Descriptions ​

Tags and Organization ​

See Also ​

Workflow Basics

Your First Workflow

Workflow Structure

Steps

Simple Commands

Multi-line Scripts

Shell Selection

Dependencies

Parallel Execution

Working Directory

Environment Variables

Global Environment

Step-Level Environment

Load from .env Files

Capturing Output

Basic Error Handling

Continue on Failure

Simple Retry

Timeouts

Step Descriptions

Tags and Organization

See Also