Back to Blog
November 27, 2025
7 min

LAO: Building AI Workflows That Never Leave Your Machine

Cloud AI is powerful but comes with a cost: your data leaves your control, you pay per token, and you're at the mercy of API rate limits and downtime. For many use cases - especially in security research, private data processing, or just wanting full control - this isn't acceptable.

I built LAO (Local AI Orchestrator) with one philosophy: developers should bend AI to their will - no cloud, no compromise.

The Problem: AI Workflow Fragmentation

Building complex AI workflows usually means:

  • Juggling multiple API keys and services
  • Writing custom glue code to chain models
  • Paying for every request
  • Trusting cloud providers with sensitive data
  • Dealing with rate limits and quotas

Want to run a workflow that:

  1. Extracts text from PDFs
  2. Summarizes each section
  3. Generates questions from summaries
  4. Validates answers against source material

With cloud APIs, that's 4 different service calls, authentication handling, error recovery, and a monthly bill. And your proprietary PDFs just went through someone else's servers.

My Solution: DAG-Based Local Orchestration

LAO lets you build visual workflows as Directed Acyclic Graphs (DAGs) where each node is a step:

  • Run local LLMs (Ollama, llama.cpp)
  • Execute shell commands
  • Process files
  • Transform data
  • Conditional branching based on outputs

All running on your hardware, with your models, processing your data locally.

Architecture Overview

LAO is built with three core components:

1. Core DAG Engine (/core) The workflow execution engine written in Rust:

pub struct DAG {
    nodes: HashMap<NodeId, WorkflowNode>,
    edges: Vec<Edge>,
    execution_order: Vec<NodeId>,
}

impl DAG {
    pub async fn execute(&mut self, context: &Context) -> Result<Output> {
        for node_id in &self.execution_order {
            let node = self.nodes.get_mut(node_id)?;
            let output = node.execute(context).await?;

            // Pass output to downstream nodes
            for edge in self.outgoing_edges(node_id) {
                context.set_input(edge.target, output.clone());
            }
        }
        Ok(context.final_output())
    }
}

2. Plugin System Modular Rust crates that extend LAO's capabilities:

  • LLM plugins (Ollama, llama.cpp, transformers)
  • File processors (PDF, image, video)
  • Data transformations (JSON, CSV, XML)
  • External integrations (Git, databases)

Plugins are dynamically loaded at runtime and can be developed independently.

3. Visual UI (/ui/lao-ui) Native GUI built with egui (immediate-mode Rust GUI):

  • Drag-and-drop workflow builder
  • Real-time execution visualization
  • Node configuration panels
  • Debug output for each step

Key Features I Built

Prompt-Driven Workflow Generation

This is where it gets interesting. Instead of manually building workflows, describe what you want:

lao generate "Extract key points from markdown files,
              generate a summary for each,
              then create a combined report"

LAO uses a local LLM to:

  1. Parse your natural language description
  2. Generate a DAG structure
  3. Configure appropriate plugins
  4. Validate the workflow is executable

The generated workflow is saved as YAML:

nodes:
  - id: extract
    plugin: markdown-parser
    config:
      input_dir: "./docs"

  - id: summarize
    plugin: ollama-llm
    config:
      model: "llama3"
      prompt: "Summarize key points: {input}"

  - id: combine
    plugin: template-renderer
    config:
      template: "report.md.j2"

edges:
  - from: extract
    to: summarize
  - from: summarize
    to: combine

Conditional Logic & Branching

Workflows can make decisions based on step outputs:

nodes:
  - id: analyze-code
    plugin: static-analyzer

  - id: check-security
    plugin: conditional
    config:
      condition: "output.vulnerabilities > 0"
      on_true: "generate-report"
      on_false: "mark-safe"

This enables complex automation like:

  • "Only summarize if document is > 1000 words"
  • "Route to different models based on language detected"
  • "Retry with different parameters on failure"

CLI for Automation

The command-line interface makes LAO perfect for CI/CD:

# Run a workflow
lao run workflow.yaml

# Validate before running
lao validate workflow.yaml

# List available plugins
lao plugins list

# Watch a directory and auto-run
lao watch ./docs --workflow process.yaml

I use this in my security research to automatically:

  • Analyze code samples for vulnerabilities
  • Generate CTF writeup drafts from solution scripts
  • Process and summarize security advisories

Cross-Platform Packaging

LAO packages for every major platform:

  • Linux: .deb, .rpm, AppImage
  • macOS: .dmg
  • Windows: .msi, .zip

One command builds all packages:

cargo xtask package

Technical Deep Dive

Why Rust for AI Orchestration?

You might wonder why not Python (the AI lingua franca)? Several reasons:

1. Performance DAG execution involves scheduling, parallelization, and state management. Rust's zero-cost abstractions mean:

  • Fast workflow execution
  • Minimal memory overhead
  • Efficient plugin loading

2. Safety Orchestrating multiple plugins and concurrent LLM calls needs:

  • Thread safety (Rust's ownership prevents data races)
  • Memory safety (no garbage collection pauses during LLM inference)
  • Error handling (Result types force explicit error paths)

3. Native UIs egui compiles to native code - no Electron bloat:

  • LAO's GUI is ~10MB vs 100MB+ for Electron apps
  • Instant startup, minimal RAM usage
  • Works great on older hardware

4. Single Binary Distribution Cargo builds a single executable with all dependencies:

cargo build --release
# One binary, no Python environment, no pip install

The Plugin Architecture

Plugins implement a simple trait:

#[async_trait]
pub trait WorkflowPlugin: Send + Sync {
    fn name(&self) -> &str;

    async fn execute(
        &self,
        input: PluginInput,
        config: PluginConfig,
    ) -> Result<PluginOutput>;

    fn validate_config(&self, config: &PluginConfig) -> Result<()>;
}

This makes adding new capabilities trivial. Example Ollama plugin:

pub struct OllamaPlugin {
    client: OllamaClient,
}

#[async_trait]
impl WorkflowPlugin for OllamaPlugin {
    fn name(&self) -> &str { "ollama-llm" }

    async fn execute(&self, input: PluginInput, config: PluginConfig)
        -> Result<PluginOutput>
    {
        let prompt = config.get("prompt")?;
        let model = config.get("model")?;

        let response = self.client
            .generate(model, prompt)
            .await?;

        Ok(PluginOutput::text(response))
    }
}

Offline-First Design

LAO never makes network requests unless explicitly configured:

  • All models run locally (Ollama, llama.cpp)
  • Plugin registry is local
  • Workflow definitions are local files
  • No telemetry, no analytics, no phone-home

Your data stays on your machine. Always.

Real-World Use Cases

Security Research

I use LAO to analyze malware samples:

  1. Extract strings from binary
  2. Classify suspicious patterns (local LLM)
  3. Generate YARA rules
  4. Cross-reference with CVE database
  5. Output structured report

All without sending samples to cloud APIs.

Document Processing

Batch process research papers:

  1. Extract text from PDFs
  2. Identify methodology sections
  3. Summarize findings
  4. Generate comparison table
  5. Export to markdown

Code Analysis

Audit codebases for security issues:

  1. Run static analysis
  2. LLM reviews flagged code
  3. Generate remediation suggestions
  4. Create GitHub issues automatically

Current Status & Roadmap

LAO is in active development with a working GUI, CLI, and plugin system.

Completed:

  • Core DAG engine with parallel execution
  • Plugin architecture with dynamic loading
  • egui-based visual workflow builder
  • CLI tools for automation
  • Cross-platform packaging

In Progress:

  • More built-in plugins (Git, Docker, HTTP)
  • Workflow marketplace/sharing
  • Advanced scheduling (cron-like triggers)
  • Better error recovery and retry logic

Planned:

  • VSCode extension for editing workflows
  • Remote execution (still local, but distributed)
  • Workflow versioning and rollback
  • Performance profiling and optimization

What I Learned

Building LAO taught me:

  • Async Rust: Tokio runtime, async traits, concurrent DAG execution
  • GUI Programming: egui immediate-mode patterns, native rendering
  • Plugin Systems: Dynamic loading, ABI stability, trait objects
  • AI Integration: Running local models efficiently, prompt engineering
  • Cross-Platform Builds: Packaging for multiple OSes, dependency management

The hardest part? Making the UI intuitive. DAGs are powerful but complex - balancing power and usability took many iterations.

Why Local-First Matters

Cloud AI is convenient, but local-first is:

  • Private: Your data never leaves your machine
  • Cost-effective: No per-token pricing
  • Reliable: No API downtime or rate limits
  • Flexible: Use any model, any hardware
  • Sovereign: You control the entire stack

For security research, privacy-sensitive work, or just wanting full control - local is the only option.

Try It Yourself

LAO is open source under MIT license.

# Clone and build
git clone https://github.com/abendrothj/lao
cd lao
cargo build --release

# Run the GUI
./target/release/lao-ui

# Or use the CLI
./target/release/lao run examples/summarize.yaml

Check out /workflows for example workflows and /docs for architecture details.

If you build workflows with LAO, I'd love to see what you create. The plugin system is designed for community extensions - contribute your own!


The future of AI doesn't have to run in someone else's datacenter. Sometimes the best cloud is the one sitting on your desk.