Back to catalog

Season 5 11 Episodes 39 min 2026

OpenClaw Gateway

v2026.3 — 2026 Edition. A comprehensive technical deep dive into the OpenClaw Gateway architecture, agent routing, and tools. (Version 2026.3)

LLM Infrastructure Multi-Agent Systems

🌐 English 🇪🇸 Español 🇫🇷 Français 🇵🇹 Português 🇮🇹 Italiano 🇵🇱 Polski 🇩🇪 Deutsch 🇷🇴 Română

Now Playing

Click play to start

0:00

The Personal AI Gateway Architecture

We explore what OpenClaw is and why it exists. Listeners will learn how a single self-hosted Gateway process connects various chat apps to an AI agent runtime.

3m 35s

Multi-Agent Routing

Learn how multiple agent identities coexist on a single Gateway. We cover channel bindings, routing rules, and isolating workspaces.

3m 12s

The Agent Workspace and System Prompts

Discover how OpenClaw shapes agent behavior. Listeners will learn about the system prompt assembly and bootstrap files like SOUL.md and AGENTS.md.

4m 05s

Session Management and DM Isolation

A deep dive into conversation routing and privacy. Listeners will understand DM scopes, session isolation, and lifecycle resets.

3m 56s

Managing Context Limits with Compaction

Learn how OpenClaw handles infinite conversations within finite LLM context windows using the Context Engine and auto-compaction.

3m 24s

Security and Trust Boundaries

Understand the OpenClaw trust model. Listeners will learn why it's a personal assistant gateway rather than a multi-tenant sandbox, and how to harden it.

3m 12s

The Exec Tool and Runtime Approvals

Explore how the agent interacts with your filesystem and shell safely. We cover the exec tool, safe binaries, and explicit approval flows.

3m 39s

Teaching Agents with Skills

Learn how to expand your agent's capabilities without writing code. We explore AgentSkills formatting, load-time gating, and ClawHub.

3m 43s

The Managed Browser Tool

Discover how OpenClaw gives agents eyes on the web. Listeners will learn about isolated Chromium profiles and existing-session MCPs.

3m 13s

Ephemeral Sub-Agents

Take orchestration to the next level by spawning background workers. We cover the sessions_spawn tool, nesting depth, and result announcements.

3m 20s

Proactive Automation Workflows

Turn your reactive bot into a proactive assistant. Listeners will learn how to combine Heartbeats, Cron jobs, and Hooks for powerful automation.

3m 58s

Episodes

The Personal AI Gateway Architecture

3m 35s

We explore what OpenClaw is and why it exists. Listeners will learn how a single self-hosted Gateway process connects various chat apps to an AI agent runtime.

Download

Hi, this is Alex from DEV STORIES DOT EU. OpenClaw Gateway, episode 1 of 11. You are away from your desk, but you need a quick code review. You send a message from your phone, and seconds later, a detailed analysis arrives back, generated entirely by a private server running on your local machine. The Personal AI Gateway Architecture is what makes this completely self-hosted setup possible. Most AI chatbots force your personal data through a third-party orchestration layer. If you want to connect a messaging app to a language model, you typically rely on a cloud service to glue the APIs together. This exposes your private queries and introduces latency. The OpenClaw Gateway architecture removes the intermediary service by running everything inside a single self-hosted process. This process serves as a dedicated bridge. It sits directly between your everyday messaging surfaces and your chosen AI runtime. Let us trace the exact path of a message to see how the logic flows. You send a code snippet from your phone using WhatsApp. On your local machine, the gateway process is already running. It maintains persistent connections to your messaging apps using specific libraries. For WhatsApp, the gateway uses the Baileys library to manage the direct socket connection. For Telegram, it relies on the grammY framework. When your message hits the local server, it arrives wrapped in a protocol-specific data structure. A WhatsApp event has a totally different payload shape than a Telegram event. The gateway immediately parses these incoming messages. It strips away the platform-specific wrappers, extracts the raw text and the sender identifier, and packs them into a standardized internal object. Here is the key insight. By the time your message reaches the AI runtime, the engine does not know where the text originated. The runtime operates completely independently of Baileys or grammY. It only sees a clean, uniform request. The AI processes your code snippet, generates the review, and hands a plain text response back to the gateway. The gateway then reverses the flow. It checks the origin marker attached to the initial request. If you asked the question via WhatsApp, the gateway formats the AI response into a Baileys-compatible structure and pushes it over the socket directly to your phone. If the request came from Telegram, it uses grammY to dispatch the reply. Keeping all of this within a single self-hosted process drastically reduces operational complexity. You do not need to deploy multiple microservices, configure message queues, or expose local endpoints to external webhooks just to route a text. One isolated application manages the network sockets, executes the normalization logic, and invokes the AI engine. Because the gateway unifies multiple channels internally, your conversation context remains centralized. You can start troubleshooting a bug on Telegram while walking, and ask a follow-up question later on WhatsApp. The gateway ensures the AI runtime retains the complete history, regardless of which mobile app you open. The most significant advantage of this architecture is absolute control over your inputs and outputs across any messaging interface, entirely on your own hardware. If you enjoy the podcast and want to support the show, you can search for DevStoriesEU on Patreon. That is all for this one. Thanks for listening, and keep building!

Multi-Agent Routing

3m 12s

Learn how multiple agent identities coexist on a single Gateway. We cover channel bindings, routing rules, and isolating workspaces.

Download

Hi, this is Alex from DEV STORIES DOT EU. OpenClaw Gateway, episode 2 of 11. Your work bot and your personal assistant bot should never share the same memories, yet spinning up a separate server instance for every new bot persona is an operational nightmare. This is the exact tension resolved by Multi-Agent Routing. Listeners often confuse this setup with multi-tenant SaaS architecture. Let us clear that up immediately. Multi-Agent Routing is not designed for hosting disparate bots for thousands of external customers. Instead, it exists to organize one owner's various personas, or to link specific group chats to specific, purpose-built bots, all running efficiently under one roof. To make this work, the system strictly separates two concepts. You need to understand the difference between an account ID and an agent ID. The account ID is the human. It is the identifier of the person sending the text. The agent ID is the bot. It defines the specific persona, model, and instructions the human is talking to. A single human with one account ID will routinely talk to multiple different agent IDs throughout the day. On the OpenClaw Gateway, multiple isolated agents run side-by-side. You do not share memory states across them. Every agent ID gets its own dedicated workspace and its own distinct session store. When an inbound message hits the Gateway, the system must figure out exactly which isolated workspace should receive it. It does this using routing rules known as bindings. Bindings are deterministic mappings. They look at the exact metadata attached to an incoming message and route it accordingly. Every inbound message carries a payload of connection data. This includes the channel, such as WhatsApp or Telegram. It includes the account ID of the sender. It can also include a peer identifier, which might dictate a specific group chat room. You configure bindings to evaluate this metadata. For example, you can create a binding that dictates any message arriving on the WhatsApp channel routes directly to a fast, everyday agent ID. This agent handles quick tasks, grocery lists, or simple web searches. In the very same configuration, you set another binding stating that any message arriving via Telegram routes to a heavy-duty agent ID running a larger model like Opus for deep coding work. The logic flow is straightforward. The Gateway receives a Telegram message. It reads the channel metadata and your account ID. It checks the bindings, finds the rule matching Telegram, and forwards the payload to the Opus agent ID. The Opus agent wakes up in its isolated workspace. It queries its own dedicated session store to retrieve the conversation history. It has absolutely no access to the grocery list you just sent to the WhatsApp agent. Here is the key insight. Multi-Agent Routing turns a single Gateway into a deterministic switchboard, using channel and user metadata to guarantee the right persona always fields the right request without ever cross-contaminating their memories. As always, thanks for listening. See you in the next episode.

The Agent Workspace and System Prompts

4m 05s

Discover how OpenClaw shapes agent behavior. Listeners will learn about the system prompt assembly and bootstrap files like SOUL.md and AGENTS.md.

Download

Hi, this is Alex from DEV STORIES DOT EU. OpenClaw Gateway, episode 3 of 11. Ever wish you could literally write a soul for your AI, perhaps making it respond exclusively as a grumpy space lobster? In OpenClaw, achieving that level of character control requires zero code. It happens entirely through plain text, which brings us to the Agent Workspace and System Prompts. An agent in OpenClaw is defined by its workspace. This workspace is simply a file directory containing a specific set of Markdown bootstrap files. Instead of burying your system prompt inside application logic or complex database fields, OpenClaw exposes the core instructions as readable, version-controllable documents. When the application runs, OpenClaw reads these files to construct the monolithic system prompt that guides the large language model. The construction relies on four primary documents. The first is the agents markdown file. This document establishes the core identity, primary objectives, and strict operational boundaries of the agent. Think of it as the foundational job description. It tells the model what it is supposed to achieve and what topics it must avoid entirely. The second document is the soul markdown file. This is where personality, tone, and conversational style live. This is exactly where you instruct the model to act like a grumpy space lobster. You write explicit directions telling the agent to complain about the freezing vacuum of space, use crustacean metaphors, and act generally annoyed by human inquiries. By isolating personality from the core logic, you can swap the tone of your agent without risking its functional reliability. The third component is the tools markdown file. This text explains the external capabilities available to the agent. It describes which functions the model can trigger, the required parameters for those functions, and how to logically interpret the results. It bridges the gap between the internal reasoning of the model and your actual codebase. The final document is the user markdown file. This file injects context about the person interacting with the agent. It can hold user preferences, technical skill levels, or account constraints. This ensures the agent tailors its responses to the specific human on the other end of the chat, rather than offering generic advice. Here is the key insight. OpenClaw takes the contents of these four files and concatenates them together. This combined string becomes the final system prompt. The crucial detail is that this prompt is injected into the context window on every single conversational turn. The model does not read these files once at startup and somehow hold them in a separate memory bank. It reads the entire concatenated block every time the user sends a new message. This architectural choice dictates how you must write your workspace files. Because the entire text of the workspace is prepended to every single interaction, your token count will add up aggressively. If you write a three-page backstory in the soul file, you pay the processing cost for those three pages every time the user simply says hello. More importantly, large system prompts consume the available context window limits. A bloated workspace crowds out the actual conversation history, causing the model to forget earlier parts of the chat much faster. You must be ruthless when editing your workspace documents. Remove redundant instructions. Use precise language. If a rule in the agents file is never triggered, delete it. The system prompt is not a one-time configuration step. It is a recurring tax on your context window and your API budget, paid on every single turn of the conversation. Keep it lean, and your agent will remain focused. Thanks for listening. Take care, everyone.

Session Management and DM Isolation

3m 56s

A deep dive into conversation routing and privacy. Listeners will understand DM scopes, session isolation, and lifecycle resets.

Download

Hi, this is Alex from DEV STORIES DOT EU. OpenClaw Gateway, episode 4 of 11. You deploy a new chat bot to your company workspace. Alice asks it to summarize her private meeting notes, and five minutes later, it accidentally quotes those notes back to Bob. Your agent just leaked data because it treats everyone in the workspace as the exact same person. Fixing this requires understanding Session Management and DM Isolation. Before fixing the overlap, we need to address a common misconception. Engineers often confuse session keys with authentication tokens. They are not the same thing. Session keys are not security barriers. They are simply routing selectors. They tell the OpenClaw system which block of conversation history to pull from the database and inject into the prompt. If you need to restrict who can talk to your agent, you use proper authentication. Session keys just keep the text separated. Every interaction with an OpenClaw agent happens inside a session. The session holds the conversation history and the active short-term context. By default, OpenClaw routes all traffic through a single shared session key called main. If you run a local terminal script or a personal assistant just for yourself, this default behavior works perfectly. All your context stays in one continuous thread. But if you connect that exact same agent to a multi-user platform, the default setting breaks down. Every user talking to the bot writes to the exact same main history. The agent reads the prompt from Alice, generates an answer, and saves it. When Bob sends a message ten seconds later, the agent reads the input from Bob alongside the previous input from Alice. This is where it gets interesting. You prevent this overlap using DM Isolation settings. When you configure your platform integration, you change the session routing strategy from the default to per-channel-peer. When you enable per-channel-peer, OpenClaw stops routing traffic to the main session. Instead, it generates a unique session key dynamically for every incoming message. It does this by combining the platform channel identifier with the user identifier. Now, when Alice messages the bot in a specific channel, OpenClaw builds a session key unique to her and that channel. When Bob messages the bot, his user identifier generates a completely different session key. The system loads a clean, empty state for Bob. Their contexts are entirely isolated. If Alice talks to the bot in a completely different channel, she gets a fresh session there, too. These sessions do not hold state forever. OpenClaw handles session cleanup through two specific lifecycle events. The first is an idle reset. If a particular session receives no new messages for a configured duration, the system drops the context. The next time the user sends a message, they start with a blank slate. The second cleanup mechanism is a hard daily reset. Regardless of how active a conversation is, OpenClaw forcibly purges all session contexts at exactly 4:00 AM server time. This daily reset acts as an automated garbage collection step. It ensures that memory is freed up and that long-running conversations do not silently consume massive amounts of context tokens over weeks of use. When you deploy agents to group environments, never assume the platform handles user separation for you. Explicitly mapping your session keys to the correct user boundary is the only way to prevent accidental context leaks. That is all for this one. Thanks for listening, and keep building!

Managing Context Limits with Compaction

3m 24s

Learn how OpenClaw handles infinite conversations within finite LLM context windows using the Context Engine and auto-compaction.

Download

Hi, this is Alex from DEV STORIES DOT EU. OpenClaw Gateway, episode 5 of 11. AI context windows are not infinite, but your conversation often needs to be. When a long-running session hits a wall, the standard fix is to start dropping older messages entirely, which makes the agent suddenly forget critical setup steps. Managing Context Limits with Compaction solves this by gracefully folding older chats into dense summaries before the limit is ever breached. Before looking at the mechanics, do not confuse this with session pruning. Session pruning is a separate operation that only trims excess tool results. Compaction operates directly on the core conversation history. Think about a long-running coding session. The agent has been generating boilerplate, reading local files, and debugging logic errors for an hour. Every interaction adds to the token count. If the system hits the hard limit of the underlying language model, the API rejects the request and the session crashes. You need a way to reclaim space without breaking the logic of the assistant. OpenClaw handles this through the Context Engine. The Context Engine manages the entire flow of messages between the user and the model. Inside this engine, there is a specific lifecycle point called compact. This phase acts as an automated safety valve for context overflow. The engine actively monitors the token usage of the current conversation. You define a maximum token threshold in your configuration. As long as the conversation stays below this threshold, messages pass through normally. When the token count approaches the limit, the engine automatically triggers a memory flush via the compact lifecycle point. When the flush triggers, the system splits the message history into two sections. It separates the most recent messages from the older, historical messages. The recent messages remain completely intact. The engine preserves the exact wording of the immediate back-and-forth so the agent does not lose its current train of thought or the exact syntax of the function you are actively working on. Here is the key insight. The older messages are not discarded. Instead, they are routed to a secondary summarization process. This process reads the bulk of the early conversation and condenses it into a short summary text. This text captures the original goals, the architectural decisions made early on, and any established rules, while stripping out conversational filler and obsolete iterations of the code. The engine then restructures the active memory. It replaces the large block of raw older messages with this single summary block. The newly structured prompt contains the summary block first, followed by the verbatim recent messages. The total token count drops drastically. The agent still understands the historical context by reading the summary, and it can continue executing the active task by reading the recent messages. The conversation continues smoothly without any manual intervention. Effective context management is not about retaining every exact word you typed, but about systematically compressing the past so the agent has maximum room to reason about the present. That is all for this one. Thanks for listening, and keep building!

Security and Trust Boundaries

3m 12s

Understand the OpenClaw trust model. Listeners will learn why it's a personal assistant gateway rather than a multi-tenant sandbox, and how to harden it.

Download

Hi, this is Alex from DEV STORIES DOT EU. OpenClaw Gateway, episode 6 of 11. Putting an AI in a shared Slack workspace with terminal access sounds like a guaranteed security incident, unless you strictly define who is allowed to tell it what to do. Today we cover Security and Trust Boundaries. The fundamental rule of the OpenClaw Gateway is that it operates on a personal assistant trust model. Many developers assume AI gateways come with complex user-level authorization built in. OpenClaw does not. OpenClaw is not designed for hostile multi-tenant isolation. It does not try to safely separate malicious users from each other on a single shared instance. Instead, the entire Gateway acts as a single boundary representing one trusted operator. The system assumes that anyone who can communicate with the Gateway is authorized to act on behalf of the owner. Think of it like an unlocked workstation. If your OpenClaw instance is configured with a tool that modifies infrastructure, anyone who can talk to that instance can trigger that modification. The AI model itself has no concept of user roles or access tokens. It only sees a stream of incoming text. This brings us to the serious risk of shared channels. If you connect your Gateway to a public Telegram group or a large team Slack channel, every single user in that channel is now inside your trust boundary. The AI treats every message it reads as a valid instruction. If an external user types a prompt injection attack into the chat, they are hijacking the delegated tool authority of your bot. The Gateway cannot distinguish between you asking for a system status and an attacker tricking the model into running a destructive shell script. The authority belongs to the bot, but the control belongs to whoever provides the prompt. If you have an exposed connection, like a Telegram bot, you must lock it down. First, turn off elevated tools for that specific Gateway profile. Do not give a publicly accessible bot access to your local file system or sensitive internal APIs. Keep its toolset limited to read-only or harmless actions. Second, restrict the communication layer. Configure the connection to only accept direct messages from specific paired users, effectively ignoring group chats and strangers entirely. By limiting who can input text and what tools the bot can execute, you secure the boundary. To verify you have not left a door open, use the built-in command line utility. Run the command openclaw security audit. This tool scans your active Gateway configuration and checks for two primary risks. First, it checks your network exposure. It will warn you if your instance is listening on public interfaces rather than safely bound to local host. Second, it flags permissive tools. The audit will alert you if you have high-risk capabilities, like arbitrary code execution, enabled at the same time as public chat integrations. Here is the key insight. The boundary of your system security is exactly the boundary of who can submit text to your models. If you cannot limit the audience, you must limit the tools. Thanks for spending a few minutes with me. Until next time, take it easy.

The Exec Tool and Runtime Approvals

3m 39s

Explore how the agent interacts with your filesystem and shell safely. We cover the exec tool, safe binaries, and explicit approval flows.

Download

Hi, this is Alex from DEV STORIES DOT EU. OpenClaw Gateway, episode 7 of 11. Giving an AI model raw command-line access sounds like a fast track to a ruined system. But what if the agent could run harmless data-parsing commands automatically, while explicitly asking your permission before touching anything destructive? The Exec Tool and Runtime Approvals handle exactly this balance. The Exec tool allows an OpenClaw agent to run shell commands to interact with the system. When the agent decides it needs to execute a command, it targets either the host machine directly or a designated sandbox container. Using a sandbox container limits the blast radius for general execution. However, sometimes the agent genuinely needs to interact with your local file system, read your local logs, or start local processes. Running commands on the host is powerful, which is exactly why the authorization model exists. OpenClaw uses a two-tier system for authorization to keep you in control without slowing down the agent unnecessarily. The first tier relies on safe bins. These are specific binaries you explicitly whitelist as harmless, such as jq for parsing JSON or grep for searching text. If the agent calls a command that only uses these safe bins, the Gateway executes it immediately. There are no prompts, and the agent maintains its momentum. The second tier catches everything else. If the agent attempts a full shell execution or tries to use a binary not on the safe list, the Gateway intercepts the request. It halts the agent and triggers the runtime approvals system. A prompt appears in your Gateway UI or your companion app. You get to review the exact command string the agent wants to run. If you approve, the Gateway executes the command and returns the output to the agent. If you reject it, the command never runs. Instead, the agent receives an execution denied error, and it must figure out another way to proceed or ask you for clarification. Here is the key insight into how this plays out in practice. Say the agent needs to analyze a massive log file. It calls grep to extract the errors. That runs instantly. Next, it needs to compile the project, so it attempts to run npm run build as a background process. The Gateway stops the agent and pings your companion app. You read the command, realize it makes sense, and hit approve. The build starts in the background. Later, the agent decides to clean up by attempting to delete a source file. The Gateway pings you again. You deny the request. Your file remains untouched, and the agent learns it lacks permission for that action. There is a strict security constraint you need to know about when executing on the host. The Gateway explicitly rejects any attempt to override the environment path variable. This prevents hijacking. Without this block, a malicious prompt could trick the agent into redefining the path, causing a safe bin name like grep to execute a destructive script hidden in a different folder. Because the path is locked, the safe bins list remains absolute. The real power of the Exec tool is not just that the AI can run commands, but that the tiered security model forces a human into the loop only when the stakes are high, leaving the agent completely autonomous for the routine work. If you want to help keep the show going, you can search for DevStoriesEU on Patreon. That is all for this one. Thanks for listening, and keep building!

Teaching Agents with Skills

3m 43s

Learn how to expand your agent's capabilities without writing code. We explore AgentSkills formatting, load-time gating, and ClawHub.

Download

Hi, this is Alex from DEV STORIES DOT EU. OpenClaw Gateway, episode 8 of 11. You want your agent to manipulate images using a command-line tool. Usually, that means writing Python wrappers, defining schemas, and hoping the language model understands the parameters. But you do not actually need code to teach an AI a new capability. You just need a text file. Today, we are focusing on Teaching Agents with Skills in OpenClaw. A Skill in OpenClaw is essentially an instruction manual. It is a plain text file named SKILL dot md, formatted using the AgentSkills standard. It is not a compiled binary and it is not a Python script. It is a Markdown document that tells the agent exactly how to orchestrate existing tools. Inside this file, you write step-by-step instructions. You define the purpose of the skill, the tools it uses, and the sequence of actions the agent must take. If you are building an image processing skill called image-lab, your SKILL dot md file will explain how to format the command line arguments to resize or crop a picture. The agent reads this file and translates your English instructions into precise command-line executions. A skill is useless if the underlying tool is missing from the system. OpenClaw prevents failures here by using load-time gating. This allows you to define prerequisites so your agent never attempts to use software that is not installed. You handle this by declaring requirements in the skill configuration. For the image-lab skill, you might need a specific package manager to run the commands. You specify this using the requires dot bins property, listing the executable name, such as uv. You can also require specific environment variables using the requires dot env property, which ensures an API key or configuration path is present before the skill activates. When OpenClaw starts, it evaluates these gates. It checks the local environment for the uv binary and any requested variables. If they are missing, OpenClaw silently skips the skill. The system will not crash and the agent will not hallucinate unsupported commands. It simply operates without the image-lab capabilities. Here is the key insight. OpenClaw needs to deliver these valid skills to the language model efficiently. It takes all the skills that passed the load-time checks and compiles them into a compact XML list. This XML block is injected directly into the agent system prompt. Language models are highly optimized for parsing XML tags. By formatting the instruction manual this way, the agent cleanly separates its core persona from the strict, step-by-step logic defined in your skills. You do not have to write every skill yourself. OpenClaw integrates with ClawHub, the official registry for community-built skills. If you need your agent to operate a specific database or cloud utility, you can search ClawHub and install an existing skill. It downloads into your environment, passes through the same load-time checks, and automatically injects into the system prompt. The most valuable aspect of the Skills architecture is decoupling capability from code. You can completely rewire how your agent solves complex multi-step problems just by editing a Markdown file, without ever modifying your application logic or compiling a new build. Thanks for spending a few minutes with me. Until next time, take it easy.

The Managed Browser Tool

3m 13s

Discover how OpenClaw gives agents eyes on the web. Listeners will learn about isolated Chromium profiles and existing-session MCPs.

Download

Hi, this is Alex from DEV STORIES DOT EU. OpenClaw Gateway, episode 9 of 11. Your agent is trying to extract data from a web page, but it is not just parsing raw HTML. It needs to render a dynamic React dashboard, click a button, and wait for a chart to load. You need to give the agent a fully functional web interface, but you absolutely do not want it messing up your own open tabs or hijacking your mouse. The Managed Browser Tool handles exactly this. This tool gives your agent the ability to click, type, navigate, and capture visually exactly what a human would see. It runs a real browser environment to interact with client-side rendered applications, bypassing the limitations of simple HTTP requests. To keep your workspace safe, the Managed Browser Tool uses different operational profiles. The default is the isolated openclaw profile. When the agent uses this profile, the tool spins up a completely separate, dedicated Chromium instance. It has its own cookies, its own local storage, and its own blank history. The agent navigates its own dedicated browser. It can fill out forms and click through menus without ever touching your personal Chrome session. However, there are times when an agent needs access to internal tools where you are already authenticated. For this, the tool provides the user profile. Instead of launching a blank slate, the user profile attaches to your existing, signed-in Chrome session. It connects through the DevTools Protocol via the Model Context Protocol. This allows the agent to leverage your active login tokens for that specific task without requiring you to pass credentials directly to the AI. Here is the key insight. Giving an AI agent an automated browser inside your environment introduces immediate security risks. To mitigate this, control of the Managed Browser Tool is strictly loopback-only. The agent communicates with the browser entirely over the local loopback interface. More importantly, every navigation request is guarded by the Server-Side Request Forgery policy. This policy ensures the agent cannot use its browser instance as a proxy to silently scan your local network ports or hit unauthorized internal services. Think about the React dashboard scenario. First, the agent issues a command to launch the browser using the default isolated profile. It navigates to the dashboard URL and actively waits for the JavaScript components to mount and the DOM to settle. Next, it locates the specific chart element using a CSS selector and triggers a click event to expand the view. Finally, it issues a screenshot command. The browser captures the rendered frame and returns the image buffer directly back to the gateway. Giving an agent a browser should never mean handing over the keys to your internal network or your personal Chrome session. The Managed Browser Tool keeps the agent highly capable, but strictly contained. That is your lot for this one. Catch you next time!

Ephemeral Sub-Agents

3m 20s

Take orchestration to the next level by spawning background workers. We cover the sessions_spawn tool, nesting depth, and result announcements.

Download

Hi, this is Alex from DEV STORIES DOT EU. OpenClaw Gateway, episode 10 of 11. You ask your AI to scrape a complex website or crunch thousands of log lines, and then you just sit there. You stare at a spinning indicator for ten minutes, unable to ask another question until the task finishes. To fix this blocking problem, you use Ephemeral Sub-Agents. An ephemeral sub-agent is a temporary, isolated background worker. Instead of doing the heavy computation itself, your main chat agent delegates the work. It does this using a specific system tool called sessions spawn. When the main agent encounters a massive task, it triggers this tool. It passes along a clear prompt, any required context files, and the specific instructions for the job. This action creates a completely new, invisible chat session running entirely in the background. Because this session is isolated, your main agent is immediately freed up. You can keep talking to your primary assistant, ask unrelated questions, or assign more tasks while the sub-agent grinds away at the heavy data out of sight. Let us look at a concrete scenario. You drop a massive server error log into your main chat and ask for a security audit. Processing that log line by line with your primary, heavy-duty LLM takes a long time and burns a lot of expensive tokens. Instead, your main agent delegates the job. Here is the key insight. When calling sessions spawn, the main agent can specify a completely different model for the background task. It can assign a cheaper, faster LLM to the sub-agent. This background worker uses the faster model to chew through the repetitive log analysis. The main agent stays responsive using the smart model, while the sub-agent does the grunt work using the fast model. When the sub-agent finally finishes parsing those logs, it needs a way to hand the data back. It does this by announcing its final result up the chain. The sub-agent takes its compiled summary and injects a message directly back into the original requester chat. You just see a new message pop up from the sub-agent with your completed log analysis, ready for you and the main agent to discuss. This architecture is known as the orchestrator pattern, and it relies on rules around nesting depth. The scenario we just covered is nesting depth one. The user talks to the main agent, and the main agent spawns a sub-agent. OpenClaw also supports nesting depth two. In a depth two scenario, your log-parsing sub-agent might encounter a heavily encoded payload in the logs. It can then spawn its own sub-agent just to decode that specific payload. That second-level agent decodes the text, passes it back to the first sub-agent, which then completes the log analysis and reports back up to your main chat. The system strictly caps this at depth two. This hard limit prevents runaway recursive loops where agents continuously spawn other agents forever, draining your compute resources. Ephemeral sub-agents fundamentally change how you interact with a prompt interface. You stop treating your LLM like a single blocked thread and start treating it like an asynchronous task manager. That is your lot for this one. Catch you next time!

Proactive Automation Workflows

3m 58s

Turn your reactive bot into a proactive assistant. Listeners will learn how to combine Heartbeats, Cron jobs, and Hooks for powerful automation.

Download

Hi, this is Alex from DEV STORIES DOT EU. OpenClaw Gateway, episode 11 of 11. A true assistant does not just sit idle waiting for you to type a command. It proactively checks your systems and alerts you when something actually needs your attention. Moving from a reactive prompt-and-response loop to a self-directing agent requires specific scheduling and execution mechanisms. This brings us to Proactive Automation Workflows. OpenClaw handles time-based automation using two distinct mechanisms. The first is the Heartbeat. The second is Cron. Engineers often confuse them because they both trigger actions automatically based on time, but they have completely different architectural roles regarding state and session isolation. The Heartbeat is a periodic loop that runs continuously within your main, active session. It is designed for routine, ongoing checks where your current context matters. Here is the key insight. Because the Heartbeat operates inside your current session, it has direct access to your active interface. Think about a scenario where you want to monitor your inbox for urgent messages. You configure a Heartbeat to execute a check every thirty minutes. If it detects a critical email, the Heartbeat can immediately push a natural language alert straight into your active conversation stream. It acts as an ongoing background thread attached to your current user state, allowing for immediate, contextual interruptions. Cron operates entirely differently. It is built for precise, scheduled jobs that require complete isolation. If you want the system to compile a comprehensive daily morning briefing from various data sources at exactly six in the morning, you use Cron. When a Cron schedule triggers, OpenClaw does not use your active chat. Instead, it spins up a completely isolated background session. It pulls the necessary data, processes the analytical heavy lifting quietly, and tracks the entire job internally as a Background Task. This means the heavy processing does not pollute the context window of your active desktop chat. Once the job completes, the finished briefing is stored and ready for you to retrieve when you log in. The Heartbeat shares state with you, while Cron runs headless and isolated. Time-based triggers are only part of the workflow. OpenClaw relies on Hooks and Standing Orders as complementary tools to complete the automation loop. While Heartbeat and Cron dictate when an action happens based on a clock, Hooks handle external, event-driven triggers. A Hook exposes a listening endpoint that outside systems can call. If a critical server log indicates a failure, an external system can hit an OpenClaw Hook, waking the assistant to analyze the error immediately without waiting for the next scheduled Heartbeat pulse. Standing Orders provide the persistent operational rules for all these autonomous runs. When that isolated Cron job wakes up at six in the morning, there is no user present to guide its output. The Standing Orders define the exact format, analytical depth, and priority rules the assistant must adhere to while it works completely independently. By combining periodic Heartbeats for active monitoring, isolated Cron jobs for heavy scheduled tasks, and persistent Standing Orders to govern unguided behavior, you fundamentally change the nature of the application. You stop building a simple chat interface and start deploying a true autonomous assistant. Since this is the final episode of our OpenClaw series, I highly encourage you to explore the official documentation, try configuring these background tasks hands-on, or visit devstories dot eu to suggest topics for our next series. Thanks for spending a few minutes with me. Until next time, take it easy.