AI Integration Guide

This guide covers the comprehensive AI integration capabilities provided by the OPENAI_COMPLETION, HUB_COMPLETION, and AZURE_OPENAI_COMPLETION steps. These steps provide a powerful interface to Large Language Models (LLMs) with support for text, images, audio, tool calling, and structured outputs.

Overview

ProActions provides three completion step types that share the same underlying implementation:

HUB_COMPLETION - Routes through ProActions Hub service
OPENAI_COMPLETION - Direct OpenAI API integration
AZURE_OPENAI_COMPLETION - Azure OpenAI service integration

All three steps support the same features and configuration options, differing only in their default service names.

Basic Usage

Simple Text Completion

The most basic usage involves sending a prompt and receiving a text response:

- step: HUB_COMPLETION
  behavior: 'You are a helpful assistant.'
  instruction: 'Write a short summary of the benefits of unit testing. Answer in JSON format.'
  options:
    temperature: 0.2
  response_format: 'json_object'

Using Stored Prompts

You can reference pre-configured prompts from the Prompt Management UI:

- step: OPENAI_COMPLETION
  promptId: d112a306-8f7a-43bc-8f38-94161b6a91cd

Configuration Options

Core Prompts

Option	Type	Description
`behavior`	string	System prompt that defines the AI's behavior and role
`instruction`	string	User prompt containing the actual task or question
`promptId`	string	ID of a stored prompt configuration (overrides behavior/instruction)

Both behavior and instruction support template variables:

- step: HUB_COMPLETION
  behavior: 'You are an expert newsroom assistant.'
  instruction: 'Summarize this article: {textContent}'

Model Selection

Option	Type	Description
`model`	string	Model ID to use (e.g., `gpt-4o`, `gpt-4-turbo`)

For Azure deployments, the deploymentId from the service configuration is used instead.

API Options

Configure model behavior through the options object:

- step: OPENAI_COMPLETION
  instruction: 'Generate creative headlines'
  options:
    temperature: 0.9
    top_p: 0.95
    max_tokens: 150
    n: 3 # Generate 3 completions

Common options include:

temperature - Controls randomness (0.0 = deterministic, 2.0 = very creative)
top_p - Nucleus sampling parameter
max_tokens - Maximum tokens in the response
frequency_penalty - Reduces repetition of tokens
presence_penalty - Encourages topic diversity

Multimodal Inputs

Image Inputs

Send images along with text prompts for vision tasks:

Single Image

- step: HUB_COMPLETION
  instruction: "Describe what's in this image"
  image: '{{ flowContext.blob }}' # Blob or data URL
  imageDetail: 'high' # low, high, or auto

Multiple Images

- step: HUB_COMPLETION
  instruction: 'Compare these images'
  images:
    - '{{ flowContext.image1 }}'
    - '{{ flowContext.image2 }}'
  imageDetail: 'auto'

Images can be provided as:

Blob objects
Data URLs (base64-encoded)
Objects with url and detail properties

Audio Inputs

Send audio files for transcription or analysis:

Single Audio File

- step: HUB_COMPLETION
  instruction: 'Analyze the tone and content'
  audio: '{{ flowContext.audioBlob }}'
  audioFormat: 'wav' # or mp3, etc.

Multiple Audio Files

- step: HUB_COMPLETION
  instruction: 'Compare these recordings'
  audios:
    - '{{ flowContext.audio1 }}'
    - '{{ flowContext.audio2 }}'

Audio can be provided as:

Blob objects
Data URLs
Objects with data and format properties

Generic Attachments

For more flexible multimodal inputs:

- step: HUB_COMPLETION
  instruction: 'Process these attachments'
  attachments:
    - type: text
      text: 'Additional context'
    - type: image_url
      image_url:
        url: 'data:image/png;base64,...'
    - '{{ flowContext.someBlob }}' # Auto-detected type

Conversation History

Prior Messages

Include conversation context:

- step: HUB_COMPLETION
  instruction: 'Continue our discussion'
  messages:
    - role: system
      content: 'You are a helpful assistant'
    - role: user
      content: 'What is AI?'
    - role: assistant
      content: 'AI stands for Artificial Intelligence...'
    - role: user
      content: 'Tell me more about neural networks'

Structured Outputs

JSON Object Mode

Request responses in JSON format:

- step: HUB_COMPLETION
  instruction: 'List the top 3 benefits of automation. Answer in JSON format.'
  response_format: 'json_object'

The LLM will return valid JSON. Access the parsed object via flowContext.object.

JSON Schema Mode

Define an exact schema for the response:

- step: HUB_COMPLETION
  instruction: 'Extract person information from the text'
  response_format:
    type: json_schema
    json_schema:
      name: person
      schema:
        type: object
        properties:
          name:
            type: string
          age:
            type: number
          occupation:
            type: string
        required: ['name']
        additionalProperties: false
      strict: true

The parsed response is available in flowContext.object.

List Mode

A convenient shorthand for generating lists:

- step: HUB_COMPLETION
  instruction: 'Summarize in 3-5 key points: {textContent}'
  response_format: 'list'

- step: INSERT_LIST
  at: CURSOR

The list is available in flowContext.list as an array of strings.

Tool Calling (Function Calling)

Enable the AI to call functions and interact with your system:

Basic Tool Definition

- step: HUB_COMPLETION
  behavior: 'You are a helpful assistant'
  instruction: 'Ask the user for their favorite color using askUser'
  tools:
    - name: askUser
      description: 'Ask the user a question'
      required_params: ['question']
      steps:
        - step: USER_PROMPT
          promptText: '{{ flowContext.question }}'

Full Tool Schema

For more complex parameter validation:

- step: HUB_COMPLETION
  instruction: 'Update the document metadata'
  tools:
    - name: updateMetadata
      description: 'Updates document metadata fields'
      parameters:
        type: object
        properties:
          title:
            type: string
            description: 'Document title'
          category:
            type: string
            enum: ['news', 'opinion', 'feature']
          tags:
            type: array
            items:
              type: string
        required: ['title']
        additionalProperties: false
      steps:
        - step: SET
          metadata:
            title: '{{ flowContext.title }}'
            category: '{{ flowContext.category }}'

Tool Implementation Options

Tools can be implemented in three ways:

1. Steps (Workflow)

tools:
  - name: getArticleText
    description: 'Retrieves the full article text'
    steps:
      - step: SET
        articleContent: '{{ client.getTextContent() }}'

2. Template (Simple Expression)

tools:
  - name: getHeadline
    description: 'Gets the current headline'
    template: "{{ client.getTextAtXpath('/doc/story/headline') }}"

3. Script (JavaScript)

tools:
  - name: setHeadline
    description: 'Updates the document headline'
    required_params: ['newHeadline']
    script: |
      console.log("Updating headline to:", params.newHeadline);
      if (params?.newHeadline) {
        client.replaceTextAtXpath(
          params.newHeadline,
          '/doc/story/headline'
        );
      }
      return { success: true };

Model Context Protocol (MCP) Tools

You can enable built-in tools from Model Context Protocol (MCP) servers connected via the Hub.

- step: HUB_COMPLETION
  builtinTools:
    # Enable all tools from 'filesystem' MCP server
    - mcp: 'filesystem'

    # Enable specific tools only
    - mcp: 'github'
      only: ['create_issue', 'list_repos']

    # Create a semantic alias with preset arguments
    - alias: 'saveReport'
      target: 'mcp__filesystem__write_file'
      description: 'Save the generated report'
      args:
        path: '/documents/report.md'

The MCP tools are automatically fetched from the configured Hub server and exposed to the LLM. The mcp property specifies the server name as registered in the Hub.

Shorthand names for MCP tools
The only list and alias targets accept several shorthand formats:

<tool> (e.g. read_file)
mcp__<server>__<tool>
mcp.<server>.<tool>
<server>__<tool>
<server>.<tool>

Calling MCP Tools Directly

You can also call MCP tools directly as a step in your workflow, without going through an LLM. This is useful for deterministic operations or when you know exactly which tool to call.

- step: HUB_MCP_INVOKE
  server: filesystem
  tool: read_file
  arguments:
    path: '/documents/report.md'

The result of the tool call is stored in flowContext.object (or flowContext.text depending on the result type).

Listing Available MCP Tools

To list all available tools for a specific MCP server:

- step: HUB_MCP_TOOLS
  server: filesystem

The result is stored in flowContext.object as a list of tool definitions.

Tool Aliases

You can define custom "virtual tools" that wrap built-in tools with preset arguments. This simplifies the tool definition for the AI and ensures consistent parameter usage.

For a complete list of available tools, see the Built-in AI Tools reference.

builtinTools:
  - alias: getHeadline
    target: getTextAtXpath
    description: 'Returns the main headline'
    args:
      xpath: '/doc/story/headline'

Alias argument behavior:

Arguments defined in args are fixed defaults.
Fixed top-level arguments are removed from the tool schema sent to the LLM.
Remaining parameters are still exposed and must be supplied by the agent.
With toolStrict: true, attempts to override fixed top-level arguments are rejected.

Tool Alias Templates

If you need multiple aliases with shared defaults, define a template and extend it:

builtinTools:
  - aliasTemplate: markSegments
    target: highlightTextSegments
    description: 'Mark segments with consistent matching behavior'
    args:
      tag: 'mark'
      virtual: true
      match:
        normalizeCharacters: true
        normalizeWhitespace: true
        caseSensitive: false
        mode: 'first'

  - alias: markClaims
    extend: markSegments
    description: 'Mark claims for fact checking'
    args:
      attributes:
        class: 'fact-check'

Dynamic Built-in Tool Discovery

Use autoDiscoverTools: true when you want the model to discover and activate built-in tool providers on demand.

When enabled, two meta-tools are exposed:

listAvailableTools - Returns discoverable providers and their tool schemas.
activateTools - Activates selected providers for the current completion loop.

- step: HUB_COMPLETION
  autoDiscoverTools: true
  instruction: |
    First call listAvailableTools to inspect what is available.
    Then call activateTools for only the providers you need.
    After activation, call the provider tools directly.

You can combine auto-discovery with explicit builtinTools for aliases, MCP tools, or always-on tools:

- step: HUB_COMPLETION
  autoDiscoverTools: true
  builtinTools:
    - alias: getHeadline
      target: getTextAtXpath
      args:
        xpath: '/doc/story/headline'

Auto-discovery respects provider-specific conditions. For example, FeedbackTools and UserInteractionTools are discoverable only when the Flow Execution Monitor is enabled.

User Interaction Tools

ProActions provides specialized built-in tools for prompting users during AI workflows. These tools display interactive UI in the Flow Execution Monitor and pause workflow execution until the user responds.

When autoDiscoverTools is enabled, these tools are listed and activatable only if the monitor is active.

Available Tools

Tool	Purpose	Returns
`askSingleChoice`	User selects ONE option from buttons	`{ answerId, answerLabel, customInput? }`
`askMultipleChoice`	User checks MULTIPLE options (checkboxes)	`{ selectedIds[], selectedLabels[] }`
`askFreeformQuestion`	User types free text	`{ answer }`
`askConfirmation`	Simple Yes/No question	`{ confirmed }`

Basic Usage Examples

- step: HUB_COMPLETION
  behavior: 'You are a newsroom assistant'
  instruction: 'Help classify and improve this article'
  builtinTools:
    - askSingleChoice
    - askConfirmation
  tools:
    # AI can now call these tools directly

AI Agent Example Calls:

// Single choice
askSingleChoice("Select article category", [
  { id: "news", label: "📰 News" },
  { id: "sports", label: "⚽ Sports" },
  { id: "opinion", label: "💭 Opinion" }
])

// Multiple choice with checkboxes
askMultipleChoice("Select relevant tags", [
  { id: "breaking", label: "Breaking News" },
  { id: "politics", label: "Politics" },
  { id: "local", label: "Local" }
], 1, 3)  // Min 1, Max 3 selections

// Simple confirmation
askConfirmation("Publish this article to production?")

Advanced Features: askSingleChoice

The askSingleChoice tool supports advanced options for custom input and skip functionality:

Parameter	Type	Description
`allowCustomInput`	boolean	Shows text field below buttons for custom user input
`allowSkip`	boolean	Automatically adds "Skip" button
`skipLabel`	string	Custom label for skip button (default: "Skip")

Examples:

// With skip option
askSingleChoice("Choose template", templates, false, true, "No template")
// Returns: { answerId: "__skip__", answerLabel: "No template" }

// With custom input
askSingleChoice("Select author", suggestedAuthors, true)
// User can click button AND type custom text
// Returns: { answerId: "john", answerLabel: "John", customInput: "Associate Editor" }

Alias Override Patterns

Use aliases to create reusable, preconfigured tools with controlled parameter flexibility:

Pattern 1: Full Override (All Fixed)

AI cannot change any parameters:

builtinTools:
  - alias: confirm_delete
    target: askConfirmation
    args:
      question: "Are you sure you want to permanently delete this?"

Usage: confirm_delete() - No arguments allowed

Pattern 2: Partial Override (Mixed Control)

Fix some parameters, allow AI to customize others:

builtinTools:
  - alias: select_priority
    target: askSingleChoice
    args:
      # question NOT specified - AI can customize
      answers:  # Fixed choices
        - id: "urgent"
          label: "🔴 Urgent"
        - id: "high"
          label: "🟠 High Priority"
        - id: "normal"
          label: "🟢 Normal"
      allowSkip: true  # Fixed
      skipLabel: "Keep current priority"

Usage: select_priority("What priority?") - AI provides question, rest is fixed

Pattern 3: Smart Defaults with Flexibility

Provide sensible defaults but allow complete customization:

builtinTools:
  - alias: choose_author_with_suggestions
    target: askSingleChoice
    args:
      question: "Select or enter author name"
      answers:  # Fixed suggestions
        - id: "john_smith"
          label: "John Smith"
        - id: "jane_doe"
          label: "Jane Doe"
      allowCustomInput: true  # Always allow custom
      # AI can override question if needed

Pattern 4: Contextual Question Templates

Let AI inject context into questions:

builtinTools:
  - alias: confirm_publish_action
    target: askSingleChoice
    args:
      # AI provides full dynamic question
      answers:
        - id: "proceed"
          label: "✓ Proceed"
          description: "Continue with this action"
        - id: "cancel"
          label: "✗ Cancel"
      allowSkip: false  # Force decision

Usage: confirm_publish_action("Publish to Production site?")

Pattern 5: Category/Tag Selection with Custom Input

builtinTools:
  - alias: tag_article
    target: askMultipleChoice
    args:
      question: "Select article tags (choose multiple)"
      answers:
        - id: "breaking"
          label: "Breaking News"
          description: "Time-sensitive content"
        - id: "analysis"
          label: "Analysis"
        - id: "opinion"
          label: "Opinion Piece"
        - id: "local"
          label: "Local News"
      minSelect: 1
      maxSelect: 3
      # AI cannot override these constraints

Alias Override Rules

What you specify in `args`	Effect
Parameter present	Fixed value - AI cannot override
Parameter absent	Dynamic - AI can provide value
With `toolStrict: true`	Attempts to override fixed params are rejected

Best Practice: Only fix parameters you absolutely need to control. Leave others unspecified for maximum AI flexibility.

Return Value Handling

All user interaction tools return JSON strings. Parse them in workflows:

- step: HUB_COMPLETION
  instruction: 'Ask user to select priority and update metadata'
  builtinTools:
    - alias: choose_priority
      target: askSingleChoice
      args:
        answers:
          - id: "high"
            label: "High"
          - id: "low"
            label: "Low"

# AI calls: choose_priority("Select priority")
# Returns: '{"answerId":"high","answerLabel":"High"}'

- step: SCRIPTING
  script: |
    const result = JSON.parse(flowContext.text);
    flowContext.priority = result.answerId;

Tool Calling Options

Option	Type	Description
`autoDiscoverTools`	boolean	Enable dynamic discovery via `listAvailableTools` and `activateTools` meta-tools
`toolChoice`	string	Control tool selection policy: `auto`, `required`, `none` (mapped to `tool_choice`)
`parallelToolCalls`	boolean	Allow multiple tool calls in one assistant turn (mapped to `parallel_tool_calls`)
`maxToolIterations`	number	Maximum rounds of tool calls (default: 5)
`afterToolResultToolChoice`	string	Behavior after tool execution: `auto` or `none`
`toolsReuseContext`	boolean	Whether to pass full flowContext to tool steps/templates (preferred)
`functionsReuseContext`	boolean	Deprecated alias of `toolsReuseContext`
`toolErrorBehavior`	string	How to handle tool/function errors: `retry` (default), `fail`, or `once`
`functionErrorBehavior`	string	Deprecated alias of `toolErrorBehavior`
`toolGuidance`	object	Optional tool schema guidance (e.g., table columns or metric keys)
`toolStrict`	boolean	Enforce strict tool argument validation (missing/extra args fail)

toolChoice and parallelToolCalls are camelCase in YAML and are translated to API fields.

Prefer toolErrorBehavior and toolsReuseContext in new flows. The function* variants remain for backward compatibility.

For advanced forcing of a specific tool object, pass tool_choice via options.

Tool Guidance (Tables & Metrics)

Use toolGuidance to tighten tool schemas so the LLM sends the right structure:

- step: HUB_COMPLETION
  toolGuidance:
    addTableFeedback:
      columns: ['Claim', 'Verdict', 'Source']
      description: 'Fact-check results'
    addMetricsFeedback:
      metrics: ['engagement', 'clarity', 'specificity', 'seo']
      min: 0
      max: 100

This updates the tool schema the model sees, ensuring it sends the exact column and metric keys.

Tool Guardrails (Strict Mode)

Enable toolStrict to enforce required fields and reject extra arguments:

- step: HUB_COMPLETION
  toolStrict: true

This prevents silent failures when the model sends incorrect parameters.

Managing Tool Call Errors

Control how the system handles tool execution errors to prevent infinite retry loops:

- step: HUB_COMPLETION
  behavior: 'You are a helpful assistant with access to various tools.'
  instruction: 'Help the user complete their task.'
  toolErrorBehavior: 'once' # Prevent retrying failed tools
  maxToolIterations: 3
  tools:
    - name: processData
      description: 'Process user data'
      required_params: ['data']
      steps:
        - step: SCRIPTING
          script: |
            if (!flowContext.data) {
              throw new Error("Data parameter is required");
            }
            flowContext.processed = flowContext.data.toUpperCase();
            return flowContext;

Error Behavior Modes:

retry (default) - Allows the LLM to retry failed tools. Use for recoverable errors where different parameters might succeed.
fail - Stops the entire flow immediately when any tool fails. Use for critical operations where errors should halt execution.
once - Prevents retrying the same tool after it fails once. Recommended to prevent infinite loops when a tool has persistent errors.

When a tool fails:

With retry: The LLM receives the error and may retry with different parameters
With fail: The flow stops with a clear error message
With once: The LLM is informed the tool cannot be retried and should try a different approach

Best Practice: Use toolErrorBehavior: "once" for production workflows to prevent wasted API calls from repeatedly failing tools.

Example: Interactive Workflow

- step: HUB_COMPLETION
  behavior: |
    You are an expert newsroom assistant.
    You help editors refine articles by analyzing content and improving headlines.
  instruction: |
    1. Use getTextContent to retrieve the article.
    2. Ask the user for their preferred headline style using askUser.
    3. Create an improved headline based on the article and user feedback.
    4. Use setHeadline to update the document.
    5. Confirm completion.
  tools:
    - name: askUser
      description: 'Ask the user a question'
      required_params: ['question']
      steps:
        - step: USER_PROMPT
          promptText: '{{ flowContext.question }}'

    - name: getTextContent
      description: 'Returns the full document text'
      template: '{{ client.getTextContent() }}'

    - name: setHeadline
      description: 'Updates the document headline'
      required_params: ['newHeadline']
      script: |
        if (params?.newHeadline) {
          client.replaceTextAtXpath(
            params.newHeadline,
            '/doc/story/headline'
          );
        }

Audio Output

Request audio responses from the model:

- step: HUB_COMPLETION
  instruction: 'Explain quantum computing in simple terms'
  outputAudio:
    voice: 'alloy' # alloy, echo, fable, onyx, nova, shimmer
    format: 'mp3' # mp3, opus, aac, flac, wav, pcm
  model: 'gpt-4o-audio-preview'

Outputs:

flowContext.audio - Audio data URL
flowContext.audio_transcript - Text transcript (if provided)

Combined Text and Audio Output

- step: HUB_COMPLETION
  behavior: 'Create an audio description of the image'
  instruction: '{{ flowContext.text }}'
  image: '{{ flowContext.blob }}'
  outputAudio:
    voice: alloy
    format: mp3
  model: gpt-4o-mini-audio-preview-2024-12-17

- step: PLAY_AUDIO
  in: '{{ flowContext.audio }}'

Reasoning Models

For models that support reasoning (e.g., o1, o3):

- step: HUB_COMPLETION
  instruction: 'Solve this complex problem: ...'
  options:
    reasoning:
      effort: 'low' # Model-specific reasoning configuration
  model: 'o1'

The reasoning output (if provided) is available in flowContext.reasoning.

Streaming

Enable streaming mode to receive tokens incrementally as they are generated, rather than waiting for the complete response. Streamed tokens are displayed in real time in the Flow Execution Monitor.

Basic Streaming

- step: HUB_COMPLETION
  behavior: 'You are a creative writing assistant.'
  instruction: 'Write a short story about a robot discovering nature.'
  stream: true
  options:
    temperature: 0.8

Streaming Options

Option	Type	Default	Description
`stream`	boolean	`false`	Enable streaming mode for incremental token delivery
`feedback`	boolean	`true`	Show streaming tokens in the Flow Execution Monitor. Set to `false` for silent streaming

Streaming with Tool Calls

Streaming works seamlessly with tool calling. During tool-call-only iterations (where the model invokes tools without producing text), no feedback entry is created in the monitor — only text responses are streamed to the UI.

- step: HUB_COMPLETION
  behavior: 'You are a headline optimization assistant.'
  instruction: |
    1. Read the current headline using getHeadline.
    2. Propose an improved headline and confirm with the user.
    3. If confirmed, update it with setHeadline.
  stream: true
  toolChoice: required
  toolStrict: true
  maxToolIterations: 10
  builtinTools:
    - addFeedback
    - askConfirmation
    - alias: getHeadline
      target: getTextAtXpath
      description: 'Returns the article headline'
      args:
        xpath: '/doc/story/grouphead/headline/p'
    - alias: setHeadline
      extend: getHeadline
      target: replaceTextAtXpath
      description: 'Changes the headline'

Silent Streaming (Performance Only)

Use feedback: false to benefit from streaming's faster time-to-first-token without displaying tokens in the monitor:

- step: HUB_COMPLETION
  instruction: 'Summarize the article: {textContent}'
  stream: true
  feedback: false

Automatic Fallback

Streaming is automatically disabled for configurations that are incompatible with it. No changes are needed — the step silently falls back to non-streaming:

response_format: list or json_schema — requires the SDK's .parse() method
outputAudio — audio output is binary, not streamable as text
n > 1 — multiple choices interleave chunks from different completions

Proxy Configuration for Streaming

Streaming requires that all proxies between the browser and the API pass Server-Sent Events (SSE) without buffering. If tokens arrive all at once instead of incrementally, check your proxy configuration:

location /swing/proactions/ {
    proxy_buffering off;
    proxy_cache off;
    gzip off;
    brotli off;  # or: proxy_set_header Accept-Encoding "";

    # existing proxy_pass configuration...
}

The critical settings are proxy_buffering off and disabling compression (gzip off, brotli off) on the SSE location. Compression algorithms buffer the response body before sending, which defeats streaming.

Agent Skills

Skills are reusable instruction sets stored on the server that can be dynamically loaded into AI completion steps at runtime. Instead of embedding lengthy guidelines directly in every behavior prompt, you write them once as a Skill and reference them by name. This keeps your YAML flows clean and makes it easy to update shared instructions in one place.

How Skills Work

When a completion step runs, ProActions:

Loads each named skill's SKILL.md from the server
Injects the skill content at the beginning of the system prompt (before behavior)
Exposes a read_skill_reference tool so the AI can fetch reference documents on demand
Falls back gracefully if a skill fails to load — the completion continues without it

Directory Structure

Skills are stored under a configurable base path (default: /SysConfig/ProActions/Skills/):

/SysConfig/ProActions/Skills/
├── editorial-style/
│   ├── SKILL.md              # Required: main instructions
│   └── references/
│       ├── style-guide.md    # Optional reference documents
│       └── forbidden-words.txt
├── seo-guidelines/
│   └── SKILL.md
└── fact-checking/
    ├── SKILL.md
    └── references/
        └── source-list.md

SKILL.md Format

Each SKILL.md supports optional YAML front-matter to control skill behavior:

---
name: Editorial Style Guide
description: House style rules for all published content
loading_mode: immediate   # 'immediate' (default) or 'on_demand'
---

Always use active voice. Avoid passive constructions.
Use Oxford commas in all lists.
Headlines must be in title case.
Quotes require attribution in the same paragraph.

Front-matter fields:

Field	Type	Description
`name`	string	Human-readable name (used in tool descriptions)
`description`	string	Brief purpose of this skill
`loading_mode`	string	`immediate` (inject on load) or `on_demand` (AI fetches via tool)

Loading modes:

immediate (default) — the skill body is injected directly into the system prompt when the step starts. Best for short, critical instructions the AI always needs.
on_demand — only the skill's name and description are listed in the system prompt. The AI can call the read_skill_instruction tool to fetch the full body when needed. Best for large reference documents that aren't always required.

Basic Usage

- step: HUB_COMPLETION
  behavior: 'You are a writing assistant.'
  instruction: 'Improve this article: {textContent}'
  skills:
    - editorial-style
    - seo-guidelines

Multiple skills are injected in the order listed and concatenated before the behavior system prompt.

Reference Documents

Skills can include reference documents (style sheets, glossaries, source lists, etc.) stored in a references/ subdirectory. When at least one skill is loaded, ProActions automatically exposes a read_skill_reference tool to the AI:

read_skill_reference(skillName, filename)

The AI can call this tool to fetch the content of any reference file it needs during the completion. You do not need to configure the tool manually — it is registered automatically.

Example SKILL.md referencing documents:

---
name: Fact Checking Workflow
description: Guidelines and sources for fact-checking articles
---

Before publishing, verify all factual claims against approved sources.
Use read_skill_reference("fact-checking", "source-list.md") to get the
approved source directory.

Configuring the Base Path

The default skills location /SysConfig/ProActions/Skills/ can be changed in your pro-actions-config.js:

window.SwingProActionsConfig = {
  skillsBasePath: '/SysConfig/MyOrg/AISkills/'
};

Trailing slashes are optional — ProActions normalizes the path automatically.

Skill Name Validation

Skill names must match the pattern [A-Za-z0-9_-]+ (letters, numbers, hyphens, underscores only). Invalid names are skipped with a warning logged to the console — they do not abort the flow.

Combining Skills with Other Configuration

Skills compose naturally with behavior, builtinTools, tools, and all other completion options:

- step: HUB_COMPLETION
  behavior: 'You are a senior editor.'
  instruction: 'Review and improve the article: {textContent}'
  skills:
    - editorial-style      # Loaded immediately into system prompt
    - source-guidelines    # on_demand — AI fetches when needed
  builtinTools:
    - ContentTools
    - alias: getHeadline
      target: getTextAtXpath
      args:
        xpath: '/doc/story/grouphead/headline/p'
  stream: true

Error Handling

If a skill file is missing or unreadable, a warning is logged and the skill is silently skipped
The completion always proceeds — skill loading failures are non-fatal
Very large skills (> 4,000 characters) produce a warning recommending the use of on_demand mode or reference documents instead

Outputs

The completion steps produce various outputs in flowContext:

Output	Type	Description
`text`	string	Main text response (always available)
`object`	object	Parsed JSON object (when using `response_format`)
`list`	array	Array of strings (when using `response_format: "list"`)
`audio`	string	Audio data URL (when `outputAudio` is configured)
`audio_transcript`	string	Transcript of audio response
`response`	object	Full API response object
`reasoning`	object	Reasoning data from models that support it
`choices`	array	All response choices (if `n > 1`)
`usage`	object	Token usage statistics
`tool_results`	array	Results from tool/function calls

Service Configuration

Configure services in your services.yaml:

HUB:
  type: completion
  endpoint: '{BASE_URL}/swing/proactions'
  target: openai
  defaultBehavior: 'You are a helpful assistant for authors.'
  options:
    temperature: 0.7
    max_tokens: 800

OPENAI_COMPLETION:
  type: completion
  apiKey: sk-...
  endpoint: 'https://api.openai.com/v1'
  model: gpt-4o
  options:
    temperature: 0.7

AZURE_OPENAI_COMPLETION:
  type: completion
  endpoint: 'https://your-resource.openai.azure.com'
  apiKey: '...'
  deploymentId: 'gpt-4o'
  apiVersion: '2024-02-15-preview'

Practical Examples

Vision: Image Caption Generation

- step: SELECTED_OBJECT

- step: EDAPI_OBJECT_CONTENT
  format: lowres

- step: HUB_COMPLETION
  behavior: |
    You are a professional photo editor. Create short, precise,
    factual captions suitable for news articles.
  instruction: 'Create a caption for this image. Article: {textContent}'
  image: '{{ flowContext.blob }}'

- step: REPLACE_TEXT
  at: XPATH
  xpath: '/doc/story/text'

Generate Structured Data

- step: HUB_COMPLETION
  instruction: 'Extract key information from: {textContent}'
  response_format:
    type: json_schema
    json_schema:
      name: article_metadata
      schema:
        type: object
        properties:
          title:
            type: string
          keywords:
            type: array
            items:
              type: string
          summary:
            type: string
        required: ['title', 'keywords']
        additionalProperties: false
      strict: true

Interactive Table Generation

- step: HUB_COMPLETION
  behavior: |
    Transform plain text into structured HTML tables.
    Use only: table, tr, td, b, i, u tags.
    Output valid XML with no blank lines.
  instruction: 'Convert to table: {selectedText}'

- step: SANITIZE
  stripMarkdown: true
  validateAndRepairXml: true

- step: INSERT_XML
  at: CURSOR
  position: insertBefore

- step: SELECTED_OBJECT

- step: EDAPI_OBJECT_CONTENT
  format: lowres

- step: HUB_COMPLETION
  behavior: 'You are an accessibility expert.'
  instruction: |
    Create a detailed description for visually impaired users.
    Include all visible elements, text, context, and meaning.
  image: '{{ flowContext.blob }}'

- step: HUB_COMPLETION
  behavior: 'Create an audio version of this description.'
  instruction: '{{ flowContext.text }}'
  outputAudio:
    voice: alloy
    format: mp3
  model: gpt-4o-mini-audio-preview-2024-12-17

- step: PLAY_AUDIO
  in: '{{ flowContext.audio }}'

Best Practices

1. Use Appropriate Models

Use smaller/faster models (gpt-4o-mini) for simple tasks
Use larger models (gpt-4o, gpt-4-turbo) for complex reasoning
Use specialized models for audio (gpt-4o-audio-preview) or vision tasks

2. Optimize Prompts

Be specific and clear in instructions
Provide context in the behavior (system prompt)
Use examples when needed
Request specific output formats

3. Handle Errors

Always wrap completion calls in error handling:

- step: TRY
  steps:
    - step: HUB_COMPLETION
      instruction: '...'
  catch:
    - step: SHOW_NOTIFICATION
      notificationType: error
      message: 'AI processing failed. Please try again.'

4. Manage Costs

Set max_tokens to limit response length
Use temperature: 0 for deterministic outputs
Store frequently used prompts with promptId
Use tool calling to reduce back-and-forth iterations

5. Structured Outputs

Always use response_format when you need JSON
Use strict: true in schemas for guaranteed adherence
Validate outputs - especially XML - with the SANITIZE step

6. Image Processing

Use imageDetail: "low" for faster processing when detail isn't critical
Use imageDetail: "high" for detailed analysis
Compress images before sending to reduce costs

7. Conversation Context

Limit message history to recent exchanges
Clean up old messages to reduce token usage
Store important context in flowContext variables

Security Considerations

API Key Protection

Never hardcode API keys in YAML files:

# Bad
OPENAI_COMPLETION:
  apiKey: sk-actual-key-here

# Good - use environment variables or proxy
HUB:
  endpoint: '{BASE_URL}/swing/proactions'
  target: openai

Content Filtering

Consider implementing content filters for user-facing applications:

- step: HUB_COMPLETION
  safetyIdentifier: 'user-generated-content'
  instruction: '...'

Troubleshooting

Common Issues

Issue: "No completion message returned"

Check that the service configuration is correct
Verify API credentials
Check network connectivity

Issue: Tool calls exceed max iterations

Increase maxToolIterations if legitimate iterations are needed
Use toolErrorBehavior: "once" to prevent retrying failed tools
Use toolErrorBehavior: "fail" to stop immediately on errors
Simplify tool descriptions to help the LLM choose correctly
Use afterToolResultToolChoice: "none" to stop after first tool call

Issue: JSON parsing errors

Use response_format: "json_object" or structured schemas
Validate output with SANITIZE step
Check that instructions explicitly request JSON

Issue: Image not recognized

Verify image is in supported format (JPEG, PNG, GIF, WebP)
Check image size limits
Ensure data URL is properly formatted

Performance Tips

Enable Streaming: Use stream: true for long responses — users see output immediately instead of waiting for the full generation
Batch Operations: Process multiple items in one call when possible
Parallel Processing: Use PARALLEL step for independent completions
Model Selection: Choose the smallest model that meets your needs

Limits and Constraints

Maximum Tokens: Varies by model (typically 4K-128K input tokens)
Image Size: Maximum ~20MB per image
Audio Size: Maximum ~25MB per audio file
Tool Iterations: Default 5, configurable up to reasonable limits
Rate Limits: Depend on service tier and provider

Overview​

Basic Usage​

Simple Text Completion​

Using Stored Prompts​

Configuration Options​

Core Prompts​

Model Selection​

API Options​

Multimodal Inputs​

Image Inputs​

Single Image​

Multiple Images​

Audio Inputs​

Single Audio File​

Multiple Audio Files​

Generic Attachments​

Conversation History​

Prior Messages​

Structured Outputs​

JSON Object Mode​

JSON Schema Mode​

List Mode​

Tool Calling (Function Calling)​

Basic Tool Definition​

Full Tool Schema​

Tool Implementation Options​

1. Steps (Workflow)​

2. Template (Simple Expression)​

3. Script (JavaScript)​

Model Context Protocol (MCP) Tools​

Calling MCP Tools Directly​

Listing Available MCP Tools​

Tool Aliases​

Tool Alias Templates​

Dynamic Built-in Tool Discovery​

User Interaction Tools​

Available Tools​

Basic Usage Examples​

Advanced Features: askSingleChoice​

Alias Override Patterns​

Alias Override Rules​

Return Value Handling​

Tool Calling Options​

Tool Guidance (Tables & Metrics)​

Tool Guardrails (Strict Mode)​

Managing Tool Call Errors​

Example: Interactive Workflow​

Audio Output​

Combined Text and Audio Output​

Reasoning Models​

Streaming​

Basic Streaming​

Streaming Options​

Streaming with Tool Calls​

Silent Streaming (Performance Only)​

Automatic Fallback​

Proxy Configuration for Streaming​

Agent Skills​

How Skills Work​

Directory Structure​

SKILL.md Format​

Basic Usage​

Reference Documents​

Configuring the Base Path​

Skill Name Validation​

Combining Skills with Other Configuration​

Error Handling​

Outputs​

Service Configuration​

Practical Examples​

Vision: Image Caption Generation​

Generate Structured Data​

Interactive Table Generation​

Multi-Modal Analysis​

Best Practices​

1. Use Appropriate Models​

2. Optimize Prompts​

3. Handle Errors​

4. Manage Costs​

5. Structured Outputs​

Overview

Basic Usage

Simple Text Completion

Using Stored Prompts

Configuration Options

Core Prompts

Model Selection

API Options

Multimodal Inputs

Image Inputs

Single Image

Multiple Images

Audio Inputs

Single Audio File

Multiple Audio Files

Generic Attachments

Conversation History

Prior Messages

Structured Outputs

JSON Object Mode

JSON Schema Mode

List Mode

Tool Calling (Function Calling)

Basic Tool Definition

Full Tool Schema

Tool Implementation Options

1. Steps (Workflow)

2. Template (Simple Expression)

3. Script (JavaScript)

Model Context Protocol (MCP) Tools

Calling MCP Tools Directly

Listing Available MCP Tools

Tool Aliases

Tool Alias Templates

Dynamic Built-in Tool Discovery

User Interaction Tools

Available Tools

Basic Usage Examples

Advanced Features: askSingleChoice

Alias Override Patterns

Alias Override Rules

Return Value Handling

Tool Calling Options

Tool Guidance (Tables & Metrics)

Tool Guardrails (Strict Mode)

Managing Tool Call Errors

Example: Interactive Workflow

Audio Output

Combined Text and Audio Output

Reasoning Models

Streaming

Basic Streaming

Streaming Options

Streaming with Tool Calls

Silent Streaming (Performance Only)

Automatic Fallback

Proxy Configuration for Streaming

Agent Skills

How Skills Work

Directory Structure

SKILL.md Format

Basic Usage

Reference Documents

Configuring the Base Path

Skill Name Validation

Combining Skills with Other Configuration

Error Handling

Outputs

Service Configuration

Practical Examples

Vision: Image Caption Generation

Generate Structured Data

Interactive Table Generation

Multi-Modal Analysis

Best Practices

1. Use Appropriate Models

2. Optimize Prompts

3. Handle Errors

4. Manage Costs

5. Structured Outputs