Text-to-Speech Audio Generation

Generate audio versions of articles using AI text-to-speech services.

Overview

This example demonstrates how to:

Extract text content while excluding specific elements (excludeTags)
Create dynamic filenames based on article metadata
Generate audio files using AI speech synthesis (HUB_SPEECH)
Upload binary files to Méthode

Requirements

ProActions Hub configured for content extraction
OpenAI TTS API access (or compatible service)
Write permissions to Méthode folders

Use Case

Modern digital publications often need audio versions of articles for:

Accessibility (screen reader alternatives)
Multi-platform distribution (podcasts, smart speakers)
Mobile consumption (listen while driving)

This workflow automates the creation of high-quality audio narrations.

Configuration

# /SysConfig/ProActions/sections/tts.ai-kit.section.yaml
section: Speech / TTS
keywords: 'Voice'

actions:
  - title: 'Text to Speech (OpenAI)'
    flow:
      # Extract text content, excluding image captions and credits
      - step: SET
        text: '{{ client.getTextContent({ excludeTags: ["web-image-caption", "caption", "credit"] }) | safe }}'
        targetName: "{{ client.getMetadata().metadata.general.title | replace(r/[\\/,\\?]/g, '') | truncate(50) }}.mp3"

      # Generate speech audio
      - step: HUB_SPEECH

      # Upload to Méthode
      - step: UPLOAD
        options:
          name: "{{ flowContext.targetName | safe }}"
          workFolder: "{{ client.getDocumentWorkfolder() }}"
          folderPath: "/Production/Product/Audio"
          type: "Audio"
          attributes: ""
          systemAttributes: "<props></props>"
          application: 'Swing ProActions'
          options:
            showPath: true
            showSystemAttributes: true
            showAttributes: true
            createMode: 'AUTO_RENAME'

      # Notify user of success
      - step: SHOW_NOTIFICATION
        notificationType: success
        message: "Successfully created audio version of your object."
        title: "Text to Speech"

Main Configuration

# pro-actions.ai-kit.yaml
AI_KIT:
  SERVICES: !include /SysConfig/ProActions/services.ai-kit.services.yaml
  SECTIONS:
    - !include /SysConfig/ProActions/sections/tts.ai-kit.section.yaml

How It Works

Step 1: Extract and Prepare Content

The SET step prepares both the text content and the target filename:

- step: SET
  text: '{{ client.getTextContent({ excludeTags: ["web-image-caption", "caption", "credit"] }) | safe }}'
  targetName: "{{ client.getMetadata().metadata.general.title | replace(r/[\\/,\\?]/g, '') | truncate(50) }}.mp3"

Key features:

Excluding tags:

client.getTextContent({
  excludeTags: ["web-image-caption", "caption", "credit"]
})

This retrieves text content but excludes specified XML elements. Useful for removing:

Image captions
Photo credits
Pull quotes
Sidebars
Any non-narrative content

Dynamic filename:

targetName: "{{ client.getMetadata().metadata.general.title | replace(r/[\\/,\\?]/g, '') | truncate(50) }}.mp3"

Creates a safe filename:

Gets article title from metadata
Removes invalid filename characters (/, \, ?)
Truncates to 50 characters
Adds .mp3 extension

Step 2: Generate Audio

The HUB_SPEECH step sends text to the speech synthesis service:

- step: HUB_SPEECH

This step:

Uses the text variable from the previous SET step
Calls the configured TTS service (e.g., OpenAI TTS)
Returns binary audio data
Stores the result in flowContext.data by default

Step 3: Upload to Méthode

The UPLOAD step saves the audio file:

- step: UPLOAD
  options:
    name: "{{ flowContext.targetName | safe }}"
    workFolder: "{{ client.getDocumentWorkfolder() }}"
    folderPath: "/Production/Product/Audio"
    type: "Audio"
    options:
      createMode: 'AUTO_RENAME'

Upload options:

name - Filename for the uploaded file
workFolder - Base Méthode work folder
folderPath - Relative path within work folder
type - Méthode object type
createMode: 'AUTO_RENAME' - Automatically rename if file exists

Customization

Configure Voice and Speed

Customize the speech synthesis parameters:

- step: HUB_SPEECH
  voice: "alloy"  # OpenAI TTS voices: alloy, echo, fable, onyx, nova, shimmer
  speed: 1.0      # Speed from 0.25 to 4.0

Exclude Different Elements

Customize which elements to exclude:

# Exclude more elements
text: '{{ client.getTextContent({
  excludeTags: ["web-image-caption", "caption", "credit", "sidebar", "pullquote", "infobox"]
}) | safe }}'

# Include everything
text: '{{ client.getTextContent() | safe }}'

# Only specific sections
text: '{{ client.getTextContent({
  includeTags: ["text", "headline", "summary"]
}) | safe }}'

Best Practices

Exclude Non-Narrative Content: Always exclude captions, credits, and other elements that don't belong in narration
Safe Filenames: Use replace() to remove invalid characters and truncate() to limit length
User Feedback: Show notifications on success and provide clear error messages
Auto-Rename: Use createMode: 'AUTO_RENAME' to avoid overwrite conflicts

Troubleshooting

Audio quality is poor:

Check the TTS service configuration
Try different voices
Adjust speed parameter
Ensure text is clean (no XML tags, special characters)

Upload fails:

Verify folder path exists in Méthode
Check write permissions
Ensure workFolder is correct
Check filename for invalid characters

Wrong content in audio:

Verify excludeTags list is correct
Check document structure with client.getTextContent() in console
Test with smaller text selections first

File already exists error:

Use createMode: 'AUTO_RENAME' to automatically rename
Or increment filename manually
Or use timestamp in filename

Content Import and Rewrite - Import and transform content
Progress Bar - Show progress for long operations

Overview​

Requirements​

Use Case​

Configuration​

Main Configuration​

How It Works​

Step 1: Extract and Prepare Content​

Step 2: Generate Audio​

Step 3: Upload to Méthode​

Customization​

Configure Voice and Speed​

Exclude Different Elements​

Best Practices​

Troubleshooting​

Related Examples​