Hume AI API

Analyze emotions in text, audio, and video, generate expressive text-to-speech, and build speech-to-speech conversational agents via Hume's REST API.

Official docs: https://dev.hume.ai/reference

When to Use

Use this skill when you need to:

Analyze emotions and expressions in text, audio, images, or video (batch processing)
Generate expressive text-to-speech audio with emotional control
Manage EVI (Empathic Voice Interface) configurations, prompts, and tools
Retrieve chat transcripts and events from EVI conversations

Prerequisites

Sign up at https://app.hume.ai
Navigate to the API Keys page
Copy your API key

Set environment variable:

bash

export HUME_TOKEN="your-api-key"

Placeholders: Values in {curly-braces} like {job-id} are placeholders. Replace them with actual values when executing.

Expression Measurement (Batch)

Start Inference Job from URLs

Write to /tmp/hume_request.json:

json

{
  "urls": ["https://example.com/media-file.mp4"],
  "models": {
    "face": {},
    "prosody": {},
    "language": {}
  },
  "notify": true
}

Then run:

bash

curl -s -X POST "https://api.hume.ai/v0/batch/jobs" --header "Content-Type: application/json" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" -d @/tmp/hume_request.json | jq .

Start Inference Job with Text

Write to /tmp/hume_request.json:

json

{
  "text": ["I am so excited about this!", "This is really disappointing."],
  "models": {
    "language": {}
  }
}

Then run:

bash

curl -s -X POST "https://api.hume.ai/v0/batch/jobs" --header "Content-Type: application/json" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" -d @/tmp/hume_request.json | jq .

List Jobs

bash

curl -s "https://api.hume.ai/v0/batch/jobs?limit=10" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" | jq .

List Jobs by Status

bash

curl -s "https://api.hume.ai/v0/batch/jobs?limit=10&status=COMPLETED" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" | jq .

Get Job Details

bash

curl -s "https://api.hume.ai/v0/batch/jobs/{job-id}" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" | jq .

Get Job Predictions

bash

curl -s "https://api.hume.ai/v0/batch/jobs/{job-id}/predictions" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" | jq .

Download Job Artifacts

bash

curl -s "https://api.hume.ai/v0/batch/jobs/{job-id}/artifacts" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" --output /tmp/hume_artifacts.zip

Text-to-Speech

Synthesize Speech (JSON Response)

Write to /tmp/hume_request.json:

json

{
  "utterances": [
    {
      "text": "Hello, how are you today?",
      "description": "A warm and friendly greeting"
    }
  ],
  "format": {
    "type": "mp3"
  }
}

Then run:

bash

curl -s -X POST "https://api.hume.ai/v0/tts" --header "Content-Type: application/json" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" -d @/tmp/hume_request.json | jq .

The response contains base64-encoded audio in .generations[].audio.

Synthesize Speech with Specific Voice

Write to /tmp/hume_request.json:

json

{
  "utterances": [
    {
      "text": "Welcome to our platform!",
      "speed": 1.0
    }
  ],
  "voice": {
    "name": "{voice-name}"
  },
  "format": {
    "type": "mp3"
  }
}

Then run:

bash

curl -s -X POST "https://api.hume.ai/v0/tts" --header "Content-Type: application/json" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" -d @/tmp/hume_request.json | jq .

Synthesize and Save Audio File

Write to /tmp/hume_request.json:

json

{
  "utterances": [
    {
      "text": "This is a test of text-to-speech synthesis."
    }
  ],
  "format": {
    "type": "mp3"
  }
}

Then extract and decode the audio:

bash

curl -s -X POST "https://api.hume.ai/v0/tts" --header "Content-Type: application/json" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" -d @/tmp/hume_request.json | jq -r '.generations[0].audio' | base64 -d > /tmp/hume_output.mp3

Synthesize Multiple Utterances

Write to /tmp/hume_request.json:

json

{
  "utterances": [
    {
      "text": "First sentence.",
      "description": "calm and measured"
    },
    {
      "text": "Second sentence!",
      "description": "excited and energetic",
      "trailing_silence": 0.5
    }
  ],
  "format": {
    "type": "mp3"
  }
}

Then run:

bash

curl -s -X POST "https://api.hume.ai/v0/tts" --header "Content-Type: application/json" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" -d @/tmp/hume_request.json | jq .

EVI Configs

List Configs

bash

curl -s "https://api.hume.ai/v0/evi/configs?page_size=20" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" | jq .

Create Config

Write to /tmp/hume_request.json:

json

{
  "name": "My EVI Config"
}

Then run:

bash

curl -s -X POST "https://api.hume.ai/v0/evi/configs" --header "Content-Type: application/json" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" -d @/tmp/hume_request.json | jq .

EVI Prompts

List Prompts

bash

curl -s "https://api.hume.ai/v0/evi/prompts?page_size=20" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" | jq .

Create Prompt

Write to /tmp/hume_request.json:

json

{
  "name": "Customer Support Agent",
  "text": "You are a friendly and empathetic customer support agent. Listen carefully and help resolve issues."
}

Then run:

bash

curl -s -X POST "https://api.hume.ai/v0/evi/prompts" --header "Content-Type: application/json" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" -d @/tmp/hume_request.json | jq .

EVI Chats

List Chat Events

bash

curl -s "https://api.hume.ai/v0/evi/chats/{chat-id}?page_size=50&ascending_order=true" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" | jq .

Available Models for Expression Measurement

Model	Description
`face`	Facial expression analysis from images/video
`prosody`	Vocal expression analysis from audio
`language`	Emotion analysis from text
`ner`	Named entity recognition from text
`burst`	Non-speech vocal sounds (laughter, sighs)
`facemesh`	Detailed facial landmark detection

TTS Utterance Parameters

Parameter	Type	Description
`text`	string	Content to convert to speech (required)
`description`	string	Acting directions or voice style prompt
`speed`	number	Speech rate multiplier (default: 1.0)
`trailing_silence`	number	Silence after utterance in seconds (default: 0)

Response Codes

Status	Description
`200`	Success
`201`	Resource created
`400`	Invalid request parameters
`401`	Missing or invalid API key
`404`	Resource not found
`429`	Rate limit exceeded
`5xx`	Server error

Guidelines

Authentication: Use X-Hume-Api-Key header (not Bearer token) for all requests
Batch Jobs: Jobs are asynchronous; poll the job details endpoint until status is COMPLETED
Models: Specify only the models you need in batch requests to reduce processing time
TTS Description: Use the description field to control emotional tone and style of generated speech
Pagination: EVI endpoints use zero-based page numbering with configurable page size (1-100)
API Key Types: Organization API Key for TTS and EVI; Personal API Key for Expression Measurement

API Reference

Documentation: https://dev.hume.ai/reference
Portal: https://app.hume.ai
API Keys: https://app.hume.ai/keys

Search AI Tools

hume

Install this agent skill to your Project

SKILL.md

Hume AI API

When to Use

Prerequisites

Expression Measurement (Batch)

Start Inference Job from URLs

Start Inference Job with Text

List Jobs

List Jobs by Status

Get Job Details

Get Job Predictions

Download Job Artifacts

Text-to-Speech

Synthesize Speech (JSON Response)

Synthesize Speech with Specific Voice

Synthesize and Save Audio File

Synthesize Multiple Utterances

EVI Configs

List Configs

Create Config

EVI Prompts

List Prompts

Create Prompt

EVI Chats

List Chat Events

Available Models for Expression Measurement

TTS Utterance Parameters

Response Codes

Guidelines

API Reference