CAPABILITY PLUGIN

AbstractMusic

Model-agnostic text-to-music and text-to-audio generation for the Abstract ecosystem. Remote-first with ACE Music and ElevenLabs, local generation via ACE-Step and Stable Audio 3 on Apple Silicon and GPU.

from abstractmusic import MusicManager

mm = MusicManager(backend="acemusic")

# Generate music from a text prompt
audio = mm.t2m("ambient lo-fi study music", duration=30)

# Save to file
open("out.wav", "wb").write(audio)

Music Generation for the Abstract Ecosystem

AbstractMusic is a model-agnostic text-to-music and text-to-audio library. The base install is import-light with remote backends (ACE Music, ElevenLabs). Local inference stacks for ACE-Step, Stable Audio 3, and MusicGen live behind optional extras.

Remote-First Default

The acemusic backend calls a hosted API out of the box. The elevenlabs backend targets ElevenLabs Music endpoints. Both are stdlib-only with no heavy dependencies in the base install.

Local ACE-Step Generation

The acestep backend uses AbstractMusic-owned orchestration around Diffusers AceStepPipeline with HuggingFace model weights. Runs locally on Apple Silicon (MPS) and CUDA GPUs, with automatic CPU fallback.

Text Planning Layer

Separates text planning from audio synthesis. The built-in deterministic planner is dependency-free. Host applications can inject smarter planners through MusicManager or plugin configuration for prompt enhancement and lyrics generation.

Multiple Backends, One Interface

Generate music from text prompts using remote APIs or local models. The manager stays thin and model-agnostic while backends handle provider-specific behavior.

ACE Music (Remote)

Default remote backend. Calls a hosted ACE Music API with an API key. Lightweight stdlib-only client. Supports prompt, lyrics, duration, BPM, seed, and format parameters.

ElevenLabs Music (Remote)

Stdlib-only remote backend for ElevenLabs Music endpoints. Supports provider-neutral MusicCompositionPlan for structured music requests. Requires ELEVENLABS_API_KEY.

ACE-Step (Local)

Local generation via Diffusers AceStepPipeline. MIT-licensed XL Turbo checkpoint. Supports lyrics, vocal language, BPM, and seed. MPS bfloat16 preferred on Apple Silicon with automatic fallbacks.

Stable Audio 3 (Local)

Internal text-to-music path for stabilityai/stable-audio-3-small-music. AbstractMusic-owned runtime code with HuggingFace weights. Small Music validated up to 120 seconds.

Apple Silicon Optimized

ACE-Step prefers PyTorch MPS with bfloat16 when supported, then float32, with CPU float32 as the final fallback. Automatic MPS memory capping and watermark validation for 18 GB+ machines.

Prompt Enhancement

Optional --enhance-prompt and --auto-lyrics flags. Structure prompting enabled by default for 45+ second generations. Planner results include a provider-neutral composition_plan.

Quality Screening

Artifact screening with WAV/music-likeness inspection and repetition/novelty quality gates. Validation state is explicit through smoke metrics, tests, and registry status.

Model Registry

Packaged music_model_capabilities.json declares model metadata, license, and precision policy. Discovery via available_providers(), list_models(), and capability_catalog() without loading weights.

Install & First Track

Generate music in minutes. The base install is lightweight with remote backends; local inference is an explicit extra.

Installation

# Base install (remote ACE Music + ElevenLabs backends)
pip install abstractmusic

# Local ACE-Step generation
pip install "abstractmusic[acestep]"

# Local Stable Audio 3
pip install "abstractmusic[stable-audio-3]"

# Full local stack (Apple Silicon)
pip install "abstractmusic[all-apple]"

# Full local stack (GPU)
pip install "abstractmusic[all-gpu]"

CLI Quick Start

# Remote generation (default ACE Music backend)
abstractmusic t2m "ambient lo-fi study music" --out out.wav --duration 30

# ElevenLabs backend
abstractmusic --backend elevenlabs t2m "cinematic instrumental synth cue" \
  --format mp3 --out out.mp3 --duration 30

# Local ACE-Step generation
abstractmusic --backend acestep t2m "ambient lo-fi study music" \
  --out out.wav --duration 10

# Enhanced prompt with auto-lyrics
abstractmusic --backend acestep t2m "heroic fantasy epic music" \
  --enhance-prompt --auto-lyrics --print-plan --out out.wav --duration 30

# Local Stable Audio 3
abstractmusic --backend stable-audio-3 t2m "rhythmic space shooter game music" \
  --out out.wav --duration 30 --steps 16

Quick Start (Python)

from abstractmusic import MusicManager

# Remote generation (reads ACEMUSIC_API_KEY from env)
mm = MusicManager(backend="acemusic")
audio = mm.t2m("ambient lo-fi study music", duration=30)

# Local ACE-Step generation
mm_local = MusicManager(backend="acestep")
audio = mm_local.t2m(
    "epic orchestral film score",
    duration=30,
    seed=42,
)

# Artifact-based generation
asset = mm.generate_audio(
    prompt="jazz piano trio",
    duration=30,
    format="wav",
)

Interactive REPL

# Start the interactive REPL
abstractmusic cli

# Inside the REPL:
# /engines          — list available backends
# /models           — list known models by engine
# /engine acestep   — switch to local ACE-Step
# /model ACE-Step/acestep-v15-xl-turbo-diffusers
# /status           — show active engine, model, params
# /download on      — enable HuggingFace downloads

Key Classes & Methods

The public API is built around MusicManager for direct usage and the AbstractCore capability plugin for ecosystem integration.

MusicManager — Core API

from abstractmusic import MusicManager

mm = MusicManager(backend="acemusic")

# Generate audio bytes directly
audio_bytes = mm.t2m(
    prompt="ambient lo-fi study music",
    duration=30,
    format="wav",
    seed=42,
)

# Generate with artifact metadata
asset = mm.generate_audio(
    prompt="cinematic orchestral cue",
    duration=60,
    lyrics="Rise above the shadows...",
    vocal_language="en",
    bpm=120,
)

Text Planning Layer

mm = MusicManager(
    backend="acestep",
    text_planner_mode="auto",  # "auto" | "required" | "off"
)

# Auto mode: uses injected provider when present,
# falls back to the deterministic planner
audio = mm.t2m(
    "epic battle music with choir",
    duration=45,  # structure-prompt enabled by default at 45s+
)

AbstractCore Plugin Integration

from abstractcore import create_llm

llm = create_llm("openai")

# Generate music via capability plugin
audio = llm.music.t2m("chill ambient study beats", duration=30)

# Provider and model discovery (no weight loading)
providers = llm.music.available_providers()
models = llm.music.list_models(provider="acestep")
catalog = llm.music.capability_catalog()

# Local engine residency (acestep, stable-audio-3)
llm.music.load_resident_model(provider="acestep")
llm.music.list_resident_models()
llm.music.unload_resident_model(provider="acestep")

Available Backends

ACE Music

Default remote backend. Hosted API with ACEMUSIC_API_KEY. Stdlib-only client. Supports tagged <prompt> mode for strict duration enforcement.

ElevenLabs Music

Stdlib-only remote backend for ElevenLabs Music endpoints. Supports MusicCompositionPlan for structured requests. Requires ELEVENLABS_API_KEY.

ACE-Step

Local generation via AceStepPipeline. MIT XL Turbo checkpoint. Lyrics, vocal language, BPM, seed. MPS bfloat16 → float32 → CPU fallback chain.

Stable Audio 3

Internal stable-audio-3-small-music runtime. HuggingFace weights only, no upstream package. Validated at 30s and 120s. Requires HF gate approval.