Model-agnostic text-to-music and text-to-audio generation for the Abstract ecosystem. Remote-first with ACE Music and ElevenLabs, local generation via ACE-Step and Stable Audio 3 on Apple Silicon and GPU.
from abstractmusic import MusicManager
mm = MusicManager(backend="acemusic")
# Generate music from a text prompt
audio = mm.t2m("ambient lo-fi study music", duration=30)
# Save to file
open("out.wav", "wb").write(audio)
AbstractMusic is a model-agnostic text-to-music and text-to-audio library. The base install is import-light with remote backends (ACE Music, ElevenLabs). Local inference stacks for ACE-Step, Stable Audio 3, and MusicGen live behind optional extras.
The acemusic backend calls a hosted API out of the box. The elevenlabs backend targets ElevenLabs Music endpoints. Both are stdlib-only with no heavy dependencies in the base install.
The acestep backend uses AbstractMusic-owned orchestration around Diffusers AceStepPipeline with HuggingFace model weights. Runs locally on Apple Silicon (MPS) and CUDA GPUs, with automatic CPU fallback.
Separates text planning from audio synthesis. The built-in deterministic planner is dependency-free. Host applications can inject smarter planners through MusicManager or plugin configuration for prompt enhancement and lyrics generation.
Generate music from text prompts using remote APIs or local models. The manager stays thin and model-agnostic while backends handle provider-specific behavior.
Default remote backend. Calls a hosted ACE Music API with an API key. Lightweight stdlib-only client. Supports prompt, lyrics, duration, BPM, seed, and format parameters.
Stdlib-only remote backend for ElevenLabs Music endpoints. Supports provider-neutral MusicCompositionPlan for structured music requests. Requires ELEVENLABS_API_KEY.
Local generation via Diffusers AceStepPipeline. MIT-licensed XL Turbo checkpoint. Supports lyrics, vocal language, BPM, and seed. MPS bfloat16 preferred on Apple Silicon with automatic fallbacks.
Internal text-to-music path for stabilityai/stable-audio-3-small-music. AbstractMusic-owned runtime code with HuggingFace weights. Small Music validated up to 120 seconds.
ACE-Step prefers PyTorch MPS with bfloat16 when supported, then float32, with CPU float32 as the final fallback. Automatic MPS memory capping and watermark validation for 18 GB+ machines.
Optional --enhance-prompt and --auto-lyrics flags. Structure prompting enabled by default for 45+ second generations. Planner results include a provider-neutral composition_plan.
Artifact screening with WAV/music-likeness inspection and repetition/novelty quality gates. Validation state is explicit through smoke metrics, tests, and registry status.
Packaged music_model_capabilities.json declares model metadata, license, and precision policy. Discovery via available_providers(), list_models(), and capability_catalog() without loading weights.
Generate music in minutes. The base install is lightweight with remote backends; local inference is an explicit extra.
# Base install (remote ACE Music + ElevenLabs backends)
pip install abstractmusic
# Local ACE-Step generation
pip install "abstractmusic[acestep]"
# Local Stable Audio 3
pip install "abstractmusic[stable-audio-3]"
# Full local stack (Apple Silicon)
pip install "abstractmusic[all-apple]"
# Full local stack (GPU)
pip install "abstractmusic[all-gpu]"
# Remote generation (default ACE Music backend)
abstractmusic t2m "ambient lo-fi study music" --out out.wav --duration 30
# ElevenLabs backend
abstractmusic --backend elevenlabs t2m "cinematic instrumental synth cue" \
--format mp3 --out out.mp3 --duration 30
# Local ACE-Step generation
abstractmusic --backend acestep t2m "ambient lo-fi study music" \
--out out.wav --duration 10
# Enhanced prompt with auto-lyrics
abstractmusic --backend acestep t2m "heroic fantasy epic music" \
--enhance-prompt --auto-lyrics --print-plan --out out.wav --duration 30
# Local Stable Audio 3
abstractmusic --backend stable-audio-3 t2m "rhythmic space shooter game music" \
--out out.wav --duration 30 --steps 16
from abstractmusic import MusicManager
# Remote generation (reads ACEMUSIC_API_KEY from env)
mm = MusicManager(backend="acemusic")
audio = mm.t2m("ambient lo-fi study music", duration=30)
# Local ACE-Step generation
mm_local = MusicManager(backend="acestep")
audio = mm_local.t2m(
"epic orchestral film score",
duration=30,
seed=42,
)
# Artifact-based generation
asset = mm.generate_audio(
prompt="jazz piano trio",
duration=30,
format="wav",
)
# Start the interactive REPL
abstractmusic cli
# Inside the REPL:
# /engines — list available backends
# /models — list known models by engine
# /engine acestep — switch to local ACE-Step
# /model ACE-Step/acestep-v15-xl-turbo-diffusers
# /status — show active engine, model, params
# /download on — enable HuggingFace downloads
The public API is built around MusicManager for direct usage and the AbstractCore capability plugin for ecosystem integration.
from abstractmusic import MusicManager
mm = MusicManager(backend="acemusic")
# Generate audio bytes directly
audio_bytes = mm.t2m(
prompt="ambient lo-fi study music",
duration=30,
format="wav",
seed=42,
)
# Generate with artifact metadata
asset = mm.generate_audio(
prompt="cinematic orchestral cue",
duration=60,
lyrics="Rise above the shadows...",
vocal_language="en",
bpm=120,
)
mm = MusicManager(
backend="acestep",
text_planner_mode="auto", # "auto" | "required" | "off"
)
# Auto mode: uses injected provider when present,
# falls back to the deterministic planner
audio = mm.t2m(
"epic battle music with choir",
duration=45, # structure-prompt enabled by default at 45s+
)
from abstractcore import create_llm
llm = create_llm("openai")
# Generate music via capability plugin
audio = llm.music.t2m("chill ambient study beats", duration=30)
# Provider and model discovery (no weight loading)
providers = llm.music.available_providers()
models = llm.music.list_models(provider="acestep")
catalog = llm.music.capability_catalog()
# Local engine residency (acestep, stable-audio-3)
llm.music.load_resident_model(provider="acestep")
llm.music.list_resident_models()
llm.music.unload_resident_model(provider="acestep")
Default remote backend. Hosted API with ACEMUSIC_API_KEY. Stdlib-only client. Supports tagged <prompt> mode for strict duration enforcement.
Stdlib-only remote backend for ElevenLabs Music endpoints. Supports MusicCompositionPlan for structured requests. Requires ELEVENLABS_API_KEY.
Local generation via AceStepPipeline. MIT XL Turbo checkpoint. Lyrics, vocal language, BPM, seed. MPS bfloat16 → float32 → CPU fallback chain.
Internal stable-audio-3-small-music runtime. HuggingFace weights only, no upstream package. Validated at 30s and 120s. Requires HF gate approval.