A unified Python interface over multiple LLM backends with consistent support for streaming, tool calling, structured output, media handling, embeddings, and an optional OpenAI-compatible HTTP server.
Write your code once and run it across cloud providers, local servers, and in-process engines. AbstractCore normalizes provider differences so you can switch backends without rewriting your application.
from abstractcore import create_llm
# Cloud provider
llm = create_llm("openai", model="gpt-4o-mini")
# Local server — same API
llm = create_llm("ollama", model="qwen3:4b")
# In-process on Apple Silicon
llm = create_llm("mlx", model="mlx-community/Qwen3-4B")
resp = llm.generate("Say hello in French.")
print(resp.content)AbstractCore is designed to be lightweight by default. The core install is small and fast. Heavy dependencies stay behind optional extras and are imported lazily.
# Lightweight core (HTTP-only providers)
pip install abstractcore
# Add only what you need
pip install "abstractcore[openai]"
pip install "abstractcore[anthropic]"
pip install "abstractcore[media]"
pip install "abstractcore[server]"
# Turnkey Apple Silicon install
pip install "abstractcore[all-apple]"
# Turnkey NVIDIA GPU install
pip install "abstractcore[all-gpu]"Everything you need to build production AI applications, from basic text generation to multimodal pipelines.
OpenAI, Anthropic, OpenRouter, Portkey, Ollama, LM Studio, vLLM, MLX, HuggingFace — and any generic OpenAI-compatible endpoint. Same create_llm() interface across all.
Pass stream=True to any provider and iterate over chunks. Works identically across cloud, local HTTP, and in-process backends.
Decorate functions with @tool for a universal tool representation. Pass-through by default — your host or runtime executes; Core normalizes across providers.
Pass a Pydantic model as response_model=... and get typed objects back. Uses native provider structured output when available, prompted strategies + validation otherwise.
Policy-driven image, audio, video, and document input. Configurable vision/audio/video fallback pipelines. Capability plugins for TTS, STT, image gen, video gen, and music.
Install abstractcore[embeddings] and use EmbeddingManager for text embeddings with local models. Clean API for RAG pipelines and semantic search.
Turn AbstractCore into an API gateway with /v1/chat/completions, optional image/audio/video endpoints, Swagger UI, and configurable server auth.
Persistent configuration at ~/.abstractcore/config/. Set API keys, default models, vision/audio/video strategies, logging, and server settings from the command line.
BasicSession for multi-turn state. CachedSession adds provider prompt caching with stable prefix reuse. Attach files as context — cache them across turns.
From zero to generating in under a minute.
pip install abstractcore
# Add provider SDK extras as needed
pip install "abstractcore[openai]" # OpenAI SDK
pip install "abstractcore[anthropic]" # Anthropic SDKfrom abstractcore import create_llm
llm = create_llm("openai", model="gpt-4o-mini")
resp = llm.generate("Say hello in French.")
print(resp.content)for chunk in llm.generate("Write a short poem.", stream=True):
print(chunk.content or "", end="", flush=True)from abstractcore import create_llm, tool
@tool
def get_weather(city: str) -> str:
"""Return weather for a city."""
return f"{city}: 22C and sunny"
llm = create_llm("openai", model="gpt-4o-mini")
resp = llm.generate(
"What's the weather in Paris?",
tools=[get_weather]
)
print(resp.tool_calls) # structured calls for your host to executefrom pydantic import BaseModel
from abstractcore import create_llm
class Answer(BaseModel):
title: str
bullets: list[str]
llm = create_llm("openai", model="gpt-4o-mini")
result = llm.generate(
"Summarize HTTP/3 in 3 bullets.",
response_model=Answer
)
print(result.title)
print(result.bullets)from abstractcore import BasicSession, create_llm
session = BasicSession(
create_llm("anthropic", model="claude-haiku-4-5"),
temperature=0.3
)
print(session.generate("Give me 3 startup ideas.").content)
print(session.generate("Pick the best one and explain why.").content)Key patterns and provider configuration.
Media input is policy-driven. Audio and video require explicit policy configuration to avoid silent semantic changes.
from abstractcore import create_llm
llm = create_llm("anthropic", model="claude-haiku-4-5")
resp = llm.generate("Describe this image.", media=["./image.png"])
print(resp.content)
# Audio/video policies: native_only | speech_to_text | auto | caption
# Configure via CLI:
# abstractcore --set-audio-strategy auto
# abstractcore --set-video-strategy autopip install "abstractcore[embeddings]"
from abstractcore.embeddings import EmbeddingManager
em = EmbeddingManager()
vec = em.embed_text("hello world")
print(len(vec))pip install "abstractcore[server]"
python -m abstractcore.server.app
# Health check
curl http://localhost:8000/health
# Chat completions (provider/model format)
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model": "openai/gpt-4o-mini", "messages": [{"role": "user", "content": "Hello!"}]}'
# Swagger UI: http://localhost:8000/docsAbstractCore discovers optional modality backends via Python entry points:
TTS & STT via llm.voice and llm.audio. Local and server-backed speech pipelines.
Image & video generation via llm.vision. Text-to-image, image-to-image, text-to-video, image-to-video.
Text-to-music via llm.music. Local ACE-Step backend for in-process generation.
# Interactive config wizard
abstractcore --config
# Set API keys
abstractcore --set-api-key openai sk-...
abstractcore --set-api-key anthropic sk-ant-...
# Set default models
abstractcore --set-chat-model openai/gpt-4o-mini
abstractcore --set-code-model anthropic/claude-haiku-4-5
# Media strategies
abstractcore --set-video-strategy auto
abstractcore --set-audio-strategy auto
# Check status
abstractcore --status| Extra | What it adds |
|---|---|
openai | OpenAI Python SDK |
anthropic | Anthropic Python SDK |
remote | OpenAI + Anthropic SDKs |
huggingface | Transformers + Torch (heavy) |
mlx / apple | Apple Silicon local LLM (heavy) |
vllm / gpu | NVIDIA GPU inference (heavy) |
tools | Built-in web search & filesystem tools |
media | Image + PDF/Office document extraction |
embeddings | EmbeddingManager + local models |
server | OpenAI-compatible /v1 HTTP server (FastAPI) |
tokens | Precise token counting (tiktoken) |
all-apple / all-gpu | Full local dev environment (turnkey) |