KNOWLEDGE

AbstractSemantics

Central, editable semantics registry for AbstractFramework. Defines predicates and entity types as CURIE-based allowlists, with helpers to build compact JSON Schema for knowledge-graph assertion structured outputs.

from abstractsemantics import load_semantics_registry, build_kg_assertion_schema_v0

reg = load_semantics_registry()
schema = build_kg_assertion_schema_v0(registry=reg)

print(len(reg.predicates), len(reg.entity_types))

Definitions, Not Storage

AbstractSemantics intentionally contains definitions only — prefix mappings, predicate allowlists, and entity-type allowlists. It makes no network calls and does not store or query data. Storage and querying belong in downstream systems like AbstractMemory.

YAML Registry

The semantics registry is a YAML file that defines allowed predicate IDs and entity-type IDs. Shipped as package data, overridable via ABSTRACTSEMANTICS_REGISTRY_PATH.

JSON Schema Builder

Build a compact, bounded JSON Schema dict from the registry for LLM structured outputs. Enforces predicate and entity-type enums, evidence fields, and output size limits.

Stable $ref Resolution

Workflows can reference the schema by a stable $ref string and resolve it at runtime. No need to embed the full schema dict in flow definitions.

Dependency-Light

Only depends on PyYAML. Tiny public API with immutable frozen dataclasses. Designed to be embedded in downstream systems with minimal overhead.

Registry & Schema

A shared vocabulary for knowledge-graph assertions used by runtimes, UIs, and ingestion pipelines across the AbstractFramework ecosystem.

Predicate Allowlist

35+ predicates from standard vocabularies (RDF, Dublin Core, Schema.org, SKOS, CiTO). Covers structure, metadata, context, actions, concepts, and evidence relationships.

Entity Type Allowlist

17+ entity types spanning persons, organizations, events, places, concepts, datasets, software, media, and generic things. Hierarchical via optional parent pointers.

CURIE Prefix Mapping

Standard namespace prefixes (dcterms, schema, skos, cito, rdf) for compact, human-readable identifiers. Loaded but not expanded — IDs are treated as opaque strings.

Predicate Aliases

A small, deterministic alias set for predicate strings that LLMs tend to emit by default. Optional — aliases are not part of the canonical registry.

Bounded Output Schema

The v0 schema caps output size with maxItems, requires minimum assertions when non-empty, and bounds evidence fields with maxLength constraints.

Tolerant Loader

The registry loader skips invalid items, ignores unknown keys, and requires only at least one valid predicate. Designed for forward compatibility as the registry evolves.

Inverse Pointers

Predicates can declare inverse relationships (e.g. dcterms:hasPartdcterms:isPartOf) as navigational hints for graph exploration UIs.

Extraction Prompt

Ships an optimized prompt template for LLM-based knowledge extraction aligned to the registry. Guidance for evidence quoting and confidence scoring.

Install and Use

AbstractSemantics requires Python 3.10+ and depends only on PyYAML.

Install

# From PyPI
pip install abstractsemantics

# From source (editable, with test deps)
pip install -e ".[dev]"

Load the Registry

from abstractsemantics import load_semantics_registry

reg = load_semantics_registry()
print(len(reg.predicates), len(reg.entity_types))

# List allowed IDs
print(sorted(reg.predicate_ids())[:10])
print(sorted(reg.entity_type_ids())[:10])

Build the JSON Schema

from abstractsemantics import build_kg_assertion_schema_v0, load_semantics_registry

reg = load_semantics_registry()
schema = build_kg_assertion_schema_v0(
    registry=reg,
    include_predicate_aliases=True,
    max_assertions=20,
    min_assertions_when_nonempty=1,
)

Resolve a $ref

from abstractsemantics import KG_ASSERTION_SCHEMA_REF_V0, resolve_schema_ref

# Resolve a stable reference to a concrete schema
schema = resolve_schema_ref({"$ref": KG_ASSERTION_SCHEMA_REF_V0})
assert isinstance(schema, dict)

Custom Registry

# Option A: environment variable
export ABSTRACTSEMANTICS_REGISTRY_PATH=/path/to/semantics.yaml

# Option B: explicit path
from pathlib import Path
from abstractsemantics import load_semantics_registry

reg = load_semantics_registry(Path("/path/to/semantics.yaml"))

Public API Surface

Import from abstractsemantics (re-exported from __init__.py). Direct submodule imports may change.

Registry API

# Load the registry
load_semantics_registry(path=None) -> SemanticsRegistry
resolve_semantics_registry_path() -> Path

# SemanticsRegistry (frozen dataclass)
.version: int
.prefixes: Dict[str, str]
.predicates: List[PredicateDef]
.entity_types: List[EntityTypeDef]
.predicate_ids() -> set[str]
.entity_type_ids() -> set[str]

# Entry shapes
PredicateDef(id, label?, inverse?, description?)
EntityTypeDef(id, label?, parent?, description?)

Schema API

# Build the structured-output JSON Schema
build_kg_assertion_schema_v0(
    registry=None,
    include_predicate_aliases=False,
    max_assertions=12,
    min_assertions_when_nonempty=3,
    max_evidence_quote_len=160,
    max_original_context_len=280,
) -> dict

# Stable reference constant
KG_ASSERTION_SCHEMA_REF_V0 = "abstractsemantics:kg_assertion_schema_v0"

# Resolve a $ref to a concrete schema dict
resolve_schema_ref(schema: dict) -> dict | None

Registry YAML Format

# semantics.yaml
version: 0
prefixes:
  schema: "https://schema.org/"
  dcterms: "http://purl.org/dc/terms/"
predicates:
  - id: "schema:name"
    label: "name"
  - id: "dcterms:hasPart"
    inverse: "dcterms:isPartOf"
entity_types:
  - id: "schema:Thing"
    label: "Thing"
  - id: "schema:Person"
    label: "Person"