Skip to content

About the DSPy LinkML Schema

DSPy is the framework for programming, not prompting, language models. This project provides a LinkML schema that formalises DSPy's data model - signatures, modules, predictors, adapters, the typed LM request/response contract, custom types, clients, providers, the cache, teleprompters, evaluation, retrievers, datasets, streaming events, and the public error hierarchy.

The schema is the single source of truth for the shape of every DSPy artefact that crosses a boundary (disk, wire, provider API, persisted program, evaluation report, RDF/JSON-LD export).


What this project ships

Artefact Path Purpose
LinkML schema src/dspy/schema/dspy.yaml Canonical model: classes, enums, custom types, subsets
SSSOM cross-walks src/dspy/mappings/ 12 mapping files, 170 mappings to external vocabularies (AIRO/Nexus, MCP, Mellea, ISO 22989, ISO 42001, EU AI Act, NIST AI 100-1, UCO, UCS, AI/DPV)
Generated Python data model src/dspy/datamodel/dspy.py gen-python output (256 dataclasses)
Schema generator scripts/build_schema.py Authors dspy.yaml from the Python source of truth
Mapping generator scripts/build_sssom_mappings.py Authors the SSSOM/TSV cross-walks
Overlay tool scripts/apply_sssom_overlay.py Merges mappings back into the schema (idempotent, layout-preserving)
Validation tests tests/test_data.py 24 fixtures covering valid/invalid instances
Upstream notes ISSUE.md Documented LinkML metamodel / generator quirks

Build pipeline

just build-schema   # gen-linkml -> gen-mappings -> apply-mappings
just gen-project    # build-schema + downstream LinkML generators (Pydantic, Java, TypeScript, OWL, ...)
just test           # LinkML-validator round-trip tests

Why LinkML

DSPy ships a rich, evolving runtime contract: typed LMRequest / LMResponse, heterogeneous LMPart content (text, image, audio, tool call, citation, reasoning, refusal), Signature field metadata, Tool schemas, optimiser state, evaluation rows, streaming events. Today this contract is encoded in Pydantic classes scattered across dspy/core, dspy/adapters, dspy/clients, and dspy/predict. The Pydantic classes are excellent runtime guard-rails but they are not a publishable schema:

  • They do not survive a language boundary (no first-class TS/Java/Rust types).
  • They have no stable URI for cross-vocabulary alignment (RDF, OWL, JSON-LD).
  • They do not version cleanly between releases of dspy.Module.dump_state().
  • They cannot be queried, diffed, or rendered as ontology docs.

LinkML solves all of these problems at once: one YAML schema generates Pydantic, dataclasses, JSON Schema, JSON-LD context, SHACL shapes, OWL, GraphQL, Java, TypeScript, Markdown docs, and Mermaid ER diagrams, and provides a runtime validator (linkml.validator.validate_file) that DSPy can call on any persisted artefact.

The schema in this repository is intentionally a mirror of the runtime Pydantic types, not a replacement, so it can be regenerated from Python and overlaid with curated SSSOM mappings without disturbing DSPy itself.


Engineering notes

Single source of truth in Python

scripts/build_schema.py is a generator. It enumerates DSPy's classes, slots, and enums, and emits canonical YAML using two small helpers (_emit_scalar, _emit_section) that preserve the formatting that gen-doc and human readers expect (blank lines between entries, double-blank between top-level blocks, wrapped long strings). The same helpers are reused by apply_sssom_overlay.py so that overlaying mappings never reverts formatting improvements.

SSSOM as cross-walk format

External alignments live in src/dspy/mappings/ as SSSOM/TSV files, one per target vocabulary. SSSOM gives us provenance (mapping_justification, mapping_date, creator_id), a controlled predicate vocabulary (skos:exactMatch, skos:closeMatch, skos:relatedMatch, skos:broadMatch), and machine-checkable round-tripping. The 170 mappings cover 74 schema elements.

Documented LinkML quirks

Two schema generation issues raised upstream:

  1. equals_string / equals_string_in reject enum-ranged slots. Worked around by declaring discriminators as range: string with designates_type: true, keeping the enum as documentation.
  2. Former slot_usage warning (adapter_kind) is now fixed locally. We promoted adapter_kind to a real top-level slot and made Adapter use it directly, which removes the generator error.
  3. Former subset/slot overlap warning (tools) is now fixed locally. We renamed the subset to tooling while retaining the tools slot, eliminating the name collision warning.

How a LinkML schema could enhance DSPy itself

This section is forward-looking. It catalogues integration points where the schema in this repo could move from mirror to engine inside DSPy. Each item is framed from the point of view of a real DSPy user - a researcher, an SRE, a frontend developer, a compliance officer, a contributor - and only then names the code path that would change.

1. Compiled programs that survive an upgrade

A practitioner who spent compute optimising a DSPy program today has no guarantee that Module.load() will reproduce the exact same predictor tomorrow: dump_state() is an untyped dict, and the only version metadata recorded is metadata.dependency_versions (Python, DSPy, cloudpickle). When the team upgrades DSPy or swaps an LM provider, the program can deserialise into something subtly different - or fail with a KeyError on a renamed slot. Today the relevant code lives in BaseModule.dump_state, Predict.dump_state, Signature.dump_state, and BaseLM.dump_state; each emits its own ad-hoc shape, and Predict.serialize_object() flattens custom types to plain dicts. Promoting these shapes to a LinkML Module / Predictor / Signature model gives dspy.Module.load(...) a one-line validator (linkml.validator.validate_file), a versioned schema URI to record alongside the dependency table, and an upgrade path through linkml-transformer migrations so a 2024 program can be opened in 2026 with explicit, reviewable rewrites.

2. Non-Python teams who want to consume DSPy outputs

A TypeScript frontend or a Rust pipeline that wants to read a Prediction, render a Tool call, or replay a trace currently has to reverse-engineer DSPy's Pydantic classes by reading Python source. There is no published machine-readable contract for LMRequest, LMResponse, Tool, Signature, or the LMPart hierarchy, even though every one of these crosses a language boundary in practice (browser, mobile, MCP client). LinkML treats this as a single command: the schema in src/dspy/schema/dspy.yaml already produces TypeScript via gen-typescript, Java via gen-java, JSON Schema via gen-json-schema, and OWL/JSON-LD via gen-owl - all run by just gen-project and emitted under project/. A future DSPy.js or DSPy.rs reuses those types instead of hand-writing them.

3. Operators who need structured traces, not coloured text

An SRE wiring DSPy into OpenTelemetry, Datadog, or a homegrown trace store today has two options: scrape the coloured output of inspect_history, or register a BaseCallback whose hooks receive inputs: dict[str, Any] and outputs: dict[str, Any] with no documented shape. Trace data goes into observability pipelines as free-form JSON, and downstream consumers re-implement the same field names every project. With this schema, every LM call is already an LMHistoryEntry wrapping an LMRequest, LMResponse, LMUsage, and a list of LMMessage / LMPart; emitting it as JSON-LD with the class URIs in the dspy: namespace plus the prov:Entity mapping on NamedThing gives operators a stable, queryable trace format that SPARQL, OTLP exporters, and warehouse loaders all understand without custom parsers.

4. Tool authors who only want to write the function once

A DSPy developer who declares a dspy.Tool today has its schema assembled three different ways for three different consumers: Tool.parameters builds an object schema from per-arg Pydantic model_json_schema() calls and overrides __get_pydantic_json_schema__(); tool_to_openai reshapes LMToolSpec into OpenAI's envelope; and convert_mcp_tool goes the other way for MCP servers. Each path can drift independently, and a renamed parameter only shows up when a provider call fails. Driving all three paths from the LinkML Tool / LMToolSpec classes (already mapped to mcp:Tool and nexus:Capability) collapses tool-schema production to one gen-json-schema step, with stable URIs so the same definition fans out to OpenAI, Anthropic, LiteLLM, MCP, and JSON-Schema consumers from a single source.

5. Researchers and JSONAdapter users who want predictable outputs

When a JSONAdapter user asks an OpenAI model for structured output, _get_structured_outputs_response_format manufactures a one-off Pydantic model from the Signature for that specific call. The resulting JSON Schema is therefore opaque, not cacheable, and not visible to TypeScript clients trying to validate the same payload on the wire. A LinkML-backed Signature makes the same JSON Schema a deterministic, cacheable artefact produced by gen-json-schema on the slot tree - identical between Python, the provider, and the browser - which removes the most common class of "the model returned a slightly wrong shape" bug.

6. Researchers comparing adapter formats

A researcher choosing between ChatAdapter, JSONAdapter, XMLAdapter, BAMLAdapter, and TwoStepAdapter today has only docstrings and example notebooks. Each adapter implements its own format_user_message_content() and parsing path (see chat_adapter.py, json_adapter.py, xml_adapter.py, baml_adapter.py, two_step_adapter.py), which makes it hard to see whether a given signature is supported by all of them, and easy for the implementations to drift. The AdapterKind enum and the Adapter / subclass tree in this schema are the hook to publish each format as a machine-readable spec (grammar + parse rules), so a researcher can ask "does my signature work in XML?" and get a yes/no from a validator instead of from a failed run.

7. Cache hits that survive a field rename

A team with a long-running cache today is one PR away from silent invalidation: Cache.cache_key hashes the result of _transform_value, which calls model_dump() / model_json_schema() on a dict whose shape moves whenever a request field is added or renamed. A cached response from last quarter may no longer hit, or worse, may hit for the wrong reason. Defining the cache key as a typed projection of the LinkML LMRequest (the canonical JSON-LD frame of a declared subset - everything except metadata, provider_data, extensions) means a schema diff becomes the audit log of cache compatibility, and DSPy can promote the projection itself to a public, versioned slot in the schema.

8. SREs who want unified error handling

When a DSPy program fails in production today, the operator gets a LMRateLimitError (or one of its siblings) carrying machine-relevant fields - code, http_status, retryable, model, provider - from dspy/utils/exceptions.py. Each provider populates them differently, and there is no published enum of code values, so dashboards group errors by class name. Using those same fields, observability tools can normalise errors across providers without an SDK by aligning to the matching LinkML hierarchy (DSPyError, LMTransportError, LMRateLimitError, …, mapped to mcp:Error / mcp:ParseError / mcp:InvalidRequestError), which also gives an SRE a stable retry-policy slot and a fixed vocabulary of error codes to alert on.

9. Frontend developers consuming DSPy streams

A web developer wiring a DSPy stream into a browser today reads dspy/streaming/messages.py to discover that StreamResponse(predict_name, signature_field_name, chunk, is_last_chunk) and StatusMessage(message) are the on-wire shapes, and then hand-rolls TypeScript types to match. The StreamEvent hierarchy plus StreamEventKind enum in this schema already covers both events; running gen-typescript (already part of just gen-project) produces the corresponding .ts types and an AsyncAPI-friendly description, so a browser client can subscribe to the same stream the Python process emits without a parallel type declaration.

10. Compliance teams asking "what model produced this?"

A governance reviewer who has to answer EU AI Act, ISO 42001, or NIST AI 100-1 questions today has to follow Python references from a Prediction back through Settings.lm, Provider, and TrainingJob / utils_finetune.py to piece together model identity, provider, fine-tuning lineage, and dataset provenance - none of which are emitted as machine-readable metadata by BaseLM.dump_state or Dataset. Persisting these entities against the LinkML classes (already cross-walked in src/dspy/mappings/ to nexus:AiModel, nexus:AiProvider, ai:ModelTraining, nexus:Dataset, legal_eu_aiact:AIProvider, iso42001:AISystemLifecycleStage, nist_ai_100_1:AiActor) gives compliance tooling a single graph it can query for "every Prediction served by this model on this dataset under this fine-tuning job", without DSPy itself growing a governance subsystem.

11. MCP server authors who want bidirectional safety

An MCP server author today wires DSPy in one direction: convert_mcp_tool reads an MCP tool.inputSchema and produces a Python dspy.Tool via convert_input_schema_to_tool_args. The reverse direction (publish a DSPy Tool as an MCP tool) does not exist as a first-class path, and there is no validator that catches when an upstream MCP tool renames a parameter. With a LinkML LMToolSpec (mapped to mcp:Tool) the conversion becomes schema-to-schema in both directions: gen-json-schema produces the MCP inputSchema, and a linkml.validator check on the round-trip guarantees parameter drift is caught at registration time, not at the first failed call.

12. Researchers who want their signature constraints documented once

A dspy.Signature author who writes Pydantic constraints (min_length, pattern, etc.) on an InputField ends up re-explaining them three times: the constraints flow into the provider via __get_pydantic_json_schema__, into the prompt via PYDANTIC_CONSTRAINT_MAP which hand-rolls human-readable text, and into the model in signature.py which loses them on serialisation. LinkML covers the same surface with pattern, minimum_value, maximum_value, required, and equals_string slots that render to docs (gen-doc), validators (linkml.validator), and provider JSON-Schema (gen-json-schema) in a single declaration - so the constraint a researcher writes is also the constraint a reviewer reads and a provider enforces.

13. Researchers publishing reproducible evaluations

A paper author today reports a number from Evaluate but cannot publish a self-describing artefact that another team can re-load: EvaluationResult carries score and a list of (example, prediction, score) rows, but not the dataset version, metric identifier, LM config, seed, or adapter that produced them. The matching EvaluationResult / EvaluationRow / Evaluate classes in this schema map to nexus:AiEval, nexus:AiEvalResult, and nexus:EvaluationResultRecord, so writing the result out as schema-validated JSON-LD gives benchmark hubs, audit logs, and collaborators a manifest they can verify and rerun without a separate companion document.

14. Users who persist Predictions with custom types

A user who saves a Prediction containing a dspy.Image, dspy.Audio, dspy.Code, or dspy.Document today loses the type tag on the way out: Predict.serialize_object calls model_dump(mode="json") and produces a plain dict, while the custom-type modules (image.py, audio.py, code.py, document.py, history.py, citation.py, reasoning.py) each have their own field set. Reloading next week dispatches on shape, not on identity. With this schema each custom type already has a class URI (dspy:Image, dspy:Audio, …) and inherits from Type / NamedThing, so JSON-LD output carries an explicit @type and polymorphic deserialisation becomes a one-line dispatch instead of heuristic sniffing.

15. Researchers citing the dataset they actually used

A practitioner citing HotPotQA, MATH, GSM8K, Colors, or AlfWorld today has to track the exact version, split, and seed out-of-band - the Dataset base class stores sizes and seeds but no URI, checksum, or commit hash, and DataLoader.from_huggingface does not record the HF dataset id or revision. The LinkML Dataset class is already mapped to nexus:Dataset and ucs_core:InformationContentEntity; populating its slots and emitting JSON-LD on Dataset.save() gives papers a stable URI plus a checksum a reviewer can verify, without forcing DSPy to ship a citation manager.

16. Reproducing a paper's dspy.settings

A reader trying to reproduce a result needs more than the code: the runtime Settings singleton holds the active lm, adapter, rm, callbacks, threading limits, history flags, and tracing toggles, and there is no Settings.to_yaml() to share. The LinkML Settings class (tree_root: true) already enumerates every public knob; serialising it on program start - and reading it back via the schema validator - turns the unspecified "we used DSPy 2.x with default settings" of today's papers into a one-file settings.yaml that any reader can load before rerunning the experiment.


Status

  • Schema: complete for the public DSPy surface as of the version in src/dspy/schema/dspy.yaml.
  • Mappings: 170 mappings across 12 external vocabularies.
  • Tests: 24/24 passing.
  • Generator pipeline: just build-schema and just gen-project both green.
  • Upstream integration: this repository does not yet hook into DSPy at runtime; the integration ideas above are a roadmap, not a promise.