to_json_schema¶
- deep_dataclasses.to_json_schema(cls, strict=False, allow_additional_properties=False)[source]¶
Generate a JSON Schema
objectfor a dataclass.Recursively converts the field types of cls to their JSON Schema equivalents. The resulting schema can be used directly with any JSON Schema validator (e.g.
jsonschema.validate).The following Python types are supported:
Primitives:
bool,int,float,str,None/type(None)Collections:
List[T],Tuple[T, ...],Tuple[T1, T2, ...],Set[T],FrozenSet[T],Dict[K, V]Composites:
Optional[T],Union[A, B, ...],Literal[...]Nested dataclasses (recursed automatically)
typing.Any(no constraint)
- Parameters:
cls (
type) – A@dataclassor@deep_dataclassclass.strict (
bool, defaultFalse) –When
False(default), only fields without a default value and not typed asOptionalare listed under"required". This allows partial dicts to validate successfully as long as omitted fields have defaults.When
True, every field — even those with defaults — is added to"required"unless explicitly typed asOptional. Use this when you want to enforce that all fields are always present.allow_additional_properties (
bool, defaultFalse) –When
False(default),"additionalProperties": falseis added to the schema, rejecting any keys not declared as fields. This is the right choice for closed schemas such asUnionvariant discrimination.When
True, extra keys are silently accepted. Useful when the dataclass represents a partial view of a larger document.
- Returns:
A JSON Schema
objectwith the following keys:"type"Always
"object"."properties"A dict mapping each field name to its type schema, including a
"default"key when the field has a default value or factory."required"List of field names that must be present (only included when at least one field is required).
"additionalProperties"Falseunless allow_additional_properties isTrue.
- Return type:
dict
Notes
The validate-then-construct pattern works end-to-end with
@deep_dataclassbecause construction coerces nested dicts to their declared types. With plain@dataclass, validation succeeds but nested dicts are not coerced, so the constructed object may contain raw dicts in place of typed fields.Examples
Basic usage — validate a raw dict before constructing:
>>> import jsonschema >>> from dataclasses import dataclass >>> from deep_dataclasses import deep_dataclass, to_json_schema >>> >>> @deep_dataclass ... class Config: ... class Optimizer: ... lr: float = 1e-3 ... momentum: float = 0.9 ... epochs: int = 100 >>> >>> schema = to_json_schema(Config) >>> jsonschema.validate({"Optimizer": {"lr": 0.01}, "epochs": 50}, schema) >>> cfg = Config(**{"Optimizer": {"lr": 0.01}, "epochs": 50}) >>> cfg.Optimizer.lr 0.01
strict=Truerequires every field to be present, even those with defaults:>>> strict = to_json_schema(Config, strict=True) >>> jsonschema.validate({"epochs": 50}, strict) # raises — Optimizer missing Traceback (most recent call last): ... jsonschema.exceptions.ValidationError: 'Optimizer' is a required property
allow_additional_properties=Trueaccepts extra keys:>>> open_schema = to_json_schema(Config, allow_additional_properties=True) >>> jsonschema.validate({"epochs": 10, "unknown_key": 42}, open_schema)
Literalfields are enforced via"enum":>>> from typing import Literal >>> >>> @dataclass ... class Run: ... device: Literal["cpu", "cuda"] = "cpu" >>> >>> jsonschema.validate({"device": "tpu"}, to_json_schema(Run)) Traceback (most recent call last): ... jsonschema.exceptions.ValidationError: 'tpu' is not valid under any of the given schemas
Overview¶
to_json_schema converts any @dataclass or @deep_dataclass class into a
JSON Schema object descriptor. The resulting dict
can be passed directly to any JSON Schema validator.
to_json_schema works with any standard dataclass — not only classes produced
by @deep_dataclass. If you already have plain @dataclass classes, you can
export and validate their schemas without any changes.
Basic usage¶
import jsonschema
from deep_dataclasses import deep_dataclass, to_json_schema
@deep_dataclass
class Config:
class Optimizer:
lr: float = 1e-3
momentum: float = 0.9
epochs: int = 100
schema = to_json_schema(Config)
# {'type': 'object',
# 'properties': {
# 'Optimizer': {'type': 'object', 'properties': {...}, ...},
# 'epochs': {'type': 'integer', 'default': 100}
# },
# 'additionalProperties': False}
jsonschema.validate({"epochs": 50}, schema) # passes — Optimizer has a default
jsonschema.validate({"epochs": "fifty"}, schema) # raises — wrong type
Supported types¶
Python type |
JSON Schema |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
fixed-length array with |
|
|
|
|
|
|
|
|
Nested dataclass |
Recursed — full object schema inline |
strict mode¶
By default (strict=False), only fields that have no default value and are not
Optional are listed under "required". This means a partial dict validates
successfully as long as the missing fields have defaults:
schema = to_json_schema(Config) # strict=False
jsonschema.validate({"epochs": 50}, schema) # passes — Optimizer has a default
With strict=True, every field is required unless it is Optional[T]:
strict = to_json_schema(Config, strict=True)
jsonschema.validate({"epochs": 50}, strict)
# ValidationError: 'Optimizer' is a required property
Use strict=True when you want to enforce that callers always provide every field
explicitly, for example in an API boundary or config file validator.
allow_additional_properties¶
By default "additionalProperties": false is added to the schema, so any key not
declared as a field is rejected. Pass allow_additional_properties=True to
accept extra keys silently — useful when the dataclass represents a partial view
of a larger document:
open_schema = to_json_schema(Config, allow_additional_properties=True)
jsonschema.validate({"epochs": 10, "experiment_id": "run-42"}, open_schema)
Validate-then-construct: a poor man’s pydantic¶
The combination of to_json_schema and @deep_dataclass’s dict coercion gives
you a lightweight validate-then-construct pattern that covers most use cases where
you might otherwise reach for Pydantic:
import json, jsonschema
from deep_dataclasses import deep_dataclass, to_json_schema
from typing import Literal, List
from dataclasses import field
@deep_dataclass
class TrainingRun:
class Optimizer:
lr: float = 1e-3
momentum: float = 0.9
epochs: int = 100
device: Literal["cpu", "cuda"] = "cpu"
tags: List[str] = field(default_factory=list)
schema = to_json_schema(TrainingRun)
# --- load and validate raw data (e.g. from a config file) ---
raw = json.loads(open("run.json").read())
jsonschema.validate(raw, schema) # raises on bad data — before any construction
# --- construct fully-typed object ---
run = TrainingRun(**raw)
isinstance(run.Optimizer, TrainingRun.Optimizer) # True — dicts coerced recursively
Compared to Pydantic:
|
||
|---|---|---|
Validation |
At construction (jsonschema) |
At construction |
Type coercion |
Dataclass-typed fields only |
All fields |
Schema export |
|
|
Dependencies |
|
|
stdlib compatibility |
Full |
Separate |
Nested dict coercion |
✅ |
✅ |
Field validators |
❌ (use |
✅ |
The key trade-off: @deep_dataclass stays in the stdlib dataclass world — you get
full compatibility with everything that consumes dataclasses, at the cost of no
per-field validators.
Optional and Union schemas¶
Optional[T] becomes anyOf: [<T schema>, null]:
from typing import Optional
from dataclasses import dataclass
@dataclass
class Record:
name: str
score: Optional[float] = None
to_json_schema(Record)
# {'type': 'object',
# 'properties': {
# 'name': {'type': 'string'},
# 'score': {'anyOf': [{'type': 'number'}, {'type': 'null'}], 'default': None}
# },
# 'required': ['name'],
# 'additionalProperties': False}
Union[A, B] where both variants are dataclasses becomes anyOf with their full
inline schemas, which enables union best-match coercion
at construction time:
from typing import Union
from dataclasses import field
from deep_dataclasses import deep_dataclass, auxiliary, to_json_schema
@deep_dataclass
class Config:
@auxiliary
class TrainMode:
lr: float = 1e-3
@auxiliary
class TestMode:
metric: str = "accuracy"
mode: Union[TrainMode, TestMode] = field(default_factory=TrainMode)
schema = to_json_schema(Config)
# mode schema: {'anyOf': [{'type': 'object', ...TrainMode...},
# {'type': 'object', ...TestMode...}]}
Using with plain @dataclass¶
to_json_schema works on any standard dataclass. The only limitation is that
construction will not coerce nested dicts — that feature requires
@deep_dataclass. Validation still works correctly:
from dataclasses import dataclass
from deep_dataclasses import to_json_schema
import jsonschema
@dataclass
class Point:
x: float
y: float
schema = to_json_schema(Point)
jsonschema.validate({"x": 1.0, "y": 2.0}, schema) # passes
jsonschema.validate({"x": "bad", "y": 2.0}, schema) # raises
Integration with validate_defaults¶
The validate_defaults decorator from
deep_dataclasses.extras uses to_json_schema internally to check that a
class’s own default instance satisfies its schema. This is useful as a
module-load-time sanity check on configuration classes:
from deep_dataclasses import deep_dataclass
from deep_dataclasses.extras import validate_defaults
@validate_defaults
@deep_dataclass
class Config:
device: Literal["cpu", "cuda"] = "cpu" # checked at import time