Inference Documentation Maintenance Guide#
For developers: How to keep documentation in sync with code
Overview#
The JESTER inference documentation uses a hybrid approach:
Auto-generated: YAML configuration reference (from Pydantic schemas)
Manual: Narrative documentation, guides, examples
This ensures accuracy while maintaining readability.
Auto-Generated Documentation#
YAML Configuration Reference#
File: docs/yaml_reference.md
Source: jesterTOV/inference/config/schema.py (Pydantic models)
Generator: jesterTOV/inference/config/generate_yaml_reference.py
When to Regenerate#
Regenerate whenever you modify:
config/schema.py- Any changes to Pydantic modelsField names, types, or defaults
Validation rules
Documentation strings in
Field(...)
How to Regenerate#
# From repository root
uv run python -m jesterTOV.inference.config.generate_yaml_reference
# This creates/updates:
# docs/yaml_reference.md
Automated Reminder#
The config/schema.py file has a docstring reminder:
"""
IMPORTANT: When you modify these schemas, regenerate the YAML reference:
uv run python -m jesterTOV.inference.config.generate_yaml_reference
"""
Pre-Commit Hook (Recommended)#
Add to .pre-commit-config.yaml:
- repo: local
hooks:
- id: regenerate-yaml-reference
name: Regenerate YAML reference
entry: uv run python -m jesterTOV.inference.config.generate_yaml_reference
language: system
files: ^jesterTOV/inference/config/(schema\.py|schemas/.*\.py)$
pass_filenames: false
This automatically regenerates the reference when schema.py or any file under schemas/ changes.
Manual Documentation#
Files to Update Manually#
File |
Update When |
Contents |
|---|---|---|
|
User workflow changes |
Quick start guide, examples |
|
Architecture changes |
Module structure, data flow |
|
Module structure changes |
Module overview |
Update Checklist#
When adding a new likelihood type:
[ ] Add to
likelihoods/my_likelihood.py[ ] Update
likelihoods/factory.py[ ] Update
config/schema.py(add toLikelihoodConfig.typeLiteral)[ ] Regenerate YAML reference (auto)
[ ] Add example to
quickstart.md(manual)[ ] Update
yaml_reference.md“Likelihood-Specific Parameters” (manual - generator doesn’t know aboutparametersdict contents)
When adding a new prior type:
[ ] Add to
priors/simple_priors.py[ ] Update
priors/parser.pynamespace[ ] Add example to
quickstart.md(manual)
When adding a new EOS type:
[ ] Add config class to
config/schemas/eos.py(extendBaseEOSConfig) and add toEOSConfigunion[ ] Register in
transforms/transform.py(_create_eoswithisinstancecheck)[ ] Regenerate YAML reference (auto)
[ ] Add example configuration (manual)
When adding a new TOV solver:
[ ] Add config class to
config/schemas/tov.py(extendBaseTOVConfig) and switchTOVConfigto a discriminated union[ ] Register in
transforms/transform.py(_create_tov_solverwithisinstancecheck)[ ] Regenerate YAML reference (auto)
[ ] Add example configuration (manual)
When modifying configuration fields:
[ ] Modify the relevant file under
config/schemas/[ ] Regenerate YAML reference (auto)
[ ] Update
quickstart.mdif it affects quick start (manual)
Documentation Structure#
Source of Truth Hierarchy#
Code (ultimate truth)
config/schema.py- Configuration validationModule docstrings - API documentation
Type hints - Function signatures
Auto-generated docs (always in sync with code)
docs/yaml_reference.md- All YAML options
Manual docs (requires human updates)
docs/quickstart.md- Quick startdocs/inference_architecture.md- Architecture
Avoiding Duplication#
Don’t duplicate information that can be auto-generated:
❌ List all YAML fields manually in documentation
✅ Link to
yaml_reference.mdfor complete list✅ Show key examples and explain concepts
Example:
<!-- Good: Explain concept, link to reference -->
## Configuration System
JESTER uses YAML files for configuration. For a complete list of all
available options, see the [YAML Reference](yaml_reference.md).
Key configuration sections:
- `transform`: How to convert parameters to observables
- `prior`: Prior distributions
- `likelihoods`: Observational constraints
...
<!-- Bad: Duplicate auto-generated content -->
## Configuration System
All available fields:
- seed: int, default 43
- eos: EOSConfig (required, discriminated by type)
- type: "metamodel" | "metamodel_cse" | "spectral" (required)
- ndat_metamodel: int, default 100
- tov: TOVConfig (required, discriminated by type)
... (this will get out of sync!)
Testing Documentation#
Manual Testing Checklist#
Before committing documentation changes:
[ ] Links work: Check all internal links resolve
[ ] Code examples run: Test YAML configs and Python snippets
[ ] Formatting renders: Check markdown renders correctly
[ ] Examples are current: Verify examples match latest code
Link Checking#
# Check for broken links (requires markdown-link-check)
npm install -g markdown-link-check
markdown-link-check docs/inference*.md
Code Example Testing#
Extract and test code examples:
# Test YAML examples
cat docs/quickstart.md | \
sed -n '/```yaml/,/```/p' | \
sed '/```/d' > /tmp/test_config.yaml
uv run python -m jesterTOV.inference.run_inference \
--config /tmp/test_config.yaml \
--validate-only
Version Control#
Documentation Commits#
Follow these conventions:
# When regenerating auto-docs
git commit -m "docs: regenerate YAML reference after schema changes"
# When updating manual docs
git commit -m "docs: add likelihood type XYZ to inference guide"
# When fixing docs issues
git commit -m "docs: fix broken link in inference quickstart"
Pull Request Checklist#
When your PR changes inference code:
[ ] Updated/regenerated auto-generated docs (if schema changed)
[ ] Updated manual docs (if user-facing features changed)
[ ] Added/updated examples (if workflow changed)
[ ] Tested documentation links
[ ] Checked code examples still work
Documentation Review#
Self-Review Checklist#
Before requesting review:
Accuracy: Does the doc match the code?
Completeness: Are all new features documented?
Clarity: Can a new user understand it?
Examples: Are there working examples?
Links: Do all references work?
Reviewer Checklist#
When reviewing documentation PRs:
Verify auto-generation: If
schema.pychanged, was reference regenerated?Check examples: Do code examples run?
Test links: Click through major navigation paths
Assess clarity: Is the explanation understandable?
Look for duplication: Is info duplicated that could be auto-generated?
Common Pitfalls#
❌ Don’t Do This#
Pitfall 1: Forgetting to regenerate
# You modify config/schema.py
class SamplerConfig(BaseModel):
n_chains: int = 20
new_field: int = 100 # Added
# ❌ Commit without regenerating reference
# → docs/yaml_reference.md is now out of sync!
Solution: Always run generator after modifying schemas.
Pitfall 2: Duplicating auto-generated content
<!-- ❌ Don't copy-paste from yaml_reference.md -->
All sampler fields:
- n_chains: int, default 20
- n_loop_training: int, default 3
...
<!-- ✅ Instead, link and explain -->
See [YAML Reference](yaml_reference.md#sampler-configuration)
for all sampler fields. Key parameters to tune:
- `n_chains`: More chains = better convergence but slower
- `learning_rate`: Controls NF training speed
Pitfall 3: Hardcoding examples that will break
<!-- ❌ Example that will break if defaults change -->
Run with default settings (20 chains, 3 training loops):
```yaml
sampler:
n_chains: 20
n_loop_training: 3
Run with default settings (see defaults):
sampler:
output_dir: "./outdir/" # Only override what you need
---
## Future Improvements
### Potential Automation
1. **API documentation from docstrings**
- Use Sphinx autodoc for module reference
- Generate from `jesterTOV.inference` package docstrings
2. **Example validation in CI**
- Extract code examples from markdown
- Run `--validate-only` on all YAML examples
- Fail CI if examples are broken
3. **Link checking in CI**
- Automated broken link detection
- Run on every PR that touches docs
4. **Documentation coverage**
- Track what % of public APIs are documented
- Alert when new public functions lack docstrings
### Suggested Pre-Commit Hook (Complete)
```yaml
# .pre-commit-config.yaml
repos:
- repo: local
hooks:
# Regenerate YAML reference when schema changes
- id: regenerate-yaml-reference
name: Regenerate YAML reference
entry: uv run python -m jesterTOV.inference.config.generate_yaml_reference
language: system
files: ^jesterTOV/inference/config/(schema\.py|schemas/.*\.py)$
pass_filenames: false
# Check documentation links
- id: markdown-link-check
name: Check markdown links
entry: markdown-link-check
language: node
files: \.md$
additional_dependencies: ['markdown-link-check']
# Validate YAML examples in documentation
- id: validate-yaml-examples
name: Validate YAML examples
entry: scripts/validate_yaml_examples.sh
language: system
files: ^docs/inference.*\.md$
pass_filenames: false
Summary#
Quick Reference#
Task |
Command |
|---|---|
Regenerate YAML reference |
|
Check links |
|
Validate config example |
|
Golden Rule#
If the code changes, the documentation must change. If the schema changes, regenerate the auto-docs.
Documentation Workflow#
Code Change
↓
Modify schema.py? → Yes → Regenerate YAML reference
↓ ↓
↓ (auto-updated)
↓
User-facing change? → Yes → Update manual docs
↓ ↓
↓ (quickstart.md, etc.)
↓
Commit
↓
PR Review
↓
Merge
Maintainer Note: Keep this guide updated as the documentation system evolves!
Last Updated: February 2026