Operator Onboarding Kit

Add one new graph operator with deterministic validation, planner visibility, and CI coverage.

1) Edit Checklist (Template)

  1. Schema registry:
  2. file: include/lightning_core/core/graph.hpp
  3. add OpKind enum + opKindName()
  4. register schema in OperatorRegistry (rank/dtype/layout/attrs)
  5. Validation contract:
  6. file: include/lightning_core/core/graph.hpp
  7. add shape/layout/attribute checks in graph validation path
  8. return deterministic reason codes on violations
  9. Execution path:
  10. file: include/lightning_core/core/graph.hpp
  11. add executeNodeTyped switch case
  12. keep CPU baseline behavior deterministic
  13. Python name mapping:
  14. file: python/bindings/bind_graph.cpp
  15. map Python op string to the new OpKind
  16. Tests:
  17. file: tests/test_graph_ir.cpp (C++ correctness + planner/validation checks)
  18. file: tests/test_python_operator_onboarding_smoke.py (copy-paste smoke)
  19. Bench evidence:
  20. file: benchmarks/python/graph_eager_ab_bench.py (if op participates in chain dispatch)
  21. include fallback reason visibility in CSV/JSON/MD row fields

2) Minimal Contract Rules

  • Unsupported shape/layout/dtype must fail with deterministic reason codes.
  • Graph-request failures must preserve eager-equivalent numerics through deterministic fallback.
  • Add at least one positive (supported) and one negative (unsupported) test case.
  • Ensure plan summary reports remain stable (planned_dispatch_groups, fallback_reason_code).

3) Copy-Paste Smoke Example

Use this exact snippet to validate local onboarding flow:

import numpy as np
import lightning_core as lc

g = lc.GraphIR()
ta = g.add_tensor([4, 4], dtype="float32", name="a", constant=True)
tb = g.add_tensor([4, 4], dtype="float32", name="b", constant=True)
tout = g.add_tensor([4, 4], dtype="float32", name="out")
g.add_node("vector_add", [ta, tb], [tout])

a = np.arange(16, dtype=np.float32).reshape(4, 4)
b = np.ones((4, 4), dtype=np.float32)
result = g.execute_f32({ta: a, tb: b}, preferred_device="cpu")
out = np.asarray(result["values"][tout], dtype=np.float32).reshape(4, 4)

assert np.allclose(out, a + b, atol=1e-5, rtol=1e-5)
print("operator onboarding smoke: ok")

4) CI Gate Targets

  • ci-contract-tests.yml:
  • C++ contract subset (test_graph_ir) must pass
  • Python onboarding smoke must pass
  • benchmark-artifacts.yml:
  • graph/eager artifact must include fixed planner/fallback evidence columns

5) Fast Validation Commands

cmake --build build-v021 -j 8
ctest --test-dir build-v021 --output-on-failure -R test_graph_ir
python tests/test_python_operator_onboarding_smoke.py
python benchmarks/python/graph_eager_ab_bench.py --device auto --warmup 2 --iters 8 --trace-iters 4