Rewrite GPU kernels across frameworks
metal-candle-free-flipworkflowsetup L3★1
ericflo/kiln ↗What it does
Rewrite GPU kernels: candle-core → kt-native substrate
Best for
Mechanical candle-core → kt-native GPU kernel substrate swaps where logic must be preserved byte-for-byte (#1082).
Inputs
- · Kernel subsystem (tm_conv1d, tm_gdn_chunk, etc.)
- · Function list per subsystem
- · Source metal.rs boundary kernel code
Outputs
- · Candle-free kt-native rewrites (JSON per subsystem)
- · Preserved logic (byte-for-byte kernel calls)
Requires
- · Claude agents (1 per kernel subsystem)
- · Metal GPU framework (preserved)
- · kt-native substitutes (new)
Preconditions
- · 6 disjoint subsystem groups (GROUPS array)
- · Source functions readable in metal.rs
- · Multi-phase: Draft
Failure modes
- · Mechanical rewrite only: must preserve kernel logic exactly
- · If logic is 'improved', the substrate swap is broken
- · Storage.Metal → kt equivalents must use identical buffer indices
- · pipeline/encoder/device APIs must match exactly
Trust signals
- · 6 disjoint subsystem groups (no write conflicts)
- · Per-group agent isolation
- · Critical rule: PURE SUBSTRATE SWAP (logic unchanged)
- · 7-step rewrite pattern (signatures, literals, allocation, device, storage, buffer_o_kt, existing methods)