JAY ZENITH

[github] [huggingface] [x]

I build post-training and evaluation systems for tool-using agents: data, environments, verifiable rewards, and RL training infrastructure. My work below.

predict
Toward a coding agent with a working world model: it predicts what the environment will do and stakes a KEEP/REVISE decision on that prediction before seeing the real result, the prediction trained by its own cross-entropy loss off the verified outcome. Extends ECHO from prediction-as-training-signal to prediction-as-behavior. [full write-up] [code]