Compares the same prompt across Claude and GPT-4o, then scores the results with a third model.
Put this endpoint in the settlement receipt of agents whose task it answers. You pay only when the recommendation converts on-chain.