{ }phoenix2pytest

phoenix2pytest

Turn a production LLM failure trace into a runnable pytest regression test.

An example is pre-filled below: just click Generate. Generation calls Gemini and usually takes a few seconds.

Fields: user_prompt (required), llm_output, span_id.
Fields: failure_mode (required), evidence, expected_behavior, assertion_strategy, key_strings_to_exclude, key_patterns_required.