May 7, 2024
true. I didn't add truthfulness because it is difficult to reliably check that using a language model because of hallucinations, so I settle to the more broad soundness concept. but can definitely do a system with some sort of truthfulness evaluation.