Conflict-Aware RAG Answer Quality Classification: Detecting Evident Conflict and Baseless Information with Lightweight ML Models
DOI:
https://doi.org/10.63575/CIA.2026.40201Keywords:
retrieval-augmented generation, hallucination detection, evident conflict, baseless information, RAGTruth, XGBoost, BiLSTM, DeBERTa, calibration, answer quality classificationAbstract
Retrieval-augmented generation (RAG) improves factual reliability by conditioning a language model on retrieved evidence, yet grounded answers can still fail in two distinct ways: they may contradict the available evidence or introduce claims that the evidence does not support. This study formulates conflict-aware answer-quality classification on RAGTruth-processed, a 17,790-row benchmark derived from RAGTruth, and distinguishes four mutually exclusive row-level states: no hallucination, evident-conflict only, baseless-information only, and both. The empirical analysis verifies the target construction, split composition, label overlap, quality labels, task variation, and generator variation, while the modeling framework specifies three deployment-oriented classifiers: XGBoost with TF-IDF and overlap features, a compact bidirectional LSTM, and DeBERTa-small with structured query-context-output input. The train and test splits differ in four-class composition (chi-square = 108.381, p = 2.447 × 10⁻²³, Cramér’s V = 0.078) and in overall hallucination prevalence, whereas their quality-label distributions remain stable (chi-square = 3.348, p = 0.188, Cramér’s V = 0.014). Baseless-information-only examples are more common than evident-conflict-only examples in both splits, and mixed errors form the smallest test class. These properties show why accuracy and binary hallucination detection are insufficient for operational monitoring. Macro-F1, mechanism-specific recall, confusion matrices, probability calibration, and routing-aware error analysis are therefore treated as the central evaluation criteria for lightweight conflict-aware safeguards.


