Sanchari Biswas conducts research on the reliability, safety, and robustness of large language models (LLMs) under the supervision of Prof. Chiu C. Tan at Temple University. Her work investigates the structural weaknesses of modern generative AI systems and develops principled methodologies to evaluate, detect, and mitigate reasoning failures in LLM outputs.
As LLMs are increasingly integrated into real-world systems, their outputs must be reliable, logically consistent, and aligned with user intent. However, these models frequently produce hallucinated information, inconsistencies, or policy-violating responses. Sanchari’s research addresses these challenges by developing systematic frameworks for analyzing and improving the behavior of LLMs across diverse domains.
Hallucination Analysis and Detection
A central focus of her work is developing systematic approaches to identify and characterize hallucinations in large language models. Rather than treating hallucinations as isolated errors, her research examines how these failures emerge from the probabilistic reasoning mechanisms underlying LLMs.
Her work explores how hallucinations manifest across multiple tasks including question answering, document understanding, security analysis, and policy-driven decision systems. By studying model behavior under adversarial prompts, ambiguous queries, and cross-domain contexts, she aims to develop frameworks that can reliably identify failure modes in LLM-generated outputs.
Evaluation Frameworks and Benchmarking
Sanchari also investigates scalable evaluation methodologies for assessing the reliability of LLM systems. This includes designing domain-specific benchmarks that test model consistency, factual grounding, and policy compliance across different application environments.
Her research emphasizes building evaluation pipelines that move beyond simple accuracy metrics and instead measure deeper properties of model behavior such as logical consistency, cross-response agreement, and alignment with external knowledge sources.
Formal Verification–Inspired Reliability Methods
Inspired by techniques from formal verification and program analysis, her work explores structured reasoning frameworks for improving LLM reliability. These approaches incorporate constraint checking, structured intermediate representations, and consistency validation mechanisms to detect contradictions and unsupported claims within model outputs.
By combining structured reasoning with multi-source validation mechanisms, her research seeks to develop scalable and interpretable solutions for ensuring trustworthy AI behavior.
Applications in High-Stakes Domains
Many of the systems studied in her research operate in domains where incorrect or misleading information can have significant consequences. Her work therefore explores the application of reliability-focused AI evaluation in domains such as education, accreditation systems, healthcare decision support, and cybersecurity.
Through interdisciplinary collaborations and domain-aware evaluation methods, she aims to design AI systems that are both technically robust and practically deployable in real-world environments.
Broader Research Vision
More broadly, Sanchari’s research lies at the intersection of trustworthy AI, security, and formal reasoning. She is particularly interested in developing scalable methodologies for evaluating and improving the behavior of generative AI systems in complex and heterogeneous environments.
Her long-term goal is to contribute to the development of AI systems whose reasoning processes can be systematically analyzed, verified, and trusted, enabling the safe deployment of language models in critical real-world applications.