CHEMLIT-QA: A HUMAN EVALUATED DATASET FOR CHEMISTRY RAG TASKS

ChemLit-QA: a human evaluated dataset for chemistry RAG tasks

ChemLit-QA: a human evaluated dataset for chemistry RAG tasks

Blog Article

Retrieval-Augmented Generation (RAG) is Weight Management a widely used strategy in Large-Language Models (LLMs) to extrapolate beyond the inherent pre-trained knowledge.Hence, RAG is crucial when working in data-sparse fields such as Chemistry.The evaluation of RAG systems is commonly conducted using specialized datasets.

However, existing datasets, typically in the form of scientific Question-Answer-Context (QAC) triplets or QA pairs, are often limited in size due to the labor-intensive nature of manual curation or require further quality assessment when generated through automated processes.This highlights a critical need for large, high-quality datasets tailored to scientific applications.We introduce ChemLit-QA, a comprehensive, expert-validated, open-source dataset comprising over 1,000 entries specifically designed for chemistry.

Our approach involves the initial generation and filtering of a QAC dataset using an automated framework based on GPT-4 Turbo, followed by rigorous evaluation by chemistry experts.Additionally, we provide two supplementary datasets: ChemLit-QA-neg focused on negative data, and ChemLit-QA-multi focused on multihop reasoning tasks for LLMs, 3-Layer Wooden Ornament which complement the main dataset on hallucination detection and more reasoning-intensive tasks.

Report this page