AI for Finance | LLMs and Generative AI for Finance

Agentic Retrieval for Financial Documents: AI Grand Challenge on FinAgentBench

Agentic Retrieval for Financial Documents:
AI Grand Challenge on FinAgentBench

Agentic Retrieval for Financial Documents: AI Grand Challenge on FinAgentBench

Co-hosted with the AI for Finance Symposium '25

Competition at ICAIF ‘25

Competition at

ICAIF ‘25

September 10th

Start Date

October 20th

Submission Deadline

October 25th

Winner Announcement

November 15th

Presentation at ICAIF '25

Location

Singapore

Winner Announcement

Competition at

ICAIF ‘25

memex

Thin the Haystack, Seek the Wisdom of Many: Fused Agentic Retrieval with Similarity-based Pre-filtering

Shuvendu Roy, Ph.D.

RBC Borealis

Mohammed Suhail, Ph.D.

RBC Borealis

Amin Shabani, Ph.D.

RBC Borealis

Siqi Liu, Ph.D.

RBC Borealis

Yuzhen Hu

Fine-Tuning Multi-Stage Financial Retrieval with Agentic Understanding

Yuzhen Hu

University of Houston

AI Lens

PRISM: Prompt-Refined In-Context System Modelling for Financial Retrieval

Chun Chet Ng, Ph.D.

AI Lens

Jia Yu Lim

AI Lens

This challenge has concluded.

If you’d like to attend the larger workshop in person as a non-speaker, be sure to register for
ICAIF 2025. After selecting "Day Pass", please mark “AI Grand Challenge on FinAgentBench” under the listed competitions before completing your purchase. Full workshop information can be found here: ai4f.org

For any other questions regarding ticket registration please email; katieduke@linqalpha.com.

Ticket Registration

Program Schedule

Introduction to Challenge

Introduction

to Challenge

2:20 PM - 2:22 PM (2 minutes)

(tentative)
1:00 PM - 1:15 PM (15 minutes)

Winning Presentation

Winning

Presentation

2:22 PM - 2:50 PM (28 minutes)

7 mins each team — 6 mins presentation and 1 min Q&A

(tentative)
1:15 PM - 1:30 PM (15 minutes)

The goal of the competition is
to advance financial AI systems that can perform multi-step, evidence-grounded retrieval from SEC filings to support accurate and interpretable investment research.

Overview

Start: September 10th, 2025

Close: October 20th, 2025

The AI Agentic Retrieval Grand Challenge at ICAIF ’25 is motivated by the need for financial AI systems that can move beyond what traditional methods can achieve in accuracy to deliver interpretable, evidence-grounded answers from complex SEC filings. Participants are tasked with building retrieval systems that can support complex institutional finance questions with character-level evidence attribution.

Unlike traditional retrieval tasks, this competition is built on FinAgentBench,
a benchmark purpose-built to evaluate multi-step, agentic retrieval. Each query requires reasoning across two stages: first, selecting the most relevant document type (Document-Level Ranking), and second, identifying the most relevant passages within that document (Chunk-Level Ranking). The dataset contains over 2,400 examples curated by expert financial analysts from real SEC filings (2023–2024).

Participants must return ranked results for both document-level and chunk-level ranking tasks. Participants who impress with their solutions (and explain how they achieved them) will be invited as a presenter to share their work at ICAIF ’25.

Description

In the AI Agentic Retrieval Grand Challenge, participants are tasked with developing systems capable of retrieving the most relevant evidence from raw SEC filings to answer institutional finance questions. The task is structured as a two-stage agentic retrieval problem: Document-Level Ranking and Chunk-Level Ranking. This two-step design prioritizes accuracy and qualities essential for deploying AI systems in investment research. Participants’ systems will be evaluated on their capacity to reason across both document-type ranking and chunk-level retrieval tasks. All submissions must be made via Databricks Free Edition workspaces as the standardized evaluation interface; optional integration with the Upstage API (Solar LLM) is allowed.

Task Overview

For each Document-Level Ranking and Chunk-Level Ranking tasks, you are required rank

Document-Level Ranking – Given a financial question, systems must rank candidate document types (10-K, 10-Q, 8-K, DEF-14A, earnings transcripts) by their likelihood of containing the adequate information for answering the question.  
Chunk-Level Ranking – Given a single document, systems must identify and rank the most relevant passages within the provided document, returning a ranked list of candidate chunks that contain the information needed for answering the question.

Task Evaluation

The challenge consists of two sequential retrieval tasks. In the document-type ranking task, participants must rank all candidate document types based on their relevance to the input question. In the chunk-level ranking task, given the selected document, participants must rank the top 5 most relevant paragraph-level passages that contain the information necessary for answering the question.

Both tasks will be evaluated using standard ranking metrics: MRR@5 (Mean Reciprocal Rank), MAP@5 (Mean Average Precision), and nDCG@5 (Normalized Discounted Cumulative Gain).

Organizing Committee