First Published: First Break - February 2026, by B. Lasscock, M. Gajula, K. Gonzalez, B. Michell, S. Namasivayam, A. Sansal, A. Valenciano (TGS), D. Arunabha, L. Chen, C. Liu, V.S. Ravipati, M. Sujitha, G. Suren (AWS GenAI Innovation Center)


Abstract

This article presents a practical framework for AI-assisted subsurface data access based on explicit data representations, agent-based workflows, and efficient information retrieval. We demonstrate large-scale conversion of SEG-Y archives into self-describing MDIO v1 datasets and present a case study on agent-driven reconstruction of seismic metadata from legacy text headers. A second case study evaluates embedding-based retrieval across acquisition and processing reports, showing that vector quantisation and graph-based indexing enable low-latency, relevance-driven search.

from mdio import open_mdio
from mdio.builder.schemas.v1.stats import SummaryStatistics
from upath import UPath

uri = UPath("s3://tgs-opendata-poseidon/full_stack_agc.mdio", anon=True)
ds = mdio.open_mdio(uri)
stats_dict = ds.seismic.attrs["statsV1"]
stats = SummaryStatistics.model_validate(stats_dict)
line = ds.sel(inline=2468)
cmap_kw = dict(cmap="gray_r")
fig_kw = dict(aspect=2, size=6, yincrease=False)
line.seismic.T.plot(**cmap_kw, **fig_kw, interpolation="lanczos")

An inline slice sampled from the Poseidon dataset using the code above.

These capabilities are integrated into an interactive, multi-agent system that supports natural-language analysis and coordinated access to structured and unstructured subsurface information.

Read the full article here.