Skip to main content
All Reviews
AstronomyNiche
advanced

APBench and benchmarking large language model performance in fundamental astrodynamics problems for space engineering

Di Wu et al. (2025)

Published
Mar 7, 2025
Journal
Scientific Reports · Vol. 15 · No. 1
DOI
10.1038/s41598-025-91150-5

At a Glance

How good can LLMs solve space science university-level problems

Summary

Authors created a dataset of questions from Astrodynamics, tested a variety of LLMs including open-source ones on them, and evaluated their performance. The paper is a good example of a benchmark study and how to conduct it. Helpful for anyone doing benchmark stuff in astronomy.

Method Snapshot

LLM (benchmark)

Background

Deep knowledge of benchmarking in LLMs + the state of art

A nice example of the usefulness of LLMs in astronomy + a good example of how to do a benchmark study in astronomy + LLM

ES