Design & Reuse

Dnotitia Releases "Dnotitia NIAH," an Open-Source Benchmark for Long-Context LLM Performance

Oct. 02, 2025 – 

[Seoul, South Korea – October 2nd, 2025] Dnotitia Inc. (Dnotitia), a leader in long-term memory AI and semiconductor-integrated solutions, today announced the open-source release of Dnotitia NIAH(Needles in a Haystack) on GitHub, a benchmarking tool designed to quantitatively evaluates how well large language models(LLMs) handle long-context reasoning.

Many next-generation LLMs claim to support context windows exceeding one million tokens, but their ability to retrieve information accurately across extended sequences remains largely unproven. In practice, models often exhibit sharp performance drops when answers appear near the end in the text, demonstrating that a longer context window alone does not ensure reliable reasoning.

<Image: Configuration file example auto-detected by Dnotitia NIAH>

‘Dnotitia NIAH’ closes this gap. Inspired by the phrase “finding a needle in a haystack,” it tests whether a model can locate and extract specific information hidden within thousands of lines of contexts. For instance, when asked “What ingredient is needed to make delicious kimchi?” the model must retrieve “cabbage” from a buried reference sentence like “Cabbage is required to make delicious kimchi.”

Dnotitia has already applied ‘Dnotitia NIAH’ to evaluate multiple open-source LLMs. Early versions frequently failed when the answer was located near the end of a passage, while improved models   showed consistent performance throughput. These results highlight how systematic benchmarking can directly accelerate model improvement.

With ‘Dnotitia NIAH’, researchers and developers can move beyond token-count claims and quantitatively verify whether LLMs maintain across long documents, providing a practical tool to measure and enhance large-context capabilities.

“Dnotitia has released a wide range of open-source resources from LLM models and training datasets to automation frameworks, helping to advance the AI ecosystem,” said Moo-Kyoung(MK) Chung, CEO of Dnotitia. “The release of Dnotitia NIAH continues that effort. By opening not only models but also evaluation tools, we aim to contribute to the progress of the global AI community.”