Design & Reuse

Dnotitia and Hanyang University Launch Open-Source Platform Benchmarking AI Quantization

'QLLM-INFER' enables AI developers and researchers to evaluate state-of-the-art quantization algorithms under standardized conditions

Seoul, South Korea, Apr. 08, 2025 – 

 

Dnotitia Inc. (Dnotitia), a leading AI and semiconductor company, today announced the release of an open-source platform for evaluating AI quantization techniques. Jointly developed through an industry-academia research collaboration with the AIHA Lab at Hanyang University, led by Professor Jungwook Choi, the platform, ‘QLLM-INFER’, is now publicly available on GitHub under the Apache 2.0 license.

 

As large language models (LLMs) like ChatGPT continue to gain increasing attention, the scope of AI applications is rapidly expanding. However, deploying these models in real-world scenarios remains a major challenge due to their high computational and memory demands. Quantization - a technique that reduces the precision of numerical representations in AI models - offers a powerful solution by compressing large numbers into smaller ones. This enables models to maintain accuracy while significantly improving speed and reducing memory consumption.

 

Despite growing importance of quantization in optimizing AI models, previous benchmarking efforts have been fragmented. Algorithms have often been evaluated using inconsistent experimental setups and metrics, making objective comparisons difficult. In response, Dnotitia and Hanyang University introduced a unified, open-source platform designed to standardize the evaluation of quantization algorithms. ‘QLLM-INFER’ offers consistent benchmarking conditions and has already been used to assess eight of the most influential quantization methods published between 2022 and 2024.

 

The platform categorizes algorithm performance into three core evaluation types:

  1. Weight and Activation Quantization: reducing both model parameters and intermediate computation values

  2. Weight-only Quantization: compressing model parameters while keeping activations intact

  3. KV Cache Quantization: optimizing temporary memory usage for long-context processing in LLMs

 

“As LLM services become more widely commercialized, model compression through quantization is no longer optional – it’s essential,” said Moo-Kyoung Chung, CEO of Dnotitia. “However, selecting the most suitable quantization approach for specific deployment environments remains a complex challenge. ‘QLLM-INFER’ was designed to address this issue - offering a transparent and reproducible benchmarking platform that enables stakeholders to objectively compare algorithm performance. We expect it will significantly support both the selection of optimal solutions and the innovation of new quantization techniques.”

 

“Until now, there was no consistent framework for evaluating quantization methods,” said Professor Jungwook Choi of Hanyang University. “This platform establishes the first standardized benchmark for quantization, which is academically significant in its own right. We believe it will help AI researchers produce more objective and reproducible results, ultimately advancing the quality and reliability of research in this field.”

 

About Dnotitia

 

Dnotitia is an AI and semiconductor company that creates innovative value through the convergence of artificial intelligence (AI) and data, providing high-performance and low-cost LLM solutions. Leveraging Dnotitia’s world’s first Vector Data Processing Unit (VDPU), the company offers ▲Seahorse, the high-performance vector database, which supports Retrieval-Augmented Generation (RAG) solutions – a key technology for Gen AI. Additionally, Dnotitia offers ▲Mnemos, a personal/Edge LLM device based on its proprietary LLM foundation model.

  1. Seahorse is indexes various types of multi-modal data, such as text, images, and videos, into vector form, providing semantic search that extracts information reflecting meanings and contexts based on user queries. Seahorse can be used not only in RAG systems but also for implementing semantic search across all digital data stored globally.

  2. Mnemos is a solution designed to address the high costs and resource consumption of AI. It is a compact edge device capable of running high-performance LLM without the need for a data center. Leveraging Dnotitia’s RAG and LLM optimization technology, Mnemos delivers high-performance LLM services using minimal GPU/NPU resources.

Founded in 2023, Dnotitia has grown to a team of over 90 employees in a short period of time and has established strategic partnerships across various industries. By integrating specialized semiconductors and optimized algorithms, Dnotitia aims to usher in a new era of AI. Through the fusion of data with AI to develop AI with long-term memory, Dnotitia envisions realizing a low-cost AGI (Artificial General Intelligence) accessible to everyone, creating a future where the benefits of AI can be enjoyed by all.

For more information about Dnotitia, please visit: www.dnotitia.com .