An Empirical Study on the Accuracy of Large Language Models in API Documentation Understanding: A Cross-Programming Language Analysis

Authors

  • Pengfei Li Electrical and Computer Engineering, Duke University, NC, USA Author
  • Qichang Zheng Computational Social Science, University of Chicago, IL, USA Author
  • Ziyi Jiang Computer Information Tech, Northern Arizona University, AZ, USA Author

DOI:

https://doi.org/10.63575/

Keywords:

Large Language Models, API Documentation, Code Understanding, Cross-Language Analysis

Abstract

This study presents a comprehensive empirical evaluation of Large Language Models (LLMs) in understanding API documentation across multiple programming languages. We systematically assess the accuracy and consistency of five prominent LLMs—GPT-4, GPT-3.5, Claude-3, Llama-2, and CodeT5—in interpreting API documentation for Java, Python, JavaScript, and C++. Our evaluation framework employs both automated metrics and human evaluation protocols to measure understanding accuracy, completeness, and cross-language consistency. Results indicate significant variations in LLM performance across different programming languages, with accuracy scores ranging from 67.3% to 89.7%. The study reveals that syntax complexity, documentation structure, and linguistic patterns substantially influence LLM comprehension capabilities. These findings provide critical insights for improving LLM-based code assistance tools and establishing guidelines for effective API documentation design in multi-language development environments.

Author Biography

  • Ziyi Jiang, Computer Information Tech, Northern Arizona University, AZ, USA

     

     

Published

2025-07-05

How to Cite

[1]
Pengfei Li, Qichang Zheng, and Ziyi Jiang, “An Empirical Study on the Accuracy of Large Language Models in API Documentation Understanding: A Cross-Programming Language Analysis”, Journal of Computing Innovations and Applications, vol. 3, no. 2, pp. 1–14, Jul. 2025, doi: 10.63575/.