Evaluating Tool Selection and Usage Efficiency of LLM-based Agents in Domain-Specific Tasks: A Comparative Analysis
DOI:
https://doi.org/10.63575/Keywords:
LLM agents, tool selection, usage efficiency, domain-specific evaluationAbstract
Large language model-based agents demonstrate increasing sophistication in autonomous task execution across diverse domains, yet their tool selection mechanisms and usage efficiency remain underexplored. This study develops a comprehensive evaluation framework for assessing tool selection patterns and usage efficiency in domain-specific environments. We implement a probabilistic assessment methodology that quantifies agent performance across multiple dimensions including selection accuracy, execution latency, and resource optimization. Our experimental protocol encompasses financial analysis, scientific computation, and data processing domains, evaluating six distinct LLM architectures under controlled conditions. Results indicate significant variance in tool selection strategies, with transformer-based agents achieving 23.4% higher efficiency scores compared to retrieval-augmented baselines. The framework reveals systematic patterns in tool invocation sequences, demonstrating domain-specific adaptation capabilities while highlighting critical limitations in cross-domain generalization. Our analysis contributes quantitative insights into agent behavior patterns and establishes baseline metrics for future tool usage optimization research. These findings inform architectural decisions for production deployments where tool efficiency directly impacts computational costs and response latency.


