Smaller & Smarter: Score-Driven Network Chaining of Smaller Language Models - Journal of Software Engineering and Applications

JSEA > Vol.17 No.1, January 2024

Journal of Software Engineering and Applications

Volume 17, Issue 1 (January 2024)

ISSN Print: 1945-3116 ISSN Online: 1945-3124

Google-based Impact Factor: 1.22 Citations h5-index & Ranking

Smaller & Smarter: Score-Driven Network Chaining of Smaller Language Models ()

HTML XML

Download as PDF (Size: 3038KB) PP. 23-42

DOI: 10.4236/jsea.2024.171002 137 Downloads 655 Views

Author(s)

Gunika Dhingra¹, Siddansh Chawla¹, Vijay K. Madisetti², Arshdeep Bahga³

Affiliation(s)

¹School of Computer Science Engineering & Technology, Bennett University, Greater Noida, India.
²School of Cybersecurity and Privacy, Georgia Institute of Technology, Atlanta, USA.
³Cloudemy Technology Labs, Chandigarh, India.

ABSTRACT

With the continuous evolution and expanding applications of Large Language Models (LLMs), there has been a noticeable surge in the size of the emerging models. It is not solely the growth in model size, primarily measured by the number of parameters, but also the subsequent escalation in computational demands, hardware and software prerequisites for training, all culminating in a substantial financial investment as well. In this paper, we present novel techniques like supervision, parallelization, and scoring functions to get better results out of chains of smaller language models, rather than relying solely on scaling up model size. Firstly, we propose an approach to quantify the performance of a Smaller Language Models (SLM) by introducing a corresponding supervisor model that incrementally corrects the encountered errors. Secondly, we propose an approach to utilize two smaller language models (in a network) performing the same task and retrieving the best relevant output from the two, ensuring peak performance for a specific task. Experimental evaluations establish the quantitative accuracy improvements on financial reasoning and arithmetic calculation tasks from utilizing techniques like supervisor models (in a network of model scenario), threshold scoring and parallel processing over a baseline study.

KEYWORDS

Large Language Models (LLMs), Smaller Language Models (SLMs), Finance, Networking, Supervisor Model, Scoring Function

Share and Cite:

Dhingra, G. , Chawla, S. , Madisetti, V. and Bahga, A. (2024) Smaller & Smarter: Score-Driven Network Chaining of Smaller Language Models. Journal of Software Engineering and Applications, 17, 23-42. doi: 10.4236/jsea.2024.171002.

Cited by

No relevant information.

Journals Menu

Follow SCIRP

	+1 323-425-8868
	customer@scirp.org
	+86 18163351462(WhatsApp)
	1655362766

	Paper Publishing WeChat

Journals Menu

Home

About SCIRP

Service

Policies