Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

MITRA AI Chatbot Assistant: Using 8-Bit Quantized Large Language Models on Consumer-Grade GPUs

2026·0 Zitationen·International Journal for Research in Applied Science and Engineering TechnologyOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2026

Jahr

Abstract

This paper explains the MITRA AI Chatbot Assistant, a locally-hosted AI chatbot designed to run on consumer-grade GPUs, using 8-bit quantized Large Language Models (LLMs). This project also makes use of proprietary cloud-based LLMs like ChatGPT, Gemini, Grok, and DeepSeek using their APIs, but for privacy concerns, this system has made use of local models such as Mistral 7B and Llama 3. Due to these local models, there’s no need for constant internet connectivity. MITRA enables private, low-latency inference using quantization techniques like GPTQ and GGUF. Due to quantization, the model size was reduced to 7-8 GB, enabling deployment on consumer-grade hardware without depending on commercial hardware, which is very expensive. MITRA can assist in multiple domains, such as education, medical, therapeutic, coding, etc., in use cases.

Autoren

Yash Pal

Institutionen

College of Medical Sciences(NP)

Themen

Artificial Intelligence in Healthcare and EducationAI in Service InteractionsBig Data and Digital Economy

Volltext beim Verlag öffnen

MITRA AI Chatbot Assistant: Using 8-Bit Quantized Large Language Models on Consumer-Grade GPUs

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen