OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 03.05.2026, 16:48

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

MITRA AI Chatbot Assistant: Using 8-Bit Quantized Large Language Models on Consumer-Grade GPUs

2026·0 Zitationen·International Journal for Research in Applied Science and Engineering TechnologyOpen Access
Volltext beim Verlag öffnen

0

Zitationen

1

Autoren

2026

Jahr

Abstract

This paper explains the MITRA AI Chatbot Assistant, a locally-hosted AI chatbot designed to run on consumer-grade GPUs, using 8-bit quantized Large Language Models (LLMs). This project also makes use of proprietary cloud-based LLMs like ChatGPT, Gemini, Grok, and DeepSeek using their APIs, but for privacy concerns, this system has made use of local models such as Mistral 7B and Llama 3. Due to these local models, there’s no need for constant internet connectivity. MITRA enables private, low-latency inference using quantization techniques like GPTQ and GGUF. Due to quantization, the model size was reduced to 7-8 GB, enabling deployment on consumer-grade hardware without depending on commercial hardware, which is very expensive. MITRA can assist in multiple domains, such as education, medical, therapeutic, coding, etc., in use cases.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationAI in Service InteractionsBig Data and Digital Economy
Volltext beim Verlag öffnen