Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
MITRA AI Chatbot Assistant: Using 8-Bit Quantized Large Language Models on Consumer-Grade GPUs
0
Zitationen
1
Autoren
2026
Jahr
Abstract
This paper explains the MITRA AI Chatbot Assistant, a locally-hosted AI chatbot designed to run on consumer-grade GPUs, using 8-bit quantized Large Language Models (LLMs). This project also makes use of proprietary cloud-based LLMs like ChatGPT, Gemini, Grok, and DeepSeek using their APIs, but for privacy concerns, this system has made use of local models such as Mistral 7B and Llama 3. Due to these local models, there’s no need for constant internet connectivity. MITRA enables private, low-latency inference using quantization techniques like GPTQ and GGUF. Due to quantization, the model size was reduced to 7-8 GB, enabling deployment on consumer-grade hardware without depending on commercial hardware, which is very expensive. MITRA can assist in multiple domains, such as education, medical, therapeutic, coding, etc., in use cases.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.557 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.447 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.944 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.797 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.