AI Medical Voicebot

March 2025

A multimodal AI-powered voicebot for healthcare that combines speech recognition, image analysis, and TTS for real-time medical insights.

Project Demo

Technologies Used

PythonGradioWhisper (GROQ)LLaMA-3.2 VisionElevenLabsGTTSdotenv

Project Links

Project Overview

This project features a voice-activated AI medical assistant that listens to voice input, transcribes it using Whisper-large-v3 via GROQ API, analyzes uploaded medical images with LLaMA-3.2 Vision, generates short professional diagnoses, and speaks the result using ElevenLabs or GTTS. Built in Python with a Gradio UI, it blends audio, vision, and NLP to create a seamless diagnostic experience.

Key Features

Voice Input & Transcription

Real-time transcription using Whisper-large-v3 via GROQ API.

Image-Based Diagnosis

Upload and analyze medical images using a powerful vision model.

Concise AI Insights

Generates short, professional responses tailored for healthcare.

Text-to-Speech Output

Converts AI responses into natural voice with ElevenLabs or GTTS.

Challenges

•Combining multimodal AI (voice, vision, text) into a real-time flow
•Ensuring quick and accurate transcriptions and image analysis
•Integrating various APIs smoothly into a single UI

Key Learnings

•Built a multimodal pipeline with real-time AI inference
•Handled audio/image data and coordinated multiple services
•Enhanced understanding of medical AI safety and clarity in generation

Project Gallery

AI Medical Voicebot screenshot 1

1 / 4

AI Medical Voicebot screenshot 2

2 / 4

AI Medical Voicebot screenshot 3

3 / 4

AI Medical Voicebot screenshot 4

4 / 4

Related Projects

You might also be interested in these projects.

Fragments Trails – Educational Q&A Platform

AI-Driven Applications

Fragments Trails – Educational Q&A Platform

July 2025

A full-featured educational Q&A platform with AI moderation, personalized feeds, Trails, and admin/moderator control.

MongoDBExpressReact+8 more

Scalable E-Commerce Platform

MERN Stack Applications

Scalable E-Commerce Platform

December 2024

A scalable full-stack e-commerce platform built with the MERN stack, TypeScript, Docker, Redis, Stripe, Firebase, Cloudinary, and more.

React.jsNext.jsNode.js+9 more

Agentic AI Researcher App

AI & Multimodal

Agentic AI Researcher App

September 2025

An AI-powered research assistant that searches arXiv, reads papers, generates insights, and writes new research in LaTeX PDF format.

LangGraphLangChainGoogle GenAI+4 more