March 2025
A multimodal AI-powered voicebot for healthcare that combines speech recognition, image analysis, and TTS for real-time medical insights.
This project features a voice-activated AI medical assistant that listens to voice input, transcribes it using Whisper-large-v3 via GROQ API, analyzes uploaded medical images with LLaMA-3.2 Vision, generates short professional diagnoses, and speaks the result using ElevenLabs or GTTS. Built in Python with a Gradio UI, it blends audio, vision, and NLP to create a seamless diagnostic experience.
Real-time transcription using Whisper-large-v3 via GROQ API.
Upload and analyze medical images using a powerful vision model.
Generates short, professional responses tailored for healthcare.
Converts AI responses into natural voice with ElevenLabs or GTTS.
You might also be interested in these projects.
July 2025
A full-featured educational Q&A platform with AI moderation, personalized feeds, Trails, and admin/moderator control.
December 2024
A scalable full-stack e-commerce platform built with the MERN stack, TypeScript, Docker, Redis, Stripe, Firebase, Cloudinary, and more.
September 2025
An AI-powered research assistant that searches arXiv, reads papers, generates insights, and writes new research in LaTeX PDF format.