Lab2: AI Food Analysis App with Gemini
Introduction
In this lab, you will build an AI-powered food analysis application using Google Gemini and the Mini Pupper’s camera. The app captures images of food and uses Gemini’s vision capabilities to analyze nutritional content, estimate calories, and provide health recommendations.
Prerequisites
- Completed Lab1 (Gemini API Setup)
- Camera connected to Mini Pupper
- Jupyter Notebook installed
Part 1: Quick Start - Test Gemini Vision
Before building the full app, let’s test Gemini Vision with a simple notebook.
Step 1: Clone Mangdang Repository
git clone http://github.com/lbaitemple/mangdang
cd mangdang/gemini
Step 2: Open food.ipynb
- Open Jupyter Lab:
http://<robot-ip>:8888 - Navigate to
mangdang/gemini/ - Open
food.ipynb
Step 3: Test Vision
- Place a food item in front of the camera
- Run the notebook cells
- Modify the prompt to test different analyses

Part 2: Setup for Custom App
Install Required Libraries
Create a requirements.txt file:
google-generativeai
python-dotenv
opencv-python
ipywidgets
Pillow
langchain-google-vertexai
Install the dependencies:
pip install -r requirements.txt
Configure Credentials
Create a .env file with your API credentials:
# Copy the sample env file
cp env.sample .env
# Edit the .env file
nano .env
Add your API key path:
API_KEY_PATH=/path/to/your/credentials.json
Part 2: Understanding the Code
Load Credentials
from dotenv import load_dotenv
import os
import google_api
from PIL import Image
load_dotenv(dotenv_path='./.env')
api_path = os.environ.get('API_KEY_PATH', '')
if os.path.exists(api_path):
google_api.init_credentials(api_path)
This code:
- Loads environment variables from
.envfile - Gets the API key path
- Initializes Google API credentials
Part 3: Complete Food Analysis App
Full Jupyter Notebook Code
# Import Required Libraries
import os, io
import json
import cv2
from google_api import ai_image_response as ai_image_response
import ipywidgets as widgets
from IPython.display import display, clear_output
from threading import Thread
from dotenv import load_dotenv
from langchain_google_vertexai import ChatVertexAI
from PIL import Image
def get_gemini_response(input_prompt, image):
"""Send image and prompt to Gemini for analysis"""
model = ChatVertexAI(
model_name='gemini-2.0-flash',
convert_system_message_to_human=True,
)
response = ai_image_response(model, image=image, text=input_prompt)
return response
# Create Widgets
input_prompt_widget = widgets.Textarea(
value="""You are an expert nutritionist where you need to see the food items from the image
and calculate the total calories, also provide the details of every food item with calories intake
in the below format:
1. Item 1 - no of calories
2. Item 2 - no of calories
----
----
Finally, you can mention whether the food is healthy or not and also mention the percentage split
of the ratio of carbohydrates, fats, fibers, sugar, and other things required in our diet.""",
placeholder='Type your input prompt here',
description='Prompt:',
layout={'width': '600px', 'height': '200px'}
)
analysis_button = widgets.Button(description='Analyze')
stop_camera_button = widgets.Button(description='Stop Camera')
clear_button = widgets.Button(description='Clear')
output_label = widgets.Textarea(
value='',
placeholder='The analysis result will be displayed here...',
description='Result:',
layout={'width': '600px', 'height': '200px'},
disabled=True # Make the output label read-only
)
camera_view = widgets.Image()
# Initialize camera
cap = cv2.VideoCapture(0) # Use 0 for the default camera
def update_camera_view():
"""Continuously update camera feed in widget"""
while True:
ret, frame = cap.read()
if ret:
_, buffer = cv2.imencode('.jpg', frame)
camera_view.value = buffer.tobytes()
def analyze_image(b):
"""Capture image and send to Gemini for analysis"""
input_prompt = input_prompt_widget.value
ret, frame = cap.read()
if not ret:
output_label.value = "Failed to capture image"
return
# Convert the captured frame to RGB
frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
pil_image = Image.fromarray(frame_rgb)
print("Sending to Gemini for analysis...")
output_label.value = 'Analyzing... Please wait.'
response = get_gemini_response(input_prompt, pil_image)
print(response)
# Display response in output label
output_label.value = f"Food Details:\n{response}"
def clear_analyze(b):
"""Clear the output"""
output_label.value = ""
# Attach Button Handlers
analysis_button.on_click(analyze_image)
clear_button.on_click(clear_analyze)
# Display Layout
display(widgets.HBox([camera_view, input_prompt_widget]))
display(widgets.HBox([analysis_button, stop_camera_button, clear_button]))
display(output_label)
# Start the camera feed in a separate thread
camera_thread = Thread(target=update_camera_view)
camera_thread.daemon = True
camera_thread.start()
Part 4: How It Works
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ Jupyter Notebook │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ │
│ │ Camera │───►│ OpenCV │───►│ Camera Widget │ │
│ │ Thread │ │ Capture │ │ (Live View) │ │
│ └──────────────┘ └──────────────┘ └──────────────────┘ │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ │
│ │ Analyze │───►│ Gemini API │───►│ Output Widget │ │
│ │ Button │ │ (Vision) │ │ (Results) │ │
│ └──────────────┘ └──────────────┘ └──────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
Key Components
| Component | Purpose |
|---|---|
camera_view | Displays live camera feed |
input_prompt_widget | Customizable prompt for Gemini |
analysis_button | Triggers image capture and analysis |
output_label | Shows Gemini’s response |
camera_thread | Background thread for camera updates |
Part 5: Running the App
Step 1: Start Jupyter Notebook
cd ~/mangdang/gemini
jupyter notebook --ip=0.0.0.0 --no-browser
Step 2: Open the Notebook
Navigate to http://<minipupper-ip>:8888 and open food.ipynb
Step 3: Run the Cells
- Run the first cell to install dependencies (only needed once)
- Run the second cell to load credentials
- Run the third cell to start the app
Step 4: Analyze Food
- Point the camera at food
- Click “Analyze” button
- Wait for Gemini’s response
- View nutritional analysis in the output

Part 6: Customizing the Prompt
You can modify the prompt for different analysis types:
Calorie Counter
prompt = """Analyze this food image and provide:
1. List of food items
2. Estimated calories for each item
3. Total calories
4. Recommended portion size"""
Ingredient Identifier
prompt = """Identify all ingredients visible in this food image.
List them with estimated quantities."""
Diet Compatibility
prompt = """Analyze this food for dietary restrictions:
- Is it vegetarian/vegan?
- Is it gluten-free?
- Is it dairy-free?
- Allergen warnings"""
Recipe Suggestion
prompt = """Based on the ingredients visible in this image,
suggest a healthy recipe that could be made."""
Part 7: Alternative Implementation (Simple Version)
If you don’t have the google_api module, use this simpler version:
import google.generativeai as genai
import cv2
from PIL import Image
import ipywidgets as widgets
from IPython.display import display
from threading import Thread
import os
# Configure Gemini
genai.configure(api_key=os.environ.get("GOOGLE_API_KEY"))
model = genai.GenerativeModel('gemini-2.0-flash')
# Widgets
camera_view = widgets.Image(format='jpeg', width=320, height=240)
output = widgets.Textarea(layout={'width': '600px', 'height': '200px'}, disabled=True)
analyze_btn = widgets.Button(description='Analyze Food')
cap = cv2.VideoCapture(0)
running = True
def update_camera():
global running
while running:
ret, frame = cap.read()
if ret:
_, buffer = cv2.imencode('.jpg', frame)
camera_view.value = buffer.tobytes()
def analyze(b):
ret, frame = cap.read()
if ret:
output.value = "Analyzing..."
frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
pil_image = Image.fromarray(frame_rgb)
prompt = """Analyze this food image and provide:
1. Food items identified
2. Estimated calories per item
3. Total calories
4. Health assessment"""
response = model.generate_content([prompt, pil_image])
output.value = response.text
analyze_btn.on_click(analyze)
display(camera_view, analyze_btn, output)
Thread(target=update_camera, daemon=True).start()
Exercises
Exercise 1: Add Stop Camera Button
Implement the stop camera functionality to properly release the camera resource.
Exercise 2: Save Analysis History
Store each analysis result with timestamp and image for later review.
Exercise 3: Voice Output
Use text-to-speech to read the analysis results aloud.
Exercise 4: Multi-Language Support
Modify the prompt to get responses in different languages.
Troubleshooting
| Issue | Solution |
|---|---|
| Camera not found | Check ls /dev/video*, try different index |
| API error | Verify credentials in .env file |
| Slow response | Reduce image size before sending |
| Widget not displaying | Ensure ipywidgets is installed |
Summary
In this lab, you learned:
- How to build an AI-powered food analysis app
- How to integrate camera capture with Gemini Vision
- How to use ipywidgets for interactive Jupyter apps
- How to customize prompts for different analysis types
- How to use threading for live camera feeds