Deploy Chatbot with Ollama and Docker on Windows with OpenWebUI Guide

In this guide, we will walk you through how to easily deploy a simple chatbot application using Ollama on your Windows laptop. Whether you're using a GPU or not, this step-by-step process will help you set up a local environment and start experimenting with LLM (Large Language Models) like Llama 3.2.

Features:

Interactive Chatbot Interface: A simple UI built with OpenWebUI for easy user interaction.
Ollama Backend Integration: Connects to the Ollama backend server for generating LLM responses.
Easy Configuration: Set up the Ollama backend server IP using environment variables for seamless deployment.
Supports both GPU and CPU: Instructions for running the application on laptops with or without GPU support.

Step-by-Step Guide for Running LLM Apps on Your Laptop (Windows)

Install Docker Desktop and Check GPU Support

Before we dive into setting up the application, ensure that Docker Desktop is installed and running on your Windows laptop. Docker will be used to containerize the application, making the deployment process much simpler.

Download Docker Desktop: Install Docker Desktop.
Check Docker Status: Open your terminal and run the command:

sudo docker ps

For GPU users:

Ensure nvidia-smi is installed. You can check GPU status by running:

nvidia-smi.exe

Download and Run Ollama with GPU Support

Now, let's pull the Ollama Docker image with GPU support.

Run the following command to launch the Ollama container:
docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

This will download the Ollama image and start the container with GPU support.

Download the Llama 3.2 Model

Once the container is running, you can download the Llama 3.2 model:

docker exec -it ollama ollama run llama3.2

Now, interact with your Llama model by typing:

>>> Hello how are you?

Run OpenWebUI with Ollama Container IP

You’re now ready to run the OpenWebUI to interact with your LLM via a browser. Make sure the container IP is correct:

Find the container IP using:

docker inspect ollama | findstr "IP"

Use the IP found to set up OpenWebUI:

docker run -d -p 3000:8080 -e WEBUI_AUTH=False -v open-webui:/app/backend/data -e OLLAMA_BASE_URL=http://<ollama-container-ip>:11434 --name open-webui ghcr.io/open-webui/open-webui:main

Once running, access your chatbot at:

http://localhost:3000

Troubleshooting

Environment Variables: Ensure that the OLLAMA_IP is correctly set.
Port Issues: Make sure port 11434 is open and not blocked by any firewall.
GPU Not Detected: If your GPU is not being utilized, check if nvidia-smi is working inside the container.

Conclusion

Congratulations! You've successfully set up a personal chatbot using Ollama and Llama 3.2 on your Windows laptop. Whether you're running this with a GPU or not, this tutorial helps you deploy LLM apps easily, making it a great starting point for anyone looking to explore the power of large language models on their personal machine.

How to Run and Deploy LLM Apps on Your Windows Laptop (With or Without GPU) using Ollama and Docker