this post was submitted on 31 Jan 2025
1 points (100.0% liked)

Self-Hosted Alternatives to Popular Services

213 readers
2 users here now

A place to share, discuss, discover, assist with, gain assistance for, and critique self-hosted alternatives to our favorite web apps, web...

founded 2 years ago
MODERATORS
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/selfhosted by /u/yoracale on 2025-01-31 18:03:16+00:00.


Hey guys! We previously wrote that you can run R1 locally but many of you were asking how. Our guide was a bit technical, so we at Unsloth collabed with Open WebUI (a lovely chat UI interface) to create this beginner-friendly, step-by-step guide for running the full DeepSeek-R1 Dynamic 1.58-bit model locally.

This guide is summarized so I highly recommend you read the full guide (with pics) here:

  • You don't need a GPU to run this model but it will make it faster especially when you have at least 24GB of VRAM.
  • Try to have a sum of RAM + VRAM = 80GB+ to get decent tokens/s

To Run DeepSeek-R1:

1. Install Llama.cpp

  • Download prebuilt binaries or build from source following this guide.

2. Download the Model (1.58-bit, 131GB) from Unsloth

  • Get the model from Hugging Face.
  • Use Python to download it programmatically:
from huggingface_hub import snapshot_download snapshot_download(     repo_id="unsloth/DeepSeek-R1-GGUF",     local_dir="DeepSeek-R1-GGUF",     allow_patterns=["*UD-IQ1_S*"] ) 

  • Once the download completes, you’ll find the model files in a directory structure like this:
DeepSeek-R1-GGUF/ ├── DeepSeek-R1-UD-IQ1_S/ │   ├── DeepSeek-R1-UD-IQ1_S-00001-of-00003.gguf │   ├── DeepSeek-R1-UD-IQ1_S-00002-of-00003.gguf │   ├── DeepSeek-R1-UD-IQ1_S-00003-of-00003.gguf

  • Ensure you know the path where the files are stored.

3. Install and Run Open WebUI

  • This is how Open WebUI looks like running R1

  • If you don’t already have it installed, no worries! It’s a simple setup. Just follow the Open WebUI docs here:

  • Once installed, start the application - we’ll connect it in a later step to interact with the DeepSeek-R1 model.

4. Start the Model Server with Llama.cpp

Now that the model is downloaded, the next step is to run it using Llama.cpp’s server mode.

🛠️Before You Begin:

  1. Locate the llama-server Binary
  2. If you built Llama.cpp from source, the llama-server executable is located in:llama.cpp/build/bin Navigate to this directory using:cd [path-to-llama-cpp]/llama.cpp/build/bin Replace [path-to-llama-cpp] with your actual Llama.cpp directory. For example:cd ~/Documents/workspace/llama.cpp/build/bin
  3. Point to Your Model Folder
  4. Use the full path to the downloaded GGUF files.When starting the server, specify the first part of the split GGUF files (e.g., DeepSeek-R1-UD-IQ1_S-00001-of-00003.gguf).

🚀Start the Server

Run the following command:

./llama-server \     --model /[your-directory]/DeepSeek-R1-GGUF/DeepSeek-R1-UD-IQ1_S/DeepSeek-R1-UD-IQ1_S-00001-of-00003.gguf \     --port 10000 \     --ctx-size 1024 \     --n-gpu-layers 40 

Example (If Your Model is in /Users/tim/Documents/workspace):

./llama-server \     --model /Users/tim/Documents/workspace/DeepSeek-R1-GGUF/DeepSeek-R1-UD-IQ1_S/DeepSeek-R1-UD-IQ1_S-00001-of-00003.gguf \     --port 10000 \     --ctx-size 1024 \     --n-gpu-layers 40 

✅ Once running, the server will be available at:

http://127.0.0.1:10000/

🖥️ Llama.cpp Server Running

After running the command, you should see a message confirming the server is active and listening on port 10000.

Step 5: Connect Llama.cpp to Open WebUI

  1. Open Admin Settings in Open WebUI.
  2. Go to Connections > OpenAI Connections.
  3. Add the following details:
  4. URL → Key → none

Adding Connection in Open WebUI

If you have any questions please let us know and also - any suggestions are also welcome! Happy running folks! :)

no comments (yet)
sorted by: hot top controversial new old
there doesn't seem to be anything here