Deep Seek Installer

newest version May 3rd 2026

Run DeepSeek on Your Mac

One-click offline install via Ollama. Your data never leaves your Mac.

Selected

DeepSeek-R1 7B

~4.7 GB  ·  16 GB RAM

Download install script
How to run the script
  1. Click the download button above to save the .sh file.
  2. Open Terminal — press ⌘ Space, type "Terminal", press Return.
  3. Run: chmod +x ~/Downloads/install-deepseek-*.sh && ~/Downloads/install-deepseek-*.sh
  4. The script installs Ollama and downloads your model automatically.
  5. Once done, start chatting: ollama run deepseek-r1:7b
Preview install script

            

What is DeepSeek?

DeepSeek is a family of open-weight large language models developed by DeepSeek AI. The R1 series specialises in chain-of-thought reasoning and outperforms many closed models on benchmarks like MATH and AIME. The Coder series is fine-tuned for code generation, completion, and debugging.

Running DeepSeek locally via Ollama means your data never leaves your machine, there are no API costs, and the model works completely offline once downloaded.

System requirements

ModelMinimum RAMStorageChip
DeepSeek-R1 1.5B8 GB2 GBAny Mac (2019+)
DeepSeek-R1 7B16 GB6 GBM1 / Intel i7+
DeepSeek-R1 14B16 GB11 GBM1 Pro / M2
DeepSeek Coder 6.7B8 GB5 GBAny Mac (2019+)
DeepSeek Coder V2 16B16 GB11 GBM1 Pro / M2
DeepSeek-R1 32B32 GB22 GBM2 Max / M3 Pro+

macOS version: 11 (Big Sur) or later. Apple Silicon Macs run models natively via Metal GPU acceleration.

Quick start

After running the install script, open Terminal and start a chat session:

ollama run deepseek-r1:7b

Type your message and press Return. To quit, type /bye or press Ctrl+D.

To list all downloaded models:

ollama list

To remove a model and free disk space:

ollama rm deepseek-r1:7b

Using from Python

Install the Ollama Python library:

pip install ollama

Basic chat example:

import ollama

response = ollama.chat(
    model="deepseek-r1:7b",
    messages=[
        {"role": "user", "content": "Explain quantum entanglement simply."}
    ]
)
print(response["message"]["content"])

Streaming responses:

import ollama

stream = ollama.chat(
    model="deepseek-r1:7b",
    messages=[{"role": "user", "content": "Write a haiku about rain."}],
    stream=True,
)
for chunk in stream:
    print(chunk["message"]["content"], end="", flush=True)

Multi-turn conversation:

import ollama

history = []

def chat(user_msg):
    history.append({"role": "user", "content": user_msg})
    resp = ollama.chat(model="deepseek-r1:7b", messages=history)
    reply = resp["message"]["content"]
    history.append({"role": "assistant", "content": reply})
    return reply

print(chat("What is the capital of Japan?"))
print(chat("What language do they speak there?"))

OpenAI-compatible REST API

Ollama exposes an OpenAI-compatible API on http://localhost:11434. Any tool that supports OpenAI can point to it.

Chat completion with curl:

curl http://localhost:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-r1:7b",
    "messages": [
      {"role": "user", "content": "Hello, how are you?"}
    ]
  }'

With the official OpenAI Python SDK:

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:11434/v1",
    api_key="ollama",  # required but ignored
)

response = client.chat.completions.create(
    model="deepseek-r1:7b",
    messages=[{"role": "user", "content": "Summarise the Apollo 11 mission."}],
)
print(response.choices[0].message.content)

Using from Node.js

Install the Ollama JS library:

npm install ollama

Basic usage:

import ollama from "ollama";

const response = await ollama.chat({
  model: "deepseek-r1:7b",
  messages: [{ role: "user", content: "What is 17 × 34?" }],
});

console.log(response.message.content);

Streaming in Node.js:

import ollama from "ollama";

const stream = await ollama.chat({
  model: "deepseek-r1:7b",
  messages: [{ role: "user", content: "Count to 10 slowly." }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.message.content);
}

Custom system prompts (Modelfile)

Create a Modelfile to give your model a persistent system prompt or tweak parameters:

FROM deepseek-r1:7b

SYSTEM """
You are a senior software engineer. Always respond with clean,
commented code. Prefer Python unless asked otherwise.
"""

PARAMETER temperature 0.2
PARAMETER top_p 0.9

Build and run your custom model:

ollama create my-coder -f ./Modelfile
ollama run my-coder

Useful parameters:

ParameterDefaultEffect
temperature0.8Lower = more focused, higher = more creative
top_p0.9Nucleus sampling threshold
num_ctx2048Context window size (tokens)
num_predict-1Max tokens to generate (-1 = unlimited)
repeat_penalty1.1Penalise repetition

Troubleshooting

Ollama command not found after install
Close Terminal and reopen it, or run source ~/.zshrc. The installer adds Ollama to /usr/local/bin.
Model pull fails or stalls
Check your internet connection. Resume an interrupted pull by running ollama pull MODEL_ID again — it resumes from where it stopped.
"Error: model requires more system memory"
Your Mac does not have enough RAM. Choose a smaller model (e.g. 1.5B or 7B) or close other applications to free memory.
Responses are very slow
On Intel Macs without a GPU, inference is CPU-only. Try a smaller model. Apple Silicon Macs use Metal GPU automatically for much faster speeds.
Port 11434 already in use
Another process is using the Ollama port. Run lsof -i :11434 to find it, or set a custom port: OLLAMA_HOST=0.0.0.0:11435 ollama serve.
How do I update Ollama?
Re-run the install script or download the latest version from ollama.com. Your downloaded models are preserved.

Run DeepSeek on Your Mac

One-click offline install via Ollama. Your data never leaves your Mac.