Deep Seek Installer
One-click offline install via Ollama. Your data never leaves your Mac. Selected DeepSeek-R1 7B DeepSeek is a family of open-weight large language models developed by DeepSeek AI. The R1 series specialises in chain-of-thought reasoning and outperforms many closed models on benchmarks like MATH and AIME. The Coder series is fine-tuned for code generation, completion, and debugging. Running DeepSeek locally via Ollama means your data never leaves your machine, there are no API costs, and the model works completely offline once downloaded. macOS version: 11 (Big Sur) or later. Apple Silicon Macs run models natively via Metal GPU acceleration. After running the install script, open Terminal and start a chat session: Type your message and press Return. To quit, type To list all downloaded models: To remove a model and free disk space: Install the Ollama Python library: Basic chat example: Streaming responses: Multi-turn conversation: Ollama exposes an OpenAI-compatible API on Chat completion with curl: With the official OpenAI Python SDK: Install the Ollama JS library: Basic usage: Streaming in Node.js: Create a Build and run your custom model: Useful parameters:newest version May 3rd 2026
Run DeepSeek on Your Mac
How to run the script
chmod +x ~/Downloads/install-deepseek-*.sh && ~/Downloads/install-deepseek-*.sh
ollama run deepseek-r1:7b
Preview install script
What is DeepSeek?
System requirements
Model Minimum RAM Storage Chip DeepSeek-R1 1.5B 8 GB 2 GB Any Mac (2019+) DeepSeek-R1 7B 16 GB 6 GB M1 / Intel i7+ DeepSeek-R1 14B 16 GB 11 GB M1 Pro / M2 DeepSeek Coder 6.7B 8 GB 5 GB Any Mac (2019+) DeepSeek Coder V2 16B 16 GB 11 GB M1 Pro / M2 DeepSeek-R1 32B 32 GB 22 GB M2 Max / M3 Pro+
Quick start
ollama run deepseek-r1:7b
/bye or press Ctrl+D.ollama list
ollama rm deepseek-r1:7b
Using from Python
pip install ollama
import ollama
response = ollama.chat(
model="deepseek-r1:7b",
messages=[
{"role": "user", "content": "Explain quantum entanglement simply."}
]
)
print(response["message"]["content"])
import ollama
stream = ollama.chat(
model="deepseek-r1:7b",
messages=[{"role": "user", "content": "Write a haiku about rain."}],
stream=True,
)
for chunk in stream:
print(chunk["message"]["content"], end="", flush=True)
import ollama
history = []
def chat(user_msg):
history.append({"role": "user", "content": user_msg})
resp = ollama.chat(model="deepseek-r1:7b", messages=history)
reply = resp["message"]["content"]
history.append({"role": "assistant", "content": reply})
return reply
print(chat("What is the capital of Japan?"))
print(chat("What language do they speak there?"))
OpenAI-compatible REST API
http://localhost:11434. Any tool that supports OpenAI can point to it.curl http://localhost:11434/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-r1:7b",
"messages": [
{"role": "user", "content": "Hello, how are you?"}
]
}'
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:11434/v1",
api_key="ollama", # required but ignored
)
response = client.chat.completions.create(
model="deepseek-r1:7b",
messages=[{"role": "user", "content": "Summarise the Apollo 11 mission."}],
)
print(response.choices[0].message.content)
Using from Node.js
npm install ollama
import ollama from "ollama";
const response = await ollama.chat({
model: "deepseek-r1:7b",
messages: [{ role: "user", content: "What is 17 × 34?" }],
});
console.log(response.message.content);
import ollama from "ollama";
const stream = await ollama.chat({
model: "deepseek-r1:7b",
messages: [{ role: "user", content: "Count to 10 slowly." }],
stream: true,
});
for await (const chunk of stream) {
process.stdout.write(chunk.message.content);
}
Custom system prompts (Modelfile)
Modelfile to give your model a persistent system prompt or tweak parameters:FROM deepseek-r1:7b
SYSTEM """
You are a senior software engineer. Always respond with clean,
commented code. Prefer Python unless asked otherwise.
"""
PARAMETER temperature 0.2
PARAMETER top_p 0.9
ollama create my-coder -f ./Modelfile
ollama run my-coder
Parameter Default Effect temperature0.8 Lower = more focused, higher = more creative top_p0.9 Nucleus sampling threshold num_ctx2048 Context window size (tokens) num_predict-1 Max tokens to generate (-1 = unlimited) repeat_penalty1.1 Penalise repetition
Troubleshooting
source ~/.zshrc. The installer adds Ollama to /usr/local/bin.ollama pull MODEL_ID again — it resumes from where it stopped.lsof -i :11434 to find it, or set a custom port: OLLAMA_HOST=0.0.0.0:11435 ollama serve.
Run DeepSeek on Your Mac
One-click offline install via Ollama. Your data never leaves your Mac.



