Run
Ollama Run Command Options
ollama run <model> [options]
What: Run a model with various configurable options.
Why: Customize model behavior like temperature, max tokens, stop sequences.
How: Add flags after the model name to control execution.
Example:
ollama run llama2 --temperature 0.5 --max_tokens 100
--temperature <value>
What: Sets randomness of model output.
Why: Lower values make output deterministic; higher increase creativity.
How: Use with ollama run
command.
Example:
ollama run llama2 --temperature 0.7
--max_tokens <number>
What: Limits maximum tokens generated.
Why: Controls length of model output.
How: Specify max tokens to prevent overly long answers.
Example:
ollama run llama2 --max_tokens 150
--stop <string>
What: Defines stop sequence to end output.
Why: Useful to control output boundaries.
How: Provide a string at which model should stop.
Example:
ollama run llama2 --stop "###"
--system <string>
What: Overrides system prompt temporarily.
Why: Customize model behavior for a single run.
How: Pass a string describing assistant role.
Example:
ollama run llama2 --system "You are an expert assistant."
--model <model>
What: Explicitly specify model to run.
Why: Useful when running from a default alias.
How: Use to override default.
Example:
ollama run --model llama2