Skip to main content

Run


Ollama Run Command Options

ollama run <model> [options]

What: Run a model with various configurable options.
Why: Customize model behavior like temperature, max tokens, stop sequences.
How: Add flags after the model name to control execution.

Example:

ollama run llama2 --temperature 0.5 --max_tokens 100
--temperature <value>

What: Sets randomness of model output.
Why: Lower values make output deterministic; higher increase creativity.
How: Use with ollama run command.

Example:

ollama run llama2 --temperature 0.7
--max_tokens <number>

What: Limits maximum tokens generated.
Why: Controls length of model output.
How: Specify max tokens to prevent overly long answers.

Example:

ollama run llama2 --max_tokens 150
--stop <string>

What: Defines stop sequence to end output.
Why: Useful to control output boundaries.
How: Provide a string at which model should stop.

Example:

ollama run llama2 --stop "###"
--system <string>

What: Overrides system prompt temporarily.
Why: Customize model behavior for a single run.
How: Pass a string describing assistant role.

Example:

ollama run llama2 --system "You are an expert assistant."
--model <model>

What: Explicitly specify model to run.
Why: Useful when running from a default alias.
How: Use to override default.

Example:

ollama run --model llama2