DeepSeek
DeepSeek provides an OpenAI-compatible API for their language models, with specialized models for both general chat and advanced reasoning tasks. The DeepSeek provider is compatible with all the options provided by the OpenAI provider.
Setup
- Get an API key from the DeepSeek Platform
- Set
DEEPSEEK_API_KEYenvironment variable or specifyapiKeyin your config
Configuration
Basic configuration example:
providers:
- id: deepseek:deepseek-chat
config:
temperature: 0.7
max_tokens: 4000
apiKey: YOUR_DEEPSEEK_API_KEY
- id: deepseek:deepseek-reasoner # Legacy alias for V4 Flash thinking mode
config:
max_tokens: 8000
Configuration Options
temperaturemax_tokenscost,inputCost,outputCost- Override promptfoo's pricing estimates (inputCostandoutputCosttake precedence overcost)top_p,presence_penalty,frequency_penaltystreamshowThinking- Control whether reasoning content is included in the output (default:true, applies to deepseek-reasoner model)
Available Models
The current primary API model names are deepseek-v4-flash and deepseek-v4-pro. The legacy aliases deepseek-chat and deepseek-reasoner remain available until July 24, 2026 and currently map to the non-thinking and thinking modes of deepseek-v4-flash, respectively.
deepseek-v4-flash
- General purpose V4 model for conversations and reasoning
- 1M context window, up to 384K output tokens
- Input: $0.0028/1M (cache hit), $0.14/1M (cache miss)
- Output: $0.28/1M
deepseek-v4-pro
- Higher-capability V4 model with thinking and non-thinking modes
- 1M context window, up to 384K output tokens
- Input: $0.003625/1M (cache hit), $0.435/1M (cache miss)
- Output: $0.87/1M
- Promotional pricing is documented through May 31, 2026
Legacy aliases
deepseek-chat
- Legacy alias that currently maps to non-thinking
deepseek-v4-flash - Scheduled for retirement on July 24, 2026
deepseek-reasoner
- Legacy alias that currently maps to thinking
deepseek-v4-flash - Scheduled for retirement on July 24, 2026
- Supports showing or hiding reasoning content through the
showThinkingparameter
Thinking mode does not support temperature, top_p, presence_penalty, frequency_penalty, logprobs, or top_logprobs parameters. Setting these parameters will not trigger an error but will have no effect.
Example Usage
Here's an example comparing DeepSeek with OpenAI on reasoning tasks:
providers:
- id: deepseek:deepseek-reasoner
config:
max_tokens: 8000
showThinking: true # Include reasoning content in output (default)
- id: openai:o-1
config:
temperature: 0.0
prompts:
- 'Solve this step by step: {{math_problem}}'
tests:
- vars:
math_problem: 'What is the derivative of x^3 + 2x with respect to x?'
Controlling Reasoning Output
The legacy deepseek-reasoner alias uses V4 Flash thinking mode and includes detailed
reasoning steps in its output. You can control whether this reasoning content is shown
using the showThinking parameter:
providers:
- id: deepseek:deepseek-reasoner
config:
showThinking: false # Hide reasoning content from output
When showThinking is set to true (default), the output includes both reasoning and the final answer in a standardized format:
Thinking: <reasoning content>
<final answer>
When set to false, only the final answer is included in the output. This is useful when you want better reasoning quality but don't want to expose the reasoning process to end users or in your assertions.
See our complete example that benchmarks it against OpenAI's o1 model on the MMLU reasoning tasks.
API Details
- Base URL:
https://api.deepseek.com/v1 - OpenAI-compatible API format
- Full API documentation
See Also
- OpenAI Provider - Compatible configuration options
- Complete example - Benchmark against OpenAI's o1 model