DeepSeek V4: a cheaper, larger LLM that narrows the gap with frontier models

Servers with modular expert modules and long context ribbon

Chinese lab DeepSeek has released preview details for DeepSeek V4, a major update the company says brings its models much closer to so-called frontier systems. The announcement introduces two variants — V4 Flash and V4 Pro — and highlights big increases in scale, a 1 million-token context window, and aggressive pricing that positions the models as lower-cost alternatives to high-end proprietary offerings. While DeepSeek touts strong reasoning and coding results, the preview also reveals limits in knowledge benchmarks and fresh scrutiny over the company’s practices.

What DeepSeek V4 brings

DeepSeek shipped two preview builds: V4 Flash and V4 Pro. Both are mixture-of-experts (MoE) architectures designed to activate fewer parameters per request to reduce inference costs. Crucially, DeepSeek says both variants support a 1,000,000-token context window — large enough to ingest entire codebases or long technical documents in a single prompt. The company positions V4 as a generational step up from its V3.2/R1 releases, emphasizing efficiency and improved task performance.

Architecture and scale

V4 Pro is the headline model on scale: DeepSeek reports a total of 1.6 trillion parameters with roughly 49 billion active parameters per request, which the company describes as the largest open-weight model to date. V4 Flash is a smaller option at 284 billion parameters with about 13 billion active. The MoE approach enables these very large total parameter counts while keeping per-request compute and cost lower than a fully dense model of comparable size.

Performance claims

DeepSeek asserts V4 closes much of the gap with both open and closed frontier models on reasoning benchmarks, and that a V4-Pro-Max configuration outperforms many open-source peers and even edges out some tasks against OpenAI’s GPT-5.2 and Google’s Gemini 3.0 Pro. For coding competitions, the company says the V4 models are “comparable to GPT-5.4.” These are preview claims and, as with all benchmark statements, will require independent verification across a range of tasks and prompt distributions.

Limitations and concerns

Despite gains, DeepSeek’s preview acknowledges weaker performance on knowledge tests, trailing leading frontier models such as GPT-5.4 and Gemini 3.1 Pro. The firm characterizes that gap as roughly three to six months on a “developmental trajectory,” implying the strongest closed-source systems still hold an edge in up-to-date factual recall or knowledge-intensive tasks. Separately, DeepSeek faces reputational and legal scrutiny: U.S. authorities have alleged widespread IP theft tied to Chinese actors, and industry competitors including Anthropic and OpenAI have publicly accused DeepSeek of distilling their models. Those accusations and geopolitical headwinds add nontechnical risk to adoption.

Pricing and market impact

DeepSeek’s pricing is a key differentiator. The company lists V4 Flash at $0.14 per million input tokens and $0.28 per million output tokens, and V4 Pro at $0.145 per million input and $3.48 per million output. DeepSeek claims these rates undercut multiple frontier offerings and smaller open models, a strategy that could accelerate usage among cost-sensitive developers and enterprises that need long-context processing without frontier model prices.

What to watch next

The preview raises several follow-ups to monitor: independent benchmark replication (reasoning, coding, and knowledge tasks); how well the 1M-token context performs in real-world applications; whether DeepSeek expands beyond text-only capabilities; responses from regulators and industry peers regarding IP and model provenance; and actual availability and terms for commercial use. If the company’s claims hold up, V4 could alter cost-to-performance tradeoffs for large-context and code-heavy workloads — but adoption will hinge on proven accuracy, trust, and compliance.