Deepseek: A True AI Revolution or Just Compression and Distillation?

2月6日讀畢需時 5 分鐘

In recent years, large language models (LLMs) have become the backbone of AI innovation, driving massive global investments in advanced systems.

China has entered this race with Deepseek, a model claiming to deliver mainstream AI performance at a fraction of the cost, sparking both interest and controversy.

However, as Deepseek gains traction, it faces growing scrutiny over potentially misleading cost claims, alleged misuse of OpenAI-based model distillation, unauthorized acquisition of Nvidia’s high-end AI chips, security risks, and intellectual property concerns—leading to increasing global investigations and restrictions.

While Deepseek improves cost efficiency in LLM deployment, many experts argue it offers optimization rather than true AI innovation.

This article breaks down the key facts and controversies to provide a clearer perspective on the ongoing debate.

Table of Contents

What is an LLM? Why is the Barrier to Entry So High?

The Basics of Language Models
Training Requirements and Costs
What Qualifies as a True AI Breakthrough?

Understanding Deepseek

Definition and Key Features
Deepseek vs. Mainstream AI Models

Deepseek's Controversies and Concerns

Which Global Bans is Deepseek Currently Facing?

What is an LLM, and Why Is the Barrier So High?

Understanding Large Language Models (LLMs)

A Large Language Model (LLM) is a type of neural network built using deep learning and transformer architecture, allowing it to generate and understand human-like natural language.

These models rely on vast amounts of training data, including:

Large-scale text datasets sourced from the internet
Self-supervised learning to develop a deep understanding of language structures
Advanced reasoning and information retrieval capabilities

Prominent LLMs include OpenAI’s GPT-4, Google’s Gemini, and Meta’s Llama series.

The High Cost of Training an LLM

Training an LLM is incredibly expensive, making it nearly impossible for smaller companies to develop their own models.

For reference, GPT-4’s training costs are estimated in the hundreds of millions of dollars.

The major cost factors include:

Data Processing

Collecting, cleaning, and filtering high-quality training datasets.

Computational Infrastructure

Thousands of GPUs (Graphics Processing Units) or TPUs (Tensor Processing Units) are required for training.

Electricity & Cooling

Running these high-performance computing clusters consumes vast amounts of energy and requires dedicated cooling systems.

Human Expertise

AI researchers, engineers, and data scientists fine-tune hyperparameters and optimize model architectures.

What Defines a True AI Breakthrough?

When discussing revolutionary AI advancements, true breakthroughs go beyond simply optimizing efficiency or reducing costs.

Key indicators of innovation include:

Architectural Innovations

Like Google’s Transformer architecture, which reshaped AI development.

Training Methodology Improvements

Advancements such as few-shot learning and self-supervised learning.

Enhanced Reasoning Capabilities

AI models should be able to understand complex contexts, answer nuanced questions, and perform logical reasoning.

Resource Optimization & Performance Gains

While cost-efficiency is important, it should be achieved without sacrificing model capabilities.

Introducing Deepseek: AI Innovation or Just Cost-Cutting?

What is Deepseek?

Deepseek (深度求索) is a Chinese AI system that includes models such as DeepSeek-V3 and DeepSeek-R1, marketed as "high-performance yet low-cost language models advancing Artificial General Intelligence (AGI)."

Key Technical Features

Initially, Deepseek claimed to be a revolutionary AI development. However, its primary focus appears to be reducing the training and operational costs of LLMs rather than innovating new model architectures.

Hardware & Cost Efficiency

Deepseek’s parent company, High-Flyer Quant, reportedly purchased large quantities of older Nvidia A100 GPUs before the U.S. banned the export of Nvidia’s high-end H100 AI chips to China in 2022. This allowed them to keep hardware costs lower.

Deepseek also claims that its model training costs were under $6 million, far lower than competitors who require over $100 million.

Feature	Deepseek AI	GPT-4 (OpenAI)	Gemini (Google)	Llama 3 (Meta)
Model Type	Training-optimized AI	General-purpose LLM	Multimodal (Text + Image)	Open-weight LLM
Core Technology	Training acceleration & compression	Transformer architecture	Multimodal learning	High efficiency, low resource demand
Primary Use Case	Cost reduction & faster inference speed	High-level intelligence (dialogue & content generation)	Vision + text comprehension	Enterprise adoption & developer community
Data Sources	Not disclosed	Multilingual web content & books	Image + text datasets	Open-source data
Privacy & Security	Highly controversial	Well-defined privacy policies	Emphasizes data security	Community moderation

The Controversies and Risks of Deepseek

Issue 1: Not a True AI Innovation, Just "Technical Compression"

Experts argue that Deepseek is not a breakthrough LLM but rather relies on compression algorithms to optimize training and inference speeds.

While this reduces costs, it doesn’t enhance reasoning capabilities or model intelligence, contradicting claims of being a "revolutionary AI."

Issue 2: Security Concerns and Data Transmission Risks

Deepseek’s privacy policy explicitly states that user data is stored on servers in China, raising concerns over data security and government surveillance risks.

Issue 3: Lack of Transparency in Content Moderation

Unlike OpenAI and Google, Deepseek does not disclose its training data sources or content moderation mechanisms. This has led to concerns that it could be used for information manipulation or propaganda.

Issue 4: OpenAI Accuses Deepseek of IP Theft via Model Distillation

Model distillation is a technique that enables smaller models to mimic larger models. If misused, it can become a tool for intellectual property theft.

OpenAI’s terms prohibit users from replicating its models or using its outputs to build competing AI. While OpenAI has accused Deepseek of leveraging its models, no direct evidence has been made public.

Issue 5: Allegations of Using Nvidia H100 Chips Banned for China

U.S. company Scale AI’s CEO, Alexandr Wang, alleges that Deepseek has acquired up to 50,000 Nvidia H100 chips—which are banned for export to China. If true, this would violate U.S. export control laws.

Issue 6: Potentially Overstated Cost Efficiency Claims

Investigations suggest that Deepseek’s parent company spent over $139 million on AI infrastructure, far exceeding its claimed $6 million training cost.

Which Global Bans is Deepseek Facing?

Italy

On January 28, 2025, Italy’s Data Protection Authority (Garante) warned that Deepseek mishandles user data and confirmed that user information is stored on Chinese servers. Following this, Deepseek was removed from Apple’s App Store and Google Play in Italy.

Taiwan

Due to cybersecurity and data leakage concerns, Taiwan’s Ministry of Digital Affairs proposed restrictions on Deepseek in public institutions on January 31, 2025 and fully banned the service on February 3, 2025.

United States

While no nationwide ban exists, several government bodies have taken action:

U.S. Congress: Banned Deepseek from all official devices.
Texas: Prohibited Deepseek use in government offices.
Department of Defense & U.S. Navy: Blocked Deepseek from military networks.
NASA: Completely banned employees from using Deepseek.

Japan

On February 1, 2025, the Japanese government followed suit, advising public officials to "avoid using" Deepseek to safeguard data security.

South Korea

On February 5, 2025, the South Korean government and major corporations joined the ban on Deepseek.

Ministry of Foreign Affairs & Ministry of Trade, Industry, and Energy: Blocked Deepseek and restricted employees from accessing the platform.
Ministry of the Interior and Safety: Issued a directive to the central government and all 17 local governments, urging public officials to use AI tools cautiously and avoid inputting personal data.
Personal Information Protection Commission (PIPC): Sent an official request to Deepseek's Beijing headquarters seeking clarification on its data collection, processing, and storage practices—but has yet to receive a response.
Kakao: After announcing a partnership with OpenAI, issued an internal directive prohibiting employees from using Deepseek for business purposes, making it the first South Korean tech company to do so.
LG U+ (Telecom Provider): Released a cybersecurity notice banning employees from using Deepseek.
Samsung, SK, LG: Implemented internal policies prohibiting employees from using unauthorized AI tools.
Naver: Restricted employees from using generative AI tools in external environments.