🤖 AI 核心词汇

AI 智能核心词汇

Raymond's AI Knowledge System · 2026.04

🇺🇸 美股 AI

NVIDIA 英伟达 AMD 超威 Broadcom 博通 Google 谷歌 Microsoft 微软 Meta Amazon 亚马逊 Tesla 特斯拉 IBM Apple 苹果

🇨🇳 A股 AI 算力

寒武纪 科大讯飞 中芯国际 澜起科技

💡 光通信

中际旭创 亨通光电 中天科技 德科立
1
编程基础
人工智能
The simulation of human intelligence by machines, enabling them to perform tasks requiring reasoning, learning, and problem-solving — the broadest concept in the field.
"AI is transforming healthcare diagnostics, financial trading, and autonomous transportation."
计算机科学GoogleMicrosoft
机器学习
A subset of AI where computers learn patterns and make decisions from data without being explicitly programmed, improving performance through accumulated experience.
"Machine learning models now detect diseases from medical images with accuracy rivaling radiologists."
深度学习
A branch of machine learning using multi-layered neural networks to automatically learn hierarchical representations directly from raw data such as images, audio, and text.
"Deep learning powers image recognition, voice assistants, and machine translation systems."
神经网络表征学习NVIDIA
神经网络
A computing system inspired by biological neural networks, consisting of interconnected layers of nodes (neurons) that process data through weighted connections and non-linear activation functions.
"A deep neural network with hundreds of layers can learn extremely complex patterns."
感知机权重激活函数
训练
The iterative process of feeding labeled data through a neural network and adjusting its weights so that the model's output increasingly matches the desired ground-truth results.
"Training a frontier large language model can cost hundreds of millions of dollars in compute."
监督学习梯度下降NVIDIA
推理
The phase where a trained model is used to make predictions or generate outputs on new, unseen data — typically faster and more cost-effective than training, often heavily optimized for deployment.
"On-device model inference enables real-time AI on smartphones without cloud connectivity."
推理引擎部署NVIDIA
过拟合
When a model learns the training data too well — including its noise and quirks — resulting in poor performance on new data and failing to generalize outside the training set.
"Adding dropout layers and monitoring validation loss are common techniques to combat overfitting."
正则化泛化验证集
泛化能力
A model's ability to perform accurately on new, unseen data not part of its training set — the fundamental goal that separates useful models from mere memorization.
"Good generalization is what separates a useful production model from one that memorizes training data."
测试集偏差-方差权衡
2
算法与数据结构
损失函数
A function measuring the discrepancy between a model predictions and the actual ground-truth values, providing the gradient signal used to iteratively update weights during training.
"Cross-entropy loss is the standard choice for multi-class classification tasks."
均方误差交叉熵优化目标
反向传播
An algorithm computing gradients of the loss with respect to every weight in a neural network by applying the chain rule of calculus layer by layer from output back to input.
"Backpropagation enables efficient gradient computation across hundreds of millions of parameters."
链式法则梯度NVIDIA
梯度下降
An optimization algorithm that iteratively adjusts model parameters in the direction of steepest descent of the loss landscape, seeking a local or global minimum.
"Stochastic gradient descent (SGD) with mini-batches dramatically accelerates training on large datasets."
学习率SGD收敛
优化器
An algorithm (e.g., Adam, RMSProp, SGD with momentum) implementing gradient descent, adaptively managing learning rates and momentum to accelerate convergence and escape poor local minima.
"Adam optimizer is the de facto standard for training deep neural networks today."
AdamMomentum自适应学习率
超参数
Parameters set before training begins (e.g., learning rate, batch size, number of layers) that control the training process and critically affect model performance and convergence speed.
"Grid search, random search, and Bayesian optimization are common methods for systematic hyperparameter tuning."
学习率Batch Size调参
正则化
Techniques (L1/L2 penalty, dropout, data augmentation) that discourage overfitting by adding constraints or injecting noise during training, improving the model ability to generalize.
"L2 regularization adds a penalty proportional to the sum of squared weights to the total loss."
DropoutL1/L2数据增强
嵌入向量
A dense, low-dimensional representation of discrete entities (words, items, nodes) where similar items are positioned close together in the learned vector space, capturing semantic relationships.
"Word embeddings capture semantic relationships: king minus man plus woman approximates queen."
Word2Vec向量空间Google
分词 / Token 化
The process of breaking raw text into discrete tokens — words, subwords (BPE, WordPiece, SentencePiece), or characters — that a language model can process as numerical input.
"The sentence AI is amazing might tokenize to AI/ is/ amazing with a BPE tokenizer."
BPEWordPiece子词
卷积神经网络
A neural network using learnable filters (kernels) that slide across input data to automatically detect spatial hierarchies — the foundational architecture for modern computer vision.
"CNNs excel at image classification, object detection, and video analysis tasks."
卷积层池化层计算机视觉
循环神经网络
A neural network architecture designed for sequential data, where connections between nodes form directed sequences, allowing information to persist across time steps via hidden states.
"RNNs process text sequentially, maintaining a hidden state that encodes the entire context so far."
序列模型隐藏状态时序数据
长短期记忆网络
A specialized RNN architecture with gating mechanisms (input, forget, output gates) that allow it to learn long-range dependencies and effectively mitigate the vanishing gradient problem.
"LSTMs revolutionized speech recognition and machine translation before the Transformer era."
门控机制长期依赖序列建模
Transformer 架构
A neural network architecture introduced in 2017 (Attention Is All You Need) relying entirely on self-attention mechanisms, eliminating recurrence and convolution, enabling massive parallelization across GPUs.
"GPT, BERT, Claude, and Gemini are all ultimately built on the Transformer architecture."
注意力机制并行计算NVIDIA
自注意力机制
A mechanism computing a weighted sum of all positions in a sequence for each position using learned Query, Key, and Value projections, enabling capture of long-range dependencies regardless of distance.
"Self-attention lets the model directly relate sat to both cat and mat regardless of their distance."
Query/Key/Value注意力分数上下文建模
3
AI 核心(LLM 等)
大语言模型
A neural network with billions to hundreds of billions of parameters, pre-trained on massive amounts of text data, capable of understanding and generating human language at remarkable scale and fluency.
"GPT-4, Claude 3 Opus, and Gemini Ultra are examples of frontier large language models."
生成式AINLPMicrosoftGoogle
提示词工程
The discipline of crafting and optimizing input prompts to elicit desired outputs from language models — including few-shot prompting, chain-of-thought reasoning, and structured system instructions.
"Chain-of-thought prompting asks the model to reason step-by-step, dramatically improving accuracy on math and logic tasks."
Few-shotZero-shotCoT
微调
The process of taking a pre-trained model and continuing training (updating weights) on a smaller, task-specific or domain-specific dataset to adapt it for a particular application or behavior.
"Fine-tuning Llama 3 on medical literature produces a specialist model for clinical Q&A."
迁移学习RLHFLoRA
检索增强生成
An architecture that augments a language model generation by retrieving relevant documents from an external knowledge base at inference time, reducing hallucination and improving factual accuracy.
"RAG enables a model to answer questions about recent events by dynamically retrieving live news articles."
向量数据库知识库幻觉抑制
幻觉 / 捏造
When an LLM generates confident, fluent responses that are factually incorrect, internally inconsistent, or unsupported by its training data — one of the most critical challenges in LLM deployment.
"A model might confidently cite a non-existent academic paper — that is a classic hallucination."
事实性RAG对齐
基于人类反馈的强化学习
A training methodology that uses human preference labels to train a reward model, then optimizes the policy via reinforcement learning (typically PPO) to align model outputs with human values and intent.
"RLHF is a key technique behind ChatGPT and Claude conversational alignment."
对齐奖励模型PPO
Agentic AI / 智能体
AI systems that can autonomously plan, reason, use tools, and execute multi-step workflows to accomplish complex goals without continuous human intervention — going beyond single prompt-response interactions.
"An Agentic AI can browse the web, write and execute code, and iterate on results autonomously to solve a research question."
AI AgentTool Use自主决策
AI 对齐
The research field focused on ensuring AI systems behave in ways that are beneficial and consistent with human intentions and values, preventing unintended or harmful outcomes as capabilities increase.
"AI alignment research addresses the challenge of keeping superintelligent AI under meaningful human control."
AI安全价值对齐RLHF
缩放定律
Empirical observations that a model performance improves predictably as the amount of training data, model parameters, and compute increase — often following clean power-law relationships across many orders of magnitude.
"Scaling laws predict that doubling compute typically yields predictable improvements in benchmark performance."
涌现能力Compute数据规模
4
算力基础设施
图形处理器
A specialized processor originally designed for graphics rendering, now the dominant accelerator for deep learning due to its massively parallel architecture with thousands of cores optimized for matrix/tensor operations.
"The NVIDIA H100 GPU delivers 30x the deep learning throughput of its predecessor generation."
并行计算CUDANVIDIAAMD
张量处理器
A Google-designed ASIC optimized for tensor operations that powers large-scale deep learning workloads in Google Cloud and research, offering high throughput for matrix multiplications.
"TPUs enabled Google to train BERT on massive text corpora in hours rather than days."
ASIC矩阵乘法Google
神经网络处理器
A specialized AI accelerator chip designed to efficiently execute neural network operations (convolution, matrix multiplication) with low power consumption, commonly integrated into mobile SoCs and edge devices.
"Apple NPU (Neural Engine) powers on-device AI features like computational photography and Siri."
边缘AI低功耗Apple
现场可编程门阵列
An integrated circuit that can be reconfigured after manufacturing, offering a flexible hardware platform for custom AI acceleration with lower latency than GPUs for specific workloads.
"Microsoft deploys FPGAs in Azure for real-time AI inference in its data centers."
硬件加速低延迟Microsoft
每秒万亿次浮点运算
A measure of computing performance representing one trillion (10^12) floating-point operations per second, commonly used to describe the throughput of GPUs and AI accelerators for training and inference workloads.
"The NVIDIA H100 delivers 3,958 TFLOPS of FP8 performance for AI training."
算力FP8性能指标
显存
Dedicated memory on a GPU used to store model parameters, activations, and intermediate computations during training and inference — a critical bottleneck for large models that determines maximum model size.
"Training GPT-3 requires machines with enough VRAM to collectively hold 175 billion parameters."
显存带宽模型规模NVIDIA
分布式训练
The practice of splitting a large model and/or large dataset across multiple GPUs and machines, training in parallel using data parallelism, model parallelism, and pipeline parallelism to overcome single-device limits.
"Distributed training across 4,096 H100 GPUs enabled training of GPT-4-scale models."
数据并行模型并行Pipeline
边缘计算
A distributed computing paradigm that processes data near the source (IoT devices, smartphones, local servers) rather than in centralized cloud data centers, reducing latency and enabling real-time AI inference.
"Edge computing allows autonomous vehicles to make split-second decisions without cloud round-trip latency."
低延迟物联网Apple
容器编排平台
An open-source container orchestration platform that automates deployment, scaling, and management of containerized applications across clusters — widely used for serving ML models at scale with automatic load balancing.
"Kubernetes enables automatic scaling of ML inference endpoints based on incoming request load."
容器编排微服务自动扩缩容
云计算
The delivery of computing resources (servers, storage, GPU, networking) over the internet on a pay-per-use basis, providing on-demand access to the massive infrastructure required for training and deploying large AI models.
"AWS, Azure, and Google Cloud offer GPU instances that make AI training accessible without buying hardware."
IaaS按需扩展AmazonMicrosoft
容器
A lightweight, standalone package that encapsulates a software application along with its runtime dependencies and configuration — enabling consistent ML model deployment across development, testing, and production environments.
"Docker containers package a trained model with all its dependencies for reproducible deployment."
Docker环境隔离可移植性
5
编程语言
Python
A high-level, interpreted programming language known for its readable syntax and vast ecosystem of libraries (NumPy, PyTorch, TensorFlow), making it the dominant language for AI research and machine learning development.
"Most state-of-the-art AI models are prototyped in Python before being optimized for production deployment."
AI开发PyTorchTensorFlowGoogle
C / C++
Low-level, high-performance programming languages that provide fine-grained control over memory and computation, widely used for implementing AI inference engines, GPU kernels (CUDA), and performance-critical AI serving infrastructure.
"CUDA kernels are written in C++ to achieve the low-level hardware control required for efficient GPU execution."
高性能CUDA推理引擎
Rust
A systems programming language emphasizing memory safety, concurrency, and zero-cost abstractions — increasingly adopted for writing safe, high-performance AI inference servers and compiler toolchains (e.g., llama.cpp).
"Rust is used by projects like llama.cpp and candle for memory-safe, high-performance ML inference."
内存安全高性能ML推理
Go
A statically typed, compiled language designed for simplicity and concurrency, popular for building scalable ML serving systems, data pipelines, and cloud-native AI infrastructure due to its excellent concurrency primitives.
"Go is widely used at Google and cloud providers for building scalable AI microservice backends."
并发云原生微服务
结构化查询语言
A domain-specific language for managing and querying relational databases — essential for feature engineering, data preprocessing, and retrieval in AI/ML pipelines, as well as vector databases that support RAG architectures.
"Feature stores use SQL to define and manage the computed features used in ML model training."
数据库特征工程RAG
JavaScript
A versatile, interpreted language that powers web browsers and Node.js — increasingly used for deploying lightweight ML inference via TensorFlow.js, ONNX.js, and client-side AI features directly in web applications.
"TensorFlow.js enables running pre-trained models directly in the browser with no server round-trip."
Web AITensorFlow.js客户端推理
6
新兴技术
边缘人工智能
The deployment of AI models directly on edge devices (smartphones, IoT sensors, smart cameras, wearables) enabling real-time inference without cloud connectivity, addressing privacy, latency, and bandwidth constraints.
"Edge AI allows a smart camera to detect faces locally without sending video streams to the cloud."
边缘计算设备端AI隐私保护
自动驾驶
The development of vehicles capable of sensing and navigating their environment without human input, using a combination of computer vision, sensor fusion (LiDAR, radar), reinforcement learning, and high-definition mapping.
"Autonomous driving systems use deep learning for real-time object detection and path planning."
计算机视觉传感器融合路径规划Tesla
量子计算
A computing paradigm that leverages quantum mechanical phenomena (superposition, entanglement) to perform certain calculations exponentially faster than classical computers, with potential applications in optimization, drug discovery, and cryptography.
"Quantum computing could eventually break current encryption standards and revolutionize drug design."
量子力学量子优越性后量子密码学