三角函数 (Trigonometry)

🎬 视频详解 (Video)

📌 核心定义 (What)

一句话定义：三角函数是描述角度与边长比值关系的函数，在 AI 中常用于位置编码 (Positional Encoding)、周期性数据建模和旋转变换。

函数	读音	定义	值域
sin θ	/saɪn/ 正弦	对边 / 斜边	$[-1, 1]$
cos θ	/kɒs/ 余弦	邻边 / 斜边	$[-1, 1]$
tan θ	/tæn/ 正切	对边 / 邻边 = sin/cos	$(-\infty, +\infty)$

🎨 交互演示：单位圆 (Interactive)

拖动圆上的点，直观理解 sin, cos, tan 的几何意义。

📍 三角函数可视化

sin θcos θtan θ

θ = 0.0°

sin θ = 0.000

cos θ = 1.000

tan θ = 0.000

💡 拖动圆上的点改变角度 θ | 橙色线为正切线 (x=1 处的截距)

🏠 生活类比 (Analogy)

🎡 “摩天轮上的位置”

想象你坐在摩天轮上：

cos θ: 你的水平位置（左右偏离中心多远）
sin θ: 你的垂直位置（上下偏离中心多远）
θ: 摩天轮转过的角度

当摩天轮转一圈 (360° = 2π)，你的位置周期性变化，这就是三角函数的周期性。

🎯 为什么 AI 需要它？ (Why)

1. Transformer 位置编码

Transformer 使用 sin/cos 来编码序列位置：

位置编码公式

PE_{(pos, 2i)} = \sin\left(\frac{pos}{10000^{2i/d}}\right) \\ PE_{(pos, 2i+1)} = \cos\left(\frac{pos}{10000^{2i/d}}\right)

$pos$ : 词在序列中的位置
$i$ : 维度索引
$d$ : 嵌入维度
为什么用 sin/cos? 因为它们的周期性让模型能学习相对位置关系

2. 旋转位置编码 (RoPE)

现代 LLM (如 LLaMA, Qwen) 使用旋转来编码位置：

RoPE 旋转矩阵

f(x, m) = \begin{pmatrix} \cos m\theta & -\sin m\theta \\ \sin m\theta & \cos m\theta \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \end{pmatrix}

将位置信息通过旋转角度编码进向量，支持任意长度外推。

3. 傅里叶变换 / 信号处理

处理音频、图像时，sin/cos 是分解周期信号的基础。

📊 核心公式 (Math)

基本恒等式

勾股恒等式

\sin^2\theta + \cos^2\theta = 1

单位圆上任意点 $(\cos\theta, \sin\theta)$ 到原点距离为 1。

导数公式

函数	导数
$\sin x$	$\cos x$
$\cos x$	$-\sin x$
$\tan x$	$\sec^2 x = \frac{1}{\cos^2 x}$

欧拉公式（最美公式）

欧拉公式

e^{i\theta} = \cos\theta + i\sin\theta

连接了指数函数、三角函数和复数，是傅里叶变换的基础。

import torch
import math

def positional_encoding(seq_len, d_model):
    """Transformer 位置编码"""
    pe = torch.zeros(seq_len, d_model)
    position = torch.arange(0, seq_len).unsqueeze(1).float()

    # 计算分母 (频率衰减)
    # 注意：d_model 应为偶数，若为奇数则取 d_model-1
    d_even = d_model if d_model % 2 == 0 else d_model - 1
    div_term = torch.exp(
        torch.arange(0, d_even, 2).float() *
        (-math.log(10000.0) / d_model)
    )

    # 偶数维度用 sin，奇数维度用 cos
    pe[:, 0::2] = torch.sin(position * div_term)
    pe[:, 1::2] = torch.cos(position * div_term[:pe[:, 1::2].shape[1]])

    return pe

# 示例
pe = positional_encoding(seq_len=10, d_model=512)
print(f"PE shape: {pe.shape}")  # [10, 512]
print(f"PE[0, :4]: {pe[0, :4]}")  # 第一个位置的前4维

import numpy as np

# 基本三角函数
theta = np.pi / 4  # 45度

print(f"sin(45°) = {np.sin(theta):.4f}")  # 0.7071
print(f"cos(45°) = {np.cos(theta):.4f}")  # 0.7071
print(f"tan(45°) = {np.tan(theta):.4f}")  # 1.0000

# 验证勾股恒等式
print(f"sin²+cos² = {np.sin(theta)**2 + np.cos(theta)**2:.4f}")  # 1.0

# 绘制 sin/cos 曲线
x = np.linspace(0, 2*np.pi, 100)
sin_y = np.sin(x)
cos_y = np.cos(x)

⚠️ 常见误区 (Pitfalls)

角度 (Degree): 360° = 一圈
弧度 (Radian): 2π = 一圈

Python/PyTorch 中 sin, cos 默认使用弧度！

import numpy as np

# ❌ 错误：直接用角度
np.sin(90)  # ≠ 1

# ✅ 正确：转换为弧度
np.sin(np.radians(90))  # = 1
np.sin(np.pi / 2)       # = 1

🎵 傅里叶变换可视化 (Fourier Transform)

三角函数的终极应用：任何复杂波形都可以分解为简单正弦波的叠加！

🎵 傅里叶变换可视化

分量数量:5

f=1, A=1.00

f=3, A=0.33

f=5, A=0.20

f=7, A=0.14

f=9, A=0.11

💡 核心洞察:

叠加原理: 复杂波形 = 多个简单正弦波的和
圆的旋转: 每个圆代表一个频率分量，圆心轨迹生成波形
AI 应用: Transformer 位置编码使用不同频率的 sin/cos

💡 调整分量数量观察方波如何由正弦波逼近 (吉布斯现象)

🔗 相关概念

线性代数 - 向量旋转变换
微积分 - sin/cos 求导
Transformer - 位置编码应用