InfluxDB 时序数据库¶

InfluxDB 是最流行的时序数据库，专门用于存储和查询带时间戳的数据（如监控指标、传感器数据、实验数据），内置降采样和数据保留策略，是设备监控和实验数据记录的首选。

核心知识点¶

知识点	说明
数据库类型	时序数据库（Time Series Database）
最新版本	InfluxDB 3.0（OSS/Edge/Cloud）
查询语言	SQL + InfluxQL（v3 以 SQL 为主）
核心优势	高写入吞吐、自动压缩、保留策略、降采样
适用场景	系统监控、实验数据采集、IoT、性能分析
存储引擎	v3 基于 Apache Arrow 和 Parquet

安装配置¶

Docker 安装¶

# InfluxDB 2.x（稳定版，推荐）
docker run -d \
  --name influxdb \
  -p 8086:8086 \                      # Web UI 和 API 端口
  -v influxdb_data:/var/lib/influxdb2 \
  influxdb:2.7

# 访问 http://localhost:8086 完成初始设置
# 设置组织名、桶名、管理员账号

基本使用¶

1. 写入数据（Line Protocol）¶

# InfluxDB 使用 Line Protocol 写入数据
# 格式: measurement,tag_key=tag_val field_key=field_val timestamp

# 写入实验数据
curl -X POST "http://localhost:8086/api/v2/write?org=myorg&bucket=experiment" \
  -H "Authorization: Token YOUR_TOKEN" \
  -H "Content-Type: text/plain" \
  --data-raw '
qc_metrics,sample=T2D_001,tool=fastp total_reads=50000000,pass_rate=0.95 1700000000000000000
qc_metrics,sample=T2D_001,tool=fastp q20_rate=0.98,q30_rate=0.93 1700000000000000000
qc_metrics,sample=T2D_002,tool=fastp total_reads=45000000,pass_rate=0.92 1700003600000000000
'

2. Python 操作¶

from influxdb_client import InfluxDBClient, Point  # pip install influxdb-client
from influxdb_client.client.write_api import SYNCHRONOUS

# 连接
client = InfluxDBClient(
    url="http://localhost:8086",
    token="YOUR_TOKEN",
    org="myorg"
)

# 写入数据
write_api = client.write_api(write_options=SYNCHRONOUS)

# 写入单条数据
point = Point("qc_metrics") \
    .tag("sample", "T2D_003") \        # 标签（用于分组）
    .tag("tool", "fastp") \
    .field("total_reads", 48000000) \  # 字段（实际数值）
    .field("pass_rate", 0.94)

write_api.write(bucket="experiment", record=point)

# 查询数据
query_api = client.query_api()
result = query_api.query('''
    from(bucket: "experiment")
    |> range(start: -24h)
    |> filter(fn: (r) => r._measurement == "qc_metrics")
    |> filter(fn: (r) => r.sample == "T2D_001")
''')

for table in result:
    for record in table.records:
        print(f"{record.get_time()}: {record.get_field()} = {record.get_value()}")

client.close()

3. 数据保留策略¶

# 设置数据保留时间（超过时间自动删除）
# 在 InfluxDB UI 中：Data → Buckets → 设置 Retention Period
# 或通过 API 创建带保留策略的桶
curl -X POST "http://localhost:8086/api/v2/buckets" \
  -H "Authorization: Token YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "short_term",
    "orgID": "YOUR_ORG_ID",
    "retentionRules": [{"type": "expire", "everySeconds": 2592000}]
  }'
# 30 天 = 2592000 秒，30 天后数据自动删除

常见报错与解决¶

报错信息	原因	解决方法
`unauthorized`	Token 错误	检查 API Token
`bucket not found`	桶不存在	在 UI 或 API 中创建桶
写入失败	Line Protocol 格式错	检查空格和逗号分隔
查询超时	时间范围太大	缩小 range 范围

速查表¶

# ===== InfluxDB 速查表 =====

# Docker 安装
docker run -d -p 8086:8086 influxdb:2.7

# Line Protocol 格式
# measurement,tag1=v1,tag2=v2 field1=v1,field2=v2 timestamp

# Flux 查询
# from(bucket: "b") |> range(start: -1h) |> filter(fn: (r) => r._measurement == "m")

# Python: pip install influxdb-client
# client = InfluxDBClient(url="http://localhost:8086", token="...", org="...")
# write_api.write(bucket="b", record=point)
# query_api.query('from(bucket:"b") |> range(start:-1h)')

# 常用场景
# 监控指标采集     CPU/内存/磁盘使用率
# 实验数据记录     测序质控指标随时间变化
# 流程性能分析     每步流程的耗时统计