Elasticsearch 性能优化教程

Elasticsearch 的性能优化涉及索引设计、查询优化、JVM 调参和硬件配置等多个层面。在搬瓦工 VPS 资源有限的环境下，合理的优化可以让 Elasticsearch 发挥最佳性能。本文将从实战角度系统讲解 ES 的性能调优方法，帮助你在有限资源下获得最优的搜索和分析体验。在此之前请确保已完成 Elasticsearch 部署。

一、索引设计优化

1.1 合理设置分片数

# 查看索引分片情况
curl -X GET "localhost:9200/_cat/shards?v&s=index"

# 创建索引时指定分片数
curl -X PUT "localhost:9200/logs-2026" -H 'Content-Type: application/json' -d '{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 0,
    "refresh_interval": "30s"
  }
}'

分片数建议：单个分片 10-50GB，搬瓦工单节点使用 1 个主分片即可。过多分片会浪费内存和 CPU。

1.2 Mapping 优化

{
  "mappings": {
    "properties": {
      "title": { "type": "text", "analyzer": "ik_max_word" },
      "status": { "type": "keyword" },
      "price": { "type": "scaled_float", "scaling_factor": 100 },
      "tags": { "type": "keyword" },
      "description": { "type": "text", "index": false },
      "created_at": { "type": "date", "format": "yyyy-MM-dd HH:mm:ss||epoch_millis" }
    },
    "dynamic": "strict"
  }
}

不需要搜索的字段设置 "index": false。
不需要聚合的文本字段关闭 fielddata。
金额等小数使用 scaled_float 节省空间。
设置 "dynamic": "strict" 防止意外字段污染 Mapping。

1.3 索引模板

curl -X PUT "localhost:9200/_index_template/logs_template" -H 'Content-Type: application/json' -d '{
  "index_patterns": ["logs-*"],
  "template": {
    "settings": {
      "number_of_shards": 1,
      "number_of_replicas": 0,
      "refresh_interval": "30s",
      "codec": "best_compression"
    }
  }
}'

二、写入性能优化

# 批量写入（_bulk API）
curl -X POST "localhost:9200/_bulk" -H 'Content-Type: application/x-ndjson' -d '
{"index":{"_index":"products","_id":"1"}}
{"name":"Product 1","price":99.99}
{"index":{"_index":"products","_id":"2"}}
{"name":"Product 2","price":199.99}
'

# 调整 refresh_interval（写入密集期）
curl -X PUT "localhost:9200/products/_settings" -H 'Content-Type: application/json' -d '{
  "index.refresh_interval": "60s"
}'

# 写入完成后恢复
curl -X PUT "localhost:9200/products/_settings" -H 'Content-Type: application/json' -d '{
  "index.refresh_interval": "1s"
}'

写入优化要点：使用 _bulk API 批量写入（每批 5-15MB）；写入密集期增大 refresh_interval；关闭副本写入后再开启。

三、查询性能优化

# 使用 filter 替代 query（可缓存，不计算分数）
{
  "query": {
    "bool": {
      "filter": [
        { "term": { "status": "active" } },
        { "range": { "price": { "gte": 100 } } }
      ]
    }
  }
}

# 只返回需要的字段
{
  "query": { "match": { "name": "笔记本" } },
  "_source": ["name", "price", "category"]
}

# 使用 search_after 替代 from/size 深度分页
# 避免 from + size > 10000

四、JVM 和系统调优

# JVM 堆内存设置（/etc/elasticsearch/jvm.options.d/custom.options）
# 搬瓦工 2GB 方案
-Xms768m
-Xmx768m

# 搬瓦工 4GB 方案
-Xms2g
-Xmx2g

# 系统参数
echo "vm.max_map_count=262144" >> /etc/sysctl.conf
echo "vm.swappiness=1" >> /etc/sysctl.conf
sysctl -p

# 禁用 swap
swapoff -a

JVM 堆内存设为物理内存的 50%，但不超过 31GB（压缩指针限制）。剩余内存留给 Lucene 的文件系统缓存。

五、缓存管理

# 查看缓存使用情况
curl -X GET "localhost:9200/_nodes/stats/indices/query_cache,request_cache,fielddata?pretty"

# 清除缓存
curl -X POST "localhost:9200/_cache/clear"

# 配置请求缓存
curl -X PUT "localhost:9200/products/_settings" -H 'Content-Type: application/json' -d '{
  "index.requests.cache.enable": true
}'

六、索引生命周期管理

# 创建 ILM 策略
curl -X PUT "localhost:9200/_ilm/policy/logs_policy" -H 'Content-Type: application/json' -d '{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": { "max_size": "5gb", "max_age": "7d" }
        }
      },
      "warm": {
        "min_age": "7d",
        "actions": {
          "shrink": { "number_of_shards": 1 },
          "forcemerge": { "max_num_segments": 1 }
        }
      },
      "delete": {
        "min_age": "30d",
        "actions": { "delete": {} }
      }
    }
  }
}'

七、监控关键指标

# 集群健康和关键指标
curl -X GET "localhost:9200/_cluster/stats?pretty"

# 节点级别热线程（排查 CPU 高）
curl -X GET "localhost:9200/_nodes/hot_threads"

# 慢查询日志配置
curl -X PUT "localhost:9200/products/_settings" -H 'Content-Type: application/json' -d '{
  "index.search.slowlog.threshold.query.warn": "5s",
  "index.search.slowlog.threshold.query.info": "2s",
  "index.search.slowlog.threshold.fetch.warn": "1s"
}'

八、优化检查清单

分片大小控制在 10-50GB，单节点分片总数不超过 1000。
不需要搜索的字段关闭索引。
filter 查询优先于 query 查询。
批量写入使用 _bulk API。
避免深度分页（from+size > 10000）。
JVM 堆设为物理内存 50%，不超过 31GB。
禁用 swap，调整 vm.max_map_count。
使用 ILM 管理索引生命周期。
定期执行 forcemerge 减少段文件。

总结

Elasticsearch 性能优化是一个持续的过程，需要结合具体的使用场景和数据特点。在搬瓦工 VPS 上，重点关注内存分配、分片策略和查询优化。更多教程请参考查询 DSL 教程。如果你的搜索需求更简单，Meilisearch 和 Typesense 是更轻量的选择。选购搬瓦工 VPS 请查看全部方案，使用优惠码 NODESEEK2026 享受 6.77% 折扣，通过 bwh81.net 进入官网。