前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >专栏 >AWS Opensearch 迁移 Elasticsearch 问题梳理

AWS Opensearch 迁移 Elasticsearch 问题梳理

原创
作者头像
岳涛
修改2024-12-24 10:47:51
修改2024-12-24 10:47:51
23000
代码可运行
举报
文章被收录于专栏:大数据生态大数据生态
运行总次数:0
代码可运行

说明

本文描述问题及解决方法同样适用于 腾讯云 Elasticsearch Service(ES)

问题背景

AWS OpenSearch 项目最初是基于 Elasticsearch 7.10 版本的代码库创建的。由于 Elasticsearch 在其后续版本中转向了商业许可模式,AWS 决定推出 OpenSearch 作为一个完全开源的替代方案。

在将 AWS OpenSearch 迁移到腾讯云 Elasticsearch 的过程中,可能会遇到一些兼容性问题,因此需要优先解决这些问题,以确保迁移过程的顺利进行。

问题梳理

1. mapping 问题

1.1. mapping: _all 参数不兼容问题

索引属性迁移报错:

代码语言:javascript
代码运行次数:0
运行
复制
[2024-11-27 17:22:51,099] [INFO] Starting sync indices..
[2024-11-27 17:22:51,153] [ERROR] Index [trades_uniswapv2_v202410] FAILED {'error': {'root_cause': [{'type': 'mapper_parsing_exception', 'reason': 'Root mapping definition has unsupported parameters:  [_all : {enabled=false}]'}], 'type': 'mapper_parsing_exception', 'reason': 'Failed to parse mapping [_doc]: Root mapping definition has unsupported parameters:  [_all : {enabled=false}]', 'caused_by': {'type': 'mapper_parsing_exception', 'reason': 'Root mapping definition has unsupported parameters:  [_all : {enabled=false}]'}}, 'status': 400}
[2024-11-27 17:22:51,204] [ERROR] Index [aggrtrades_hadax_v202410] FAILED {'error': {'root_cause': [{'type': 'mapper_parsing_exception', 'reason': 'Root mapping definition has unsupported parameters:  [_all : {enabled=false}]'}], 'type': 'mapper_parsing_exception', 'reason': 'Failed to parse mapping [_doc]: Root mapping definition has unsupported parameters:  [_all : {enabled=false}]', 'caused_by': {'type': 'mapper_parsing_exception', 'reason': 'Root mapping definition has unsupported parameters:  [_all : {enabled=false}]'}}, 'status': 400}
[2024-11-27 17:22:51,247] [ERROR] Index [aggrtrades_binance_5m_v202410] FAILED {'error': {'root_cause': [{'type': 'mapper_parsing_exception', 'reason': 'Root mapping definition has unsupported parameters:  [_all : {enabled=false}]'}], 'type': 'mapper_parsing_exception', 'reason': 'Failed to parse mapping [_doc]: Root mapping definition has unsupported parameters:  [_all : {enabled=false}]', 'caused_by': {'type': 'mapper_parsing_exception', 'reason': 'Root mapping definition has unsupported parameters:  [_all : {enabled=false}]'}}, 'status': 400}
[2024-11-27 17:22:51,384] [INFO] Index [aggrtrades_bitget_v202410] SUCCESS
[2024-11-27 17:22:53,839] [INFO] Sync completed.
原因分析

查看 opensearch 的 mapping:

代码语言:javascript
代码运行次数:0
运行
复制
{
  "trades_uniswapv2_v202410" : {
    "aliases" : {
      "transaction_uniswapv2_read" : { }
    },
    "mappings" : {
      "_all" : {
        "enabled" : false
      },
      "properties" : {
        "amount" : {
          "type" : "double"
        },
        "date" : {
          "type" : "date",
          "format" : "epoch_millis"
        },
        "id" : {
          "type" : "long"
        },
        "key" : {
          "type" : "keyword",
          "eager_global_ordinals" : true
        },
        "price" : {
          "type" : "double"
        },
        "trade_type" : {
          "type" : "keyword",
          "eager_global_ordinals" : true
        },
        "volume" : {
          "type" : "double"
        }
      }
    },
    "settings" : {
      "index" : {
        "refresh_interval" : "15s",
        "translog" : {
          "sync_interval" : "5s",
          "durability" : "request"
        },
        "provided_name" : "trades_uniswapv2_v202410",
        "max_result_window" : "65536",
        "creation_date" : "1726645622697",
        "history" : {
          "uuid" : "ZCWlNPggSm-9PsvPpaxRtw"
        },
        "sort" : {
          "field" : [ "date", "id" ],
          "order" : [ "desc", "desc" ]
        },
        "unassigned" : {
          "node_left" : {
            "delayed_timeout" : "5m"
          }
        },
        "number_of_replicas" : "0",
        "uuid" : "J0mJY064T1ufpodUDObykA",
        "version" : {
          "created" : "6080299",
          "upgraded" : "135248027"
        },
        "number_of_shards" : "1"
      }
    }
  }
}

通过 mapping 结构发现不兼容 _all 参数:

代码语言:javascript
代码运行次数:0
运行
复制
      "_all" : {
        "enabled" : false
      }

该参数在ES 7.x 时被舍弃,需要在迁移时丢弃。

1.2. 自定义 type 兼容问题

迁移索引属性时报错 has multiple types:

代码语言:javascript
代码运行次数:0
运行
复制
[2024-11-15 11:20:30,970] [INFO] Starting sync indices..
[2024-11-15 11:20:32,107] [WARNING] Index [open_search_sales_sku] Already exists, skipped
[2024-11-15 11:20:32,168] [ERROR] Index [open_push_msg_log] has multiple types.
[2024-11-15 11:20:32,805] [ERROR] Index [alamein_b_spu_sku_search_new1] has multiple types.
[2024-11-15 11:20:32,882] [ERROR] Index [alamein_b_spu_sku_search] has multiple types.
[2024-11-15 11:20:32,924] [WARNING] Index [dto_shared_spu] Already exists, skipped
[2024-11-15 11:20:32,965] [WARNING] Index [open_search_sales_spu] Already exists, skipped
[2024-11-15 11:20:33,140] [INFO] Index [alamein_c_spu_sku_search] SUCCESS
[2024-11-15 11:20:33,343] [INFO] Index [alamein_b_spu_sku_index] SUCCESS
[2024-11-15 11:20:33,383] [ERROR] Index [open_search_after_sale] has multiple types.
[2024-11-15 11:20:33,425] [WARNING] Index [open_fail_msg_log] Already exists, skipped
[2024-11-15 11:20:33,476] [WARNING] Index [alamein_delivery] Already exists, skipped
[2024-11-15 11:20:33,520] [WARNING] Index [open_api_doc_content] Already exists, skipped
[2024-11-15 11:20:33,581] [WARNING] Index [alamein_po_delivery] Already exists, skipped
[2024-11-15 11:20:33,581] [INFO] Sync completed.
原因分析
代码语言:javascript
代码运行次数:0
运行
复制
{
  "xxxx" : {
    "aliases" : {
      "xxxx_alias" : { }
    },
    "mappings" : {
      "dynamic_templates" : [ {
        "string_as_keyword" : {
          "match_mapping_type" : "string",
          "mapping" : {
            "type" : "keyword"
          }
        }
      } ],
      "properties" : {
        "appKey" : {
          "type" : "keyword"
        },
        "businessNumber" : {
          "type" : "keyword"
        },
        "createTime" : {
          "type" : "date",
          "format" : "8uuuu-MM-dd HH:mm:ss||8uuuu-MM-dd||epoch_millis"
        },
        "elapsedTime" : {
          "type" : "long"
        },
        "lastUpdateTime" : {
          "type" : "date",
          "format" : "8uuuu-MM-dd HH:mm:ss||8uuuu-MM-dd||epoch_millis"
        },
        "mchId" : {
          "type" : "integer"
        },
        "msgBornTime" : {
          "type" : "date",
          "format" : "8uuuu-MM-dd HH:mm:ss||8uuuu-MM-dd||epoch_millis"
        },
        "msgName" : {
          "type" : "keyword"
        },
        "requestText" : {
          "type" : "text"
        },
        "respMsg" : {
          "type" : "text"
        },
        "respStatus" : {
          "type" : "keyword"
        },
        "retryTimes" : {
          "type" : "integer"
        },
        "sourceMsgId" : {
          "type" : "keyword"
        }
      }
    },
    "settings" : {
      "index" : {
        "replication" : {
          "type" : "DOCUMENT"
        },
        "number_of_shards" : "1",
        "provided_name" : "xxxx",
        "creation_date" : "1728377239608",
        "number_of_replicas" : "1",
        "uuid" : "TjowT6POSyqAqB5-3SSn-g",
        "version" : {
          "created" : "136347827"
        }
      }
    }
  }
}

通过mapping可以发现,并没有多type,而是将 dynamic_templates 识别成了 type。这种情况则需要在多 type 判断时,mappings keys number 需要过滤掉 dynamic_templates 再进行 mapping type 的计算。

2. setting 问题

2.1. setting: replication 参数不兼容问题

opensearch 包含 replication 参数:

代码语言:javascript
代码运行次数:0
运行
复制
index.replication.type
原因分析

该参数为opensearch特有参数,无法被ES兼容,需要在迁移时丢弃。

3. 连接问题

3.1. opensearch 无认证连接兼容问题

在携带用户去访问无需认证的opensearch服务时,会报错 "not a valid key=value pair":

代码语言:javascript
代码运行次数:0
运行
复制
[ERROR] https://search-api-os-prod-xxxxxxxx.cn-north-1.es.amazonaws.com.cn: 403 {"message":"'ZWxhc3RpYzpOb25l' not a valid key=value pair (missing equal-sign) in Authorization header: 'Basic ZWxhc3RpYzpOb25l'."}
原因分析

opensearch 对用户密码的传递有严格要求,ES允许在免认证状态下传递用户密码,但opensearch不允许。所以在访问opensearch时,需要根据是否传递了密码来重新构建 headers:

代码语言:javascript
代码运行次数:0
运行
复制
headers = {"Content-Type": "application/json"}

# 手动构建 Authorization 头部
username, password = self.auth
if password:
        credentials = f"{username}:{password}"
        encoded_credentials = base64.b64encode(credentials.encode('utf-8')).decode('utf-8')
        headers['Authorization'] = f"Basic {encoded_credentials}"

3.2. read timeout 问题

代码语言:javascript
代码运行次数:0
运行
复制
warning: thread "638eed1584ba653eee01d0d55f62ae81dc3fa166ec38e2099a515c74ebca0bcb_slice_0" terminated with exception (report_on_exception is true):warning: thread "638eed1584ba653eee01d0d55f62ae81dc3fa166ec38e2099a515c74ebca0bcb_slice_2" terminated with exception (report_on_exception is true):

Manticore::SocketTimeout: Read timed out
       initialize at /root/es/logstash-7.10.2/vendor/bundle/jruby/2.5.0/gems/manticore-0.7.0-java/lib/manticore/response.rb:37
...
Manticore::SocketTimeout: Read timed out
       initialize at /root/es/logstash-7.10.2/vendor/bundle/jruby/2.5.0/gems/manticore-0.7.0-java/lib/manticore/response.rb:37
...
warning: thread "638eed1584ba653eee01d0d55f62ae81dc3fa166ec38e2099a515c74ebca0bcb_slice_1" terminated with exception (report_on_exception is true):
Manticore::SocketTimeout: Read timed out
       initialize at /root/es/logstash-7.10.2/vendor/bundle/jruby/2.5.0/gems/manticore-0.7.0-java/lib/manticore/response.rb:37
...
[2024-11-28T10:26:18,637][ERROR][logstash.javapipeline    ][main][638eed1584ba653eee01d0d55f62ae81dc3fa166ec38e2099a515c74ebca0bcb] A plugin had an unrecoverable error. Will restart this plugin.
  Pipeline_id:main
  Plugin: <LogStash::Inputs::Elasticsearch slices=>3, password=><password>, size=>5000, hosts=>["https://xxx-xxxx.ap-southeast-1.es.amazonaws.com"], scroll=>"5m", index=>"*_v202410", docinfo=>true, id=>"638eed1584ba653eee01d0d55f62ae81dc3fa166ec38e2099a515c74ebca0bcb", user=>"elastic-aic", enable_metric=>true, codec=><LogStash::Codecs::JSON id=>"json_6fce07e4-7241-4132-8296-5f25ffcb3f58", enable_metric=>true, charset=>"UTF-8">, query=>"{ \"sort\": [ \"_doc\" ] }", docinfo_target=>"@metadata", docinfo_fields=>["_index", "_type", "_id"], connect_timeout_seconds=>10, request_timeout_seconds=>60, socket_timeout_seconds=>60, ssl=>false>
  Error: Read timed out
  Exception: Manticore::SocketTimeout
  Stack: /root/es/logstash-7.10.2/vendor/bundle/jruby/2.5.0/gems/manticore-0.7.0-java/lib/manticore/response.rb:37:in `block in initialize'
...
[2024-11-28T10:26:18,816][FATAL][logstash.runner          ] An unexpected error occurred! {:error=>#<Manticore::SocketTimeout: Read timed out>, :backtrace=>["/root/es/logstash-7.10.2/vendor/bundle/jruby/2.5.0/gems/manticore-0.7.0-java/lib/manticore/response.rb:37:in `block in initialize'", 
...
[2024-11-28T10:26:18,824][ERROR][org.logstash.Logstash    ] java.lang.IllegalStateException: Logstash stopped processing because of an error: (SystemExit) exit
原因分析

由于 input 指定了 slices=>3,所以在多线程并发读取时发生了超时问题(60s),这种情况有两个方案:

● 将 scroll=>"5m" 修改为 scroll=>"50s";

● 移除 slices 参数

两个方案都可解决该问题。

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 说明
  • 问题背景
  • 问题梳理
    • 1. mapping 问题
      • 1.1. mapping: _all 参数不兼容问题
      • 1.2. 自定义 type 兼容问题
    • 2. setting 问题
      • 2.1. setting: replication 参数不兼容问题
    • 3. 连接问题
      • 3.1. opensearch 无认证连接兼容问题
      • 3.2. read timeout 问题
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档