4

elasticsearch进阶篇 | 记kibana执行dsl脚本实战过程 -利来国际app

12214


引入 | 记一次kibana执行dsl脚本实战的思考过程

                                             elasticsearch 在db-engine 权威热度排名第8

简介 | elasticsearch分布式全文搜索引擎

elasticsearch是一个基于lucene的搜索服务器。它提供了一个分布式多用户能力的全文搜索引擎,基于restful web接口。es是用java语言开发的,并作为apache许可条款下的开放源码发布,是一种流行的企业级搜索引擎。elasticsearch用于云计算中,能够达到实时搜索,稳定,可靠,快速,安装使用方便。elasticsearch简称es,文档型数据库,nosql非关系型数据库-在互联网行业也是相当热门的大数据组件。


一、elasticsearch script history-分布式全文搜索-脚本引擎历史

在es早期的版本中,使用mvel脚本,但为解决安全隐患问题,于是groovy脚本诞生。

随之出现的安全漏洞跟内存泄露问题,于是在es5.0版本之际,painless脚本官宣,距今也有数年之久,painless脚本浮现在开发者眼前。


二、elasticsearch script applycenarios-分布式全文搜索-脚本引擎应用场景

我们都很熟悉的认知到elasticsearch全文搜索引擎,在其各版本系列中提供了丰富的dsl语法-增删改查-这里以6.x版本系列-6.8.6为例。

https://www.elastic.co/guide/en/elasticsearch/reference/6.8/docs.html

在80%以上的业务场景中作增删改查游刃有余,但应用于相对复杂的业务场景:

多字段自定义更新、自定义reindex、自定义数组字段动态添加...

https://www.elastic.co/guide/en/elasticsearch/painless/6.8/painless-regexes.html

当然基于脚本引擎手动开发插件也是可以实现的,

https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-scripting-engine.html


从painless脚本的衍生意义理解是"无痛"无漏洞的,但尤其需要注意的地方-不能以root账户启动es,不要公开es路径至其他用户。

从官方script使用的介绍来看,首要就是性能问题,其次就是使用业务场景,ebay在性能优化实践英文版中也有体现,

https://www.ebayinc.com/stories/blogs/tech/elasticsearch-performance-tuning-practice-at-ebay/

这里也mark下中文版,

https://www.infoq.cn/article/elasticsearch-performance-tuning-practice-at-ebay


其中,80%以上的业务场景:参考小编汇总elasticsearch kibana dsl-crud大全

get _search{  "query": {    "match_all": {}  }}#节点信息get _cat/nodes?v
#各节点机器存储信息get _cat/allocation?v
#索引信息get _cat/indices?v
#分片信息get _cat/shards?v
#注册快照存储库-仓库共享put _snapshot/my_backup{ "type": "fs", "settings": { "location": "/home/user/yxd179/es/backup" }}
#查看仓库信息get /_snapshot/my_backup?pretty
#查看快照存储库保存结果get _snapshot
#创建快照,这个会备份所有打开的索引到my_backup仓库下并命名为snapshot_phr的快照里。这个调用会立刻返回,然后快照会在后台运行。若是希望在脚本中一直等待到完成,可通过添加 wait_for_completion 标记实现,这个会阻塞调用直到快照完成(如果是大型快照,会花很长时间才返回),其中只会备份索引809ijpomsi2zmjruqkrr0q信息put /_snapshot/my_backup/snapshot_yd?wait_for_completion=true{ "indices": "809ijpomsi2zmjruqkrr0q", "ignore_unavailable": true, "include_global_state": false, "metadata": { "taken_by": "phr", "taken_because": "backup before upgrading" }}
#查看快照get /_snapshot/my_backup/snapshot_yd
#查看所有快照get /_snapshot/my_backup/_all
#删除快照delete /_snapshot/my_backup/snapshot_yd
#监控快照创建或恢复过程get /_snapshot/my_backup/snapshot_yd/_status
#恢复快照post /_snapshot/my_backup/snapshot_yd/_restore
#动态模板put /_template/yxd179_tpl{ "index_patterns": [ "yxd179-2021*" ], "settings": { "number_of_shards": 1, "number_of_replicas": 1 }, "mappings": { "yd": { "dynamic_templates": [ { "strings": { "match_mapping_type": "string", "mapping": { "type": "text", "index": true, "copy_to": "full_context", "analyzer": "ik_max_word", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } } } } ], "properties": { "full_context": { "type": "text", "analyzer": "ik_max_word", "fielddata": true, "store": true } } } }}
#副本分片分配设置put /yxd179-2021/_settings{ "number_of_replicas": "1"}
#分页查询get /yxd179-2021/yd/_search{ "from": 0, "size": 30}
#根据id查询get /yxd179-2021/yd/647461503271768064
#bool query dsl查询get /yxd179-2021/yd/_search{ "query": { "bool": { "must": [ { "bool": { "should": [ { "match": { "regnumber": "20203030651" } } ] } }, { "term": { "status": "1" } } ] } }, "sort": [ { "createtime": { "order": "desc" } } ], "from": 0, "size": 10}
#允许es最大滚动数目分配设置put /yxd179-2021/_settings{ "index": { "max_result_window": 13000000 }}
#查看字段分词分析过程post /yxd179-2021/_analyze{ "field": "regnumber", "text": "国械标准20203030651号"}
#模糊查询匹配get /yxd179-2021/yd/_search{ "query": { "bool": { "must": [ { "bool": { "should": [ { "wildcard": { "regnumber.keyword": "*20203030651*" } } ] } }, { "term": { "status": "1" } } ] } }, "sort": [ { "createtime": { "order": "desc" } } ], "from": 0, "size": 10}
#对指定字段设置分词器查询get /yxd179-2021/yd/_search{ "query": { "bool": { "must": [ { "match": { "hdsd0001004": { "query": "1828551417", "analyzer": "char_analyzer" } } } ] } }, "from": 0, "size": 30}
#模糊查询匹配get /yxd179-2021/yd/_search{ "query": { "bool": { "must": [ { "wildcard": { "hdsd0001002.keyword": "*yxd179*" } } ] } }, "from": 0, "size": 30}
#关闭索引:post yxd179-2021/_close
#打开索引:post yxd179-2021/_open
#对指定字段设置分词器put /yxd179-2021/_mapping/yd{ "properties": { "hdsd0001004": { "type": "text", "analyzer": "char_analyzer" } }}
#查看mapping结构体信息get yxd179-2021/_mapping
#设置分词分析器put yxd179-2021/_settings{ "analysis": { "analyzer": { "char_analyzer": { "tokenizer": "char_tokenizer", "filter": "lowercase" } }, "tokenizer": { "char_tokenizer": { "type": "pattern", "pattern": "|" } } }}
#minimum_should_matchget /yxd179-2021/yd/_search{ "query": { "query_string": { "query": "182855141y7", "type": "phrase", "operator": "and", "minimum_should_match": "100%", "fields": [ "hdsd0001004" ] } }}
#显示字段get /yxd179-2021/yd/_search{ "_source": { "include": [ "id", "productid" ] }, "query": { "bool": { "must": [ { "terms": { "productid": [ 636654265306419462 ] } } ] } }, "from": 0, "size": 30}
#高亮查询get /yxd179-2021/yd/_search{ "query": { "bool": { "must": [ { "bool": { "should": [] } }, { "term": { "status": "1" } }, { "term":{ "id":636662671736099971 } } ] } }, "sort": [ { "id": { "order": "asc" } } ], "highlight": { "pre_tags": [ "" ], "post_tags": [ "" ], "fields": { "commonname": { "type": "plain" } } }, "from": 0, "size": 10}
#read_only_allow_deleteput /yxd179-2021/_settings{ "index":{ "blocks":{ "read_only_allow_delete":"false" } }}
#查询模板get /_template
get /yxd179-2021*/yd/_search{ "from": 0, "size": 30}
#单个字段bool查询get /yxd179-2021/yd/_search{ "query": { "bool": { "must": [ { "term": { "id": "636651493706133509" } } ] } }, "from": 0, "size": 30}
#批量post /_bulk{"index":{"_index":"yxd179-2021","_type":"yd","_id":"65965969996688"}}{"id":"65965969996688","hdsd0001002":"sdff","hdsd0001008":"fsdf","hdsd0001006":"000000000000000000","create_time":"2021-07-29","cancel_flag":0}{"index":{"_index":"yxd179-2021","_type":"yd","_id":"66049829996688"}}{"id":"66049829996688","hdsd0001002":"sdgsdg","hdsd0001008":"fsdfsdf","hdsd0001006":"000000000000000000","create_time":"2021-07-29","cancel_flag":1}
#外层交集查询get /yxd179-2021/yd/_search{ "query": { "bool": { "must": [ { "bool": { "should": [ { "match": { "regnumber": "国sd20182642128" } } ] } }, { "term": { "status": "1" } } ] } }, "sort": [ { "createtime": { "order": "desc" } } ], "from": 0, "size": 10}
#复杂bool带权重查询-得分排序get /yxd179-2021/yd/_search{ "from": 0, "size": 10, "query": { "bool": { "must": [ { "bool": { "must": [ { "term": { "cancelflag": { "value": "0", "boost": 1 } } } ], "adjust_pure_negative": true, "boost": 1 } }, { "bool": { "should": [ { "match": { "yhe": { "query": "张", "operator": "or", "prefix_length": 0, "max_expansions": 50, "fuzzy_transpositions": true, "lenient": false, "zero_terms_query": "none", "auto_generate_synonyms_phrase_query": true, "boost": 1 } } }, { "match": { "yhr": { "query": "张", "operator": "or", "prefix_length": 0, "max_expansions": 50, "fuzzy_transpositions": true, "lenient": false, "zero_terms_query": "none", "auto_generate_synonyms_phrase_query": true, "boost": 1 } } }, { "match": { "yht": { "query": "张", "operator": "or", "prefix_length": 0, "max_expansions": 50, "fuzzy_transpositions": true, "lenient": false, "zero_terms_query": "none", "auto_generate_synonyms_phrase_query": true, "boost": 1 } } }, { "match": { "yhg": { "query": "张", "operator": "or", "prefix_length": 0, "max_expansions": 50, "fuzzy_transpositions": true, "lenient": false, "zero_terms_query": "none", "auto_generate_synonyms_phrase_query": true, "boost": 1 } } } ], "adjust_pure_negative": true, "boost": 1 } } ], "adjust_pure_negative": true, "boost": 1 } }, "explain": true, "sort": [ { "id": { "order": "desc" } } ]}

#查询耗时统计分析profileget /yxd179-2021/yd/_search{ "profile": true, "query":{ "term":{ "tu":6583120 } }}
#根据id修改post /yxd179-2021/yd/b00e89b652484b0b8da16e090302e012/_update{ "doc":{ "fd":"1" }}
#修改_update_by_query脚本引擎painlesspost /yxd179-2021/_update_by_query{ "query":{ "term":{ "fdh":6583120 } }, "script":{ "lang":"painless", "source": "ctx._source.cancelflag=params.cancelflag;ctx._source.updatetime=params.updatetime", "params": { "cancelflag":"0", "updatetime":"2021-07-28t01:17:36.000z" } }}
#交集查询-且保留-全get /yxd179-2021/yd/_search{ "query": { "bool": { "must": [ { "term": { "cancelflag": "0" } }, { "bool": { "must": [ { "wildcard": { "hdsd0001002.keyword": "*yxd179*" } }, { "match": { "hdsd0001003": "2" } } ] } } ] } }, "sort": [ { "id": { "order": "desc" } } ], "highlight": { "pre_tags": [ "" ], "post_tags": [ "" ], "fields": { "hdsd0001002": { "type": "plain" } } }, "from": 0, "size": 30}

#外层交集查询-里层交集查询get /yxd179-2021/yd/_search{ "from": 0, "size": 10, "query": { "bool": { "must": [ { "bool": { "must": [ { "term": { "cancelflag": { "value": "0", "boost": 1 } } } ], "adjust_pure_negative": true, "boost": 1 } }, { "bool": { "must": [ { "match": { "hdsd0001002": { "query": "张", "operator": "or", "prefix_length": 0, "max_expansions": 50, "fuzzy_transpositions": true, "lenient": false, "zero_terms_query": "none", "auto_generate_synonyms_phrase_query": true, "boost": 1 } } }, { "match": { "hdsd0001003": { "query": "2", "operator": "or", "prefix_length": 0, "max_expansions": 50, "fuzzy_transpositions": true, "lenient": false, "zero_terms_query": "none", "auto_generate_synonyms_phrase_query": true, "boost": 1 } } } ], "adjust_pure_negative": true, "boost": 1 } } ], "adjust_pure_negative": true, "boost": 1 } }, "explain": true}

#并集查询get /yxd179-2021/yd/_search{ "from": 0, "size": 10, "query": { "bool": { "must": [ { "term": { "cancelflag": { "value": "0", "boost": 1 } } } ], "should": [ { "match": { "hdsd0001002": { "query": "张", "operator": "or", "prefix_length": 0, "max_expansions": 50, "fuzzy_transpositions": true, "lenient": false, "zero_terms_query": "none", "auto_generate_synonyms_phrase_query": true, "boost": 1 } } } ], "adjust_pure_negative": true, "boost": 1 } }, "explain": true}
#并集查询-字段显示get /yxd179-2021/yd/_search{ "from": 0, "size": 10, "query": { "bool": { "must": [ { "match": { "cancelflag": { "query": "0", "operator": "and", "prefix_length": 0, "max_expansions": 50, "fuzzy_transpositions": true, "lenient": false, "zero_terms_query": "none", "auto_generate_synonyms_phrase_query": true, "boost": 1 } } } ], "should": [ { "match": { "hdsd0001002": { "query": "张", "operator": "or", "prefix_length": 0, "max_expansions": 50, "fuzzy_transpositions": true, "lenient": false, "zero_terms_query": "none", "auto_generate_synonyms_phrase_query": true, "boost": 1 } } }, { "match": { "hdsd0001002.pinyin": { "query": "zhang", "operator": "or", "prefix_length": 0, "max_expansions": 50, "fuzzy_transpositions": true, "lenient": false, "zero_terms_query": "none", "auto_generate_synonyms_phrase_query": true, "boost": 1 } } } ], "adjust_pure_negative": true, "boost": 1 } }, "explain": true, "_source": { "includes": [ "id", "th001id", "createtime", "updatetime", "hdsd0001001", "hdsd0001002", "cancelflag" ], "excludes": [] }}

#若需要更频繁的更新,可以使用es api强制更新get /yxd179-2021/_refresh
#根据id删除delete /yxd179-2021/yd/ud6-5xkbwvbb7hkjg5k0
#删除索引delete /yxd179-2021
#删除模板-动态mappingdelete /_template/yxd179_tpl
#排序get /yxd179-2021/yd/_search{ "sort": [ { "createtime": { "order": "desc" } } ], "from": 0, "size": 30}


三、elasticsearch script actualcombat-分布式全文搜索-脚本引擎实战


这里仅以update-by-query为例:



 

其中,lang指定脚本引擎:painless,source中为script脚本片段,params为脚本参数值。

之所以通过params传递,可突破es对脚本编译限制,虽然也可以通过下面操作来修改该解析上限的配置:

put /_cluster/settings
{
"transient": {
"script.max_compilations_per_minute": 40
}
}

重要:对于大批量数据,es都需要单独的编译解析,当进行bulk update时,若是每一个脚本都实时编译的话,可想而知很快就会达到上限。知其然知其所以然,对于es中都只会在第一次进行解析这个脚本,之后便无需再次解析,当脚本中有常数变量时,es会实时编译脚本,故结合script中的param功能,设法将脚本中的变量通过param传递进去,从而可以从根本上解决脚本编译解析限制的问题。


接下来,我们看下在java中怎么样基于6.8.6版本构建tcp client执行painless脚本引擎?




补充:updatebyquery api的调用从获取索引快照开始,索引使用内部版本控制找到任何文档。

试想当一个文档在快照的时间和索引请求过程之间发生变化时,会发生版本冲突。当版本匹配时,updatebyquery更新文档并增加版本号。上述为了防止版本冲突导致updatebyquery中止,还可以设abortonversionconflict(false),之所以这么做,是有可能它试图获取在线映射更改,而版本冲突意味着在相同时间开始updatebyquery和试图更新文档的冲突文档,该更新将获取在线映射更新,updatebyquery也可以通过指定pipeline来使用ingest节点。其中updatebyqueryrequestbuilder api可支持过滤更新的文档,限制要更新的文档总数,并使用脚本更新文档,即时刷入磁盘,重试次数等。

retry重试:

当客户端a、b几乎同时获取同一个文档, 一并获得_version版本信息, 假设此时_version=1。接着,客户端a修改文档中的部分内容, 将修改写入索引。然而elasticsearch在写入索引时, 检查客户端a提交的文档的版本信息(这里仍然是1) 和 现存的文档的版本信息(这里也是1), 发现相同后, 执行写入操作, 并修改版本号_version=2。接着客户端b也修改文档中的部分内容, 其操作写回索引的速度稍慢。此时同样执行写入过程,es发现客户端b提交的文档的版本为1, 而现存文档的版本为2,即发生冲突,此次partial update将失败-重试。

并发控制策略:

partial update并发控制策略-乐观锁

小试牛刀案例:

请问,如何通过脚本引擎指定多个字段update?

方式no.1:

ctx._source.putall(params) 

方式no.2:

for (k in params.keyset()){if (!k.equals('ctx')){ctx._source.put(k,params.get(k))}}


「 往期文章 」


oracle优化案例 | 从执行计划定位sql查询问题

      

最后修改时间:2021-11-26 13:54:32
「喜欢文章,快来给作者赞赏墨值吧」
【利来手机国际的版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:[email protected]进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论