popexizhi: elasticsearch 使用正则查询

https://www.elastic.co/guide/en/elasticsearch/reference/5.5/query-dsl-regexp-query.html

http://cwiki.apachecn.org/pages/viewpage.action?pageId=4260873

标准字符都支持，比如如下就ok

curl -XGET "/_search/ -d'
{
"query": {
"regexp":{
"node_name": "cascade-node-[12][2-9]{1,2}"
}
}
}'

查了一下官方帮助分组，定位符，逻辑判断都ok，lucene应该是很内置了正则引擎，

如下:

Grouping

Parentheses "()" can be used to form sub-patterns. The quantity operators listed above operate on the shortest previous pattern, which can be a group. For string "ababab":

(ab)+       # match
ab(ab)+     # match
(..)+       # match
(...)+      # no match
(ab)*       # match
abab(ab)?   # match
ab(ab)?     # no match
(ab){3}     # match
(ab){1,2}   # no match

Alternation

The pipe symbol "|" acts as an OR operator. The match will succeed if the pattern on either the left-hand side OR the right-hand side matches. The alternation applies to the longest pattern, not the shortest. For string "aabb":

aabb|bbaa   # match
aacc|bb     # no match
aa(cc|bb)   # match
a+|b+       # no match
a+b+|b+a+   # match
a+(b|c)+    # match

Character classes

Ranges of potential characters may be represented as character classes by enclosing them in square brackets "[]". A leading ^ negates the character class. The allowed forms are:

[abc]   # 'a' or 'b' or 'c'
[a-c]   # 'a' or 'b' or 'c'
[-abc]  # '-' or 'a' or 'b' or 'c'
[abc\-] # '-' or 'a' or 'b' or 'c'
[^abc]  # any character except 'a' or 'b' or 'c'
[^a-c]  # any character except 'a' or 'b' or 'c'
[^-abc]  # any character except '-' or 'a' or 'b' or 'c'
[^abc\-] # any character except '-' or 'a' or 'b' or 'c'

Note that the dash "-" indicates a range of characters, unless it is the first character or if it is escaped with a backslash.

For string "abcd":

ab[cd]+     # match
[a-d]+      # match
[^a-d]+     # no match

popexizhi

html tool

2018年11月29日星期四

elasticsearch 使用正则查询

没有评论:

发表评论