搜索与查询
查询上下文
查询示例
GET kibana_sample_data_ecommerce/_search
{
"size": 1
}
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 4675,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "kibana_sample_data_ecommerce",
"_type" : "_doc",
"_id" : "VB2OXXwBYKHeDs_3_B9c",
"_score" : 1.0,
"_source" : {
"category" : [
"Men's Clothing"
],
"currency" : "EUR",
"customer_first_name" : "Eddie",
"customer_full_name" : "Eddie Underwood",
"customer_gender" : "MALE",
"customer_id" : 38,
"customer_last_name" : "Underwood",
"customer_phone" : "",
"day_of_week" : "Monday",
"day_of_week_i" : 0,
"email" : "eddie@underwood-family.zzz",
"manufacturer" : [
"Elitelligence",
"Oceanavigations"
],
"order_date" : "2021-10-18T09:28:48+00:00",
"order_id" : 584677,
"products" : [
{
"base_price" : 11.99,
"discount_percentage" : 0,
"quantity" : 1,
"manufacturer" : "Elitelligence",
"tax_amount" : 0,
"product_id" : 6283,
"category" : "Men's Clothing",
"sku" : "ZO0549605496",
"taxless_price" : 11.99,
"unit_discount_amount" : 0,
"min_price" : 6.35,
"_id" : "sold_product_584677_6283",
"discount_amount" : 0,
"created_on" : "2016-12-26T09:28:48+00:00",
"product_name" : "Basic T-shirt - dark blue/white",
"price" : 11.99,
"taxful_price" : 11.99,
"base_unit_price" : 11.99
},
{
"base_price" : 24.99,
"discount_percentage" : 0,
"quantity" : 1,
"manufacturer" : "Oceanavigations",
"tax_amount" : 0,
"product_id" : 19400,
"category" : "Men's Clothing",
"sku" : "ZO0299602996",
"taxless_price" : 24.99,
"unit_discount_amount" : 0,
"min_price" : 11.75,
"_id" : "sold_product_584677_19400",
"discount_amount" : 0,
"created_on" : "2016-12-26T09:28:48+00:00",
"product_name" : "Sweatshirt - grey multicolor",
"price" : 24.99,
"taxful_price" : 24.99,
"base_unit_price" : 24.99
}
],
"sku" : [
"ZO0549605496",
"ZO0299602996"
],
"taxful_total_price" : 36.98,
"taxless_total_price" : 36.98,
"total_quantity" : 2,
"total_unique_products" : 2,
"type" : "order",
"user" : "eddie",
"geoip" : {
"country_iso_code" : "EG",
"location" : {
"lon" : 31.3,
"lat" : 30.1
},
"region_name" : "Cairo Governorate",
"continent_name" : "Africa",
"city_name" : "Cairo"
}
}
}
]
}
}
结构
{
"took" : 0, -- 请求消耗的时间
"timed_out" : false, -- 当前请求是否超时
"_shards" : { -- 当前请求的分片
"total" : 1, -- 一共一个
"successful" : 1, -- 成功一个
"skipped" : 0, -- 跳过0个
"failed" : 0 -- 失败0个
},
"hits" : { -- 请求结果
"total" : { -- 请求统计
"value" : 4675, -- 请求查询到4675条记录
"relation" : "eq" -- 当前查询关系 等于
},
"max_score" : 1.0, -- 当前返回结果最大评分为1.0
"hits" : [ -- 请求结果数据
{
"_index" : "kibana_sample_data_ecommerce", -- 当前数据所在索引
"_type" : "_doc", -- 数据类型,7.0之前可以自定义,之后固定为_doc
"_id" : "VB2OXXwBYKHeDs_3_B9c", -- 当前数据id
"_score" : 1.0, -- 相关度评分1.0. 默认根据评分排序,由高到低
"_source" : { -- 导入的数据
"customer_full_name" : "Eddie Underwood",
"customer_gender" : "MALE",
"customer_id" : 38,
"customer_last_name" : "Underwood"
}
}
]
}
}
相关度评分
数据根据查询的条件,算出一个相关度评分,然后数据根据相关度评分从高到低排序列出,在没有排序条件的时候.
在7.x之前相关度评分默认使用TF/IDF算法计算而来,7.x之后默认为BM25。
元数据
禁用元数据
好处: 节约开销,节省不必要的查询浪费.坏处:
- 不支持update、update_by_query和reindex API。
- 不支持高亮.
- 不支持reindex,更改mapping分析器与版本升级.
- 通过查看索引时使用的原始文档来调试查询或聚合的功能。
- 将来有可能自动修复索引损坏。
总结: 可以使用压缩索引的方式来节省磁盘,比直接禁用更好.
GET kibana_sample_data_ecommerce/_search { "_source": "fasle" -- 查询时添加 -souce: false 条件 , "size": 1 } 结果: { "took" : 1, "timed_out" : false, "_shards" : { "total" : 6, "successful" : 6, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 4744, "relation" : "eq" }, "max_score" : 1.0, "hits" : [ { "_index" : ".kibana-event-log-7.8.0-000001", "_type" : "_doc", "_id" : "Ih2OXXwBYKHeDs_3uR8r", "_score" : 1.0, "_source" : { } -- _souce元数据不输出. } ] } }数据源过滤器
including: 结果中返回哪些字段
Excluding: 结果中不返回哪些字段.只是结果字段不返回,还是可以通过字段进行检索.
使用:
在mapping中定义过滤:支持通配符,但是这种方式不推荐,因为mapping不可变
PUT user -- 设置用户索引mappings { "mappings": { "_source": { "includes": [ "name", "age" ], "excludes": [ "sex", "birth" ] } } } PUT user/_doc/1 -- 插入一条数据 { "name": "空痕影", "age": 18, "birth": "2000-12-12", "sex": "男" } GET user/_search -- 查询 结果: { "took" : 880, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 1, "relation" : "eq" }, "max_score" : 1.0, "hits" : [ { "_index" : "user", "_type" : "_doc", "_id" : "1", "_score" : 1.0, "_source" : { -- 只显示name与age,不显示birth与sex "name" : "空痕影", "age" : 18 } } ] } }查询的时候动态的指定source
“_source” : “false”,
“_source” : “obj.*”,
“_source” : [“obj1.“,”obj2.“],
“_source” : {
“includes”:[“obj1.“,”obj2.“],
“excludes”:[“*.obj3”]
}
注:如果有includes与excludes有交集,以excludes为准,即不显示交集字段.
GET user/_search { "_source": { "includes": ["name","age","birth"], "excludes": ["age"] } } 结果: { "took" : 0, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 1, "relation" : "eq" }, "max_score" : 1.0, "hits" : [ { "_index" : "user", "_type" : "_doc", "_id" : "1", "_score" : 1.0, "_source" : { "name" : "空痕影", "birth" : "2000-12-12" } } ] } }
Query String
查询所有
GET user/_search
带参数/精准匹配
GET user/_search?q=name:空痕影
带分页与排序
GET user/_search?from=0&size=2&sort=age:asc
注:带了排序后sort将为null.需要自己开启
_all搜索 相当于在所有有索引的字段中检索
GET user/_serach?q=空痕影
全文检索 fulltext query
match: 匹配包含某个term的子句
GET user/_search { "query": { "match":{ "device": "huawei mate book" } } } 结果: { "took" : 1, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 3, "relation" : "eq" }, "max_score" : 1.1143606, "hits" : [ { "_index" : "user", "_type" : "_doc", "_id" : "2", "_score" : 1.1143606, "_source" : { "name" : "空痕影2", "age" : 15, "birth" : "2011-12-12", "sex" : "男", "device" : "huawei mate book" } }, { "_index" : "user", "_type" : "_doc", "_id" : "1", "_score" : 0.13353139, "_source" : { "name" : "空痕影1", "age" : 18, "birth" : "2010-12-12", "sex" : "男", "device" : "huawei mate pad" } }, { "_index" : "user", "_type" : "_doc", "_id" : "3", "_score" : 0.13353139, "_source" : { "name" : "空痕影3", "age" : 20, "birth" : "1996-12-12", "sex" : "男", "device" : "huawei mate phone" } } ] } } 分析: 英文分词器以空格分隔,将查询词库 分成 huawei,mate,phone 三个词 来查询device字段被分词后的值.match_all: 匹配所有结果的子句
GET user/_search { "query": { "match_all":{} } }multi_match: 多字段条件
// 查询数据中 name与desc 字段包含查询字符串"3"的短语的记录. GET user/_search { "query": { "multi_match": { "query": "3", "fields": ["name","desc"] } } } 结果: { "took" : 0, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : 1.3862942, "hits" : [ { "_index" : "user", "_type" : "_doc", "_id" : "3", "_score" : 1.3862942, "_source" : { "name" : "空痕影3", "age" : 20, "birth" : "1996-12-12", "sex" : "男", "device" : "huawei mate phone" } }, { "_index" : "user", "_type" : "_doc", "_id" : "4", "_score" : 0.6931471, "_source" : { "name" : "空痕影4", "age" : 20, "birth" : "1996-12-12", "sex" : "男", "device" : "huawei mate phone", "desc" : "这是第3条数据" } } ] } }match_phrase: 短语查询
// 查询数据中一组词项都匹配的数据.即包含mate与book词项的且 GET user/_search { "query": { "match_phrase": { "device": "mate book" } } } 结果: { "took" : 1, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 1, "relation" : "eq" }, "max_score" : 1.3093333, "hits" : [ { "_index" : "user", "_type" : "_doc", "_id" : "2", "_score" : 1.3093333, "_source" : { "name" : "空痕影2", "age" : 15, "birth" : "2011-12-12", "sex" : "男", "device" : "huawei mate book" } } ] } }
精确查找 Term
team: 匹配和搜索词项完全相等的结果.
term和match_phrase区别:
- match_phrase 会将检索关键词分词, match_phrase的分词结果必须在被检索字段的分词中都包含,而且顺序必须相同,而且默认必须都是连续的
- term搜索不会将搜索词分词,但源文件内的字段分词需要keyword来控制.
term和keyword都是不分词.但作用域不同:
- term是对于搜索词不分词,
- keyword是字段类型,是对于source data中的字段值不分词
// ik分词器会将NFC手机分为nfc与手机两个词.
// 匹配name字段分词后的词中是否包含"nfc手机"这个完整词的.
GET product/_search
{
"query": {
"term": {
"name": {
"value": "nfc手机"
}
}
}
}
// keyword:存储数据时候,不会分词建立索引
// 匹配name字段源数据是否 = "nfc手机" 这个词
GET product/_search
{
"query": {
"term": {
"name.keyword": {
"value": "nfc手机"
}
}
}
}
teams: 匹配和搜索词项列表中任意项匹配的结果
// 查询 product 中的name字段是否包含小米与nfc词项.
GET product/_search
{
"query": {
"terms": {
"name": [
"小米",
"nfc"
]
}
}
}
range:范围查找
// 查询价格大于1K,小于3K的记录
GET product/_search
{
"query": {
"range": {
"price": {
"gte": 1000,
"lte": 3000
}
}
}
}
过滤器 Filter
query和filter的主要区别在:
- filter是结果导向的而query是过程导向。
- query倾向于“当前文档和查询的语句的相关度”而filter倾向于“当前文档和查询的条件是不是相符”。即在查询过程中,query是要对查询的每个结果计算相关性得分的,而filter不会。
- 另外filter有相应的缓存机制,可以提高查询效率。
组合查询-Bool query
可以组合多个查询条件,bool查询也是采用more_matches_is_better的机制,因此满足must和should子句的文档将会合并起来计算分值
// 格式:
{
"bool" : {
"must" : [],
"should" : [],
"filter" : [],
"must_not" : [],
}
}
- must:必须满足子句(查询)必须出现在匹配的文档中,并将有助于得分。
- filter:过滤器 不计算相关度分数,cache子句(查询)必须出现在匹配的文档中。但是不像 must查询的分数将被忽略。Filter子句在filter上下文中执行,这意味着计分被忽略,并且子句被考虑用于缓存。
- should:可能满足 or子句(查询)应出现在匹配的文档中。
- must_not:必须不满足 不计算相关度分数 not子句(查询)不得出现在匹配的文档中。子句在过滤器上下文中执行,这意味着计分被忽略,并且子句被视为用于缓存。由于忽略计分,0因此将返回所有文档的分数。
minimum_should_match:参数指定should返回的文档必须匹配的子句的数量或百分比。如果bool查询包含至少一个should子句,而没有must或 filter子句,则默认值为1。否则,默认值为0