elasticsearch v6.3.0 sql group-order

 

Overview

es v6.3.0之后,推出了es-SQL的支持。今天来试试这个功能。


测试数据集

geonames


简单语句

在简单语句的情况下,这个功能ok,具体表现如下,

simple es-sql

# execute
curl -X POST "$HOST/_xpack/sql?format=txt" -H 'Content-Type: application/json' -d'
{
    "query": "SELECT * FROM bm ORDER BY longitude DESC limit 3",
    "fetch_size": 3
}'

# translate to es DSL
curl -X POST "$HOST/_xpack/sql/translate?pretty" -H 'Content-Type: application/json' -d'
{
    "query": "SELECT * FROM bm ORDER BY longitude DESC limit 3",
    "fetch_size": 3
}'

# execute2(双引号里面的字符串)
curl -X POST "$HOST/_xpack/sql?format=txt" -H 'Content-Type: application/json' -d"
{
    \"query\": \"SELECT country_code, population AS sum_pop FROM bm WHERE population > 1 AND country_code = 'CN' ORDER BY population DESC\",
    \"fetch_size\": 11
}"

translate from es DSL

稍复杂语句

mysql

我们先看在mysql数据库下面,这些复杂语句的语法准确性mysql-process

es-sql

only group by

only group by

translate from es DSL与execute的返回结果一致

group by with order by

当在group by之后添加order by,es-sql就不能正常解析了。而在es-DSL里面是可以实现这个agg-sort功能的。

es-sql fail with group-order

根据上一节的without order by解析出来的DSL,再配合agg-sort这个功能,来实现group-order。 without order

with order

# without order
curl -X POST "$HOST/bm/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "size" : 0,
  "query" : {
    "range" : {
      "population" : {
        "from" : 0,
        "to" : null,
        "include_lower" : true,
        "include_upper" : false,
        "boost" : 1.0
      }
    }
  },
  "_source" : false,
  "stored_fields" : "_none_",
  "aggregations" : {
    "groupby" : {
      "composite" : {
        "size" : 11,
        "sources" : [
          {
            "1674" : {
              "terms" : {
                "field" : "country_code",
                "order" : "asc"
              }
            }
          }
        ]
      },
      "aggregations" : {
        "1683" : {
          "sum" : {
            "field" : "population"
          }
        }
      }
    }
  }
}
'

# with order
curl -X POST "$HOST/bm/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "size" : 0,
  "query" : {
    "range" : {
      "population" : {
        "from" : 0,
        "to" : null,
        "include_lower" : true,
        "include_upper" : false,
        "boost" : 1.0
      }
    }
  },
  "_source" : false,
  "stored_fields" : "_none_",
  "aggregations" : {
    "groupby" : {
      "composite" : {
        "size" : 11,
        "sources" : [
          {
            "1674" : {
              "terms" : {
                "field" : "country_code",
                "order" : "asc"
              }
            }
          }
        ]
      },
      "aggregations" : {
        "1683" : {
          "sum" : {
            "field" : "population"
          }
        }
        ,"population_bucket_sort": {
            "bucket_sort": {
                "sort": [
                  {"1683": {"order": "desc"}}
                ]
            }
        }
      }
    }
  }
}
'

Others

es-sql source code

不知道这个fix/enhancement是否可以在es-string通过antlr义成AST的es-DSL。有时间再回头看这个issue

costin回复说Bucket Sort Aggregation只是局部排序,非全局排序。但是至于如何实现全局排序,我仍然没有弄明白。

costin reply