Elasticsearch Sorting fields anomaly
Trying to sort a list on certain fields. firstName
and lastName
but I have noticed some inconstant result.
I am running a simple query
//Return all the employees from a specific company ordering by lastName asc | desc
GET employee-index-sorting
{
"query": {
"bool": {
"filter": {
"term": {
"companyId": 3179
}
}
}
},
"sort": [
{
"lastName.keyword": { <-- Should this be keyword? or not_analyzed
"order": "desc"
}
}
]
}
In the result why would van der Mescht and van Breda be before Zwane and Zwezwe?
I suspect there is something wrong with my mappings
{
"_index": "employee-index",
"_type": "_doc",
"_id": "637467",
"_score": null,
"_source": {
"companyId": 3179,
"firstName": "Name",
"lastName": "van der Mescht",
},
"sort": [
"van der Mescht"
]
},
{
"_index": "employee-index",
"_type": "_doc",
"_id": "678335",
"_score": null,
"_source": {
"companyId": 3179,
"firstName": "Name3",
"lastName": "van Breda",
},
"sort": [
"van Breda"
]
},
{
"_index": "employee-index",
"_type": "_doc",
"_id": "113896",
"_score": null,
"_source": {
"companyId": 3179,
"firstName": "Name2",
"lastName": "Zwezwe",
},
"sort": [
"Zwezwe"
]
},
{
"_index": "employee-index",
"_type": "_doc",
"_id": "639639",
"_score": null,
"_source": {
"companyId": 3179,
"firstName": "Name1",
"lastName": "Zwane",
},
"sort": [
"Zwane"
]
}
Mappings
Posting the entire map because I am not sure if there might be something else wrong with it.
How should i change the lastName and firstName propery to allow for sorting on them?
PUT employee-index-sorting
{
"settings": {
"index": {
"analysis": {
"filter": {},
"analyzer": {
"keyword_analyzer": {
"filter": [
"lowercase",
"asciifolding",
"trim"
],
"char_filter": ,
"type": "custom",
"tokenizer": "keyword"
},
"edge_ngram_analyzer": {
"filter": [
"lowercase"
],
"tokenizer": "edge_ngram_tokenizer"
},
"edge_ngram_search_analyzer": {
"tokenizer": "lowercase"
}
},
"tokenizer": {
"edge_ngram_tokenizer": {
"type": "edge_ngram",
"min_gram": 2,
"max_gram": 5,
"token_chars": [
"letter"
]
}
}
}
}
},
"mappings": {
"_doc": {
"properties": {
"employeeId": {
"type": "keyword"
},
"companyGroupId": {
"type": "keyword"
},
"companyId": {
"type": "keyword"
},
"number": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"preferredName": {
"type": "text",
"index": false
},
"firstName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"middleName": {
"type": "text",
"index": false
},
"lastName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"fullName": {
"type": "text",
"fields": {
"keywordstring": {
"type": "text",
"analyzer": "keyword_analyzer"
},
"edgengram": {
"type": "text",
"analyzer": "edge_ngram_analyzer",
"search_analyzer": "edge_ngram_search_analyzer"
}
},
"analyzer": "standard"
},
"terminationDate": {
"type": "date"
},
"companyName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"email": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"idNumber": {
"type": "text"
},
"description": {
"type": "text",
"index": false
},
"jobNumber": {
"type": "keyword"
},
"frequencyId": {
"type": "long"
},
"frequencyCode": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"frequencyAccess": {
"type": "boolean"
}
}
}
}
}
elasticsearch kibana dsl
|
show 1 more comment
Trying to sort a list on certain fields. firstName
and lastName
but I have noticed some inconstant result.
I am running a simple query
//Return all the employees from a specific company ordering by lastName asc | desc
GET employee-index-sorting
{
"query": {
"bool": {
"filter": {
"term": {
"companyId": 3179
}
}
}
},
"sort": [
{
"lastName.keyword": { <-- Should this be keyword? or not_analyzed
"order": "desc"
}
}
]
}
In the result why would van der Mescht and van Breda be before Zwane and Zwezwe?
I suspect there is something wrong with my mappings
{
"_index": "employee-index",
"_type": "_doc",
"_id": "637467",
"_score": null,
"_source": {
"companyId": 3179,
"firstName": "Name",
"lastName": "van der Mescht",
},
"sort": [
"van der Mescht"
]
},
{
"_index": "employee-index",
"_type": "_doc",
"_id": "678335",
"_score": null,
"_source": {
"companyId": 3179,
"firstName": "Name3",
"lastName": "van Breda",
},
"sort": [
"van Breda"
]
},
{
"_index": "employee-index",
"_type": "_doc",
"_id": "113896",
"_score": null,
"_source": {
"companyId": 3179,
"firstName": "Name2",
"lastName": "Zwezwe",
},
"sort": [
"Zwezwe"
]
},
{
"_index": "employee-index",
"_type": "_doc",
"_id": "639639",
"_score": null,
"_source": {
"companyId": 3179,
"firstName": "Name1",
"lastName": "Zwane",
},
"sort": [
"Zwane"
]
}
Mappings
Posting the entire map because I am not sure if there might be something else wrong with it.
How should i change the lastName and firstName propery to allow for sorting on them?
PUT employee-index-sorting
{
"settings": {
"index": {
"analysis": {
"filter": {},
"analyzer": {
"keyword_analyzer": {
"filter": [
"lowercase",
"asciifolding",
"trim"
],
"char_filter": ,
"type": "custom",
"tokenizer": "keyword"
},
"edge_ngram_analyzer": {
"filter": [
"lowercase"
],
"tokenizer": "edge_ngram_tokenizer"
},
"edge_ngram_search_analyzer": {
"tokenizer": "lowercase"
}
},
"tokenizer": {
"edge_ngram_tokenizer": {
"type": "edge_ngram",
"min_gram": 2,
"max_gram": 5,
"token_chars": [
"letter"
]
}
}
}
}
},
"mappings": {
"_doc": {
"properties": {
"employeeId": {
"type": "keyword"
},
"companyGroupId": {
"type": "keyword"
},
"companyId": {
"type": "keyword"
},
"number": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"preferredName": {
"type": "text",
"index": false
},
"firstName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"middleName": {
"type": "text",
"index": false
},
"lastName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"fullName": {
"type": "text",
"fields": {
"keywordstring": {
"type": "text",
"analyzer": "keyword_analyzer"
},
"edgengram": {
"type": "text",
"analyzer": "edge_ngram_analyzer",
"search_analyzer": "edge_ngram_search_analyzer"
}
},
"analyzer": "standard"
},
"terminationDate": {
"type": "date"
},
"companyName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"email": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"idNumber": {
"type": "text"
},
"description": {
"type": "text",
"index": false
},
"jobNumber": {
"type": "keyword"
},
"frequencyId": {
"type": "long"
},
"frequencyCode": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"frequencyAccess": {
"type": "boolean"
}
}
}
}
}
elasticsearch kibana dsl
As far as I know you should not add.keyword
when querying, use just field name.
– PeterM
Nov 15 '18 at 20:44
So if i take it out i get the following error...Fielddata is disabled on text fields by default. Set fielddata=true on [lastName] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead
– R4nc1d
Nov 16 '18 at 6:46
So, i enabled Enabling field data on text fields but still not getting the correct values.
– R4nc1d
Nov 19 '18 at 19:34
@PeterM in the example you point at, the field used for sorting just doesn't have the.keyword
suffix but it is indeed a keyword field.
– Val
Nov 20 '18 at 12:26
1
@PeterM the way it works is that thelastName
field is of typetext
and the syntheticlastName.keyword
field is of typekeyword
. But in the source you only seelastName
.
– Val
Nov 21 '18 at 7:21
|
show 1 more comment
Trying to sort a list on certain fields. firstName
and lastName
but I have noticed some inconstant result.
I am running a simple query
//Return all the employees from a specific company ordering by lastName asc | desc
GET employee-index-sorting
{
"query": {
"bool": {
"filter": {
"term": {
"companyId": 3179
}
}
}
},
"sort": [
{
"lastName.keyword": { <-- Should this be keyword? or not_analyzed
"order": "desc"
}
}
]
}
In the result why would van der Mescht and van Breda be before Zwane and Zwezwe?
I suspect there is something wrong with my mappings
{
"_index": "employee-index",
"_type": "_doc",
"_id": "637467",
"_score": null,
"_source": {
"companyId": 3179,
"firstName": "Name",
"lastName": "van der Mescht",
},
"sort": [
"van der Mescht"
]
},
{
"_index": "employee-index",
"_type": "_doc",
"_id": "678335",
"_score": null,
"_source": {
"companyId": 3179,
"firstName": "Name3",
"lastName": "van Breda",
},
"sort": [
"van Breda"
]
},
{
"_index": "employee-index",
"_type": "_doc",
"_id": "113896",
"_score": null,
"_source": {
"companyId": 3179,
"firstName": "Name2",
"lastName": "Zwezwe",
},
"sort": [
"Zwezwe"
]
},
{
"_index": "employee-index",
"_type": "_doc",
"_id": "639639",
"_score": null,
"_source": {
"companyId": 3179,
"firstName": "Name1",
"lastName": "Zwane",
},
"sort": [
"Zwane"
]
}
Mappings
Posting the entire map because I am not sure if there might be something else wrong with it.
How should i change the lastName and firstName propery to allow for sorting on them?
PUT employee-index-sorting
{
"settings": {
"index": {
"analysis": {
"filter": {},
"analyzer": {
"keyword_analyzer": {
"filter": [
"lowercase",
"asciifolding",
"trim"
],
"char_filter": ,
"type": "custom",
"tokenizer": "keyword"
},
"edge_ngram_analyzer": {
"filter": [
"lowercase"
],
"tokenizer": "edge_ngram_tokenizer"
},
"edge_ngram_search_analyzer": {
"tokenizer": "lowercase"
}
},
"tokenizer": {
"edge_ngram_tokenizer": {
"type": "edge_ngram",
"min_gram": 2,
"max_gram": 5,
"token_chars": [
"letter"
]
}
}
}
}
},
"mappings": {
"_doc": {
"properties": {
"employeeId": {
"type": "keyword"
},
"companyGroupId": {
"type": "keyword"
},
"companyId": {
"type": "keyword"
},
"number": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"preferredName": {
"type": "text",
"index": false
},
"firstName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"middleName": {
"type": "text",
"index": false
},
"lastName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"fullName": {
"type": "text",
"fields": {
"keywordstring": {
"type": "text",
"analyzer": "keyword_analyzer"
},
"edgengram": {
"type": "text",
"analyzer": "edge_ngram_analyzer",
"search_analyzer": "edge_ngram_search_analyzer"
}
},
"analyzer": "standard"
},
"terminationDate": {
"type": "date"
},
"companyName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"email": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"idNumber": {
"type": "text"
},
"description": {
"type": "text",
"index": false
},
"jobNumber": {
"type": "keyword"
},
"frequencyId": {
"type": "long"
},
"frequencyCode": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"frequencyAccess": {
"type": "boolean"
}
}
}
}
}
elasticsearch kibana dsl
Trying to sort a list on certain fields. firstName
and lastName
but I have noticed some inconstant result.
I am running a simple query
//Return all the employees from a specific company ordering by lastName asc | desc
GET employee-index-sorting
{
"query": {
"bool": {
"filter": {
"term": {
"companyId": 3179
}
}
}
},
"sort": [
{
"lastName.keyword": { <-- Should this be keyword? or not_analyzed
"order": "desc"
}
}
]
}
In the result why would van der Mescht and van Breda be before Zwane and Zwezwe?
I suspect there is something wrong with my mappings
{
"_index": "employee-index",
"_type": "_doc",
"_id": "637467",
"_score": null,
"_source": {
"companyId": 3179,
"firstName": "Name",
"lastName": "van der Mescht",
},
"sort": [
"van der Mescht"
]
},
{
"_index": "employee-index",
"_type": "_doc",
"_id": "678335",
"_score": null,
"_source": {
"companyId": 3179,
"firstName": "Name3",
"lastName": "van Breda",
},
"sort": [
"van Breda"
]
},
{
"_index": "employee-index",
"_type": "_doc",
"_id": "113896",
"_score": null,
"_source": {
"companyId": 3179,
"firstName": "Name2",
"lastName": "Zwezwe",
},
"sort": [
"Zwezwe"
]
},
{
"_index": "employee-index",
"_type": "_doc",
"_id": "639639",
"_score": null,
"_source": {
"companyId": 3179,
"firstName": "Name1",
"lastName": "Zwane",
},
"sort": [
"Zwane"
]
}
Mappings
Posting the entire map because I am not sure if there might be something else wrong with it.
How should i change the lastName and firstName propery to allow for sorting on them?
PUT employee-index-sorting
{
"settings": {
"index": {
"analysis": {
"filter": {},
"analyzer": {
"keyword_analyzer": {
"filter": [
"lowercase",
"asciifolding",
"trim"
],
"char_filter": ,
"type": "custom",
"tokenizer": "keyword"
},
"edge_ngram_analyzer": {
"filter": [
"lowercase"
],
"tokenizer": "edge_ngram_tokenizer"
},
"edge_ngram_search_analyzer": {
"tokenizer": "lowercase"
}
},
"tokenizer": {
"edge_ngram_tokenizer": {
"type": "edge_ngram",
"min_gram": 2,
"max_gram": 5,
"token_chars": [
"letter"
]
}
}
}
}
},
"mappings": {
"_doc": {
"properties": {
"employeeId": {
"type": "keyword"
},
"companyGroupId": {
"type": "keyword"
},
"companyId": {
"type": "keyword"
},
"number": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"preferredName": {
"type": "text",
"index": false
},
"firstName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"middleName": {
"type": "text",
"index": false
},
"lastName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"fullName": {
"type": "text",
"fields": {
"keywordstring": {
"type": "text",
"analyzer": "keyword_analyzer"
},
"edgengram": {
"type": "text",
"analyzer": "edge_ngram_analyzer",
"search_analyzer": "edge_ngram_search_analyzer"
}
},
"analyzer": "standard"
},
"terminationDate": {
"type": "date"
},
"companyName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"email": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"idNumber": {
"type": "text"
},
"description": {
"type": "text",
"index": false
},
"jobNumber": {
"type": "keyword"
},
"frequencyId": {
"type": "long"
},
"frequencyCode": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"frequencyAccess": {
"type": "boolean"
}
}
}
}
}
elasticsearch kibana dsl
elasticsearch kibana dsl
asked Nov 15 '18 at 20:26


R4nc1dR4nc1d
1,5871332
1,5871332
As far as I know you should not add.keyword
when querying, use just field name.
– PeterM
Nov 15 '18 at 20:44
So if i take it out i get the following error...Fielddata is disabled on text fields by default. Set fielddata=true on [lastName] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead
– R4nc1d
Nov 16 '18 at 6:46
So, i enabled Enabling field data on text fields but still not getting the correct values.
– R4nc1d
Nov 19 '18 at 19:34
@PeterM in the example you point at, the field used for sorting just doesn't have the.keyword
suffix but it is indeed a keyword field.
– Val
Nov 20 '18 at 12:26
1
@PeterM the way it works is that thelastName
field is of typetext
and the syntheticlastName.keyword
field is of typekeyword
. But in the source you only seelastName
.
– Val
Nov 21 '18 at 7:21
|
show 1 more comment
As far as I know you should not add.keyword
when querying, use just field name.
– PeterM
Nov 15 '18 at 20:44
So if i take it out i get the following error...Fielddata is disabled on text fields by default. Set fielddata=true on [lastName] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead
– R4nc1d
Nov 16 '18 at 6:46
So, i enabled Enabling field data on text fields but still not getting the correct values.
– R4nc1d
Nov 19 '18 at 19:34
@PeterM in the example you point at, the field used for sorting just doesn't have the.keyword
suffix but it is indeed a keyword field.
– Val
Nov 20 '18 at 12:26
1
@PeterM the way it works is that thelastName
field is of typetext
and the syntheticlastName.keyword
field is of typekeyword
. But in the source you only seelastName
.
– Val
Nov 21 '18 at 7:21
As far as I know you should not add
.keyword
when querying, use just field name.– PeterM
Nov 15 '18 at 20:44
As far as I know you should not add
.keyword
when querying, use just field name.– PeterM
Nov 15 '18 at 20:44
So if i take it out i get the following error...Fielddata is disabled on text fields by default. Set fielddata=true on [lastName] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead
– R4nc1d
Nov 16 '18 at 6:46
So if i take it out i get the following error...Fielddata is disabled on text fields by default. Set fielddata=true on [lastName] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead
– R4nc1d
Nov 16 '18 at 6:46
So, i enabled Enabling field data on text fields but still not getting the correct values.
– R4nc1d
Nov 19 '18 at 19:34
So, i enabled Enabling field data on text fields but still not getting the correct values.
– R4nc1d
Nov 19 '18 at 19:34
@PeterM in the example you point at, the field used for sorting just doesn't have the
.keyword
suffix but it is indeed a keyword field.– Val
Nov 20 '18 at 12:26
@PeterM in the example you point at, the field used for sorting just doesn't have the
.keyword
suffix but it is indeed a keyword field.– Val
Nov 20 '18 at 12:26
1
1
@PeterM the way it works is that the
lastName
field is of type text
and the synthetic lastName.keyword
field is of type keyword
. But in the source you only see lastName
.– Val
Nov 21 '18 at 7:21
@PeterM the way it works is that the
lastName
field is of type text
and the synthetic lastName.keyword
field is of type keyword
. But in the source you only see lastName
.– Val
Nov 21 '18 at 7:21
|
show 1 more comment
1 Answer
1
active
oldest
votes
For sorting you need to use lastName.keyword
, that's correct, no need to change anything there.
The reason why van der Mescht
and van Breda
are before Zwane
and Zwezwe
is because sorting on strings happens on a lexicographical level, i.e. basically using the ASCII table and uppercase characters happen before lowercase ones, so words are sorted in that same order. But since you're sorting in desc
mode, that's exactly the opposite:
z...
- ...
van der Mescht
- ...
van Breda
- ...
a...
- ...
Zwezwe
- ...
Zwane
- ...
Z...
- ...
A...
To fix this, what you simply need to do is to add a normalizer to your lastName.keyword
field, i.e. change your mapping to this and it will work:
{
"settings": {
"index": {
"analysis": {
"filter": {},
"analyzer": {
...
},
"tokenizer": {
...
},
"normalizer": { <-- add this
"lowersort": {
"type": "custom",
"filter": [
"lowercase"
]
}
}
}
}
},
"mappings": {
"_doc": {
"properties": {
...
"lastName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"normalizer": "lowersort", <-- add this
"ignore_above": 256
}
}
},
...
}
}
}
}
ah, did not realize that the upper and lowercase would have such an effect. But now I know. So seems to be working will be testing some other scenarios quickly.
– R4nc1d
Nov 20 '18 at 6:33
Awesome, glad it helped!
– Val
Nov 20 '18 at 6:42
Thanks, Val, it is working. I found one scenario where the order does seem a bit off but I am still investigating that. But 99.9% is working
– R4nc1d
Nov 23 '18 at 17:55
Feel free to provide more info and we'll sort this out
– Val
Nov 23 '18 at 18:01
awesome thanks, will def do that
– R4nc1d
Nov 23 '18 at 19:26
|
show 4 more comments
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53327405%2felasticsearch-sorting-fields-anomaly%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
For sorting you need to use lastName.keyword
, that's correct, no need to change anything there.
The reason why van der Mescht
and van Breda
are before Zwane
and Zwezwe
is because sorting on strings happens on a lexicographical level, i.e. basically using the ASCII table and uppercase characters happen before lowercase ones, so words are sorted in that same order. But since you're sorting in desc
mode, that's exactly the opposite:
z...
- ...
van der Mescht
- ...
van Breda
- ...
a...
- ...
Zwezwe
- ...
Zwane
- ...
Z...
- ...
A...
To fix this, what you simply need to do is to add a normalizer to your lastName.keyword
field, i.e. change your mapping to this and it will work:
{
"settings": {
"index": {
"analysis": {
"filter": {},
"analyzer": {
...
},
"tokenizer": {
...
},
"normalizer": { <-- add this
"lowersort": {
"type": "custom",
"filter": [
"lowercase"
]
}
}
}
}
},
"mappings": {
"_doc": {
"properties": {
...
"lastName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"normalizer": "lowersort", <-- add this
"ignore_above": 256
}
}
},
...
}
}
}
}
ah, did not realize that the upper and lowercase would have such an effect. But now I know. So seems to be working will be testing some other scenarios quickly.
– R4nc1d
Nov 20 '18 at 6:33
Awesome, glad it helped!
– Val
Nov 20 '18 at 6:42
Thanks, Val, it is working. I found one scenario where the order does seem a bit off but I am still investigating that. But 99.9% is working
– R4nc1d
Nov 23 '18 at 17:55
Feel free to provide more info and we'll sort this out
– Val
Nov 23 '18 at 18:01
awesome thanks, will def do that
– R4nc1d
Nov 23 '18 at 19:26
|
show 4 more comments
For sorting you need to use lastName.keyword
, that's correct, no need to change anything there.
The reason why van der Mescht
and van Breda
are before Zwane
and Zwezwe
is because sorting on strings happens on a lexicographical level, i.e. basically using the ASCII table and uppercase characters happen before lowercase ones, so words are sorted in that same order. But since you're sorting in desc
mode, that's exactly the opposite:
z...
- ...
van der Mescht
- ...
van Breda
- ...
a...
- ...
Zwezwe
- ...
Zwane
- ...
Z...
- ...
A...
To fix this, what you simply need to do is to add a normalizer to your lastName.keyword
field, i.e. change your mapping to this and it will work:
{
"settings": {
"index": {
"analysis": {
"filter": {},
"analyzer": {
...
},
"tokenizer": {
...
},
"normalizer": { <-- add this
"lowersort": {
"type": "custom",
"filter": [
"lowercase"
]
}
}
}
}
},
"mappings": {
"_doc": {
"properties": {
...
"lastName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"normalizer": "lowersort", <-- add this
"ignore_above": 256
}
}
},
...
}
}
}
}
ah, did not realize that the upper and lowercase would have such an effect. But now I know. So seems to be working will be testing some other scenarios quickly.
– R4nc1d
Nov 20 '18 at 6:33
Awesome, glad it helped!
– Val
Nov 20 '18 at 6:42
Thanks, Val, it is working. I found one scenario where the order does seem a bit off but I am still investigating that. But 99.9% is working
– R4nc1d
Nov 23 '18 at 17:55
Feel free to provide more info and we'll sort this out
– Val
Nov 23 '18 at 18:01
awesome thanks, will def do that
– R4nc1d
Nov 23 '18 at 19:26
|
show 4 more comments
For sorting you need to use lastName.keyword
, that's correct, no need to change anything there.
The reason why van der Mescht
and van Breda
are before Zwane
and Zwezwe
is because sorting on strings happens on a lexicographical level, i.e. basically using the ASCII table and uppercase characters happen before lowercase ones, so words are sorted in that same order. But since you're sorting in desc
mode, that's exactly the opposite:
z...
- ...
van der Mescht
- ...
van Breda
- ...
a...
- ...
Zwezwe
- ...
Zwane
- ...
Z...
- ...
A...
To fix this, what you simply need to do is to add a normalizer to your lastName.keyword
field, i.e. change your mapping to this and it will work:
{
"settings": {
"index": {
"analysis": {
"filter": {},
"analyzer": {
...
},
"tokenizer": {
...
},
"normalizer": { <-- add this
"lowersort": {
"type": "custom",
"filter": [
"lowercase"
]
}
}
}
}
},
"mappings": {
"_doc": {
"properties": {
...
"lastName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"normalizer": "lowersort", <-- add this
"ignore_above": 256
}
}
},
...
}
}
}
}
For sorting you need to use lastName.keyword
, that's correct, no need to change anything there.
The reason why van der Mescht
and van Breda
are before Zwane
and Zwezwe
is because sorting on strings happens on a lexicographical level, i.e. basically using the ASCII table and uppercase characters happen before lowercase ones, so words are sorted in that same order. But since you're sorting in desc
mode, that's exactly the opposite:
z...
- ...
van der Mescht
- ...
van Breda
- ...
a...
- ...
Zwezwe
- ...
Zwane
- ...
Z...
- ...
A...
To fix this, what you simply need to do is to add a normalizer to your lastName.keyword
field, i.e. change your mapping to this and it will work:
{
"settings": {
"index": {
"analysis": {
"filter": {},
"analyzer": {
...
},
"tokenizer": {
...
},
"normalizer": { <-- add this
"lowersort": {
"type": "custom",
"filter": [
"lowercase"
]
}
}
}
}
},
"mappings": {
"_doc": {
"properties": {
...
"lastName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"normalizer": "lowersort", <-- add this
"ignore_above": 256
}
}
},
...
}
}
}
}
answered Nov 20 '18 at 4:51


ValVal
102k6134171
102k6134171
ah, did not realize that the upper and lowercase would have such an effect. But now I know. So seems to be working will be testing some other scenarios quickly.
– R4nc1d
Nov 20 '18 at 6:33
Awesome, glad it helped!
– Val
Nov 20 '18 at 6:42
Thanks, Val, it is working. I found one scenario where the order does seem a bit off but I am still investigating that. But 99.9% is working
– R4nc1d
Nov 23 '18 at 17:55
Feel free to provide more info and we'll sort this out
– Val
Nov 23 '18 at 18:01
awesome thanks, will def do that
– R4nc1d
Nov 23 '18 at 19:26
|
show 4 more comments
ah, did not realize that the upper and lowercase would have such an effect. But now I know. So seems to be working will be testing some other scenarios quickly.
– R4nc1d
Nov 20 '18 at 6:33
Awesome, glad it helped!
– Val
Nov 20 '18 at 6:42
Thanks, Val, it is working. I found one scenario where the order does seem a bit off but I am still investigating that. But 99.9% is working
– R4nc1d
Nov 23 '18 at 17:55
Feel free to provide more info and we'll sort this out
– Val
Nov 23 '18 at 18:01
awesome thanks, will def do that
– R4nc1d
Nov 23 '18 at 19:26
ah, did not realize that the upper and lowercase would have such an effect. But now I know. So seems to be working will be testing some other scenarios quickly.
– R4nc1d
Nov 20 '18 at 6:33
ah, did not realize that the upper and lowercase would have such an effect. But now I know. So seems to be working will be testing some other scenarios quickly.
– R4nc1d
Nov 20 '18 at 6:33
Awesome, glad it helped!
– Val
Nov 20 '18 at 6:42
Awesome, glad it helped!
– Val
Nov 20 '18 at 6:42
Thanks, Val, it is working. I found one scenario where the order does seem a bit off but I am still investigating that. But 99.9% is working
– R4nc1d
Nov 23 '18 at 17:55
Thanks, Val, it is working. I found one scenario where the order does seem a bit off but I am still investigating that. But 99.9% is working
– R4nc1d
Nov 23 '18 at 17:55
Feel free to provide more info and we'll sort this out
– Val
Nov 23 '18 at 18:01
Feel free to provide more info and we'll sort this out
– Val
Nov 23 '18 at 18:01
awesome thanks, will def do that
– R4nc1d
Nov 23 '18 at 19:26
awesome thanks, will def do that
– R4nc1d
Nov 23 '18 at 19:26
|
show 4 more comments
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53327405%2felasticsearch-sorting-fields-anomaly%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
As far as I know you should not add
.keyword
when querying, use just field name.– PeterM
Nov 15 '18 at 20:44
So if i take it out i get the following error...Fielddata is disabled on text fields by default. Set fielddata=true on [lastName] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead
– R4nc1d
Nov 16 '18 at 6:46
So, i enabled Enabling field data on text fields but still not getting the correct values.
– R4nc1d
Nov 19 '18 at 19:34
@PeterM in the example you point at, the field used for sorting just doesn't have the
.keyword
suffix but it is indeed a keyword field.– Val
Nov 20 '18 at 12:26
1
@PeterM the way it works is that the
lastName
field is of typetext
and the syntheticlastName.keyword
field is of typekeyword
. But in the source you only seelastName
.– Val
Nov 21 '18 at 7:21