ElasticSearch - How can I get all of a document's fields? - elasticsearch

I'm trying to investigate an ElasticSearch index for which I have no documentation. Some of the documents in this index have parent-child relationships. So I issued:
curl -XGET 'http://localhost:9200/myindex/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": {
"has_parent": {
"type": "entity",
"query": {
"term": {
"_id": "PROFILE_19986956"
}
}
}
}
}'
And got:
"hits" : {
"total" : 13,
"max_score" : 1.0,
"hits" : [ {
"_index" : "myindex",
"_type" : "property",
"_id" : "PROFILE_19986956_name",
"_score" : 1.0
},
...
]
}
Now I want to get the value of the document with ID PROFILE_19986956_name so I do curl -XGET 'http://localhost:9200/myindex/property/PROFILE_19986956_name?routing=0&pretty' and get:
{
"_index" : "myindex",
"_type" : "property",
"_id" : "PROFILE_19986956_name",
"_version" : 3,
"found" : true
}
Which has no value for the name, which I was expecting to get. I know it has to be there because searching for the entity's name yields a result but for some reason I can't get the field that contains the name. How can I get ES to show it?

Look at the mapping, I think the fields are indexed but the source is disabled. Try :
curl -XGET 'http://localhost:9200/myindex
and see if the mapping has :
"_source": {
"enabled": false
}
If you see this, the source of the documents has not been indexed in elasticsearch, so you can't get it from it.

Related

delete all documents where id start with a number Elasticsearch

What is the fastest way to get all _ids ?
I need a query to delete all documents where _id start with a number in elasticsearch.
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 2,
"successful" : 2,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 10000,
"relation" : "gte"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "myindex",
"_type" : "_doc",
"_id" : "_2432475",
"_score" : 1.0,
"_source" : {
"name" : "999",
"file" : null,
"age" : null,
}
},
Your best bet is to first copy the internal _id into a doc-level field (let's call it internal_id:
POST myindex/_update_by_query
{
"query": {
"match_all": {}
},
"script": {
"source": "ctx._source.internal_id = ctx._id",
"lang": "painless"
}
}
and then use a match_phrase_prefix query like so:
GET myindex/_search
{
"query": {
"match_phrase_prefix": {
"internal_id": "_24"
}
}
}
POST /myindex/_delete_by_query' \
-H 'Content-Type: application/json' \
-d '{
"query": {
"terms": {
"_id": [ "1", "2" ]
}
}
}'
wild card on _id is not supported in elasticsearch, either you have to index similar key explictly into the doc or
you can update doc using _update_by_query and add _id key into it

Elasticsearch Sort By Epoch MilliSeconds Timestamp

I have the ES document structure as below.
"hits" : [
{
"_index" : "testindex",
"_type" : "_doc",
"_id" : "566d9a9d-62d4-4dcd-b3f3-c0598638fa43",
"_score" : 1.0,
"_source" : {
"values" : {
"isActive" : "false",
"length" : 18.49,
"latitude" : 33.69076,
"accuracy" : 7
},
"metadata" : {
"name" : "866425030270849",
"type" : "BAT-M1",
"ts" : "1572493157000"
}
}
},
To sort the ES index based on the metadata.ts (date field with format 'epoch_millis'). I am using the following query to get latest record.
curl -X GET "https://localhost:9200/testindex/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query" : {
"term" : { "metadata.name" : "866425030270849" }
},
"sort": [
{ "devicedata.metadata.ts": "desc" }
],
"size": 1
}
'
But, I am unable to sort the recent record. Please help!
devicedata in query is the nested object of metadata.

Returning all documents when query string is empty

Say I have the following mapping:
{
'properties': {
{'title': {'type': 'text'},
{'created': {'type': 'text'}}
}
}
Sometimes the user will query by created, and sometimes by title and created. In both cases I want the query JSON to be as similar as possible. What's a good way to create a query that filters only by created when the user is not using the title to query?
I tried something like:
{
bool: {
must: [
{range: {created: {gte: '2010-01-01'}}},
{query: {match_all: {}}}
]
}
}
But that didn't work. What would be the best way of writing this query?
Your query didn't work cause created is of type text and not date, range queries on string dates will not work as expected, you should change your mappings from type text to date and reindex your data.
Follow this to reindex your data (with the new mappings) step by step.
Now if I understand correctly you want to use a generic query which filters title or/and created depending on the user input.
In this case, my suggestion is to use Query String.
An example (version 7.4.x):
Mappings
PUT my_index
{
"mappings": {
"properties": {
"title": {
"type": "text"
},
"created": { -------> change type to date instead of text
"type": "date"
}
}
}
}
Index a few documents
PUT my_index/_doc/1
{
"title":"test1",
"created": "2010-01-01"
}
PUT my_index/_doc/2
{
"title":"test2",
"created": "2010-02-01"
}
PUT my_index/_doc/3
{
"title":"test3",
"created": "2010-03-01"
}
Search Query (created)
GET my_index/_search
{
"query": {
"query_string": {
"query": "created:>=2010-02-01",
"fields" : ["created"]
}
}
}
Results
"hits" : [
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"title" : "test2",
"created" : "2010-02-01"
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"title" : "test3",
"created" : "2010-03-01"
}
}]
Search Query (title)
GET my_index/_search
{
"query": {
"query_string": {
"query": "test2",
"fields" : ["title"]
}
}
}
Results
"hits" : [
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "2",
"_score" : 0.9808292,
"_source" : {
"title" : "test2",
"created" : "2010-02-01"
}
}
]
Search Query (title and created)
GET my_index/_search
{
"query": {
"query_string": {
"query": "(created:>=2010-02-01) AND test3"
}
}
}
Results
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.9808292,
"hits" : [
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "3",
"_score" : 1.9808292,
"_source" : {
"title" : "test3",
"created" : "2010-03-01"
}
}
]
fields in query string - you can mention both fields. if you remove fields then the query will apply on all fields in your mappings.
Hope this helps

Elasticsearch prefix query not working on date

I have the following documents in elasticsearch, and I'd like to apply prefix query on logtime field, but nothing would return.
{
"_index" : "test",
"_type" : "fluentd",
"_id" : "6Cn38mMBMKvgU4HnnURh",
"_score" : 1.0,
"_source" : {
"logtime" : "2018-06-11 03:08:02,117",
"userid" : "",
"payload" : "40",
"qs" : "[['I have a dream, that everybody'], ['the'], ['steins']]"
}
}
the prefix query is
curl -X GET "localhost:9200/test/_search" -H 'Content-Type: application/json' -d'{ "query": {"prefix" : { "logtime" : "2018-06-11" }}}'
Could someone help? Thanks a lot.
You can use Range Query in that case like
{
"query": {
"range": {
"createdDate": {
"gte":"2018-06-11",
"lte": "2018-06-11",
"format": "yyyy-MM-dd"
}
}
}
}
Hope it helps.

ElasticSearch ignoring field named 'tags' when specified in "fields"

I have a search index, products, containing a field named tags, which is an array. Tags values appears in results when I don't add a fields section to my query, but when I do, it's just ignored outright, and doesn't appear in results, as shown below.
$ curl -XPOST 'http://localhost:9200/products/_search?pretty' -d '{ "query": {"match_all": {} }, "fields": ["tags", "id", "slug"], "size": 2}'
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 321826,
"max_score" : 1.0,
"hits" : [ {
"_index" : "products",
"_type" : "products",
"_id" : "39969794",
"_score" : 1.0,
"fields" : {
"id" : [ "39969794" ],
"slug" : [ "slug-39969794" ]
}
}, {
"_index" : "products",
"_type" : "products",
"_id" : "21296413",
"_score" : 1.0,
"fields" : {
"id" : [ "21296413" ],
"slug" : [ "slug-21296413" ]
}
} ]
}
}
Is there a reason or known issue for this? Is tags some kind of reserved word for ElasticSearch?
I'm using ES version 1.1.2 (Lucene 4.7).
tags is not an ES reserved word. So that's not your problem.
Is your tags an array of atomic types (numbers, strings or booleans)? Or is it an array of objects?
fields only works with leaf nodes. So "fields": ["tags"] should work fine with an array of strings but it would fail with an array of tag objects.
Confused as to why you are using "fields" instead of "terms?"
$ curl -XPOST 'http://localhost:9200/products/_search?pretty' -d
'{"query":
{
"match_all": {}
},
"terms": ["tags", "id", "slug"],
"size": 2}'

Resources