Field not searchable in ES? - elasticsearch

I created an index myindex in elasticsearch, loaded a few documents into it. When I visit:
localhost:9200/myindex/mytype/1023
I noticed that my particular index has the following metadata for mappings:
mappings: {
mappinggroupname: {
properties: {
Aproperty: {
type: string
}
Bproperty: {
type: string
}
}
}
}
Is there some way to add "store:yes" and index: "analyzed" without having to reload/reindex all the documents?
Note that when i want to view a single document...
i.e. localhost:9200/myindex/mytype/1023
I can see the _source field contains all the fields of that document are and when I go to the "Browser" section of the head plugin it appears that all the columns are correct and corresponding to my fieldnames. So why is it that "stored" is not showing up in metadata? I can even perform a _search on them.
What is the difference between "stored":"true" versus the fact that I can see all my fields and values after indexing all my documents via the means I mention above?

Nope, no way! That's how your documents got indexed in the underlying lucene. The only way to change it is to reindex them all!
You see all those fields because you see the content of the special _source field in lucene, that's stored by default through elasticsearch. You are not storing all the fields separately but you do have the source document that you originally indexed through the _source, a single field that contains the whole document.
Generally the _source field is just enough, you don't usually need to configure every field as stored.
Also, the default is "index":"analyzed" if not specified for all the string fields. That means those fields are indexed and analyzed using the standard analyzer if not specified in the mapping. Therefore, as far as I can see from your mapping those two fields should be indexed, thus searchable.

Related

Best way to add data elastic search

I am posting data so I can search it later in elastic search.
I am posting it like this:
POST https://localhost:9200/superheroes/_doc/27
body:
{
"name": "Flash",
"super_power": "Super Speed"
}
This is automatically stored on a _source object... but I read the _source field itself is not indexed (and thus is not searchable) online, and the entire purpose of this application is quick search by value... like if I wanna know which superheroes have super speed and I write super speed on the search bar.
you are right about the _source field, but if you don't define the mapping for your fields, Elasticsearch generates the default mapping with default param for these fields . You can read more about the mapping param and in your case, if you want field to be searchable it needs to be the part of the inverted index and this is controlled by the index param.
You can override the value of these params by defining your own mapping(recommended otherwise Elasticsearch will guess the field type based on the first data you index in a field).
Hope this helps and clears your doubt, in short if you are not defining your mapping, your text data is searchable by default.

How to only store the index,not the original text in ES

I am using elastic search 7.10.1. I would store and search against my blogs. The blog has id,title and content fields.
I would like to search against id, title and content, but since the content of blog is too big, so that I would like to save the original content text outside of Elastic Search, such as HBase.
I would ask how to achieve this in ES?
If you are using a static mapping then simply don't define your content field in your index mapping, and don't populate it while indexing your document to ES.
Refer to Mapping param for more info, and specifically, store param default false which means you can't retrieve field value if _source(true by default) is also disabled.
index param default true, which controls whether the field is searchable or not, in your case if you don't want to search and retrieve it you have to disable these two params.

Does Elasticsearch store or not store field values by default?

In Elasticsearch all fields of a mapping have a stored property which determines whether the data of the field will be stored on disk (in addition to the storing of the whole _source).
It defaults to false.
However each segment in every shard also has a Docvalues structure per field in the mapping. The structure stores the value of the field for all documents in the segment.
All documents and fields are included in the structured by default.
So on one hand, by default Elasticsearch doesn't store the values for fields. On the other hand, it does store the values in the Docvalues structure.
So which is it? Does Elasticsearch store or not store values by default?
ES stores the same field in multiple formats for different purposes.
For eg. Consider this :
"prop_1":{ "type":"string", "index":"not_analyzed","store":true,"doc_values":true}
prop_1 would be stored on its own as an indexed, doc_values and
stored field. On top of that the prop_1 is stored to into the
_source field together with your other fields.
As explained above, even if stored:false, the field data is still persisted on disk in multiple formats.
Stored fields are designed for optimal storage whereas doc values are
designed the access field values quickly. During the execution of
query many of doc values fields are accessed for candidate hits, so
the access must be fast. This the reason why you should use doc values
in sorting, aggregations and scrips.On the other hand stored fields should be used for return field values for the top matching documents.
Now, you can use doc_values to return fields in response as well :-
GET /_search
{
"query" : {
"match_all": {}
},
"docvalue_fields" : ["test1", "test2"]
}
Doc value fields can work on fields that are not stored. So IMO, stored fields do not have any significance now.

combine fields of different documents in same index

I have 2 fields type in my index;
doc1
{
"category":"15",
"url":"http://stackoverflow.com/questions/ask"
}
doc2
{
"url":"http://stackoverflow.com/questions/ask"
"requestsize":"231",
"logdate":"22/12/2012",
"username":"mehmetyeneryilmaz"
}
now I need such a query that filter in same url field and returns fields both of documents:
result:
{
"category":"15",
"url":"http://stackoverflow.com/questions/ask"
"requestsize":"231",
"logdate":"22/12/2012",
"username":"mehmetyeneryilmaz"
}
The results given by elasticsearch are always per document, means that if there are multiple documents satisfying your query/filter, they would always appear as a different documents in the result and never merged into a single document. Hence merging them at client side is the one option which you can use. To avoid getting complete document and just to get the relevant fields, you can use "fields" in your query.
If this is not what you need and still needs narrowing down the result from the query itself, you can use top hit aggregations. It will give you the complete list of documents under a single bucket. But it would also have source field which would contain the complete documents itself.
Try giving a read to page:
https://www.elastic.co/guide/en/elasticsearch/reference/1.4/search-aggregations-metrics-top-hits-aggregation.html

Can I index nested documents on ElasticSearch without mapping?

I have a document whose structure changes often, how can I index nested documents inside it without changing the mapping on ElasticSearch?
You can index documents in Elasticsearch without providing a mapping yes.
However, Elasticsearch makes a decision about the type of a field when the first document contains a value for that field. If you add document 1 and it has a field called item_code, and in document 1 item_code is a string, Elasticsearch will set the type of field "item_code" to be string. If document 2 has an integer value in item_code Elasticsearch will have already set the type as string.
Basically, field type is index dependant, not document dependant.
This is mainly because of Apache Lucene and the way it handles this information.
If you're having a case where some data structure changes, while other doesn't, you could use an object type, http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-object-type.html.
You can even go as far as to use "enabled": false on it, which makes elasticsearch just store the data. You couldn't search it anymore, but maybe you actually don't even want that?

Resources