elasticsearch - field filterable but not searchable - elasticsearch

Using elastic 2.3.5. Is there a way to make a field filterable, but not searchable? For example, I have a language field, with values like en-US. Setting several filters in query->bool->filter->term, I'm able to filter the result set without affecting the score, for example, searching for only documents that have en-US in the language field.
However, I want a query searching for the term en-US to return no results, since this is not really an indexed field for searching, but just so I can filter.
Can I do this?

ElasticSearch use an _all field to allow fast full-text search on entire documents. This is why searching for en-US in all fields of all documents return you the one containing 'language':'en-US'.
https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-all-field.html
You can specify "include_in_all": false in the mapping to deactivate include of a field into _all.
PUT my_index
{
"mappings": {
"my_type": {
"properties": {
"title": {
"type": "string"
},
"country": {
"type": "string"
},
"language": {
"type": "string",
"include_in_all": false
}
}
}
}
}
In this example, searching for 'US' in all field will return only document containing US in title or country. But you still be able to filter your query using the language field.
https://www.elastic.co/guide/en/elasticsearch/reference/current/include-in-all.html

Related

How to conditionally apply an analyzer at index time to a field that could be one of many languages?

I have documents with a field (e.g. input_text) that contains a string that could be one of 20 odd languages. I have another field that has the short form of the language (e.g. lang)
I want to conditionally apply an analyzer at index time to the text field dependent on what the language is as detected from the language field.
I eventually want a Kibana dashboard with a single word cloud of the most common words in the text field (ie in multiple languages) but only words that have been stemmed and tokenized with stop words removed.
Is there a way to do this?
The elasticsearch documents suggest using multiple fields for each language and then specifying an analyzer for the appropriate field, but I can't do this as there are 20 some languages and this would overload my nodes.
There is no way to achieve what you want in Elasticsearch (applying analyzer to field A based on the value of field B).
I would recommend to create one index per language, and then create an index alias that groups all those indices and query against it.
PUT lang_de
{
"mappings": {
"properties": {
"input_text": {
"type": "text",
"analyzer": "german"
}
}
}
}
PUT lang__en
{
"mappings": {
"properties": {
"input_text": {
"type": "text",
"analyzer": "english"
}
}
}
}
POST _aliases
{
"actions": [
{
"add": {
"index": "lang_*",
"alias": "lang"
}
}
]
}

Elastic Search - store only field

Is there an option in elastic search to store vales just for the purpose of retrieving and not used for searching? So when indexing we'll index all fields and when searching we'll search on a single field only, but need other data as well.
For example, we'll index products, fields could be Name, SKU, Supplier Name etc. Out of which, only Name needs to be indexed and searched. SKU and Supplier Name are just for storing and retrieving with a search.
Since the _source document is stored anyway, the best way to achieve what you want is to neither store nor index any fields, except the one you're searching on, like this:
PUT my-index
{
"mappings": {
"_source": {
"enabled": true <--- true by default, but adding for completeness
},
"properties": {
"name": {
"type": "text",
"index": true <--- true by default, but adding for completeness
},
"sku": {
"type": "keyword",
"index": false, <--- don't index this field
"store": false <--- false by default, but adding for completeness
},
"supplier": {
"type": "keyword",
"index": false, <--- don't index this field
"store": false <--- false by default, but adding for completeness
},
}
}
}
So to sum up:
the fields you want to search on must have index: true
the fields you don't want to search on must have index: false
store is false by default so you don't need to specify it
_source is enabled by default, so you don't need to specify it
enabled should only be used at the top-level or on object fields, so it doesn't have its place here
With the above mapping, you can
search on name
retrieve all fields from the _source document since the _source field is stored by default and contains the original document

Elasticsearch copy_to not working on keyword field

I am trying to copy two fields onto a third field, which should have the type 'keyword' (because I want to be able to aggregate by it, and do not need to perform a full-text search)
PUT /test/_mapping/_doc
{
"properties": {
"first": {
"copy_to": "full_name",
"type": "keyword"
},
"last": {
"copy_to": "full_name",
"type": "keyword"
},
"full_name": {
"type": "keyword"
}
}
}
I then post a new document:
POST /test/_doc
{
"first": "Bar",
"last": "Foo"
}
And query it using the composite field full_name:
GET /test2/_search
{
"query": {
"match": {
"full_name": "Bar Foo"
}
}
}
And no hits are returned.
If the type of the composite field full_name were text then it works as expected and described in the docs:
https://www.elastic.co/guide/en/elasticsearch/reference/current/copy-to.html
Is it not possible to copy onto a keyword-type field?
The problem is that you use match query - When you index your docs you use keyword type which according to the ES documentation are "...only searchable by their exact value."
However when you query that field you use match query which is using the standard analyzer which, among other stuff, also does lower-casing which causes your terms to not match nothing.
You have few options I can think of in this case:
Change the field type to text which will perform the same analysis as the match query.
Create a custom field type with custom analyzer which will perform lower casing
Don't query more than a single term at a time and use term query instead of match
It seems that the type of destination field of copy_to must be text type.

Fields that need not be searchable in ElasticSearch

I am using ElasticSearch v6 to search my product catalog.
My product has a number fields, such as title, description, price, etc... one of the fields is: photo_path, which would contain the location of product photo on disk.
photo_path does need to be searched, but need to be retrieved.
Question: Is there a way to mark this field as not searchable/not indexed? And is this a good idea, for example will I save storage/process time, by marking this field not searchable.
I have seen this answer and read, _source and _all, but since _all is deprecated in version 6, I am confused what to do.
If you want some field are not indexed are not queryable, setting property"index": false, and if you only want "photo_path" field as the search result, includes this field on source only (save disk space and fetch less data from disk), show mappings like below:
{
"mappings": {
"data": {
"_source": {
"includes": [
"photo_path" // search result only contains this
]
},
"properties": {
"photo_path": {
"type": "keyword",
"doc_values": false, // Set docValues as false if you don't want to use this field to sort/aggregate
"index": false // Not index this field
},
"title": {
"type": "..."
}
}
}
}
}

Elastic search map multiple fields to a single field

Does elastic search provide a functionality to map different fields to a single field and use that single field for search.
For eg _all refers to all the fields in the docs.
Similarly do we have any mapping configuration to define a field which would be referring to multiple fields.
Eg : I have a field called Brand,Name,Category.
I need to map Brand and Name to a single field custome_field.
I want it during mapping time and not during query time. I know cross fields does that during query time.
Take a look at copy_to functionality. It acts just like a custom _all. See here more about this:
In Metadata: _all field we explained that the special _all field
indexes the values from all other fields as one big string. Having all
fields indexed into one field is not terribly flexible though. It
would be nice to have one custom _all field for the person’s name, and
another custom _all field for their address.
Elasticsearch provides us with this functionality via the copy_to
parameter in a field mapping:
PUT /my_index {
"mappings": {
"person": {
"properties": {
"first_name": {
"type": "string",
"copy_to": "full_name"
},
"last_name": {
"type": "string",
"copy_to": "full_name"
},
"full_name": {
"type": "string"
}
}
}
} }

Resources