I am using elasticsearch 5.2.2.
in my index I have data looking like this:
{
"_index": "index",
"_type": "273caf76-ec03-478c-b980-9743180bc863",
"_id": "eee46e24-f383-4ae7-8930-dc3836e030a5",
"_score": 3.41408,
"_source": {
"Father Name": [
{
"id": "some id",
"value": "Some value test test"
}
],
"Mother Name": [
{
"id": "some id",
"value": "Another value haha"
}
],
"Other values": [{ id: "", value: ""}]
}
}
When I am searching with _all, everything works fine and I can find all the results with reasonable scores
{"query":{"match":{"_all":"value"}},"from":0,"size":20}
But that query is searching in all the fields. If I want for instance just to find results in Father Name or in Father Name and Mother Name, then I find nothing.
{"query":{"match":{"Father Name":"value"}},"from":0,"size":20}
My goal is to find in have a search like with _all, but limited to a few fields.
Your fields Father Name and Mother Name are arrays of inner objects.
To search within the value field within Father Name, for example, do
curl -XGET localhost:9200/myindex/_search?pretty -d '
{
"query": {
"match": {
"Father Name.value": "first"
}
},
"from": 0,
"size": 20
}'
I'm not sure, however, how to query for all fields within Father Name.
Reference Arrays of Inner Objects
Sam Shen's answer is the way to go if you need to configure which properties to use on a per-query basis.
One alternative is to configure the fields to not be included in the _all query.
For example, this would cause only Father Name to be included in the _all query, by disabling all the fields at the type level, then enabling all of the subfields on Father Name.
PUT index
{
"mappings": {
"type": {
"include_in_all": false,
"properties": {
"Father Name" : {
"include_in_all": true
}
}
}
}
You can set the include_in_all property on any level in the mapping, including subfield properties.
The big drawback here is that this isn't configured on a query-by-query basis, this is configured for all queries attempting to use the _all field.
Related
I am pretty new to Elastic Search. I have a dataset with multiple fields like name, product_info, description etc., So while searching a document, the search term can come from any of these fields (let us call them as "search core fields").
If I start storing the data in elastic search, should I derive a field which is a concatenated term of all the "search core fields" ? and then index this field alone ?
I came across _all mapping concept and little confused. Does it do the same ?
no, you don't need to create any new field with concatenated terms.
You can just use _all with match query to search a text from any field.
About _all, yes, it searches the text from any field
The _all field has been removed in ES 7, so it would only work in ES 6 and previous versions. The main reason for this is that it used too much storage space.
However, you can define your own all field using the copy_to feature. You basically specify in your mapping which fields should be copied to your custom all field and then you can search on that field.
You can define your mapping like this:
PUT my-index
{
"mappings": {
"properties": {
"name": {
"type": "text",
"copy_to": "custom_all"
},
"product_info": {
"type": "text",
"copy_to": "custom_all"
},
"description": {
"type": "text",
"copy_to": "custom_all"
},
"custom_all": {
"type": "text"
}
}
}
}
PUT my-index/_doc/1
{
"name": "XYZ",
"product_info": "ABC product",
"description": "this product does blablabla"
}
And then you can search on your "all" field like this:
POST my-index/_search
{
"query": {
"match": {
"custom_all": {
"query": "ABC",
"operator": "and"
}
}
}
}
I search for all fields using Elasticsearch, do you know which field matched?
PUT my_index/user/1
{
"first_name": "John",
"last_name": "Smith",
"date_of_birth": "1970-10-24"
}
GET my_index/_search
{
"query": {
"match": {
"_all": "john 1970"
}
}
}
In the above example, "john 1970" is searched for all fields.
Since the put document matches "first_name" and "date_of_birth", it returns as a result.
How do I know that it matches "first_name" and "date_of_birth"?
The thing is that _all is a field into which all values from all other fields are copied at indexing time. Concretely, when you index your document, what ES conceptually sees is this (though the source is not modified to contain _all and _all itself is not stored, just indexed):
{
"first_name": "John",
"last_name": "Smith",
"date_of_birth": "1970-10-24",
"_all": "john smith 1970 10 24"
}
So if you match against _all then the only field that can match is _all itself, there's no way to "reverse-engineer" which field contained which matching value solely based on _all.
What you can do, however, is to use another feature called highlighting. Since the _all field is not stored it cannot be highlighted but the other fields can, so you can highlight which original fields match which values:
{
"query": {
"match": {
"_all": "john 1970"
}
},
"highlight": {
"fields": {
"*": {
"require_field_match": false
}
}
}
}
In the response, you'll see something like this which shows that first_name matches the query.
"highlight": {
"first_name": [
"<em>John</em>"
]
}
Consider a document in Elasticsearch like this:
{
"id": 1,
"Comment": "Comment text",
"Reply": [{
"id": 2,
"Comment": "Nested comment text",
}, {
"id": 3,
"Comment": "Another nested comment text",
}]
}
I want to search for id == 2 without knowing whether it is in the firsts level of the document or in the second. Is this possible? Please also keep in mind that the nested level can be anything (unknown at development time).
If this is possible, what's the query to return this document by searching for id == 2 without knowing that there's an id is in the second level of the document?
Try this:
{
"query": {
"query_string": {
"fields": ["*.id","id"],
"query": "2"
}
}
}
I have 2 fields in my Elasticsearch cache that I created using Oracle river-jdbc.
First column is numeric and second is string as name.
I want to use this indexing in autocomplete textbox using jQuery.
All implementation is done for the name field.
User can provide any string (at least 3 characters) then hit goes to Elasticsearch with the given string and searches data as it is "In-String" part of name field and returns the result. Similarly to querying in SQL using the LIKE operator for name field and it's working and data is loaded in the UI.
I want to do the same with the numeric field, but until and unless I give complete value of the numeric field Elasticsearch doesn't return any data. So autocomplete does not works for numeric field.
Below is the code:
Creating river field as:
{
"type": "jdbc",
"jdbc": {
"driver": "oracle.jdbc.driver.OracleDriver",
"url": "jdbc:oracle:thin:#//<ip-addr>:1521/db",
"user": "user",
"password": "pwd",
"sql": "select curr_duns_number as duns, TRIM(name) as company from subject where rownum < 10000"
},
"index": {
"index": "subject",
"type": "name"
},
"properties": {
"duns": {"type": "string", "store": "yes"},
"company": {"type": "string"}
}
}
Fetching company field:
POST http://<ip-addr>:9200/subject/name/_search
{
"from": 0,
"size": 10,
"query": {
"wildcard": {
"COMPANY": "boo*"
}
},
"sort": [
{
"COMPANY": {"order": "asc"}
}
]
}
After trying various combinations like wildcard, matching, and query_string it doesn't give me results, and I'm left with the following problems:
I cannot query numeric fields in a similar way to how it's done using SQL, e.g. select * from subject where curr_duns_number like '%123%';
Sorting order is not properly applied as the token for sorting Elasticsearch is considering is usually a word from company name.
Well after too much research I could not find any answer on this. As a solution I changed the type of numeric field into string by appending a string to it and achieve the auto-completion for this project.
For the sake of closing the question I am accepting this answer but if any solution comes in future I will add or any one can add his comments to it.
Thanks
Lets say I have the following mapping:
"site": {
"properties": {
"title": { "type": "string" },
"description": { "type": "string" },
"category": { "type": "string" },
"tags": { "type": "array" },
"point": { "type": "geo_point" }
"localities": {
type: 'nested',
properties: {
"title": { "type": "string" },
"description": { "type": "string" },
"point": { "type": "geo_point" }
}
}
}
}
I'm then doing an "_geo_distance" sort on the parent document and am able to sort the documents on "site.point". However I would also like the nested localities to be sorted by "_geo_distance", inside the parent document.
Is this possible? If so, how?
Unfortunately, no (at least not yet).
A query in ElasticSearch just identifies which documents match the query, and how well they match.
To understand what nested documents are useful for, consider this example:
{
"title": "My post",
"body": "Text in my body...",
"followers": [
{
"name": "Joe",
"status": "active"
},
{
"name": "Mary",
"status": "pending"
},
]
}
The above JSON, once indexed in ES, is functionally equivalent to the following. Note how the followers field has been flattened:
{
"title": "My post",
"body": "Text in my body...",
"followers.name": ["Joe","Mary"],
"followers.status": ["active","pending"]
}
A search for: followers with status == active and name == Mary would match this document... incorrectly.
Nested fields allow us to work around this limitation. If the followers field is declared to be of type nested instead of type object then its contents are created as a separate (invisible) sub-document internally. That means that we can use a nested query or nested filter to query these nested documents as individual docs.
However, the output from the nested query/filter clauses only tells us if the main doc matches, and how well it matches. It doesn't even tell us which of the nested docs matched. To figure that out, we'd have to write code in our application to check each of the nested docs against our search criteria.
There are a few open issues requesting the addition of these features, but it is not an easy problem to solve.
The only way to achieve what you want is to index your sub-docs as separate documents, and to query and sort them independently. It may be useful to establish a parent-child relationship between the main doc and these separate sub-docs. (see parent-type mapping, the Parent & Child section of the index api docs, and the top-children and has-child queries.
Also, an ES user has mailed the list about a new has_parent filter that they are currently working on in a fork. However, this is not available in the main ES repo yet.