I'm pretty new to ElasticSearch. I'm using v2.0.0. I would like to know how to focus a search query on one portion of the document in order to answer the question "Get me (a page of 50) People who are a member of Group "Developers".
The document structure of a single person might look like something like this:
{
"_index": "people",
"_type": "employee",
"_id": "8725",
"_source": {
"id": 43470,
"firstName": "John",
"lastName": "Smith",
"groups": [
{
"id": 345,
"name": "Developers"
},
{
"id": 75432,
"name": "Scrummasters"
},
{
"id": 5789,
"name": "UX"
}
]
}
}
So what I want to do is look at the name of each group of each person to see if it matches what I'm looking for and if so, select the whole person. The position of what I'm looking for is obviously not static, and I can't do something simpler like
q=roles:developer
Indeed, I can select employees that are in group Developers by specifying the query string as "q=groups.name:Developers". Indeed, I could also use a wildcard to get anybody in group Admin or Administrators or Admins using "q=groups.name:Admin*"
Related
When I search with only birthData in fhir I am getting results.
For example: http://localhost:8080/hapi-fhir-jpaserver/fhir/Patient?_pretty=true&birthdate=2020-03-16 will return patient who has birthdate as 2020-03-16.
When I am searching with _content I am not getting any results. Something like this:
http://localhost:8080/hapi-fhir-jpaserver/fhir/Patient?_content=2019-09-05
_content is for searching text content.
If you want to search for dates you need to use a date search parameter. E.g.:
http://localhost:8080/hapi-fhir-jpaserver/fhir/Patient?birthDate=2019-09-05
This can be achieved using Search Parameters.
Search parameters are essentially named paths within resources that are indexed by the system so that they can be used to find resources that match a given criteria.
Using Search parameter
we can add additional search parameters that will index fields that do not have a standard search parameter defined.
we can add additional search parameters that will index extensions used by your clients.
we can disable search parameters
Example:
Lets say I have a PractitionerRole
"resourceType": "PractitionerRole",
"id": "6639",
"meta": {
"versionId": "1",
"lastUpdated": "2020-03-19T13:26:34.748+05:30",
"source": "#aYyeIlv9Yutudiwy"
},
"text": {
"status": "generated",
"div": "<div xmlns=\"<http://www.w3.org/1999/xhtml\">foo</div>">
},
"active": true,
"practitioner": {
"reference": "Practitioner/6607"
},
"organization": {
"reference": "Organization/6528"
},
"specialty": [
{
"coding": [
{
"system": "<http://snomed.info/sct",>
"code": "42343242",
"display": "Clinical immunology"
}
]
}
]
}
PractitionerRole has thier own search parameters. Apart from those search parameters we wanted to have a search parameter which will filter all practitioner roles based on practitioner.reference. We can achieve this using Search parameters. All we need to do is creating a new search parameter just like below.
{
"resourceType": "SearchParameter",
"title": "Practitioner Referecene",
"base": [ "PractitionerRole" ],
"status": "active",
"code": "practitioner_reference",
"type": "token",
"expression": "PractitionerRole.practitioner.reference",
"xpathUsage": "normal"
}
Here what fhir tells is when user wanted to filter with practitioner_reference then look for PractitionerRole.practitioner.reference.
This looks something like this:
http://localhost:8080/hapi-fhir-jpaserver/fhir/PractitionerRole?practitioner_reference=Practitioner/6607
We can also extend this to search with multiple parameters. We can create a search parameter with or condition so that it can search with multi parameters.
"resourceType": "SearchParameter",
"title": "Patient Multi Search",
"base": [ "Patient" ],
"status": "active",
"code": "pcontent",
"type": "token",
"expression": "Patient.managingOrganization.reference|Patient.birthDate|Patient.address[0].city",
"xpathUsage": "normal"
}
Above SearchParameter will search withPatient.managingOrganization.reference or Patient.birthDate or Patient.address[0].city.
The query looks like this:
Search With City → http://localhost:8080/hapi-fhir-jpaserver/fhir/Patient?pcontent=Bruenmouth
Search With Birth Date → http://localhost:8080/hapi-fhir-jpaserver/fhir/Patient?pcontent=2019-04-06
There is a microservice-based architecture wherein each service has a different type of entity. For example:
Service-1:
{
"entity_type": "SKU",
"sku": "123",
"ext_sku": "201",
"store": "1",
"product": "abc",
"timestamp": 1564484862000
}
Service-2:
{
"entity_type": "PRODUCT",
"product": "abc",
"parent": "xyz",
"description": "curd",
"unit_of_measure": "gm",
"quantity": "200",
"timestamp": 1564484863000
}
Service-3:
{
"entity_type": "PRICE",
"meta": {
"store": "1",
"sku": "123"
},
"price": "200",
"currency": "INR",
"timestamp": 1564484962000
}
Service-4:
{
"entity_type": "INVENTORY",
"meta": {
"store": "1",
"sku": "123"
},
"in_stock": true,
"inventory": 10,
"timestamp": 1564484864000
}
I want to write an Audit Service backed by elasticsearch, which will ingest all these entities and it will index based on entity_type, store, sku, timestamp.
Will elasticsearch be a good choice here? Also, how will the indexing work? So, for example, if I search for store=1, it should return all the different entities that have store as 1. Secondly, will I be able to get all the entities between 2 timestamps?
Will ES and Kibana (to visualize) be good choices here?
Yes. Your use case is pretty much exactly what is described in the docs under filter context:
In filter context, a query clause answers the question “Does this
document match this query clause?” The answer is a simple Yes or
No — no scores are calculated. Filter context is mostly used for
filtering structured data, e.g.
Does this timestamp fall into the range 2015 to 2016?
Is the status field set to published?
I have a document in our ElasticSearch index which looks like this:
{
"_index": "nm_doc",
"_type": "nm_doc",
"_id": "JRPXqmQBatyecf67YEfq",
"_score": 0.86147696,
"_source": {
"text": "A 29-year-old IT professional from Bhopal was convicted and sentenced to life imprisonment by an Additional Sessions Court in Pune on Wednesday for the rape and brutal murder of a woman in 2008, after she had refused his advances. Watch What Else is Making News The court found Manu Mohinder Ebrol, who worked in the same firm as the girl, of raping and killing the woman after stabbing her 18 times on the night of October 20, 2008, in her rented apartment. After committing the crime, Ebrol had fled to Bhopal. He was arrested later by Pune Police. The prosecution examined 26 witnesses for the case and forensic evidence such as call details and medical records also proved crucial. For all the latest Pune News , download Indian Express App",
"entities": [
{
"name": "Mohinder Ebrol"
},
{
"name": "Sessions Court"
},
{
"name": "Pune Police"
},
{
"name": "Pune News"
},
{
"name": "Indian Express"
}
]
}
If I wanted to edit just the first name in that array (Mohinder Ebrol) to be Manu Ebrol, how would I accomplish this via API call? Do I need to pass in the entire array to update the one name?
I have figured it out via the documentation:
The call Url is:
POST http://elastichost:9200/indexname/_doc/JRPXqmQBatyecf67YEfq/_update?pretty
And the body simply looks like this (yes, you do have to provide the entire array):
{
"doc": { "entities": [
{
"name": "Manu Ebrol"
},
{
"name": "Sessions Court"
},
{
"name": "Pune Police"
},
{
"name": "Pune News"
},
{
"name": "Indian Express"
}
] }
}
Hope this can help someone in the future.
I search for key word machine4 in my ES . My python client is simply:
result = es.search('machine4', index='machines')
Result look like this
[
{
"_score": 0.13424811,
"_type": "person",
"_id": "2",
"_source": {
"date": "**20180601**",
"deleted": [],
"changed": [
"machine1",
"machine2",
"machine3"
],
"**added**": [
"**machine4**",
"machine5"
]
},
"_index": "contacts"
},
{
"_score": 0.13424811,
"_type": "person",
"_id": "3",
"_source": {
"date": "**20180701**",
"deleted": [
"machine2"
],
"**changed**": [
"machine1",
"**machine4**",
"machine3"
],
"added": [
"machine7"
]
},
"_index": "contacts"
}
]
So we can easily see:
In date 20180601 , machine4 belonged to added.
In date 20180701 , machine4 belonged to changed.
I can write another function to analyze the result. Basically loop through every key,value of each items and check if the searched keyword belong, like this:
for result in search_results['hits']['hits']:
source_result = result['_source']
for key,value in source_result.items():
if 'machine4' in value:
print key
However, I wonder if ES having API to detect which key/mapping/field that the searched keywords belonged to ? In this case is added of the 1st result, and changed in 2nd result
Thank you so much
Alex
The simple answer seems to be that no, Elasticsearch doesn't have a way to do this out of the box, because Lucene doesn't have it, as per this thread
Elasticsearch has the concept of highlights, however. These could be useful, but they do require you to have some idea about which fields the match may be in.
The ES Python search documentation suggests there's no way to do that as a parameter to search, but you could create a custom query and pass it on as the q argument. It would look something like:
q = {"query" : {"match": { "content": "'machine4'" }}, "highlight" : {"fields" : {"added" : {}, "updated": {}}}}
result = es.search(index='machines', q=q)
Hope this is helpful!
i use elasticsearch,and Denormalizing Data,like
PUT /my_index/user/1
{
"name": "John Smith",
"email": "john#smith.com",
"dob": "1970/10/24"
}
PUT /my_index/blogpost/2
{
"title": "Relationships",
"body": "It's complicated...",
"user": {
"id": 1,
"name": "John Smith"
}
}
but the problem is that Elasticsearch does not support ACID transactions. Changes to individual documents are ACIDic, but not changes involving multiple documents.if i want to change /my_index/user/1 and /my_index/blogpost/2 user name at one transaction,if one error it will rollback, how to do that?
There are no transactions in ES and never will according to inside sources.
The best way to achieve what you want is to make your updates in bulk and then check the response of each individual responses.
POST _bulk
{"index": {"_index": "my_index", "_type": "user", "_id": "1"}}
{ "name": "John Smith", "email": "john#smith.com", "dob": "1970/10/24" }
{"index": {"_index": "my_index", "_type": "blogpost", "_id": "2"}}
{ "title": "Relationships", "body": "It's complicated...", "user": { "id": 1, "name": "John Smith" }}
When your client gets the responses, it should check the items array and make sure that each item status is 200 (updated) or 201 (created). If that's the case, your bulk "transaction" was properly committed, if not, then everything with status 200 or 201 was committed otherwise the commit failed.