How to query from two elasticsearch indexes with one query - elasticsearch

Newbie here. I have two indexes: /items and /categories that contain data similar to this:
#Items:
{ id:1, name: "banana", categories: [1] },
{ id:2, name: "coconut", categories: [1] },
{ id:3, name: "smoothie", categories: [1, 2] }
#Categories
{ id:1, name: "Apple" },
{ id:2, name: "Banana" }
I want to create a query with bana as a query string and I should return items 1 and 3. It should not return category objects. ID 1 because of the name match and ID 3 because it has a category that has a name match.
How could a query like this be constructed?
My current query does not return item 3 and simply like
GET /items/_search
{
"query": {
"bool":{
"must":{
"query_string":{
"query":"bana"
}
}
}
}
}

Is it necessary to create two indexes for this? maybe use name directly in Item index to replace id:
{ id:1, name: "banana", categories: ["Apple"] }
Since index will have its own dictionary, will not cause extra disk space.
if really need to join two indexes, maybe you want to try Parent-Child Relationship, but it will deprecated in future, and replaced by: join datatype, this way also will keep data in one index.

Related

GraphQL - When to use a resolver or an argument with recursive and normalized data?

I'm working with a very large normalized and recursive object. I want to get the list of all recursive items. Should I use an argument or a custom resolver?
My object looks like:
{
products: [{
product_id: "car",
bundle_id: 5
},{
product_id: "door"
bundle_id: 6
},
{ product_id: "wheel" },
{ product_id: "metal" },
{ product_id: "glass" }],
bundles: [{
bundle_id: 5,
options: [{product_id: "door"},{product_id: "wheel"}]
},
{
bundle_id: 6,
options: [{product_id: "metal"},{product_id: "glass"}]
}]
}
You might notice that "car" is a bundle that has a door and a wheel. "door" is also a bundle that has metal and glass. This structure could recurse indefinitely. That is, a bundle could have infinitely more bundle products underneath it.
I want to get a list of all products for a bundle (example: "car"). What is the best approach?
I see two options.
First Option - use a custom resolver, for example child_products that would recurse and resolve to a flat array of all children:
products(product_id: "car") {
product_id
bundle {
options {
product_id
}
}
child_products {
product_id
bundle {
options {
product_id
}
}
}
}
Second Option - use an argument that specifies including all children:
products(product_id: "car", include_children: true) {
product_id
bundle {
options {
product_id
}
}
}
I'm going to build a JS library that can take the array of products and options and build the nested structure. Please let me know what you think is the right way. Thanks!
You should not need an argument like include_children because a client's query will be sufficient to determine whether to include the nodes or not -- if a client doesn't need the nodes, it can simply omit the appropriate field.
Based on the provided JSON object, I would expect a schema that looks something like this:
type Query {
product(id: ID!): Product
}
type Product {
id: ID!
bundle: Bundle
}
type Bundle {
id: ID!
options: [Product!]!
}
which would let you make a query like:
query {
product(id: "car") {
id
bundle {
options {
id
bundle {
id
# and so on...
}
}
}
}
}
The actual depth of this query would be left up to the client's needs. Recursive type definitions like this do present a possible attack vector and so you should also look into using a library like graphql-depth-limit or graphql-query-complexity.

How to adapt query to API?

I'm trying to wrap my head around GraphQL.
Right now I'm just playing with the public API of Artsy (an art website, playground at https://metaphysics-production.artsy.net). What I want to achieve is following:
I want to get all node types entities without declaring them by hand (is there a shortcut for this)?
I want every node with a field type from which I can read the type, without parsing through imageUrl etc. to fint that out.
What I constructed as of right now is this:
{
search(query: "Berlin", first: 100, page: 1, entities: [ARTIST, ARTWORK, ARTICLE]) {
edges {
node {
displayLabel
imageUrl
href
}
}
}}
Very primitive I guess. Can you guys help me?
TL;DR:
1) There is no shortcut, it's not something GraphQL offers out of the box. Nor is it something I was able to find via their Schema.
2) Their returned node of type Searchable does not contain a property for type that you're looking for. But you can access it via the ... on SearchableItem (union) syntax.
Explanation:
For question 1):
Looking at their schema, you can see that their search query has the following type details:
search(
query: String!
entities: [SearchEntity]
mode: SearchMode
aggregations: [SearchAggregation]
page: Int
after: String
first: Int
before: String
last: Int
): SearchableConnection
The query accepts an entities property of type SearchEntity which looks like this:
enum SearchEntity {
ARTIST
ARTWORK
ARTICLE
CITY
COLLECTION
FAIR
FEATURE
GALLERY
GENE
INSTITUTION
PROFILE
SALE
SHOW
TAG
}
Depending on what your usecase is, if you're constructing this query via some code, then you can find out which SearchEntity values they have:
{
__type(name: "SearchEntity") {
name
enumValues {
name
}
}
}
Which returns:
{
"data": {
"__type": {
"name": "SearchEntity",
"enumValues": [
{
"name": "ARTIST"
},
{
"name": "ARTWORK"
},
...
}
}
}
then store them in an array, omit the quotation marks from the enum and pass the array back to the original query directly as an argument.
Something along the lines of this:
query search($entities: [SearchEntity]) {
search(query: "Berlin", first: 100, page: 1, entities: $entities) {
edges {
node {
displayLabel
imageUrl
href
}
}
}
}
and in your query variables section, you just need to add:
{
"entities": [ARTIST, ARTWORK, ...]
}
As for question 2)
The query itself returns a SearchableConnection object.
type SearchableConnection {
pageInfo: PageInfo!
edges: [SearchableEdge]
pageCursors: PageCursors
totalCount: Int
aggregations: [SearchAggregationResults]
}
Digging deeper, we can see that they have edges, of type SearchableEdge - which is what you're querying.
type SearchableEdge {
node: Searchable
cursor: String!
}
and finally, node of type Searchable which contains the data you're trying to access.
Now, the type Searchable doesn't contain type:
type Searchable {
displayLabel: String
imageUrl: String
href: String
}
But, if you look at where that Searchable type is implemented, you can see SearchableItem - which contains the property of displayType - which doesn't actually exist in Searchable.
You can access the property of SearchableItem and get the displayType, like so:
{
search(query: "Berlin", first: 100, page: 1, entities: [ARTIST, ARTWORK, ARTICLE]) {
edges {
node {
displayLabel
imageUrl
href
... on SearchableItem {
displayType
}
}
}
}
}
and your result will look like this:
{
"data": {
"search": {
"edges": [
{
"node": {
"displayLabel": "Boris Berlin",
"imageUrl": "https://d32dm0rphc51dk.cloudfront.net/CRxSPNyhHKDIonwLKIVmIA/square.jpg",
"href": "/artist/boris-berlin",
"displayType": "Artist"
}
},
...

ElasticSearch query for items not in given array

I am trying to write a part of a query to filter out any items with a type as "group" and that have a group id that isn't in a given array of ids. I started writing a bool query with a must and must_not but I was getting tripped up on how to write "id not in the given array.
EDIT:
I am actually converting an outdated query using "and" and "not" to be ES 5.5 compatible. Here is the old query that worked.
:and => [
{
term: {
type: 'group'
}
},
{
:not => {
terms: {
group_id: group_ids
}
}
},
{
:not => {
terms: {
user_id: user_ids
}
}
}
]
group_ids and user_ids are arrays.
You probably have not analyzed the arrays with the IDs. You can use a Bool query with a filter clause, and then within that filter start a new bool query with a mustNot clause and within that clause add a terms query with your IDs.
bool: {
must: {
term: {
kind: 'group'
}
},
must_not: [
{
terms: {
group_id: group_ids
}
},
{
terms: {
user_id: user_ids
}
}
]
}

Query nested object based on key in MongoDB

My Schema Sample:
{
_id: '1234',
daily: {
'12-06-03':{
a:1,
b:2
},
'12-06-04':{
c:1,
d:2
},
'12-06-05':{
e:1,
f:2
},
'12-06-06':{
a:1,
b:2
}
}
}
My Query: i want to query All 'daily' object's nested objects greater than or less than particular date (assume: 12-06-05).
I understand one method is to retrieve entire daily object and then compare by iterating over each key of daily object.

Tire search return terms by first letter

I'm using Tire/ElasticSearch to create an alphabetical browse of all the tags in my database. However, the tire search returns the tag I want as well as all the other tags associated to the same item. So, for example, if my letter was "A" and an item had the tags 'aardvark' and 'biscuit', both 'aardvark' and 'biscuit' would show up as results for the 'A' query. How can I construct this so that I only get 'aardvark'?
def explore
#get alphabetical tire results with term and count only
my_letter = "A"
self.search_result = Tire.search index_name, 'tags' => 'count' do
query {string 'tags:' + my_letter + '*'}
facet 'tags' do
terms 'tags', :order => 'term'
end
end.results
end
Mapping:
{
items: {
item: {
properties: {
tags: {
type: "string",
index_name: "tag",
index: "not_analyzed",
omit_norms: true,
index_options: "docs"
},
}
}
}
}
Following things that you'll need to change:
Mapping
You need to map the tags properly in order to search through them. And as your tags, are inside you item document, you need to set the properties of tags as nested, so that you can apply your search query in the facets too. Here is the mapping that you need to set:
{
item: {
items: {
properties: {
tags: {
properties: {
type: "nested",
properties: {
value: {
type: "string",
analyzer: 'not_analyzed'
}
}
}
}
}
}
}
}
Query
Now, you can use prefix query to search through the tags that start with a certain letter and get the facets, Here is the complete query:
query: {
nested: {
path: "tags",
query: {
prefix: {
'tags.value' : 'A'
}
}
}
}
facets: {
words: {
terms: {field: 'tags.value'},
type: 'nested',
facet_filter: {prefix: {
'tags.value' : 'A'
}
}
}
}
Facet filter is applied while computing facets, so you'll only get the facets which will match your criteria. I preferred prefix query over regular exp. query because of performance issues. But I am not quite sure whether prefix query works for your problem. Let me know it it doesn't work.

Resources