What is a good practice to map my Elasticsearch index? - elasticsearch

In order to model different types of metrics and audit logs when a search results should include both of them.
What is more encouraged in terms of fast search performance and indexing efficiency from the 3 options below:
Mapping different subfields for each object
Using a generic object that has predefined fields and a dynamic extra metadata field
Using a different index for each object
Examples for each option:
Option 1 - Mapping different subfields under the same index unified_index_00001
{
'type': 'audit_log',
'audit_log_object': {
'name': 'object_name',
'action': 'edit',
'before': 'field_value',
'after': 'field_value_update'
}
},
{
'type': 'network',
'network': {
'name': 'network_name',
'event': 'access',
'ip': 'xxx.xxx.xxx.xxx'
}
},
{...} ...
Option 2 - Mapping a generic object under the same index unified_index_00001
{
'type': 'audit_log',
'name': 'object_name',
'action': 'edit'
'meta': {
'before': 'field_value',
'after': 'field_value_update'
}
},
{
'type': 'network',
'name': 'network_name',
'action': 'access',
'meta': {
'ip': 'xxx.xxx.xxx.xxx'
}
},
{...} ...
Option 3 - Using a different index for each object
audit_log_index_00001
{
'name': 'object_name',
'action': 'edit',
'before': 'field_value',
'after': 'field_value_update'
},
{...}
...
metric_index_00001
{
'name': 'network_name',
'event': 'event_type',
'ip': 'xxx.xxx.xxx.xxx'
},
{...} ...
Note: There is only need for Term indexing (no need of text searches)

In Elasticsearch, you usually want to start with the queries and then work backwards from there.
If you always query only events of one type, separate indices make sense. If there are mixed queries - you would be happier with a joint index, and usually with extracted common fields ("option 2") otherwise those queries won't really work.
Also, take into account Elasticsearch limitations:
fields per index (1000 by default)
shards and indices per cluster (thousands but still)
etc

Related

Spring Data: Update an element in an objects array of a nested MongoDB document

I'm building Spring/MongoDB backend app for an online store and I have the next structure of MongoDB document:
'client_data': [
'_id': 'sdas2DSA',
'client_name': 'aurelis',
'stores': [
{
'store_id': 1,
'products': [
{
'product_id': 1,
'name': 'T-shirt',
'price': 120
},
{
'product_id': 2,
'name': 'Pants',
'price': 120
},
],
}
]
]
strong text
I need to implement logic of updating a product entity of the given client's store by product_id, but I'm stucked at the stage of querying.
Suppose the client with the 'client_name'='aurelis' is sent a post request with parameter 'store_id'=1 with the given body:
{
'product_id': 1,
'name': 'T-shirt',
'price': 90
},
How can I built an query that updates the product object of the specific client's store by product_id?
I tried to implement the functionality by myself:
Query query = new Query(new Criteria().andOperator(
Criteria.where("client_name").is(clientName),
Criteria.where("stores.store_id").is(storeId),Criteria.where("stores.products").elemMatch(Criteria.where("products.product_id").is(product_id))
));
Update update = new Update().set("stores.products", product)
mongoTemplate = new MongoTemplate().findAndModify(query, update, Product.class);
but it doesn't work.
I'd be grateful for your answers:)

Which event(s) to use in place of Google Chat's "DeprecatedEvent"

I have several questions related to google.golang.org/api/chat/v1.
It seems the only available Event object is DeprecatedEvent. So, what is the non-deprecated one to use?
If DeprecatedEvent is indeed the intended one to use, inside the DeprecatedEvent object, there's a User *User field. However, it seems that the User object is different from what I actually got from the responses.
For example:
{
'eventTime': '2019-08-27T06:50:12.391141Z',
'user': {
'name': 'users/112...',
'email': 'iskandar.setiadi#...',
'avatarUrl': 'https://lh3.googleusercontent.com/a-/AAu...',
'displayName': 'Iskandar Setiadi',
'type': 'HUMAN'
},
'type': 'ADDED_TO_SPACE',
'space': {
'name': 'spaces/7ag...',
'type': 'DM'
}
}
In the API, User object only contains displayName, name, and type. It seems that email and avatarUrl are not there. Is v1 outdated or is there any alternatives that I don't know?

Complex secondary index operations

I can create a secondary index for this table:
{
contact: [
'example#example.com'
]
}
Like this: indexCreate('contact', {multi: true})
But can I create index for this:
{
contact: [
{
type: 'email',
main: true
value: 'andros705#gmail.com'
}
{
type: 'phone'
value: '0735521632'
}
]
}
Secondary index would only search in objects whose type is 'email' and main is set to 'true'
Here's how you might create such an index:
table.indexCreate(
'email',
row => row('contact').filter({type: 'email'})('value'),
{multi: true})
This works by using a multi-index. When the multi: true argument is passed to indexCreate, the index function is expected to return an array instead of a single value. Every element in that array can be used to look up the document in the index (using getAll or between). If the array is empty, the document will not show up in the index.

GraphQL Root Query: Same schema, different types

I'm pretty new to GraphQL and within my root query I have two fields that are very similar aside from their "type" property, that I would like to combine.
allPosts returns an array of post objects, while post returns a single post.
Each field is using the same schema, and the loaders/resources are being determined within those respective fields based on the argument passed in.
const RootQuery = new graphql.GraphQLObjectType({
name: 'Query',
description: 'Root Query',
fields: {
allPosts: {
type: new graphql.GraphQLList(postType),
args: {
categoryName: {
type: graphql.GraphQLString
}
},
resolve: (root, args) => resolver(args)
},
post: {
type: postType,
args: {
slug: {
type: graphql.GraphQLString
}
},
resolve: (root, args) => resolver(args)
},
});
Is it possible to combine these two fields into one and have the type determined by the argument passed in or another variable?
No, you can't!
Once you define a field as GraphQLList, you always get an array. There is no chance that you suddenly get an object instead of array of.
Same apply to other case when you define field as GraphQLObjectType (or any other scalar type) and you want get an array as result.
Those two fields have really different purposes.
Anyway, you can always add a limit logic to your allPosts field and limit the result to one. But, nevertheless you get always array with only one post

GridFS - product images & thumbnails - what is the best DB sctructure?

I have a e-commerce website working on MongoDB + GridFS.
Each product may have up to 5 images.
Each image has 3 thumbnails of different sizes.
I need an advice on best DB structure for this.
Currently I'm thinking to store image IDs and also thumb IDs (IDs from GridFS) in each product:
{
'_id': 1,
'title': 'Some Product',
'images': [
{'id': '11', thumbs: {'small': '22', 'medium': '33'},
{'id': '44', thumbs: {'small': '55', 'medium': '66'}
]
}
Or would it be better to store path in GridFS?
{
'_id': '111',
'filename': '1.jpg',
'path': 'product/988/image/111/'
},
{
'_id': '222',
'filename': '1.jpg',
'path': 'product/988/image/111/thumbnail_small'
},
{
'_id': '333',
'filename': '1.jpg',
'path': 'product/988/image/111/thumbnail_large'
}
UPDATE: "path" field in GridFS is a "fake" path, not a real one. Just a quick way to find all related files. It is cheaper to have 1 indexed field than several fields with compound indexes.
If you will store the images with GridFS within MongoDB, I would go with the first one.
The second schema doesn't seem to be correct. I mean GridFS is supposed to store files, so with the id of the image you don't need any path within those documents. If you simply want to store the path of the file, directly embedd it into your primary collection, so you don't need this overhead of a somewhat useless collection.
In general see Storing Images in DB - Yea or Nay? if you really should store your images in dbms.
Additionally if you only save the path you may need few to no changes in case you're switching to some CDN to get your images to the customer.

Resources