I have a field in elastic that is sometimes a string and sometimes a string array. The field in my .NET model is a string array. During deserialization I would like to convert this always to a string array to match my model. Would I use IElasticsearchSerializer or is that for handling the entire source and not just a single field? Does anyone have any simple example I could try?
Related
I was doing a project, and I have an event
event AddedDoctor(
address indexed doctorAddress,
string indexed name,
string indexed doctorRegistrationId,
uint256 dateOfRegistration,
string specialization,
address hospitalAddress
);
I am not able to access all the parameters of this event to index it is The Graph. I am facing two issues :
string indexed name parameter is indexed so it is accessible by event.params.name but it is in the Bytes format. On searching the net I found that indexed strings or arrays are stored as hashes and not plain strings. How do I get unstuck.
I am not able to read unindexed parameters string specialization and address hospitalAddress using event.params.specialization and event.params.hospitalAddress. How do I access these unindexed parameters?
Basically I want to index all these event parameters in The Graph for easy retrieval of data. How can I do that?
I found that The Graph can index the unindexed string parameter in the string format and we can directly work with that. The indexed string parameters are hashed into into 20 bytes object which is difficult to work with. I found this answer very helpful:
https://ethereum.stackexchange.com/questions/6840/indexed-event-with-string-not-getting-logged
So, basically I removed indexed keyword from all the parameters of string or array type and it works.
My log POCO has several fixed properties, like user id, timestamp, with a flexible data bag property, which is a JSON representation of any kind of extra information I'd like to add to the log. This means the property names could be anything within this data bag, bringing me 2 questions:
How can I configure the mapping so that the data bag property, which is of type string, would be mapped to a JSON object during the indexing, instead of being treated as a normal string?
With the data bag object having arbitrary property names, meaning the overall document type could have a huge number of properties inside, would this hurt the search performance?
For the data translation from string to JSON you can use ingest pipeline with JSON processor:
https://www.elastic.co/guide/en/elasticsearch/reference/master/json-processor.html
It depends of you queries. If you'll use the "free text search" - yes, the huge number of fields will slow the query. If you you'll use query like "field":"value" - no, there is no problem with the fields number in the searches. Additional information about query optimization you cold find here:
https://www.elastic.co/guide/en/elasticsearch/reference/7.15/tune-for-search-speed.html#search-as-few-fields-as-possible
And the question is: what you meen, when say "huge number"? 1000? 10000? 100000? As part of optimization i recommend to use dynamic templates with the definition: each string field automatically ingest into the index as "keyword" and not text + keyword. This setting decrease the number of fields to half.
I am wondering what are the other advantages except type validation of integer field type in comparison to string type. As far as I know in Lucene index those fields anyway are stored in common byte format.
The reason why I am asking is that I have a field value which can be both string and integer. I am thinking about should I create different types inside a mapping, i.e. localhost:9200/index/string_type and localhost:9200/index/integer_type or I can safely (in terms of performance and other aspects) use string type for both variants.
I am using elastic 2.4.
You could go with the string_type for both actually. I don't personally see any advantages of having an interger_type over the string. But then make sure that you map the string as not_analyzed, hence the value of the field will not be analyzed or tokenized. So that you could simply use the field for aggregations. Maybe you should have a look at this one which elaborates more. Having both the field types at once would not make any difference at all from doing the above.
I have an elasticsearch (version 1.7) cluster with multiple indices. Each index has multiple doc_types, and each has fields w/ a variety of types. I'd like to get a list of field names for a given field type. This would be a necessarily nested list. For example, I'd like to query for field type "string", and return {index1: {doc_type1.1: [field1.1.1, field1.1.2], ...} -- the leaves of this nested dicts are only those fields w/ the given type. So the hits for this query won't be documents but rather a subset of the cluster's mapping. Is this possible using Elasticsearch?
One solution: I know I can get the mapping as a dict using Python, then work on the mapping dict to recover this nested list. But I think there should be an elasticsearch way of doing this, not a Python solution. In my searches through the documentation I just keep finding the "type filter" which filters by doc_type, not field type.
There's currently no way of achieving this. The _mapping endpoint will return all fields of the request mapping type(s).
However, there might be a way, provided your fields have a special naming convention hinting at their type, for instance name_str (string field for "name"), age_int (integer field for "age"), etc. In this case, you could use response filtering on the _mapping call and retrieve only the fields ending with _str:
curl -XGET localhost:9200/yourindex/_mapping/yourtype?filter_path=*_str
I am developing an API using Codeigniter and MongoDB.
In this system I am saving the full name and _ID of users that the selected user
is following.
What is best to do regarding the _Id? Store it as an object or as a string?
If I store it as an object I need to convert it to string when echoing out followers otherwise
the output looks strange.
My question is really. Is it ok to store the _Id as a string rather than an object?
What is the downside of storing as string?
Thankful for all input!
Performance for requests (and updates) are really better with objectid. More over, objectid are quite small in space.
From the official doc :
BSON includes a binary data datatype for storing byte arrays. Using
this will make the id values, and their respective keys in the _id
index, twice as small.
here are 2 links that can help you :
- http://www.mongodb.org/display/DOCS/Optimizing+Object+IDs
- http://www.mongodb.org/display/DOCS/Object+IDs
When you use ObjectId, it generates _id as a unique value in all your computers. So if you use Sharding, you will not worry about you _id conflicts. See how ObjectId generates in specification
But if you use string, you should generate it carefully.