Say I have a field A with values:
"some string"
12
["I'm an array"]
{"great": "also an object"}
How does this work? (if it does at all)
I.e: In Elasticsearch for example an implicit field mapping is created under the covers based on the first value that comes in for said field, if an explicit mapping doesn't exist.
E.g.: if "some string" comes in as first value for A, A is assumed to contain strings from then on. If afterwards anything that can't be coerced to a string is persisted, the insert will fail.
Since RethinkDb is schemaless (no field mappings), does the same logic apply here?
Or, as an alternative, nothing at all is assumed on type, and polymorphic values can live happily side by side in the same field?
nothing at all is assumed on type, same field can have different type. They can live happily side by side. When doing query, if you needs some special decision based on type of field, you can use something like branch and typeOf, or do some pre-processing with map.
You can try this in data exploerer:
r.table('user').insert({f: "12"});
r.table('user').insert({f: 12}) ;
r.table('user').insert({f: [12]});
r.table('user').insert({f: {v: 12}});
Related
I have a scenario where i want to search for 'bank of india' and documents retrieved have hits for 'reserve bank of india', 'state bank of india', etc. Basically the search string named entity is part of another named entity as well.
What are the ways to avoid it in elasticsearch?
If you use keyword type instead of text as the mapping for your entity field you will no longer have those partial matches. keyword says treat this text like a single unit (named entities are like this), while text says treat each word as a unit and consider the field as a bag of words, So the query looks for the most word matches, regardless of order or if all of the words are there. There are different queries that can get at that requiring order (match_phrase) and requiring all words to be matches (minimum_should_match parameter), but I like to use the term query if you follow the keyword mapping strategy. Does that make sense?
When I write my queries to fetch a non-scalar type I often forget to include a subselection of the types fields I should see.
Is there a way to actually have some fields being subselected by default if I ommit the subselection syntax?
Nope. Nothing like it in the spec. I'd say the reason is you'd be almost guaranteed to often select an infinite (or at least humongous) graph by accident.
We hit an odd behaviour in ElasticSearch (ES). We have a database of companies, and wanted to sort results by company name (which is a field of type text). This failed, because ES sorted on any word in the name. So e.g. the name "Zoo on a Bus" would come out near the top, as it contains "a" which gets sorted highly.
As a work-around solution, we're using a type keyword field to sort.
Is there a better solution? A parameter we could pass into sort that would make it work alphabetically on the whole field.
I am wondering what are the other advantages except type validation of integer field type in comparison to string type. As far as I know in Lucene index those fields anyway are stored in common byte format.
The reason why I am asking is that I have a field value which can be both string and integer. I am thinking about should I create different types inside a mapping, i.e. localhost:9200/index/string_type and localhost:9200/index/integer_type or I can safely (in terms of performance and other aspects) use string type for both variants.
I am using elastic 2.4.
You could go with the string_type for both actually. I don't personally see any advantages of having an interger_type over the string. But then make sure that you map the string as not_analyzed, hence the value of the field will not be analyzed or tokenized. So that you could simply use the field for aggregations. Maybe you should have a look at this one which elaborates more. Having both the field types at once would not make any difference at all from doing the above.
Im trying filter records with that has a specific value key and delete them. I tried "withFields" and "hasFields" but seems that i can't apply delete to them. My question is how can i do that?
r.db('databaseFoo').table('checkpoints').filter(function (user) {
return user('type').default(false);
}).delete();
If you want all documents that have a type key, you can use hasFields for that.
r.db('databaseFoo').table('checkpoints')
.hasFields('type')
In your current query, what you are doing is getting all documents that don't have a type key or where the value for type is equal to false. This might be what you want, but it's a little confusing if you only want documents that have a type property.
Keeping a reference to the original document
The problem with using hasFields is that it converts a selection (a sequence with a reference to the specific rows in the database) that you can update, and delete into a sequence, with which you can't do that. This is a known issue in RethinkDB. You can read this blog post to understand the different types in ReQL a bit better.
In order to get around this, you can use the hasFields method with the filter method.
r.db('databaseFoo').table('checkpoints')
.filter(r.row.hasFields('type'))
.delete()
This query will work since it returns a selection which can then be passed into delete.
If you want to get all records with with a specific value at a specific key, you can do so a couple of different ways. To get all documents where the property type is equal to false, you can do as follows:
r.db('databaseFoo').table('checkpoints')
.filter({ type: false })
or, you can do:
r.db('databaseFoo').table('checkpoints')
.filter(r.row('type').eq(false))