Sorting Innerhits results using Nest - sorting

I'm having trouble figuring out the correct usage of .Sort() for inner hits. If I have a document structure like:
{
"id": "whatever",
"sales": [
{
"startsOn": "2022-05-01T00:00:00"
"endsOn": "2022-05-31T23:59:59"
"discountPercent": 0.10
},
{
"startsOn": "2022-06-01T00:00:00"
"endsOn": "2022-06-31T23:59:59"
"discountPercent": 0.15
}
]
}
I'm trying to order the innerResults by startsOn descending and my InnerHits code looks something like:
client.SearchAsync<Product>(x =>
x.Query(q =>
q.Nested(n =>
n.Path(p => p.sales)
.Query(q => q.DateRange( /* misc date range*/))
.InnerHits(ih => ih.Name("sorted_Sales")))));
I tried adding .Sort() after the .Name() but I get a
FieldSortDescriptor does not contain a definition for sales and no accessible exensiont method yada yada

Below will sort inner hits with descending order using startsOn field.
.InnerHits(ih => ih
.Name("sorted_Sales")
.Sort(s => s
.Descending(d => d.sales.FirstOrDefault()!.startsOn))))))

Related

Create a nested HASH from a API Call doesn't work properly

I am new here and i hope that I'm doing everything right.
I also searched the Forum and with Googel, but I didn't find the answer. (Or I did not notice that the solution lies before my eyes. Then I'm sorry >.< .)
i have a problem and i dont exactly know what i am doing wrong at the moment.
I make a API request and get a big JSON back. It looks somehow like that:
"apps": [
{
"title": "XX",
... many more data
},
{
"title": "XX",
... many more data
},
{
"title": "XX",
... many more data
}
... and so on
]
After that i want to create a hash with the data i need, for example it should look like:
{
"APP_0" => {"Title"=>"Name1", "ID"=>"1234", "OS"=>"os"}
"APP_1" => {"Title"=>"Name2", "ID"=>"5678", "OS"=>"os"}
}
but the values in the hash that i create with my code looks like:
"APP_1", {"Title"=>"Name2", "ID"=>"5678", "OS"=>"os"}
dont now if this is a valid hash? And after that i want to iterate through the Hash and just output the ID. But I get an error (TypeError). What am i doing wrong?
require 'json'
require 'net/http'
require 'uri'
require 'httparty'
response = HTTParty.get('https://xxx/api/2/app', {
headers: {"X-Toke" => "xyz"},
})
all_apps_parse = JSON.parse(response.body)
all_apps = Hash.new
all_apps_parse["apps"].each_with_index do |app, i|
all_apps["APP_#{i}"] = {'Title' => app["title"],
'ID' => app["id"],
'OS' => app["platform"]}
end
all_apps.each_with_index do |app, i|
app_id = app["App_#{i}"]["id"]
p app_id
end
I hope someone can understand the problem and can help me :-). Thanks in advance.
Assuming the data looks something like this:
all_apps_parse = { "apps" => [
{
"title" => "Name1",
"id" => 1234,
"platform" => "os"
},
{
"title" => "Name2",
"id" => 5678,
"platform" => "os"
},
{
"title" => "Name3",
"id" => 1111,
"platform" => "windows"
}]
}
and with a little idea of what you want to achieve, here is my solution:
all_apps = Hash.new
all_apps_parse["apps"].each_with_index do |app, i|
all_apps["APP_#{i}"] = { 'Title' => app["title"],
'ID' => app["id"],
'OS' => app["platform"] }
end
all_apps
=> {"APP_0"=>{"Title"=>"Name1", "ID"=>1234, "OS"=>"os"}, "APP_1"=>{"Title"=>"Name2", "ID"=>5678, "OS"=>"os"}, "APP_2"=>{"Title"=>"Name3", "ID"=>1111, "OS"=>"windows"}}
all_apps.each do |key, value|
puts key # => e.g. "APP_0"
puts value['ID'] # => e.g. 1234
end
# Prints
APP_0
1234
APP_1
5678
APP_2
1111

How to search on multiple fields with boost weights?

At the moment I'm using this:
response = await ElasticClient.SearchAsync<Product>(s => s
.From(skip)
.Size(productSearch.ItemsPerPage)
.Index(productSearch.Company + PartOfIndexName + productSearch.Country)
.Query(q => q
.QueryString(c => c
.Fields(f => f
.Field(p => p.IdPart1, 4.0)
.Field(p => p.Title, 4.0)
.Field(p => p.BrandName, 3.0)
.Field(p => p.Description, 2.0)
)
.Query("*" + productSearch.Query + "*")
)
)
);
But this doesn't work. No results get returned. But I get a valid response (debug information: "Valid NEST response built from a successful low level call on POST"). Does anyone have any idea what I'm doing wrong? It's been days now and I still can't figure it out.
When I query it via the Elasticsearch REST API like this:
POST http://localhost:9200/company_products_country/_search
body:
{
"size": 10,
"query": {
"match": {
"title": "something"
}
}
}
Then it works and I get results. But if I search the description field for something like: "787920/1", then I get no results. The description field is a 500 char text field.
I index the documents like this:
ElasticClient.Index(product, idx => idx.Index(indexName));

Multiple limit condition in mongodb

I have a collection in which one of the field is "type". I want to get some values of each type depending upon condition which is same for all the types. Like I want 2 documents for type A, 2 for type B like that.
How to do this in a single query? I am using Ruby Active Record.
Generally what you are describing is a relatively common question around the MongoDB community which we could describe as the "top n results problem". This is when given some input that is likely sorted in some way, how to get the top n results without relying on arbitrary index values in the data.
MongoDB has the $first operator which is available to the aggregation framework which deals with the "top 1" part of the problem, as this actually takes the "first" item found on a grouping boundary, such as your "type". But getting more than "one" result of course gets a little more involved. There are some JIRA issues on this about modifying other operators to deal with n results or "restrict" or "slice". Notably SERVER-6074. But the problem can be handled in a few ways.
Popular implementations of the rails Active Record pattern for MongoDB storage are Mongoid and Mongo Mapper, both allow access to the "native" mongodb collection functions via a .collection accessor. This is what you basically need to be able to use native methods such as .aggregate() which supports more functionality than general Active Record aggregation.
Here is an aggregation approach with mongoid, though the general code does not alter once you have access to the native collection object:
require "mongoid"
require "pp";
Mongoid.configure.connect_to("test");
class Item
include Mongoid::Document
store_in collection: "item"
field :type, type: String
field :pos, type: String
end
Item.collection.drop
Item.collection.insert( :type => "A", :pos => "First" )
Item.collection.insert( :type => "A", :pos => "Second" )
Item.collection.insert( :type => "A", :pos => "Third" )
Item.collection.insert( :type => "A", :pos => "Forth" )
Item.collection.insert( :type => "B", :pos => "First" )
Item.collection.insert( :type => "B", :pos => "Second" )
Item.collection.insert( :type => "B", :pos => "Third" )
Item.collection.insert( :type => "B", :pos => "Forth" )
res = Item.collection.aggregate([
{ "$group" => {
"_id" => "$type",
"docs" => {
"$push" => {
"pos" => "$pos", "type" => "$type"
}
},
"one" => {
"$first" => {
"pos" => "$pos", "type" => "$type"
}
}
}},
{ "$unwind" => "$docs" },
{ "$project" => {
"docs" => {
"pos" => "$docs.pos",
"type" => "$docs.type",
"seen" => {
"$eq" => [ "$one", "$docs" ]
},
},
"one" => 1
}},
{ "$match" => {
"docs.seen" => false
}},
{ "$group" => {
"_id" => "$_id",
"one" => { "$first" => "$one" },
"two" => {
"$first" => {
"pos" => "$docs.pos",
"type" => "$docs.type"
}
},
"splitter" => {
"$first" => {
"$literal" => ["one","two"]
}
}
}},
{ "$unwind" => "$splitter" },
{ "$project" => {
"_id" => 0,
"type" => {
"$cond" => [
{ "$eq" => [ "$splitter", "one" ] },
"$one.type",
"$two.type"
]
},
"pos" => {
"$cond" => [
{ "$eq" => [ "$splitter", "one" ] },
"$one.pos",
"$two.pos"
]
}
}}
])
pp res
The naming in the documents is actually not used by the code, and titles in the data shown for "First", "Second" etc, are really just there to illustrate that you are indeed getting the "top 2" documents from the listing as a result.
So the approach here is essentially to create a "stack" of the documents "grouped" by your key, such as "type". The very first thing here is to take the "first" document from that stack using the $first operator.
The subsequent steps match the "seen" elements from the stack and filter them, then you take the "next" document off of the stack again using the $first operator. The final steps in there are really justx to return the documents to the original form as found in the input, which is generally what is expected from such a query.
So the result is of course, just the top 2 documents for each type:
{ "type"=>"A", "pos"=>"First" }
{ "type"=>"A", "pos"=>"Second" }
{ "type"=>"B", "pos"=>"First" }
{ "type"=>"B", "pos"=>"Second" }
There was a longer discussion and version of this as well as other solutions in this recent answer:
Mongodb aggregation $group, restrict length of array
Essentially the same thing despite the title and that case was looking to match up to 10 top entries or greater. There is some pipeline generation code there as well for dealing with larger matches as well as some alternate approaches that may be considered depending on your data.
You will not be able to do this directly with only the type column and the constraint that it must be one query. However there is (as always) a way to accomplish this.
To find documents of different types, you would need to have some type of additional value that, on average distributed the types out according to how you want the data back.
db.users.insert({type: 'A', index: 1})
db.users.insert({type: 'B', index: 2})
db.users.insert({type: 'A', index: 3})
db.users.insert({type: 'B', index: 4})
db.users.insert({type: 'A', index: 5})
db.users.insert({type: 'B', index: 6})
Then when querying for items with db.users.find(index: {$gt: 2, $lt: 7}) you will have the right distribution of items.
Though I'm not sure this was what you were looking for

example of how to use synonyms in nest

i haven't found a solid example on how to create and use synonyms using Nest for Elasticsearch. if anyone has one it would be helpful.
my attempt looks like this, but i don't know how to apply it to a field.
var syn = new SynonymTokenFilter
{
Synonyms = new [] { "pink, p!nk => pink", "lil, little", "ke$ha, kesha => ke$ha" },
IgnoreCase = true,
Tokenizer = "standard"
};
client.CreateIndex("myindex", i =>
{
i
.Analysis(a => a.Analyzers(an => an
.Add("fullTermCaseInsensitive", fullTermCaseInsensitive)
)
.TokenFilters(x => x
.Add("synonym", syn)
)
)
...
it's very simple :)
you will need to define first the Synonym filter the you can use it in your custom Analyzer...where you can add also other type of filters.
Small example :
.Analysis(descriptor => descriptor
.Analyzers(bases => bases
.Add("folded_word", new CustomAnalyzer()
{
Filter = new List<string> { "icu_folding", "trim", "synonym" },
Tokenizer = "standard"
}
)
)
.TokenFilters(i => i
.Add("synonym", new SynonymTokenFilter()
{
SynonymsPath="analysis/synonym.txt",
Format = "Solr"
}
)
)
Then you can use the custom analyzer in the mapping part
Assuming your fullTermCaseInsensitive analyzer is custom, you need to add your synonym filter to it:
var fullTermCaseInsensitive = new CustomAnalyzer()
{
.
.
.
Filter = new string[] { "syn" }
};
And upon creating your index, you can add a mapping and apply the fullTermCaseInsensitive analyzer to your field(s):
client.CreateIndex("myindex", c => c
.Analysis(a => a
.Analyzers(an => an.Add("fullTermCaseInsensitive", fullTermCaseInsensitive))
.TokenFilters(tf => tf.Add("syn", syn)))
.AddMapping<MyType>(m => m
.Properties(p => p
.String(s => s.Name(t => t.MyField).Analyzer("fullTermCaseInsensitive")))));

How do I extract values from nested JSON?

After parsing some JSON:
data = JSON.parse(data)['info']
puts data
I get:
[
{
"title"=>"CEO",
"name"=>"George",
"columns"=>[
{
"display_name"=> "Salary",
"value"=>"3.85",
}
, {
"display_name"=> "Bonus",
"value"=>"994.19",
}
, {
"display_name"=> "Increment",
"value"=>"8.15",
}
]
}
]
columns has nested data in itself.
I want to save the data in a database or CSV file.
title, name, value_Salary, value_Bonus, value_increment
But I'm not concerned about getting display_name, so just the values of first of columns, second of columns data, etc.
Ok I tried data.map after converting to hash & hash.flatten could find a way out.. .map{|x| x['columns']}
.map {|s| s["value"]}
tried to get the values atleast separately - but couldnt...
This is a simple problem, and resolves down to a couple nested map blocks.
Here's the data retrieved from JSON, plus an extra row to demonstrate how easy it is to handle a more complex JSON response:
data = [
{
"title" => "CEO",
"name" => "George",
"columns" => [
{
"display_name" => "Salary",
"value" => "3.85",
},
{
"display_name" => "Bonus",
"value" => "994.19",
},
{
"display_name" => "Increment",
"value" => "8.15",
}
]
},
{
"title" => "CIO",
"name" => "Fred",
"columns" => [
{
"display_name" => "Salary",
"value" => "3.84",
},
{
"display_name" => "Bonus",
"value" => "994.20",
},
{
"display_name" => "Increment",
"value" => "8.15",
}
]
}
]
Here's the code:
records = data.map { |record|
title, name = record.values_at('title', 'name')
values = record['columns'].map{ |column| column['value'] }
[title, name, *values]
}
Here's the resulting data structure, an array of arrays:
records
# => [["CEO", "George", "3.85", "994.19", "8.15"],
# ["CIO", "Fred", "3.84", "994.20", "8.15"]]
Saving it into a database or CSV is left for you to figure out, but Ruby's CSV class makes it trivial to write a file, and an ORM like Sequel makes it really easy to insert the data into a database.

Resources