Reject hash from array of hashes based on a condition in ruby - ruby

I have an array of hashes like below.
[
{
"name":"keith",
"age":"20",
"weight":"100lb",
"status":"CURRENT"
},
{
"name":"keith",
"age":"20",
"weight":"110lb",
"status":"PREVIOUS"
},
{
"name":"keith",
"age":"20",
"weight":"120lb",
"status":"FUTURE"
}
]
I am trying to remove a hash based on the condition that if name and age are the same then fetch out the hash that has the status CURRENT ignoring the other statuses PREVIOUS and FUTURE. so the output should be
[
{
"name":"keith",
"age":"20",
"weight":"100lb",
"status":"CURRENT"
}
]
I have tried using group_by and flat_map but couldn't get the desired output.
Could someone please help me in figuring out this?

Try the following:
arr.uniq { |h| [h[:name], h[:age]] }.select { |h| h[:status] == 'CURRENT' }

As far as I understood the problem (including the context added in the comments) the task is as follows:
for any combination of (name, age) if we have an entry having status == CURRENT we should return this entry only
otherwise, we should return the original entries as is
For example, for the data
data = [
{
"name":"keith",
"age":"20",
"weight":"100lb",
"status":"CURRENT"
},
{
"name":"keith",
"age":"20",
"weight":"110lb",
"status":"PREVIOUS"
},
{
"name":"keith",
"age":"20",
"weight":"120lb",
"status":"FUTURE"
},
{
"name":"alice",
"age":"20",
"weight":"120lb",
"status":"PREVIOUS"
},
{
"name":"alice",
"age":"20",
"weight":"120lb",
"status":"FUTURE"
},
]
the result of the operation described above should be
[
{:name=>"keith", :age=>"20", :weight=>"100lb", :status=>"CURRENT"},
{:name=>"alice", :age=>"20", :weight=>"120lb", :status=>"PREVIOUS"},
{:name=>"alice", :age=>"20", :weight=>"120lb", :status=>"FUTURE"}
]
(so for (keith, 20) we return only a hash where status == CURRENT, for (alive, whatever) we return initial data unchanged).
The solution seems to be very straightforward:
Group by (name, age)
For each group if it contains an entry (hash) with status == CURRENT return just this entry, otherwise return the whole group as is.
Or the same in Ruby:
data
.group_by { |name:, age:, **| [name, age] }
.reduce([]) do |acc, (_, group)|
current = group.find { |status:, **| status == "CURRENT" }
current ? acc.push(current) : acc.concat(group)
end

Related

GraphQL on clause with enum type

I have a question regarding GraphQL because I do not know if it is possible or not.
I have a simple scheme like this:
enum Range{
D,
D_1,
D_7
}
type Data {
id: Int!
levels(range: [Range!]):[LevelEntry]
}
type LevelEntry{
range: Range!
levelData: LevelData
}
type LevelData {
range: Range!
users: Int
name: String
stairs: Int
money: Float
}
Basically I want to do a query so I can retrieve different attributes for the different entries on the levelData property of levels array which can be filtered by some levels range.
For instance:
data {
"id": 1,
"levels": [
{
"range": D,
"levelData": {
"range": D,
"users": 1
}
},
{
"range": D_1,
"levelData": {
"range": D_1,
"users": 1,
"name": "somename"
}
}
]
This means i want for D "range, users" properties and for D_1 "range,users,name" properties
I have done an example of query but I do not know if this is possible:
query data(range: [D,D_1]){
id,
levels {
range
... on D {
range,
users
}
... on D_1 {
range,
users,
name
}
}
}
Is it possible? If it is how can i do it?

Null pointer exception while consuming streams

{
"rules": [
{
"rank": 1,
"grades": [
{
"id": 100,
"hierarchyCode": 32
},
{
"id": 200,
"hierarchyCode": 33
}
]
},
{
"rank": 2,
"grades": []
}
]
}
I've a json like above and I'm using streams to return "hierarchyCode" based on some condition. For example if I pass "200" my result should print 33. So far I did something like this:
request.getRules().stream()
.flatMap(ruleDTO -> ruleDTO.getGrades().stream())
.map(gradeDTO -> gradeDTO.getHierarchyCode())
.forEach(hierarchyCode -> {
//I'm doing some business logic here
Optional<SomePojo> dsf = someList.stream()
.filter(pojo -> hierarchyCode.equals(pojo.getId())) // lets say pojo.getId() returns 200
.findFirst();
System.out.println(dsf.get().getCode());
});
So in the first iteration for the expected output it returns 33, but in the second iteration it is failing with Null pointer instead of just skipping the loop since "grades" array is empty this time. How do I handle the null pointer exception here?
You can use the below code snippet using Java 8:
int result;
int valueToFilter = 200;
List<Grade> gradeList = data.getRules().stream().map(Rule::getGrades).filter(x-> x!=null && !x.isEmpty()).flatMap(Collection::stream).collect(Collectors.toList())
Optional<Grade> optional = gradeList.stream().filter(x -> x.getId() == valueToFilter).findFirst();
if(optional.isPresent()){
result = optional.get().getHierarchyCode();
System.out.println(result);
}
I have created POJO's according to my code, you can try this approach with your code structure.
In case you need POJO's as per this code, i will share the same as well.
Thanks,
Girdhar

ElasticSearch Painless: using vector functions in for loops bug

I ran into what seems to be a bug in Painless where if a vector function is used, say l2norm(), the outcome remains the same outcome as the first iteration. I'm using the painless script in a function score, I hope the query below sheds some light. I'm using the "exception" to see what the value is in each of the iteration, and it's every time the score of the first vector. I know this because I cycled the parameters a couple of times, and the score is everytime "stuck" on the first thing. So what I think is happening is that the function l2norm() (and all vector functions?!) are object instances that can only be instantiated one time? If that would be the case, what would be a work around?
Link to the ES discussion: https://discuss.elastic.co/t/painless-bug-using-for-loops-and-vector-functions/267263
{
"query": {
"nested": {
"path": "media",
"query": {
"function_score": {
"boost_mode": "replace",
"query": {
"bool": {
"filter": [{
"exists": {
"field": "media.full_body_dense_vector"
}
}]
}
},
"functions": [{
"script_score": {
"script": {
"source": "if (params.filterVectors.size() > 0 && params.filterCutOffScore >= 0) {\n for (int i=0; i < params.filterVectors.size();i++) {\n def c = params.filterVectors[i]; double euDistance = l2norm(c, doc['media.full_body_dense_vector']);\n if (i==1) { throw new Exception(euDistance + ''); } \n }\n return 1.0f;",
"params": {
"filterVectors":[
[1.0,2.0,3.0],[0.1,0.4,0.5]
],
"filterCutOffScore": 1.04
},
"lang": "painless"
}
}
}]
}
}
}
},
"size": 500,
"from": 0,
"track_scores": true
}
While l2norm is a static method, it certainly shouldn't behave like a pure function!
I've investigated a bit and it seems there's only a loop-level bug. When you call l2norm outside of the loop with either parametrized or hard-coded vectors, the results will always be different -- as they should be. But not within the for loop (I've tested a while loop too -- same result). Here's a minimum reproducible example that could be used to report a bug on github:
"script": {
"source": """
def field = doc['media.full_body_dense_vector'];
def hardcodedVectors = [ [1,2,3], [0.1,0.4,0.5] ];
def noLoopDistances = [
l2norm(hardcodedVectors[0], field),
l2norm(hardcodedVectors[1], field)
];
def hardcodedDistances = [];
for (vector in hardcodedVectors) {
double euDistance = l2norm(vector, field);
hardcodedDistances.add(euDistance);
}
def parametrizedDistances = [];
for (vector in params.filterVectors) {
double euDistance = l2norm(vector, field);
parametrizedDistances.add(euDistance);
}
def comparisonMap = [
"no-loop": noLoopDistances,
"hardcoded": hardcodedDistances,
"parametrized": parametrizedDistances
];
Debug.explain(comparisonMap);
""",
"params": {
"filterVectors": [ [1,2,3], [0.1,0.4,0.5] ]
},
"lang": "painless"
}
which yields
{
"no-loop":[
8.558621384311845, // <-- the only time two different l2norm calls behave correctly
11.071133967619906
],
"parametrized":[
8.558621384311845,
8.558621384311845
],
"hardcoded":[
8.558621384311845,
8.558621384311845
]
}
What this tells me is that it's not a matter of runtime caching but rather something else that should be investigated further be the Elastic team.
The workaround, for now, would be to keep using the parametrized vectors but instead of looping perform stone-age-like checks:
if (params.filterVectors.length == 0) {
// default to something
} else if (params.filterVectors.length == 1) {
// call l2norm once
} else if (params.filterVectors.length == 2) {
// call l2norm twice, separately
}
P.S. Throwing a new Exception() in order to debug Painless is fine. Using Debug.explain is even better for reasons explained in this sub-chapter on Debugging of my Elasticsearch Handbook.
First off, thanks to Joe for confirming I wasn't imagining things and it's indeed a bug. Second, the lovely ElasticSearch team has been triaging the issue and confirmed it's a bug, so the answer to this post is a link to the Github Issue so in the future, people can track in which ElasticSearch version this behaviour is patched.

GraphQL fallback query if no results

I have the following query:
{
entity(id: "theId") {
source1: media(source: 1){
images{
src, alt
}
}
source2: media(source: 2){
images{
src, alt
}
}
}
}
That give me a result like:
{
"entity": [
{
"source1": {
"images": [{"src": "", "alt": ""}]
},
"source2": {
"images": [{"src": "", "alt": ""}]
}
}
]
}
Is there a way to have a single result of source1 and source2, executing source1 and if it has no result it use source2 as fallback?
You are querying two fields (source1, source2) so something has to come back for both of them (null being a possible option). If you want to check them in a sequence you should probably break the query in two and run them one at the time from the client.
Could you perhaps change so you only query a single source field and have the resolver (on the server) return what makes sense based on what is available, so to speak? Like this:
{
entity(id: "theId") {
source: media(sourcesList: [1, 2]){
images{
src, alt
}
}
}
}
where sourceList is the sources to try, in order. So the resolver (server) can then check if source 1 is available and if not return source 2.
You could also add a field to let the client know which source was actually returned from the proposed list (sourceNumberReturned below would return 1 if source 1 was returned, otherwise 2).
{
entity(id: "theId") {
source: media(sourcesList: [1, 2]){
images{
src, alt
}
sourceNumberReturned
}
}
}

Elasticsearch - group by element in child collection

I have documents of following format in an elastic search index:
{
"item":"Firefox",
"tags":["a","b","c"]
},
{
"item":"Chrome",
"tags":["b","c","d"]
}
I want to group by each element in the tags property, so that I get results like:
"a" = 1, "b" = 2, "c" = 2, "d" = 1
Any help or pointers would be appreciated.
If you index (write) your document to,
index= x, type =y then ,
POST x/y/_search
{
"size":0,
"aggs":{
"t":{
"terms" :{
"field" :"tags"
}
}
}
}
To know its working, just learn elasticsearch.

Resources