How to get first two occurrences using LINQ? - performance

I need to know the number of elements that satisfy a condition so I perform the following:
int numberOfItems = context.SomeEntity.Count(someCondition);
but as I only need to check if, and only if, numberOfItems is exactly 1, I would like to improve this query and be more efficient by stopping counting items when first two occurrences satisfy the condition (when first 2 occurrences are encountered that satisfy the condition, there is no need to continue checking it). Performing something like:
bool existsOnlyOne = context.SomeEntity....
How to achieve this?

You can change the condition to
bool existsOnlyOne = context.SomeEntity.Where(someCondition).Take(2).Count() == 1;
If you have more than two items, the items from the third one on would be ignored by the Take(2) method.

Related

XQuery/XPath "except" operation - Select part of sequence that is not in other sequence

I have a pretty simple example but I am just learning and can't find a solution for the following:
Given 2 sequences, being
<emp>10</emp>
<emp>42</emp>
<emp>100</emp>
and another sequence
<emp>10</emp>
<emp>42</emp>
Want i want to do is: Compare the sequences and return the part of sequences that is in the first, but not in the 2nd sequence, being <emp>100</emp> in this case.
I was thinking about an "except"-operation, but can't figure out how to make it working.
Help greatly appreciated.
The except expression operates on node identity, not node value. What I think you want is a value comparison over your sequences. For example:
let $seq1 :=
(<emp>10</emp>,
<emp>42</emp>,
<emp>100</emp>)
let $seq2 :=
(<emp>10</emp>,
<emp>42</emp>)
return $seq1[not(. = $seq2)]
=>
<emp>100</emp>

Difficulty Accessing Members of Tuple in Apache Pig

I have a variable titled F.
Describe F returns:
F: {group: bytearray,indexkey: {(indexkey: chararray)}}
Dump F returns:
(321,{(CHOW),(DREW)})
(5011,{(CHOW),(DREW)})
(5825,{(TANNER),(SPITZENBERGER)})
(16631,{(CHOW),(DREW)})
(34299,{(CHOW),(DREW)})
(35044,{(TANNER),(SPITZENBERGER)})
(65623,{(CHOW),(DREW)})
(74597,{(SPITZENBERGER),(TANNER)})
(83499,{(SPITZENBERGER),(TANNER)})
(90257,{(SPITZENBERGER),(TANNER)})
What I need is to produce an output that looks like this (only 1st row as an example):
(321,DREW,{(CHOW)})
I've tried using deference to pull out the first element by using this:
G = FOREACH F generate indexkey.$0;
But, this still returns the whole tuple.
Can anyone suggest a method for doing this? I was under the impression that the deference operator should allow me to do this.
Thanks in advance!
Daniel
You can't index into bags like that. The reason for that is bags don't have any notion of ordering. Selecting the first item in a bag should be treated as picking a random one.
Either way, if you want only one item instead of all of them you can used a nested FOREACH to pull a LIMIT of 1:
first = FOREACH F {
lim = LIMIT indexkey 1;
GENERATE group, lim;
}
(disclaimer: I can't test this code right now, if it doesn't work let me know. Hopefully you can get the gist)
You can take this a bit further and FLATTEN it to remove the bag of one item entirely, but be careful in that if the bag is empty i think you throw away the entire record in this case.
first = FOREACH F {
lim = LIMIT indexkey 1;
GENERATE group, FLATTEN(lim);
}

XPath :: running counter two levels

Using the count(preceding-sibling::*) XPath expression one can obtaining incrementing counters. However, can the same also be accomplished in a two-levels deep sequence?
example XML instance
<grandfather>
<father>
<child>a</child>
</father>
<father>
<child>b</child>
<child>c</child>
</father>
</grandfather>
code (with Saxon HE 9.4 jar on the CLASSPATH for XPath 2.0 features)
Trying to get an counter sequence of 1,2 and 3 for the three child nodes with different kinds of XPath expressions:
XPathExpression expr = xpath.compile("/grandfather/father/child");
NodeList nodes = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
for (int i = 0 ; i < nodes.getLength() ; i++) {
Node node = nodes.item(i);
System.out.printf("child's index is: %s %s %s, name is: %s\n"
,xpath.compile("count(preceding-sibling::*)").evaluate(node)
,xpath.compile("count(preceding-sibling::child)").evaluate(node)
,xpath.compile("//child/position()").evaluate(doc)
,xpath.compile(".").evaluate(node));
}
The above code prints:
child's index is: 0 0 1, name is: a
child's index is: 0 0 1, name is: b
child's index is: 1 1 1, name is: c
None of the three XPaths I tried managed to produce the correct sequence: 1,2,3. Clearly it can trivially be done using the i loop variable but I want to accomplish it with XPath if possible. Also I need to keep the basic framework of evaluating an XPath expression to get all the nodes to visit and then iterating on that set since that's the way the real application I work on is structured. Basically I visit each node and then need to evaluate a number of XPath expressions on it (node) or on the document (doc); one of these XPAth expressions is supposed to produce this incrementing sequence.
Use the preceding axis with a name test instead.
count(preceding::child)
Using XPath 2.0, there is a much better way to do this. Fetch all <child/> nodes and use the position() function to get the index:
//child/concat("child's index is: ", position(), ", name is: ", text())
You don't say efficiency is important, but I really hate to see this done with O(n^2) code! Jens' solution shows how to do that if you can use the result in the form of a sequence of (position, name) pairs. You could also return an alternating sequence of strings and numbers using //child/(string(.), position()): though you would then want to use the s9api API rather than JAXP, because JAXP can only really handle the data types that arise in XPath 1.0.
If you need to compute the index of each node as part of other processing, it might still be worth computing the index for every node in a single initial pass, and then looking it up in a table. But if you're doing that, the simplest way is surely to iterate over the result of //child and build a map from nodes to the sequence number in the iteration.

Array List looping for a duplicate value

I am looking if there is an "easy" or simple way to make an array of something, Lets say Icecreams.. this would be a class of icecream with various Attributes (ID, flavour, Size, scoops), i would like to run an array that gathers every ice cream ordered and then searches through this list for any duplicate values (2+ same size)
First idea i had was a for loop that creates the array than grabs the ice cream ID for the first instance, and checks its "flavour" against the array, if no duplicate is found the ID is increased by 1 (ID++) and then that Ice creams flavour is ran in the array, if a match is found i would set a Boolean to true.
Every approach i seem to take appears to be rather long winded and i haven't got one working as of yet. hoping some fresh/more experienced eyes would help on this.
In answer to below;
The XML would hold something like below
<iceCream id=1>
<flavour>chocolate</flavour>
<scoops>5</scoops>
</iceCream>
<iceCream id=2>
<flavour>banana</flavour>
<scoops>2</scoops>
</iceCream>
I would want to use drools (probably an array list?) to gather each icecream tag and allow me to check if any of the icecreams have the same flavour and output something (set a boolean to true) if a match is found, My understand was to make an array then run each icecream though the array by using its ID to identify it and inside each loop do ID +1 (int ID = 1) then in the lopp ID++. Aswell as search through the flavour childtag.
int ID = 0;
boolean match = false;
ArrayList iceCreams = new ArrayList($cont.getIceCreams());
for(iceCream $Flavour: (ArrayList<iceCream>)iceCreams)
{
ID++
if($Flavour.getFlavour().equals(icecream with id of (ID variable).getFlavour)
{
match = true;
}
}
if(match)
{etc etc etc}
Something along these lines if this helps?
1) If you have control over the first array creation, why dont you make sure that while insertion, you insert only the icecreams that are unique. So, while you are inserting into the array say ID=1, first iterate through the array and check if there is an icecream in the array with ID as 1, if not you put this into the array and do other stuff.
2) Searching part: now while inserting, make sure that you are doing so based on the ascending oder of IDs, so you can perform binary search for the same.
Note: I dont know drools, i have just posted a logic as per my understanding of the problem.
I don't know drools either, but I'll post the some pseudo code for what I think you are trying to accomplish:
for(i = 0; i < len(ice_cream_array); i++)
{
for(j = (i + 1); j < len(ice_cream_array); j++)
{
if (ice_cream_array[i] == ice_cream_array[j])
break from inner loop
else
there is no match
}
}
You may also want to look up bubble sorts and binary searches.

mathematica: PadRight[] and \[PlusMinus]

Is there any way that
PadRight[a \[PlusMinus] b,2,""]
Returns
{a \[PlusMinus] b,""}
Instead of
a \[PlusMinus] b \[PlusMinus] ""
?
I believe that i need to somehow deactivate the operator properties of [PlusMinus].
Why do i need this?
I'm creating a program to display tables with physical quantities. To me, that means tables with entries like
(value of a) [PlusMinus] (uncertainty of a)
When i have several columns with different heights, i'm stuffing the shorter ones with "", so i can use Transpose the numeric part of the table.
If the column has more than one entrie, there's no problem:
PadRight[{a \[PlusMinus] b,c \[PlusMinus] d},4,""]
gives what i want:
{a \[PlusMinus] b,c \[PlusMinus] d,"",""}
It is when the column has only one entrie that my problem appears.
This is the code that constructs the body stuffed with "":
If[tested[Sbody],1,
body = PadRight[body, {Length[a], Max[Map[Length, body]]
With
tested[a__] :=
If[Length[DeleteDuplicates[Map[Dimensions, {a}]]] != 1, False,
True];
, a function that discovers if is arguments have the same dimension
and
a={Quantity1,Quantity2,...}
Where the quantities are the one's that i want on my table.
Thanks
First you need to be aware of that any expression in Mathematica is in the form of Head[Body]
where body may be empty, a single expression or a sequence of expressions separated by commas
Length operate on expressions, not necessarily lists
so
Length[PlusMinus[a,b]]
returns 2 since the body of the expression contains to expressions (atoms in this case) that are a and b
Read the documentation on PadRight. The second argument define the final length of the expression
so
PadRight[{a,b},4,c] results with a list of length 4 with the last two elements equal to
PadRight[{a,b},2,c] results with the original list since it is already of length 2
Therefore
PadRight[PlusMinus[a,b],2,anything] just returns the same PlusMinus[a,b] unchanged since it is already of length 2
so, youר first example is wrong. You are not able to get a result with head List using PadRight when you try to pad to an expression with head PlusMinus
There is no problem of executing
PadRight[PlusMinus[a,b],3,""]
but the result looks funny (at best) and logically meaningless, but if this is what you wanted in the first place you get it, and following my explanations above you can figure out why
HTH
best
yehuda

Resources