How to count distinct fields in mongodb java Api - distinct

I need to find count of a distinct field. I used MongoCollection.distinct() which returns DistinctIterable. But it does not have any size method. To find the size I need to iterate DistinctIterable and find the size. Is there any method by which I can find the size of the distinct field values with out iterating it?
MongoCollection collection = db.getCollection("test");
DistinctIterable disIterable =collection.distinct("name");
int count =0;
Iterator iterator = disIterable.iterator();
while(iterator.hasNext()) {
iterator.next();
count = count +1;
}

try use it !!
public long returnSize(){
MongoCollection collection = db.getCollection("test");
DistinctIterable disIterable =collection.distinct("name");
return disIterable.into(new ArrayList<>()).size()
}

Related

Java8 stream average of object property in collection

I'm new to Java so if this has already been answered somewhere else then I either don't know enough to search for the correct things or I just couldn't understand the answers.
So the question being:
I have a bunch of objects in a list:
try(Stream<String> logs = Files.lines(Paths.get(args))) {
return logs.map(LogLine::parseLine).collect(Collectors.toList());
}
And this is how the properties are added:
LogLine line = new LogLine();
line.setUri(matcher.group("uri"));
line.setrequestDuration(matcher.group("requestDuration"));
....
How do I sort logs so that I end up with list where objects with same "uri" are displayed only once with average requestDuration.
Example:
object1.uri = 'uri1', object1.requestDuration = 20;
object2.uri = 'uri2', object2.requestDuration = 30;
object3.uri = 'uri1', object3.requestDuration = 50;
Result:
object1.uri = 'uri1', 35;
object2.uri = 'uri2', 30;
Thanks in advance!
Take a look at Collectors.groupingBy and Collectors.averagingDouble. In your case, you could use them as follows:
Map<String, Double> result = logLines.stream()
.collect(Collectors.groupingBy(
LogLine::getUri,
TreeMap::new,
Collectors.averagingDouble(LogLine::getRequestDuration)));
The Collectors.groupingBy method does what you want. It is overloaded, so that you can specify the function that returns the key to group elements by, the factory that creates the returned map (I'm using TreeMap here, because you want the entries ordered by key, in this case the URI), and a downstream collector, which collects the elements that match the key returned by the first parameter.
If you want an Integer instead of a Double value for the averages, consider using Collectors.averagingInt.
This assumes LogLine has getUri() and getRequestDuration() methods.

Sorting for Azure DocumentDB

I want to use DocumentDB to store roughly 200.000 documents of the same type. The documents each get an integer id field and I would like to retrieve them paged, in reverse order (highest id first).
So recently I found out there is no sorting for DocumentDB (see also DocumentDB - query result order). Perhaps it is better to go for a different database (such as RavenDB) however, time is pressing and I want to avoid the cost of switching to another database.
The question:
I have been looking at implementing my own sorted index of the documents on the client side (ASP Web API 2). I was thinking of creating a SortedList of key(id) and value(document.selflink). Then I could create a Getter with parameters for count, offset and a predicate to filter the documents. Below I added a quick example.
I just have the feeling this is a bad idea; either slow, costing too many resources or can be better done another way. So I am open for implementation suggestions...
public class SortableDocumentDbRepository
{
private SortedList _sorted = new SortedList();
private readonly string _sortedPropertyName;
private DocumentCollection ReadOrCreateCollection(string databaseLink) {
DocumentCollection col = base.ReadOrCreateCollection(databaseLink);
var docs = Client.CreateDocumentQuery(Collection.DocumentsLink)
.AsEnumerable();
lock (_sorted.SyncRoot) {
foreach (Document doc in docs) {
var propVal = doc.GetPropertyValue<string>(_sortedPropertyName);
if (propVal != null) {
_sorted.Add(propVal, doc.SelfLink);
}
}
}
return col;
}
public List<T> GetItems<T>(int count, int offset, Expression<Func<T, bool>> predicate) {
List<T> result = new List<T>();
lock (_sorted.SyncRoot) {
var values = _sorted.GetValueList();
for (int i = offset; i < _sorted.Count; i++) {
var queryable = predicate != null ?
Client.CreateDocumentQuery<T>(values[i].ToString()).Where(predicate) :
Client.CreateDocumentQuery<T>(values[i].ToString());
T item = queryable.AsEnumerable().FirstOrDefault();
if (item == null || item.Equals(default(T))) continue;
result.Add(item);
if (result.Count >= count) return result;
}
}
return result;
}
}
Microsoft has implemented Sorting:
https://learn.microsoft.com/en-us/azure/cosmos-db/sql-api-sql-query-reference#bk_orderby_clause
Example: SELECT * FROM c ORDER BY c._ts DESC
As you mentioned, order by unfortunately isn't implemented yet.
Your approach looks reasonable to me.
I see you are using a predicate to narrow the query result set (pulling 200,000 records for any DB will be costly).
Since it looks like you are looking to order by id - you can also look in to setting up a range index on id allowing you to perform range queries (e.g. < and >) on the id and further narrow the query result set. There is also a range index included by default on the _ts (timestamp) system property on documents that may also be helpful in this context.
See: http://azure.microsoft.com/en-us/documentation/articles/documentdb-indexing-policies/

How to sort IEnumerable with limited result count? (another implementation of .OrderBy.Take)

I have a binary file which contains more than 100 millions of objects and I read the file using BinaryReader and return (Yield) the object (File reader and IEnumerable implementation is here: Performance comparison of IEnumerable and raising event for each item in source? )
One of object's properties indicates the object rank (like A5). Assume that I want to get sorted top n objects based on the property.
I saw the code for OrderBy function: it uses QuickSort algorithm. I tried to sort the IEnumerable result with OrderBy and Take(n) function together, but I got OutOfMemory exception, because OrderBy function creates an array with size of total objects count to implement Quicksort.
Actually, the total memory I need is n so there is no need to create a big array. For instance, if I get Take(1000) it will return only 1000 objects and it doesn't depend on the total count of whole objects.
How can I get the result of OrderBy function with Take function? In another word, I need a limited or blocked sorted list with the capacity which is defined by end-user.
If you want top N from ordered source with default LINQ operators, then only option is loading all items into memory, sorting them and selecting first N results:
items.Sort(condition).Take(N) // Out of memory
If you want to sort only top N items, then simply take items first, and sort them:
items.Take(N).Sort(condition)
UPDATE you can use buffer for keeping N max ordered items:
public static IEnumerable<T> TakeOrdered<T, TKey>(
this IEnumerable<T> source, int count, Func<T, TKey> keySelector)
{
Comparer<T, TKey> comparer = new Comparer<T,TKey>(keySelector);
List<T> buffer = new List<T>();
using (var iterator = source.GetEnumerator())
{
while (iterator.MoveNext())
{
T current = iterator.Current;
if (buffer.Count == count)
{
// check if current item is less than minimal buffered item
if (comparer.Compare(current, buffer[0]) <= 0)
continue;
buffer.Remove(buffer[0]); // remove minimual item
}
// find index of current item
int index = buffer.BinarySearch(current, comparer);
buffer.Insert(index >= 0 ? index : ~index, current);
}
}
return buffer;
}
This solution also uses custom comparer for items (to compare them by keys):
public class Comparer<T, TKey> : IComparer<T>
{
private readonly Func<T, TKey> _keySelector;
private readonly Comparer<TKey> _comparer = Comparer<TKey>.Default;
public Comparer(Func<T, TKey> keySelector)
{
_keySelector = keySelector;
}
public int Compare(T x, T y)
{
return _comparer.Compare(_keySelector(x), _keySelector(y));
}
}
Sample usage:
string[] items = { "b", "ab", "a", "abcd", "abc", "bcde", "b", "abc", "d" };
var top5byLength = items.TakeOrdered(5, s => s.Length);
var top3byValue = items.TakeOrdered(3, s => s);
LINQ does not have a built-in class that lets you take the top n elements without loading the whole collection into memory, but you can definitely build it yourself.
One simple approach would be using a SortedDictionary of lists: keep adding elements to it until you hit the limit of n. After that, check each element that you are about to add with the smallest element that you have found so far (i.e. dict.Keys.First()). If the new element is smaller, discard it; otherwise, remove the smallest element, and add a new one.
At the end of the loop your sorted dictionary will have at most n elements, and they would be sorted according to the comparator that you set on the dictionary.

How to order integers according to size and track their positions by variable name

I have a program with multiple int variables where individual counts are added to the specific variable each time a set fail condition is encountered. I want the user to be able to track how many failures of each category they have encountered by a button click. I want to display the range on a datagridview in order from highest value integer down to lowest. I also need to display in the adjacent column the name of the test step that relates to the value. My plan was to use Array.sort to order the integers but i then lose track of their names so cant assign the adjacent string column. I tried using a hashtable but if i use the string as a key it sorts alphabetically not numerically and if i use the integer as a key i get duplicate entries which dont get added to the hash table. here is some of the examples i tried but they have the aforementioned problems. essentially i want to end with two arrays where the order matches the naming and value convention. FYI the variables were declared before this section of code, variables ending in x are the string name for the (non x) value of the integer.
Hashtable sorter = new Hashtable();
sorter[download] = downloadx;
sorter[power] = powerx;
sorter[phase] = phasex;
sorter[eeprom] = eepromx;
sorter[upulse] = upulsex;
sorter[vpulse] = vpulsex;
sorter[wpulse] = wpulsex;
sorter[volts] = voltsx;
sorter[current] = currentx;
sorter[ad] = adx;
sorter[comms] = commsx;
sorter[ntc] = ntcx;
sorter[prt] = prtx;
string list = "";
string[] names = new string[13];
foreach (DictionaryEntry child in sorter)
{
list += child.Value.ToString() + "z";
}
int[] ordered = new int[] { download, power, phase, eeprom, upulse, vpulse, wpulse, volts, current, ad, comms, ntc, prt };
Array.Sort(ordered);
Array.Reverse(ordered);
for (int i = 0; i < sorter.Count; i++)
{
int pos = list.IndexOf("z");
names[i] = list.Substring(0, pos);
list = list.Substring(pos + 1);
}
First question here so hope its not too longwinded.
Thanks
Use a Dictionary. And you can order it by the value : myDico.OrderBy(x => x.Value).Reverse(), the sort will be numerical descending. You just have to enumerate the result.
I hope I understand your need. Otherwise ignore me.
You want to be using a
Dictionary <string, int>
to store your numbers.I'm not clear on how you're displaying results at the end - do you have a grid or a list control?
You ask about usings. Which ones do you already have?
EDIT for .NET 2.0
There might be a more elegant solution, but you could implement the logic by putting your rows in a DataTable. Then you can make a DataView of that table and sort by whichever column you like, ascending or descending.
See http://msdn.microsoft.com/en-us/library/system.data.dataview(v=VS.80).aspx for example.
EDIT for .NET 3.5 and higher
As far as sorting a Dictionary by its values:
var sortedEntries = myDictionary.OrderBy(pair => pair.Value);
If you need the results to be a Dictionary, you can call .ToDictionary() on that. For reverse order, use .OrderByDescending(pair => pair.Value).

How to get values after dictionary sorting by values with linq

I've a dictionary, which i sorted by value with linq, how can i get those sorted value from the sorted result i get
that's what i did so far
Dictionary<char, int> lettersAcurr = new Dictionary<char, int>();//sort by int value
var sortedDict = (from entry in lettersAcurr orderby entry.Value descending select entry);
during the debug i can see that sortedDic has a KeyValuePar, but i cant accesses to it
thanks for help
sortedDict is IEnumerable<KeyValuePair<char, int>> iterate it
Just iterate over it.
foreach (var kv in sortedDict)
{
var value = kv.Value;
...
}
If you just want the char values you could modify your query as:
var sortedDict = (from entry in lettersAcurr orderby entry.Value descending select entry.Key);
which will give you a result of IEnumerable<char>
If you want it in a dictionary again you might be tempted to
var q = (from entry in lettersAcurr orderby entry.Value descending select entry.Key).ToDictionary(x => x);
but do bare in mind that the dictionary will not be sorted, since the Dictionary(Of T) will not maintain the sorted order.

Resources