LINQ way to count elements in each value of dictionary? - linq

I have a Dictionary<String, String[]> and I'm hoping for a LINQ way to count up all the strings in the values. Currently I'm using a good ol' fashioned for:
int total = 0;
for (int i = 0; i < dict.Count; i++) {
total += dict.ElementAt(i).Value.Length;
}
I was thinking of a Select() but then I'd have a IEnumerable<String[]> or basically a 2D array of strings. Is there a better way?

Sum1 should work for that
int total = dict.Sum(d => d.Value.Length);
Here is a full working demo you can test in linqpad or a console app if so inclined
Dictionary<string, string[]> dict = new Dictionary<string, string[]>();
for( int i = 0; i < 10; i++ )
{
dict["s"+i] = Enumerable.Range(0,10).Select(s => s.ToString()).ToArray();
}
int total = dict.Sum(d => d.Value.Length);
Console.Write(total);//100
1. MSDN: Enumerable.Sum Method

total=dict.Select(n=>n.Value.Lenght).Sum();

Related

Why won't my random selection without replacement algorithm work?

I have made a very simple algorithm that picks a set of numToPick random numbers from the range 0 to batchMax, without replacement. Then it places each selected number in an array called numsPicked. For some reason I cannot explain, it isn't working on DartPad.
import 'dart:math';
void main() {
print(randNoReplace(2, 9));
}
List<int> randNoReplace(int numToPick, int batchMax) {
List<int> numsPicked = List(numToPick);
List<int> tmpArray = List(batchMax);
//this for loop creates the tmpArray from 0 to batchMax.
for (int i = 0; i <= batchMax; i++) {
tmpArray[i] = i;
}
//this for loop randomly scrambles said tmpArray.
for (int i = 0; i <= batchMax; i++) {
int randIndex = Random().nextInt(batchMax);
int tmp = tmpArray[i];
tmpArray[i] = tmpArray[randIndex];
tmpArray[randIndex] = tmp;
}
//finally, this for loop adds the first numToPick entries of the scrambled tmpArray and adds them to numsPicked.
for (int i = 0; i < numToPick; i++) {
numsPicked[i] = tmpArray[i];
}
return numsPicked;
}
So, for example, with 2 and 9 respectively, this algorithm should theoretically give me 2 random non-duplicate numbers in the range [0, 9].
I think the main issue in your code is that your first two for-loops are going from 0 to
batchMax including batchMax. This is a problem since you are using batchMax to specify the size of your tmpArray. Since the index of a List starts at 0, we cannot ask for the batchMax-element but at most batchMax - 1.
So you code should properly be (or `tmpArray should be one element bigger):
import 'dart:math';
void main() {
print(randNoReplace(2, 9));
}
List<int> randNoReplace(int numToPick, int batchMax) {
List<int> numsPicked = List(numToPick);
List<int> tmpArray = List(batchMax);
//this for loop creates the tmpArray from 0 to batchMax.
for (int i = 0; i < batchMax; i++) {
tmpArray[i] = i;
}
//this for loop randomly scrambles said tmpArray.
for (int i = 0; i < batchMax; i++) {
int randIndex = Random().nextInt(batchMax);
int tmp = tmpArray[i];
tmpArray[i] = tmpArray[randIndex];
tmpArray[randIndex] = tmp;
}
//finally, this for loop adds the first numToPick entries of the scrambled tmpArray and adds them to numsPicked.
for (int i = 0; i < numToPick; i++) {
numsPicked[i] = tmpArray[i];
}
return numsPicked;
}
Some other minor comments:
You should properly not initialize a new Random() object each time you need a random number. Instead, you should create one instance and reuse it.
Your code are not making much use of the Dart SDK. In fact, your code could properly be simplified into:
void main() {
print(randNoReplace(2, 9));
}
List<int> randNoReplace(int numToPick, int batchMax) =>
(List.generate(batchMax, (index) => index)..shuffle())
.sublist(0, numToPick);

Create Random Number List With No Repetition

I'm looking to create a list of 'random' numbers from 1 to 15 but without any repetition. I have created an array and looking to store each number in it but can't figure out how to do this. I've gotten as far as creating the random list and storing them in the array but can't quite get to ensure there are no repetitions. Any help would be appreciated. My code is as follows:
int[] myList = new int[15];
Random random = new Random();
for (int i = 0; myList.Length; i++)
{
myList[i] = random.Next(1, 15);
}
Because the size of your list is equal to the possible values, you can just create the list in normal order:
int[] myList = new int[15];
for (int i = 0; i < myList.Length; i++)
{
myList[i] = i + 1;
}
and then shuffle it, for example by assigning a random value to each entry and sort by that value:
Random random = new Random();
myList = myList.OrderBy(a => random.Next()).ToArray();
You can do it using Fisher–Yates shuffle.
Sample Implementation:
int n = 15;
int[] myList = new int[n];
Random random = new Random();
for (int i = 0; i < n; i++)
{
myList[i] = i + 1;
}
for (int i = n - 1; i >= 1; i--)
{
int j = random.Next(1, i);
int temp=myList[i];
myList[i]=myList[j];
myList[j]=temp;
}
You need to get the algorithm right.
Start from i=15
Pick a random number from 1 to i.
Append it to the list.
Swap it with (i-1)th index.
Decrement i by 1.
Repeat the above steps.
The code for above can be:
int[] myList = new int[15];
int[] original_list = {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
Random random = new Random();
for (int i = myList.Length; i>=0; i--)
{int randNo = random.Next(0, i-1);
myList[i] = original_list[randNo];
swap(original_list[i-1],original_list[randNo]); //your swap method
}

How to set a number string into a real string formation?

I'm now using NPOI to cope with Excel export, and here's my codes (part in .NET):
int rowIndex = 1;
for (int i = 0; i < dt.Rows.Count; i++)
{
IRow dataRow = sheet.CreateRow(rowIndex);
for (int j = 0; j < cellCount; j++)
{
cell = dataRow.CreateCell(j,CellType.String);
cell.SetCellValue(new HSSFRichTextString(dt.Rows[i][j].ToString()));
}
rowIndex++;
}
What makes me feel surprised is there's a list whose number string is "20150525", and it will be analyzed as "2015……E+10" formation (scientific number formation). However I wanna keep it as a string value. So How?
Thanks!
In fact we have to set a CellStyle, snippet of sample codes is below:
IRow row = book[0].CreateRow(rowIndex + 1);
ICell rowCell = null;
rowCell = row.CreateCell(colIndex);
rowCell.SetCellValue(realCellValue);
ICellStyle cellStyle = book.CreateCellStyle();
cellStyle.DataFormat = HSSFDataFormat.GetBuiltinFormat("#");
rowCell.CellStyle = cellStyle;

Most efficient way to search through rows and change values in columns of a large array (50,000 by 50,000)

Say there is a large two dimensional array (50,000 by 50,000). Essentially all the elements in the array are 1s and 0s. I am try to figure out an efficient way to change values in columns (1s to 0s and 0s to 1s). But more importantly see which rows have matching elements (a row with all the same values).
I suppose you could do something like
int[][] data = new int[1000][1000];
// populate
final Random random = new Random();
for (int i = 0; i < data.length; i++) {
for (int j = 0; j < data[i].length; j++) {
final int value = random.nextInt();
data[i][j] = ((value % 2 == 0) ? 1 : 0);
}
}
final List<BigInteger> bitStrings = new ArrayList<>();
StringBuilder builder = null;
for (int i = 0; i < data.length; i++) {
builder = new StringBuilder();
for (int j = 0; j < data[i].length; j++) {
final int value = data[i][j];
builder.append(value);
}
final BigInteger bitString = new BigInteger(builder.toString(), 2);
bitStrings.add(bitString);
}
for (final BigInteger bitString : bitStrings) {
System.out.println(bitString.toString(2));
}
And use the BigInteger operations for bitwise operations.

Longest Common Subsequence among 3 Strings

I've implemented the dynamic programming solution to find the longest common subsequence among 2 strings. There is apparently a way to generalize this algorithm to find the LCS among 3 strings, but in my research I have not found any information on how to go about this. Any help would be appreciated.
To find the Longest Common Subsequence (LCS) of 2 strings A and B, you can traverse a 2-dimensional array diagonally like shown in the Link you posted. Every element in the array corresponds to the problem of finding the LCS of the substrings A' and B' (A cut by its row number, B cut by its column number). This problem can be solved by calculating the value of all elements in the array. You must be certain that when you calculate the value of an array element, all sub-problems required to calculate that given value has already been solved. That is why you traverse the 2-dimensional array diagonally.
This solution can be scaled to finding the longest common subsequence between N strings, but this requires a general way to iterate an array of N dimensions such that any element is reached only when all sub-problems the element requires a solution to has been solved.
Instead of iterating the N-dimensional array in a special order, you can also solve the problem recursively. With recursion it is important to save the intermediate solutions, since many branches will require the same intermediate solutions. I have written a small example in C# that does this:
string lcs(string[] strings)
{
if (strings.Length == 0)
return "";
if (strings.Length == 1)
return strings[0];
int max = -1;
int cacheSize = 1;
for (int i = 0; i < strings.Length; i++)
{
cacheSize *= strings[i].Length;
if (strings[i].Length > max)
max = strings[i].Length;
}
string[] cache = new string[cacheSize];
int[] indexes = new int[strings.Length];
for (int i = 0; i < indexes.Length; i++)
indexes[i] = strings[i].Length - 1;
return lcsBack(strings, indexes, cache);
}
string lcsBack(string[] strings, int[] indexes, string[] cache)
{
for (int i = 0; i < indexes.Length; i++ )
if (indexes[i] == -1)
return "";
bool match = true;
for (int i = 1; i < indexes.Length; i++)
{
if (strings[0][indexes[0]] != strings[i][indexes[i]])
{
match = false;
break;
}
}
if (match)
{
int[] newIndexes = new int[indexes.Length];
for (int i = 0; i < indexes.Length; i++)
newIndexes[i] = indexes[i] - 1;
string result = lcsBack(strings, newIndexes, cache) + strings[0][indexes[0]];
cache[calcCachePos(indexes, strings)] = result;
return result;
}
else
{
string[] subStrings = new string[strings.Length];
for (int i = 0; i < strings.Length; i++)
{
if (indexes[i] <= 0)
subStrings[i] = "";
else
{
int[] newIndexes = new int[indexes.Length];
for (int j = 0; j < indexes.Length; j++)
newIndexes[j] = indexes[j];
newIndexes[i]--;
int cachePos = calcCachePos(newIndexes, strings);
if (cache[cachePos] == null)
subStrings[i] = lcsBack(strings, newIndexes, cache);
else
subStrings[i] = cache[cachePos];
}
}
string longestString = "";
int longestLength = 0;
for (int i = 0; i < subStrings.Length; i++)
{
if (subStrings[i].Length > longestLength)
{
longestString = subStrings[i];
longestLength = longestString.Length;
}
}
cache[calcCachePos(indexes, strings)] = longestString;
return longestString;
}
}
int calcCachePos(int[] indexes, string[] strings)
{
int factor = 1;
int pos = 0;
for (int i = 0; i < indexes.Length; i++)
{
pos += indexes[i] * factor;
factor *= strings[i].Length;
}
return pos;
}
My code example can be optimized further. Many of the strings being cached are duplicates, and some are duplicates with just one additional character added. This uses more space than necessary when the input strings become large.
On input: "666222054263314443712", "5432127413542377777", "6664664565464057425"
The LCS returned is "54442"

Resources