Best way to refactor current method - refactoring

I tried really hard to refactor this code , but was unsuccessful. Please tell me how to go about it. I have been there for hours trying to find solution . I have read some excerpts from book Clean code however, being a beginner I really find it hard to refactor. Sorry, this is my first honest attempt but I am not able to figure out how to make this funtion of size ~4 or small.
public boolean[] validateTrueFalse(String[] checkBoxValues) {
boolean[] answer = new boolean[checkBoxValues.length];
for (int i = 0; i < checkBoxValues.length; i++) {
// values are like 1_true
String[] values = checkBoxValues[i].split("_"); // split each value
// from my array
int configId = Integer.parseInt(values[0]);
boolean isAns = Boolean.parseBoolean(values[1]);
for (TrueFalseConfigurationModel tm : dt.getTfModelList()) {
if (tm.getConfiguration_id() == configId) {
if (tm.isAnswer() == isAns) { // are values from both true
answer[i] = true;
} else {
answer[i] = false;
}
}
}
}
return answer;
}

Remember that short doesn't necessarily means better. Many times a longer method can be more readable and will be easier to understand and also maintain in the future. You will sometimes need to look at your code a year or 2 after you first wrote it and it ain't worth a thing if you can't understand it after you made it so short that you can't understand what you meant to do in that method. Of course that the other extreme is also something to be avoided and a too long method is not modular and can be difficult to understand if you want to change only a specific part of it.
In my opinion, that method you wrote is at a good length and doesn't need to be shortened.
But just to answer your question, you can always shorten your methods by dividing them to more methods. for example in your case:
public boolean[] validateTrueFalse(String[] checkBoxValues) {
boolean[] answer = new boolean[checkBoxValues.length];
for (int i = 0; i < checkBoxValues.length; i++) {
answer[i] = GetAnswer(checkBoxValues[i]);
}
return answer;
}
public bool GetAnswer(string aCheckBoxValue)
{
String[] values = aCheckBoxValue.split("_");
int configId = Integer.parseInt(values[0]);
boolean isAns = Boolean.parseBoolean(values[1]);
for (TrueFalseConfigurationModel tm : dt.getTfModelList())
{
if (tm.getConfiguration_id() == configId)
{
return tm.isAnswer() == isAns;
}
}
return false;
}
Notice how I divided the one big action in the method to smaller actions which created shorter methods. You can then continue in that manner and divide the GetAnswer method itself into 2 methods if you can find a logical way to divide it.

You can reduce
if (tm.isAnswer() == isAns) { // are values from both true
answer[i] = true;
} else {
answer[i] = false;
}
By
answer[i] = tm.isAnswer() == isAns;

Related

Two sum data structure problems

I built a data structure for two sum question. In this data structure I built add and find method.
add - Add the number to an internal data structure.
find - Find if there exists any pair of numbers which sum is equal to the value.
For example:
add(1); add(3); add(5);
find(4) // return true
find(7) // return false
the following is my code, so what is wrong with this code?
http://www.lintcode.com/en/problem/two-sum-data-structure-design/
this is the test website, some cases could not be passed
public class TwoSum {
private List<Integer> sets;
TwoSum() {
this.sets = new ArrayList<Integer>();
}
// Add the number to an internal data structure.
public void add(int number) {
// Write your code here
this.sets.add(number);
}
// Find if there exists any pair of numbers which sum is equal to the value.
public boolean find(int value) {
// Write your code here
Collections.sort(sets);
for (int i = 0; i < sets.size(); i++) {
if (sets.get(i) > value) break;
for (int j = i + 1; j < sets.size(); j++) {
if (sets.get(i) + sets.get(j) == value) {
return true;
}
}
}
return false;
}
}
There does not seem to be anything wrong with your code.
However a coding challenge could possibly require a more performant solution. (You check every item against every item, which would take O(N^2)).
The best solution to implement find, is using a HashMap, which would take O(N). It's explained more in detail here.

Compare each string in datatable with that of list takes longer time.poor performance

I have a datatable of 200,000 rows and want to validate each row with that of list and return that string codesList..
It is taking very long time..I want to improve the performance.
for (int i = 0; i < dataTable.Rows.Count; i++)
{
bool isCodeValid = CheckIfValidCode(codevar, codesList,out CodesCount);
}
private bool CheckIfValidCode(string codevar, List<Codes> codesList, out int count)
{
List<Codes> tempcodes= codesList.Where(code => code.StdCode.Equals(codevar)).ToList();
if (tempcodes.Count == 0)
{
RetVal = false;
for (int i = 0; i < dataTable.Rows.Count; i++)
{
bool isCodeValid = CheckIfValidCode(codevar, codesList,out CodesCount);
}
}
}
private bool CheckIfValidCode(string codevar, List<Codes> codesList, out int count)
{
List<Codes> tempcodes= codesList.Where(code => code.StdCode.Equals(codevar)).ToList();
if (tempcodes.Count == 0)
{
RetVal = false;
}
else
{
RetVal=true;
}
return bRetVal;
}
codelist is a list which also contains 200000 records. Please suggest. I used findAll which takes same time and also used LINQ query which also takes same time.
A few optimizations come to mind:
You could start by removing the Tolist() altogether
replace the Count() with .Any(), which returns true if there are items in the result
It's probably also a lot faster when you replace the List with a HashSet<Codes> (this requires your Codes class to implement HashCode and Equals properly. Alternatively you could populate a HashSet<string> with the contents of Codes.StdCode
It looks like you're not using the out count at all. Removing it would make this method a lot faster. Computing a count requires you to check all codes.
You could also split the List into a Dictionary> which you populate with by taking the first character of the code. That would reduce the number of codes to check drastically, since you can exclude 95% of the codes by their first character.
Tell string.Equals to use a StringComparison of type Ordinal or OrdinalIgnoreCase to speed up the comparison.
It looks like you can stop processing a lot earlier as well, the use of .Any takes care of that in the second method. A similar construct can be used in the first, instead of using for and looping through each row, you could short-circuit after the first failure is found (unless this code is incomplete and you mark each row as invalid individually).
Something like:
private bool CheckIfValidCode(string codevar, List<Codes> codesList)
{
Hashset<string> codes = new Hashset(codesList.Select(c ==> code.StdCode));
return codes.Contains(codevar);
// or: return codes.Any(c => string.Equals(codevar, c, StringComparison.Ordinal);
}
If you're adamant about the count:
private bool CheckIfValidCode(string codevar, List<Codes> codesList, out int count)
{
Hashset<string> codes = new Hashset(codesList.Select(c ==> code.StdCode));
count = codes.Count(codevar);
// or: count = codes.Count(c => string.Equals(codevar, c, StringComparison.Ordinal);
return count > 0;
}
You can optimize further by creating the HashSet outside of the call and re-use the instance:
InCallingCode
{
...
Hashset<string> codes = new Hashset(codesList.Select(c ==> code.StdCode));
for (/*loop*/) {
bool isValid = CheckIfValidCode(codevar, codes, out int count)
}
....
}
private bool CheckIfValidCode(string codevar, List<Codes> codesList, out int count)
{
count = codes.Count(codevar);
// or: count = codes.Count(c => string.Equals(codevar, c, StringComparison.Ordinal);
return count > 0;
}

How to convert this to Linq?

I got another Linq problem.. Because I'm not really sure if there is another way to do this. Here is what I want to convert:
class ID
{
public string name {get; set;}
public int id {get; set;}
}
ID[] num1 = new ID[2] { {"david",1} , {"mark",2} };
ID[] num2 = new ID[3] { {"david",1} , {"david",2} };
for(int i = 0; i < num1.Length; i++)
{
for(int j = 0; j < num2.Length; j++)
{
if(num1.name.Equals(num2.name) && num1.num == num2.num)
{
Console.Writeline("name: " + num1.name + " id: " + num1.id);
//Do something
break; //to avoid useless iterations once found
}
}
}
It's not a perfect code, but hopefully it captures what I want to do. Currently I am implementing this in Linq like such:
var match =
from a in num1
from b in num2
where (a.name.Equals(b.name) && a.num == b.num)
select a;
//do something with match
I'm pretty new to Linq so I'm not sure if this is the best way to do it or is there a much more "simpler" way. Since it seems like I'm just converting it to linq but essentially does the same code.
Thank you!
The Linq code you wrote is already on the right track to solve the problem, though it is not the only way to solve it.
Instead of using a where clause, you could override the Equals method on the class, or implement an IEqualityComaprer<Number>. Then you could use the Intersect Linq method.
Something like this:
public class Number
{
public override bool Equals(object other)
{
var otherAsNumber = other as Number;
return otherAsNumber != null
&& (otherAsNumber.Name == null
? this.Name == null
: otherAsNumber.Name.Equals(this.Name)
)
&& otherAsNumber.Num == this.Num
;
}
// ...
}
// ...
var result = num1.Intersect(num2);
foreach(var item in result)
{
// Do something
}
This of course assumes that you've fixed your code so that it compiles, and so that num1 and num2 refer to collections of Number classes, instead of individual Number instances. There are a lot of problems in the code you wrote, so I'll leave fixing that problem to you.

LINQ Partition List into Lists of 8 members [duplicate]

This question already has answers here:
Split List into Sublists with LINQ
(34 answers)
Closed 10 years ago.
How would one take a List (using LINQ) and break it into a List of Lists partitioning the original list on every 8th entry?
I imagine something like this would involve Skip and/or Take, but I'm still pretty new to LINQ.
Edit: Using C# / .Net 3.5
Edit2: This question is phrased differently than the other "duplicate" question. Although the problems are similar, the answers in this question are superior: Both the "accepted" answer is very solid (with the yield statement) as well as Jon Skeet's suggestion to use MoreLinq (which is not recommended in the "other" question.) Sometimes duplicates are good in that they force a re-examination of a problem.
Use the following extension method to break the input into subsets
public static class IEnumerableExtensions
{
public static IEnumerable<List<T>> InSetsOf<T>(this IEnumerable<T> source, int max)
{
List<T> toReturn = new List<T>(max);
foreach(var item in source)
{
toReturn.Add(item);
if (toReturn.Count == max)
{
yield return toReturn;
toReturn = new List<T>(max);
}
}
if (toReturn.Any())
{
yield return toReturn;
}
}
}
We have just such a method in MoreLINQ as the Batch method:
// As IEnumerable<IEnumerable<T>>
var items = list.Batch(8);
or
// As IEnumerable<List<T>>
var items = list.Batch(8, seq => seq.ToList());
You're better off using a library like MoreLinq, but if you really had to do this using "plain LINQ", you can use GroupBy:
var sequence = new[] {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16};
var result = sequence.Select((x, i) => new {Group = i/8, Value = x})
.GroupBy(item => item.Group, g => g.Value)
.Select(g => g.Where(x => true));
// result is: { {1,2,3,4,5,6,7,8}, {9,10,11,12,13,14,15,16} }
Basically, we use the version of Select() that provides an index for the value being consumed, we divide the index by 8 to identify which group each value belongs to. Then we group the sequence by this grouping key. The last Select just reduces the IGrouping<> down to an IEnumerable<IEnumerable<T>> (and isn't strictly necessary since IGrouping is an IEnumerable).
It's easy enough to turn this into a reusable method by factoring our the constant 8 in the example, and replacing it with a specified parameter.
It's not necessarily the most elegant solution, and it is not longer a lazy, streaming solution ... but it does work.
You could also write your own extension method using iterator blocks (yield return) which could give you better performance and use less memory than GroupBy. This is what the Batch() method of MoreLinq does IIRC.
It's not at all what the original Linq designers had in mind, but check out this misuse of GroupBy:
public static IEnumerable<IEnumerable<T>> BatchBy<T>(this IEnumerable<T> items, int batchSize)
{
var count = 0;
return items.GroupBy(x => (count++ / batchSize)).ToList();
}
[TestMethod]
public void BatchBy_breaks_a_list_into_chunks()
{
var values = new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
var batches = values.BatchBy(3);
batches.Count().ShouldEqual(4);
batches.First().Count().ShouldEqual(3);
batches.Last().Count().ShouldEqual(1);
}
I think it wins the "golf" prize for this question. The ToList is very important since you want to make sure the grouping has actually been performed before you try doing anything with the output. If you remove the ToList, you will get some weird side effects.
Take won't be very efficient, because it doesn't remove the entries taken.
why not use a simple loop:
public IEnumerable<IList<T>> Partition<T>(this/* <-- see extension methods*/ IEnumerable<T> src,int num)
{
IEnumerator<T> enu=src.getEnumerator();
while(true)
{
List<T> result=new List<T>(num);
for(int i=0;i<num;i++)
{
if(!enu.MoveNext())
{
if(i>0)yield return result;
yield break;
}
result.Add(enu.Current);
}
yield return result;
}
}
from b in Enumerable.Range(0,8) select items.Where((x,i) => (i % 8) == b);
The simplest solution is given by Mel:
public static IEnumerable<IEnumerable<T>> Partition<T>(this IEnumerable<T> items,
int partitionSize)
{
int i = 0;
return items.GroupBy(x => i++ / partitionSize).ToArray();
}
Concise but slower. The above method splits an IEnumerable into chunks of desired fixed size with total number of chunks being unimportant. To split an IEnumerable into N number of chunks of equal sizes or close to equal sizes, you could do:
public static IEnumerable<IEnumerable<T>> Split<T>(this IEnumerable<T> items,
int numOfParts)
{
int i = 0;
return items.GroupBy(x => i++ % numOfParts);
}
To speed up things, a straightforward approach would do:
public static IEnumerable<IEnumerable<T>> Partition<T>(this IEnumerable<T> items,
int partitionSize)
{
if (partitionSize <= 0)
throw new ArgumentOutOfRangeException("partitionSize");
int innerListCounter = 0;
int numberOfPackets = 0;
foreach (var item in items)
{
innerListCounter++;
if (innerListCounter == partitionSize)
{
yield return items.Skip(numberOfPackets * partitionSize).Take(partitionSize);
innerListCounter = 0;
numberOfPackets++;
}
}
if (innerListCounter > 0)
yield return items.Skip(numberOfPackets * partitionSize);
}
This is faster than anything currently on planet now :) The equivalent methods for a Split operation here

What does ExpressionVisitor.Visit<T> Do?

Before someone shouts out the answer, please read the question through.
What is the purpose of the method in .NET 4.0's ExpressionVisitor:
public static ReadOnlyCollection<T> Visit<T>(ReadOnlyCollection<T> nodes, Func<T, T> elementVisitor)
My first guess as to the purpose of this method was that it would visit each node in each tree specified by the nodes parameter and rewrite the tree using the result of the elementVisitor function.
This does not appear to be the case. Actually this method appears to do a little more than nothing, unless I'm missing something here, which I strongly suspect I am...
I tried to use this method in my code and when things didn't work out as expected, I reflectored the method and found:
public static ReadOnlyCollection<T> Visit<T>(ReadOnlyCollection<T> nodes, Func<T, T> elementVisitor)
{
T[] list = null;
int index = 0;
int count = nodes.Count;
while (index < count)
{
T objA = elementVisitor(nodes[index]);
if (list != null)
{
list[index] = objA;
}
else if (!object.ReferenceEquals(objA, nodes[index]))
{
list = new T[count];
for (int i = 0; i < index; i++)
{
list[i] = nodes[i];
}
list[index] = objA;
}
index++;
}
if (list == null)
{
return nodes;
}
return new TrueReadOnlyCollection<T>(list);
}
So where would someone actually go about using this method? What am I missing here?
Thanks.
It looks to me like a convenience method to apply an aribitrary transform function to an expression tree, and return the resulting transformed tree, or the original tree if there is no change.
I can't see how this is any different of a pattern that a standard expression visitor, other than except for using a visitor type, it uses a function.
As for usage:
Expression<Func<int, int, int>> addLambdaExpression= (a, b) => a + b;
// Change add to subtract
Func<Expression, Expression> changeToSubtract = e =>
{
if (e is BinaryExpression)
{
return Expression.Subtract((e as BinaryExpression).Left,
(e as BinaryExpression).Right);
}
else
{
return e;
}
};
var nodes = new Expression[] { addLambdaExpression.Body }.ToList().AsReadOnly();
var subtractExpression = ExpressionVisitor.Visit(nodes, changeToSubtract);
You don't explain how you expected it to behave and why therefore you think it does little more than nothing.

Resources