I am working on a project with about 8 other people, and would like to know the best code practice here, given that other people will be working on this code for years to come.
Say I have an enum with 10 values:
typedef enum {
Tag1 = 1,
Tag2,
Tag3,
Tag4,
Tag5,
Tag6,
Tag7,
Tag8,
Tag9,
Tag10
} Tag;
If I wanted to check if a tag is equal to Tag6, Tag7, Tag8, Tag9, or Tag10, is it good practice to using a comparison like:
if(myTag >= Tag6 && myTag <= Tag10) {
//Do something
}
?
Or is it best to use an OR and check for each tag?
Using >= and <= looks nicer and is less clunky, but if down the line, someone were to insert a new Tag between Tag7 and Tag8, it would mess up all the logic.
Can I expect that someone wouldn't add a new Tag between other Tags?
Yes, but only for enums that express a scale of values, for instance:
enum Priority {
None = 0,
Low,
Medium,
High,
Critical
}
Then this code makes sense and is readable:
if(message.Priority >= Priority.Medium) {
// Notify user
}
If the enum doesn't express a scale like this then avoid using < or > as they can be rather confusing. Use bit flags instead.
Flags enums use binary values so that values can be combined:
enum UserAudiences {
// Basic values: dec // binary
None = 0, // 0000
Client = 1, // 0001
Employee = 2, // 0010
Contractor = 4, // 0100
Key = 8, // 1000
// Combined: dec // binary
KeyClient = 9, // 1001 : Key + Client
BoardMember = 10, // 1010 : Key + Employee
CounterParty = 5, // 0101 : Client + Contractor
BusinessPartner = 13 // 1101 : Key + Client + Contractor
}
Then, when checking for a combined enum value we look at the binary number and whether the appropriate bit is set. For instance if we want to check for UserAudiences.Employee we can just look for the bit that represents 2, if it is set then we have one of the enum values that includes it:
if((message.Audience & UserAudiences.Employee) != 0) {
// Post on intranet
} else {
// Send externally
}
There's no way to set that bit through any combination of Key, Client or Contractor enums, it can only be set if Employee is one of the 'source' enums.
Most languages have helpers for this (or you can write your own):
if(message.Audience.HasFlag(UserAudiences.Employee)) { ...
The maths could work in any base - you could use 1, 10, 100, etc in decimal. However, you'd need much bigger numbers much sooner.
Finally, there's a convention to use singular names for regular enums, and plural names for flagged enums, hinting to the programmer whether to use equality or bitwise checks.
Can I expect that someone wouldn't add a new Tag between other Tags?
I wouldn't bet on it. Unless the enum's ordinal/underlying values have some inherent meaning or order I would avoid using them to much.
I would only use range checks if I actually want somebody to be able to insert additional enums without adapting all checks. This is probably a rather rare case though. Keith gives a good examples with the Priority enum, another example I can think of are log levels.
The exact syntax depends on the language of course but I would usually consider something like this as most readable:
if(myTag in [Tag6, Tag7, Tag8]) {
// ...
}
Or even better use some describing variable names which make it obvious what the other tags are:
topTags = [Tag6, Tag7, Tag8]
if(myTag in topTags) {
// ...
}
Related
Frameworks I've seen before allow to pass a chain of multiple constants via a single parameter like So, I believe:
foo(FLAG_A | FLAG_B | FLAG_C);
They act like boolean so the function knows which flags have been given.
Now I want to implement something like that.
What is this concept called?
It's based on binary-ORing. Normally, each of the symbolic constants will be just one distinct bit, e.g., as in:
enum {
FLAG_A = 1,
FLAG_B = 1<<1,
FLAG_C = 1<<2,
};
so that you can than add them together with |, test for each individual one with & and subtract two such flag sets with & ~.
In .Net, these are defined by an enum using the FlagsAttribute:
[Flags()]
public enum Foo
{
Bit1 = 1,
Bit2 = 2,
Bit4 = 4,
Bit8 = 8,
etc.
}
// Or define using explicit binary syntax
[Flags()]
public enum Foo
{
Bit1 = 0b_0000_0001,
Bit2 = 0b_0000_0010,
Bit4 = 0b_0000_0100,
Bit8 = 0b_0000_,
etc.
}
And to utilise:
SomeFunction(Foo.Bit1 | Foo.Bit4 | etc);
I would suggest that your current name (Flags) seems to be the most appropriate definition, at least in this context.
Apparently "flags and bitmasks" are the right keywords to find more about this. "Flags" alone didn't before. Great thanks for the explanatory answers nevertheless!
I am attempting to sort a CSV file by specifying which column order to sort in:
for example: ./csort 3, 1, 5 < DATA > SORTED_DATA
or ./csort 3, 4, 6, 2, 1, 5 < DATA ...
example line of DATA: 177,27,2,42,285,220
I used a vector split(string str) function to store the columns specified in the arguments which require sorting. Creating a vector:
vector<string> columns {3, 1, 5}; // for example
Not entirely sure how to use this columns vector to proceed with the sorting process; though, I am aware that I could use sort.
sort(v.begin(), v.end(), myfunction);
As I understand your question, you have already parsed your data into 4 vectors, 1 vector per column, and you want to be able to sort your data, specifying the prececedence of the column to sort -- i.e. sort by col1, then col3, then col4...
What you want to do isn't too difficult, but you'll have to backtrack a bit. There are multiple ways to approach the problem, but here's a rough outline. Based on the level of expertise you exhibit in your question, you might have to look a few terms in the following outline, but if you do you'll have a good flexible solution to your problem.
You want to store your data by row, since you want to sort rows... 4 vector for 4 columns won't help you here. If all 4 elements in the row are going to be a the same type, you could use a std::vector or std::array for the row. std::array is solid if # cols is known compile time, std::vector for runtime. If the types are inhomogeneous, you could use a tuple, or a struct. Whatever type you use, let's call it RowT.
Parse and store into your rows, make a vector of RowT.
Define a function-object which provides the () operator for a left and right hand side of RowT. It must implement the "less than operation" following the precedence you want. Lets call that class CustomSorter.
Once you have that in place, your final sort will be:
CustomSorter cs(/*precedence arguments*/);
std::sort(rows.begin(), rows.end(), cs);
Everything is really straightforward, a basic example can bee seen here in the customsort example. In my experience the only part you will have to work at is the sort algorithm itself.
The easiest way is to use a class that has a list of indexes as a member, and go through the list in order to see if the item is less than the other.
class VecLess
{
std::vector<int> indexes;
public:
VecLess(std::vector<int> init) : indexes(init)
{
}
bool operator()(const std::vector<string> & lhs, const std::vector<string> rhs)
{
for (auto i = indexes.begin(); i != indexes.end(); ++i)
{
if (lhs[*i] < rhs[*i])
return true;
if (rhs[*i] < lhs[*i])
return false;
}
return false;
}
};
This is a bit of a side project I have taken on to solve a no-fix issue for work. Our system outputs a code to represent a combination of things on another thing. Some example codes are:
9-9-0-4-4-5-4-0-2-0-0-0-2-0-0-0-0-0-2-1-2-1-2-2-2-4
9-5-0-7-4-3-5-7-4-0-5-1-4-2-1-5-5-4-6-3-7-9-72
9-15-0-9-1-6-2-1-2-0-0-1-6-0-7
The max number in one of the slots I've seen so far is about 150 but they will likely go higher.
When the system was designed there was no requirement for what this code would look like. But now the client wants to be able to type it in by hand from a sheet of paper, something the code above isn't suited for. We've said we won't do anything about it, but it seems like a fun challenge to take on.
My question is where is a good place to start loss-less compressing this code? Obvious solutions such as store this code with a shorter key are not an option; our database is read only. I need to build a two way method to make this code more human friendly.
1) I agree that you definately need a checksum - data entry errors are very common, unless you have really well trained staff and independent duplicate keying with automatic crosss-checking.
2) I suggest http://en.wikipedia.org/wiki/Huffman_coding to turn your list of numbers into a stream of bits. To get the probabilities required for this, you need a decent sized sample of real data, so you can make a count, setting Ni to the number of times number i appears in the data. Then I suggest setting Pi = (Ni + 1) / (Sum_i (Ni + 1)) - which smooths the probabilities a bit. Also, with this method, if you see e.g. numbers 0-150 you could add a bit of slack by entering numbers 151-255 and setting them to Ni = 0. Another way round rare large numbers would be to add some sort of escape sequence.
3) Finding a way for people to type the resulting sequence of bits is really an applied psychology problem but here are some suggestions of ideas to pinch.
3a) Software licences - just encode six bits per character in some 64-character alphabet, but group characters in a way that makes it easier for people to keep place e.g. BC017-06777-14871-160C4
3b) UK car license plates. Use a change of alphabet to show people how to group characters e.g. ABCD0123EFGH4567IJKL...
3c) A really large alphabet - get yourself a list of 2^n words for some decent sized n and encode n bits as a word e.g. GREEN ENCHANTED LOGICIAN... -
i worried about this problem a while back. it turns out that you can't do much better than base64 - trying to squeeze a few more bits per character isn't really worth the effort (once you get into "strange" numbers of bits encoding and decoding becomes more complex). but at the same time, you end up with something that's likely to have errors when entered (confusing a 0 with an O etc). one option is to choose a modified set of characters and letters (so it's still base 64, but, say, you substitute ">" for "0". another is to add a checksum. again, for simplicity of implementation, i felt the checksum approach was better.
unfortunately i never got any further - things changed direction - so i can't offer code or a particular checksum choice.
ps i realised there's a missing step i didn't explain: i was going to compress the text into some binary form before encoding (using some standard compression algorithm). so to summarize: compress, add checksum, base64 encode; base 64 decode, check checksum, decompress.
This is similar to what I have used in the past. There are certainly better ways of doing this, but I used this method because it was easy to mirror in Transact-SQL which was a requirement at the time. You could certainly modify this to incorporate Huffman encoding if the distribution of your id's is non-random, but it's probably unnecessary.
You didn't specify language, so this is in c#, but it should be very easy to transition to any language. In the lookup you'll see commonly confused characters are omitted. This should speed up entry. I also had the requirement to have a fixed length, but it would be easy for you to modify this.
static public class CodeGenerator
{
static Dictionary<int, char> _lookupTable = new Dictionary<int, char>();
static CodeGenerator()
{
PrepLookupTable();
}
private static void PrepLookupTable()
{
_lookupTable.Add(0,'3');
_lookupTable.Add(1,'2');
_lookupTable.Add(2,'5');
_lookupTable.Add(3,'4');
_lookupTable.Add(4,'7');
_lookupTable.Add(5,'6');
_lookupTable.Add(6,'9');
_lookupTable.Add(7,'8');
_lookupTable.Add(8,'W');
_lookupTable.Add(9,'Q');
_lookupTable.Add(10,'E');
_lookupTable.Add(11,'T');
_lookupTable.Add(12,'R');
_lookupTable.Add(13,'Y');
_lookupTable.Add(14,'U');
_lookupTable.Add(15,'A');
_lookupTable.Add(16,'P');
_lookupTable.Add(17,'D');
_lookupTable.Add(18,'S');
_lookupTable.Add(19,'G');
_lookupTable.Add(20,'F');
_lookupTable.Add(21,'J');
_lookupTable.Add(22,'H');
_lookupTable.Add(23,'K');
_lookupTable.Add(24,'L');
_lookupTable.Add(25,'Z');
_lookupTable.Add(26,'X');
_lookupTable.Add(27,'V');
_lookupTable.Add(28,'C');
_lookupTable.Add(29,'N');
_lookupTable.Add(30,'B');
}
public static bool TryPCodeDecrypt(string iPCode, out Int64 oDecryptedInt)
{
//Prep the result so we can exit without having to fiddle with it if we hit an error.
oDecryptedInt = 0;
if (iPCode.Length > 3)
{
Char[] Bits = iPCode.ToCharArray(0,iPCode.Length-2);
int CheckInt7 = 0;
int CheckInt3 = 0;
if (!int.TryParse(iPCode[iPCode.Length-1].ToString(),out CheckInt7) ||
!int.TryParse(iPCode[iPCode.Length-2].ToString(),out CheckInt3))
{
//Unsuccessful -- the last check ints are not integers.
return false;
}
//Adjust the CheckInts to the right values.
CheckInt3 -= 2;
CheckInt7 -= 2;
int COffset = iPCode.LastIndexOf('M')+1;
Int64 tempResult = 0;
int cBPos = 0;
while ((cBPos + COffset) < Bits.Length)
{
//Calculate the current position.
int cNum = 0;
foreach (int cKey in _lookupTable.Keys)
{
if (_lookupTable[cKey] == Bits[cBPos + COffset])
{
cNum = cKey;
}
}
tempResult += cNum * (Int64)Math.Pow((double)31, (double)(Bits.Length - (cBPos + COffset + 1)));
cBPos += 1;
}
if (tempResult % 7 == CheckInt7 && tempResult % 3 == CheckInt3)
{
oDecryptedInt = tempResult;
return true;
}
return false;
}
else
{
//Unsuccessful -- too short.
return false;
}
}
public static string PCodeEncrypt(int iIntToEncrypt, int iMinLength)
{
int Check7 = (iIntToEncrypt % 7) + 2;
int Check3 = (iIntToEncrypt % 3) + 2;
StringBuilder result = new StringBuilder();
result.Insert(0, Check7);
result.Insert(0, Check3);
int workingNum = iIntToEncrypt;
while (workingNum > 0)
{
result.Insert(0, _lookupTable[workingNum % 31]);
workingNum /= 31;
}
if (result.Length < iMinLength)
{
for (int i = result.Length + 1; i <= iMinLength; i++)
{
result.Insert(0, 'M');
}
}
return result.ToString();
}
}
Is it possible to optimize this kind of (matrix) algorithm:
// | case 1 | case 2 | case 3 |
// ------|--------|--------|--------|
// | | | |
// case a| a1 | a2 | a3 |
// | | | |
// case b| b1 | b2 | b3 |
// | | | |
// case c| c1 | c2 | c3 |
// | | | |
switch (var)
{
case 1:
switch (subvar)
{
case a:
process a1;
case b:
process b1;
case c:
process c1;
}
case 2:
switch (subvar)
{
case a:
process a2;
case b:
process b2;
case c:
process c2;
}
case 3:
switch (subvar)
{
case a:
process a3;
case b:
process b3;
case c:
process c3;
}
}
The code is fairly simple but you have to imagine more complex with more "switch / case".
I work with 3 variables. According they take the values 1, 2, 3 or a, b, c or alpha, beta, charlie have different processes to achieve. Is it possible to optimize it any other way than through a series of "switch / case?
(Question already asked in french here).
Edit: (from Dran Dane's responses to comments below. These might as well be in this more prominent place!)
"optimize" is to be understood in terms of having to write less code, fewer "switch / case". The idea is to improve readability, maintainability, not performance.
There is maybe a way to write less code via a "Chain of Responsibility" but this solution is not optimal on all points, because it requires the creation of many objects in memory.
It sounds like what you want is a 'Finite State Machine' where using those cases you can activate different processes or 'states'. In C this is usually done with an array (matrix) of function pointers.
So you essentially make an array and put the right function pointers at the right indicies and then you use your 'var' as an index to the right 'process' and then you call it. You can do this in most languages. That way different inputs to the machine activate different processes and bring it to different states. This is very useful for numerous applications; I myself use it all of the time in MCU development.
Edit: Valya pointed out that I probably should show a basic model:
stateMachine[var1][var2](); // calls the right 'process' for input var1, var2
There are no good answers to this question :-(
because so much of the response depends on
The effective goals (what is meant by "optimize", what is unpleasing about the nested switches)
The context in which this construct is going to be applied (what are the ultimate needs implicit to the application)
TokenMacGuy was wise to ask about the goals. I took the time to check the question and its replies on the French site and I'm still puzzled as to the goals... Dran Dane latest response seems to point towards lessening the amount of code / improving readability but let's review for sure:
Processing Speed: not an issue the nested switches are quite efficient, possibly a tat less than 3 multiplications to get an index into a map table, but maybe not even.
Readability: yes possibly an issue, As the number of variables and level increases the combinatorial explosion kicks in, and also the format of the switch statement tends to spread the branching spot and associated values over a long vertical stretch. In this case a 3 dimension (or more) table initialized with fct. pointers puts back together the branching values and the function to be call on on a single line.
Writing less code: Sorry not much help here; at the end of the day we need to account for a relatively high number of combinations and the "map", whatever its form, must be written somewhere. Code generators such as TokenMacGuy's may come handy, it does seem a bit of an overkill in this case. Generators have their place, but I'm not sure it is the case here. One of two case: if the number of variables and level is small enough, the generator is not worth it (takes more time to set it up than to write the actual code in the first place), if the number of variables and levels is significant, the generated code is hard to read, hard to maintain...)
In a nutshell, my recommendation with regards to making the code more readable (and a bit faster to write) is the table/matrix approach described on the French site.
This solution is in two part:
a one time initialization of a 3 dimensional array (for 3 levels); (or a "fancier" container structure if preferred: a tree for example) . This is done with code like:
// This is positively more compact / readable
...
FctMap[1][4][0] = fctAlphaOne;
FctMap[1][4][1] = fctAlphaOne;
..
FctMap[3][0][0] = fctBravoCharlie4;
FctMap[3][0][1] = NULL; // impossible case
FctMap[3][0][2] = fctBravoCharlie4; // note how the same fct may serve in mult. places
And a relatively simple snippet wherever the functions need to be called:
if (FctMap[cond1][cond2][cond3]) {
retVal = FctMap[cond1][cond2][cond3](Arg1, Arg2);
if (retVal < 0)
DoSomething(); // anyway we're leveraging the common api to these fct not the switch alternative ....
}
A case which may prompt one NOT using the solution above are if the combination space is relatively sparsely populated (many "branches" in the switch "tree" are not used) or if some of the functions require a different set of parameters; For both of these cases, I'd like to plug a solution Joel Goodwin proposed first here, and which essentially combines the various keys for the several level into one longer key (with separator character if need be), essentially flattening the problem back to a long, but single level switch statement.
Now...
The real discussion should be about why we need such a mapping/decision-tree in the first place. To answer this unfortunately requires understanding the true nature of the underlying application. To be sure I'm not saying that this is indicative of bad design. A big dispatching section may make sense in some applications. However, even with the C language (which the French Site contributors seemed to disqualify to Object Oriented design), it is possible to adopt Object oriented methodology and patterns. Anyway I'm diverging...) It is possible that the application would overall be better served with alternative design patterns where the "information tree about what to call when" has been distributed in several modules and/or several objects.
Apologies to speak about this in rather abstract terms, it's just the lack of application specifics... The point remains: challenge the idea that we need this big dispatching tree; think of alternative approaches to the application at large.
Alors, bonne chance! ;-)
Depending on the language, some form of hash map with the pair (var, subvar) as the key and first-class functions as the values (or whatever your language offers to best approximate that, e.g. instances of classes extending some proper interface in Java) is likely to provide top performance -- and the utter conciseness of fetching the appropriate function (or whatever;-) from the map based on the key, and executing it, leads to high readability for readers familiar with the language and such functional idioms.
The idea of a function pointer is probably best (as per mjv, Shhnap). But, if the code under each case is fairly small, it may be overkill and result in more obfuscation than intended. In that case, I might implement something snappy and fast-to-read like this:
string decision = var1.ToString() + var2.ToString() + var3.ToString();
switch(decision)
{
case "1aa":
....
case "1ab":
....
}
Unfamiliar with your particular scenario so perhaps the previous suggestions are more appropriate.
I had exactly the same problem once, albeit for an immanent mess of a 5-parameter nested switch. I figured, why type all these O(N5) cases myself, why even invent 'nested' function names if the compiler can do this for me. And all this resulted in a 'nested specialized template switch' referring to a 'specialized template database'.
It's a little complicated to write. But I found it worth it: it results in a 'knowledge' database that is very easy to maintain, to debug, to add to etc... And I must admit: a sense of pride.
// the return type: might be an object actually _doing_ something
struct Result {
const char* value;
Result(): value(NULL){}
Result( const char* p ):value(p){};
};
Some variable types for switching:
// types used:
struct A { enum e { a1, a2, a3 }; };
struct B { enum e { b1, b2 }; };
struct C { enum e { c1, c2 }; };
A 'forward declaration' of the knowledge base: the 'api' of the nested switch.
// template database declaration (and default value - omit if not needed)
// specializations may execute code in stead of returning values...
template< A::e, B::e, C::e > Result valuedb() { return "not defined"; };
The actual switching logic (condensed)
// template layer 1: work away the first parameter, then the next, ...
struct Switch {
static Result value( A::e a, B::e b, C::e c ) {
switch( a ) {
case A::a1: return SwitchA<A::a1>::value( b, c );
case A::a2: return SwitchA<A::a2>::value( b, c );
case A::a3: return SwitchA<A::a3>::value( b, c );
default: return Result();
}
}
template< A::e a > struct SwitchA {
static Result value( B::e b, C::e c ) {
switch( b ) {
case B::b1: return SwitchB<a, B::b1>::value( c );
case B::b2: return SwitchB<a, B::b2>::value( c );
default: return Result();
}
}
template< A::e a, B::e b > struct SwitchB {
static Result value( C::e c ) {
switch( c ) {
case C::c1: return valuedb< a, b, C::c1 >();
case C::c2: return valuedb< a, b, C::c2 >();
default: return Result();
}
};
};
};
};
And the knowledge base itself
// the template database
//
template<> Result valuedb<A::a1, B::b1, C::c1 >() { return "a1b1c1"; }
template<> Result valuedb<A::a1, B::b2, C::c2 >() { return "a1b2c2"; }
This is how it can be used.
int main()
{
// usage:
Result r = Switch::value( A::a1, B::b2, C::c2 );
return 0;
}
Yes, there is definitely easier way to do that, both faster and simpler. The idea is basically the same as proposed by Alex Martelli. Instead of seeing you problem as bi-dimentional, see it as some one dimension lookup table.
It means combining var, subvar, subsubvar, etc to get one unique key and use it as your lookup table entry point.
The way to do it depends on the used language. With python combining var, subvar etc. to build a tuple and use it as key in a dictionnary is enough.
With C or such it's usually simpler to convert each keys to enums, then combine them using logical operators to get just one number that you can use in your switch (that's also an easy way to use switch instead of string comparizons with cascading ifs). You also get another benefit doing it. It's quite usual that several treatments in different branches of the initial switch are the same. With the initial form it's quite difficult to make that obvious. You'll probably have some calls to the same functions but it's at differents points in code. Now you can just group the identical cases when writing the switch.
I used such transformation several times in production code and it's easy to do and to maintain.
Summarily you can get something like this... the mix function obviously depends on your application specifics.
switch (mix(var, subvar))
{
case a1:
process a1;
case b1:
process b1;
case c1:
process c1;
case a2:
process a2;
case b2:
process b2;
case c2:
process c2;
case a3:
process a3;
case b3:
process b3;
case c3:
process c3;
}
Perhaps what you want is code generation?
#! /usr/bin/python
first = [1, 2, 3]
second = ['a', 'b', 'c']
def emit(first, second):
result = "switch (var)\n{\n"
for f in first:
result += " case {0}:\n switch (subvar)\n {{\n".format(f)
for s in second:
result += " case {1}:\n process {1}{0};\n".format(f,s)
result += " }\n"
result += "}\n"
return result
print emit(first,second)
#file("autogen.c","w").write(emit(first,second))
This is pretty hard to read, of course, and you might really want a nicer template language to do your dirty work, but this will ease some parts of your task.
If C++ is an option i would try using virtual function and maybe double dispatch. That could make it much cleaner. But it will only probably pay off only if you have many more cases.
This article on DDJ.com might be a good entry.
If you're just trying to eliminate the two-level switch/case statements (and save some vertical space), you can encode the two variable values into a single value, then switch on it:
// Assumes var is in [1,3] and subvar in [1,3]
// and that var and subvar can be cast to int values
switch (10*var + subvar)
{
case 10+1:
process a1;
case 10+2:
process b1;
case 10+3:
process c1;
//
case 20+1:
process a2;
case 20+2:
process b2;
case 20+3:
process c2;
//
case 30+1:
process a3;
case 30+2:
process b3;
case 30+3:
process c3;
//
default:
process error;
}
If your language is C#, and your choices are short enough and contain no special characters you can use reflection and do it with just a few lines of code. This way, instead of manually creating and maintaining an array of function pointers, use one that the framework provides!
Like this:
using System.Reflection;
...
void DispatchCall(string var, string subvar)
{
string functionName="Func_"+var+"_"+subvar;
MethodInfo m=this.GetType().GetMethod(fName);
if (m == null) throw new ArgumentException("Invalid function name "+ functionName);
m.Invoke(this, new object[] { /* put parameters here if needed */ });
}
void Func_1_a()
{
//executed when var=1 and subvar=a
}
void Func_2_charlie()
{
//executed when var=2 and subvar=charlie
}
Solution from developpez.com
Yes, you can optimize it and make it so much cleaner. You can not use such a "Chain of
Responsibility" with a Factory:
public class ProcessFactory {
private ArrayList<Process> processses = null;
public ProcessFactory(){
super();
processses = new ArrayList<Process>();
processses.add(new ProcessC1());
processses.add(new ProcessC2());
processses.add(new ProcessC3());
processses.add(new ProcessC4());
processses.add(new ProcessC5(6));
processses.add(new ProcessC5(22));
}
public Process getProcess(int var, int subvar){
for(Process process : processses){
if(process.canDo(var, subvar)){
return process;
}
}
return null;
}
}
Then just as your processes implement an interface process with canXXX you can easily use:
new ProcessFactory().getProcess(var,subvar).launch();
Let's say that I'm writing a function to convert between temperature scales. I want to support at least Celsius, Fahrenheit, and Kelvin. Is it better to pass the source scale and target scale as separate parameters of the function, or some sort of combined parameter?
Example 1 - separate parameters:
function convertTemperature("celsius", "fahrenheit", 22)
Example 2 - combined parameter:
function convertTemperature("c-f", 22)
The code inside the function is probably where it counts. With two parameters, the logic to determine what formula we're going to use is slightly more complicated, but a single parameter doesn't feel right somehow.
Thoughts?
Go with the first option, but rather than allow literal strings (which are error prone), take constant values or an enumeration if your language supports it, like this:
convertTemperature (TempScale.CELSIUS, TempScale.FAHRENHEIT, 22)
Depends on the language.
Generally, I'd use separate arguments with enums.
If it's an object oriented language, then I'd recommend a temperature class, with the temperature stored internally however you like and then functions to output it in whatever units are needed:
temp.celsius(); // returns the temperature of object temp in celsius
When writing such designs, I like to think to myself, "If I needed to add an extra unit, what would design would make it the easiest to do so?" Doing this, I come to the conclusion that enums would be easiest for the following reasons:
1) Adding new values is easy.
2) I avoid doing string comparison
However, how do you write the conversion method? 3p2 is 6. So that means there are 6 different combinations of celsius, Fahrenheit, and kelvin. What if I wanted to add a new temperate format "foo"? That would mean 4p2 which is 12! Two more? 5p2 = 20 combination. Three more? 6p2 = 30 combinations!
You can quickly see how each additional modification requires more and more changes to the code. For this reason I don't do direct conversions! Instead, I do an intermediate conversion. I'd pick one temperature, say Kelvin. And initially, I'd convert to kelvin. I'd then convert kelvin to the desired temperature. Yes, It does result in an extra calculation. However, it makes scalling the code a ton easier. adding adding a new temperature unit will always result in only two new modifications to the code. Easy.
A few things:
I'd use an enumerated type that a syntax checker or compiler can check rather than a string that can be mistyped. In Pseudo-PHP:
define ('kCelsius', 0); define ('kFarenheit', 1); define ('kKelvin', 2);
$a = ConvertTemperature(22, kCelsius, kFarenheit);
Also, it seems more natural to me to place the thing you operate on, in this case the temperature to be converted, first. It gives a logical ordering to your parameters (convert -- what? from? to?) and thus helps with mnemonics.
Your function will be much more robust if you use the first approach. If you need to add another scale, that's one more parameter value to handle. In the second approach, adding another scale means adding as many values as you already had scales on the list, times 2. (For example, to add K to C and F, you'd have to add K-C, K-F, C-K, and C-F.)
A decent way to structure your program would be to first convert whatever comes in to an arbitrarily chosen intermediate scale, and then convert from that intermediate scale to the outgoing scale.
A better way would be to have a little library of slopes and intercepts for the various scales, and just look up the numbers for the incoming and outgoing scales and do the calculation in one generic step.
In C# (and probaly Java) it would be best to create a Temperature class that stores temperatures privately as Celcius (or whatever) and which has Celcius, Fahrenheit, and Kelvin properties that do all the conversions for you in their get and set statements?
Depends how many conversions you are going to have. I'd probably choose one parameter, given as an enum: Consider this expanded version of conversion.
enum Conversion
{
CelsiusToFahrenheit,
FahrenheitToCelsius,
KilosToPounds
}
Convert(Conversion conversion, X from);
You now have sane type safety at point of call - one cannot give correctly typed parameters that give an incorrect runtime result. Consider the alternative.
enum Units
{
Pounds,
Kilos,
Celcius,
Farenheight
}
Convert(Unit from, Unit to, X fromAmount);
I can type safely call
Convert(Pounds, Celcius, 5, 10);
But the result is meaningless, and you'll have to fail at runtime. Yes, I know you're only dealing with temperature at the moment, but the general concept still holds (I believe).
I would choose
Example 1 - separate parameters: function convertTemperature("celsius", "fahrenheit", 22)
Otherwise within your function definition you would have to parse "c-f" into "celsius" and "fahrenheit" anyway to get the required conversion scales, which could get messy.
If you're providing something like Google's search box to users, having handy shortcuts like "c-f" is nice for them. Underneath, though, I would convert "c-f" into "celsius" and "fahrenheit" in an outer function before calling convertTemperature() as above.
In this case single parameters looks totally obscure;
Function convert temperature from one scale to another scale.
IMO it's more natural to pass source and target scales as separate parameters. I definitely don't want to try to grasp format of first argument.
I would make an enumeration out of the temperature types and pass in the 2 scale parameters. Something like (in c#):
public void ConvertTemperature(TemperatureTypeEnum SourceTemp,
TemperatureTypeEnum TargetTemp,
decimal Temperature)
{}
I'm always on the lookout for ways to use objects to solve my programming problems. I hope this means that I'm more OO than when I was only using functions to solve problems, but that remains to be seen.
In C#:
interface ITemperature
{
CelciusTemperature ToCelcius();
FarenheitTemperature ToFarenheit();
}
struct FarenheitTemperature : ITemperature
{
public readonly int Value;
public FarenheitTemperature(int value)
{
this.Value = value;
}
public FarenheitTemperature ToFarenheit() { return this; }
public CelciusTemperature ToCelcius()
{
return new CelciusTemperature((this.Value - 32) * 5 / 9);
}
}
struct CelciusTemperature
{
public readonly int Value;
public CelciusTemperature(int value)
{
this.Value = value;
}
public CelciusTemperature ToCelcius() { return this; }
public FarenheitTemperature ToFarenheit()
{
return new FarenheitTemperature(this.Value * 9 / 5 + 32);
}
}
and some tests:
// Freezing
Debug.Assert(new FarenheitTemperature(32).ToCelcius().Equals(new CelciusTemperature(0)));
Debug.Assert(new CelciusTemperature(0).ToFarenheit().Equals(new FarenheitTemperature(32)));
// crossover
Debug.Assert(new FarenheitTemperature(-40).ToCelcius().Equals(new CelciusTemperature(-40)));
Debug.Assert(new CelciusTemperature(-40).ToFarenheit().Equals(new FarenheitTemperature(-40)));
and an example of a bug that this approach avoids:
CelciusTemperature theOutbackInAMidnightOilSong = new CelciusTemperature(45);
FarenheitTemperature x = theOutbackInAMidnightOilSong; // ERROR: Cannot implicitly convert type 'CelciusTemperature' to 'FarenheitTemperature'
Adding Kelvin conversions is left as an exercise.
By the way, it doesn't have to be more work to implement the three-parameter version, as suggested in the question statement.
These are all linear functions, so you can implement something like
float LinearConvert(float in, float scale, float add, bool invert);
where the last bool indicates if you want to do the forward transform or reverse it.
Within your conversion technique, you can have a scale/add pair for X -> Kelvin. When you get a request to convert format X to Y, you can first run X -> Kelvin, then Kelvin -> Y by reversing the Y -> Kelvin process (by flipping the last bool to LinearConvert).
This technique gives you something like 4 lines of real code in your convert function, and one piece of data for every type you need to convert between.
Similar to what #Rob #wcm and #David explained...
public class Temperature
{
private double celcius;
public static Temperature FromFarenheit(double farenheit)
{
return new Temperature { Farhenheit = farenheit };
}
public static Temperature FromCelcius(double celcius)
{
return new Temperature { Celcius = celcius };
}
public static Temperature FromKelvin(double kelvin)
{
return new Temperature { Kelvin = kelvin };
}
private double kelvinToCelcius(double kelvin)
{
return 1; // insert formula here
}
private double celciusToKelvin(double celcius)
{
return 1; // insert formula here
}
private double farhenheitToCelcius(double farhenheit)
{
return 1; // insert formula here
}
private double celciusToFarenheit(double kelvin)
{
return 1; // insert formula here
}
public double Kelvin
{
get { return celciusToKelvin(celcius); }
set { celcius = kelvinToCelcius(value); }
}
public double Celcius
{
get { return celcius; }
set { celcius = value; }
}
public double Farhenheit
{
get { return celciusToFarenheit(celcius); }
set { celcius = farhenheitToCelcius(value); }
}
}
I think I'd go whole hog one direction or another. You could write a mini-language that does any sort of conversion like units does:
$ units 'tempF(-40)' tempC
-40
Or use individual functions like the recent Convert::Temperature Perl module does:
use Convert::Temperature;
my $c = new Convert::Temperature();
my $res = $c->from_fahr_to_cel('59');
But that brings up an important point---does the language you are using already have conversion functions? If so, what coding convention do they use? So if the language is C, it would be best to follow the example of the atoi and strtod library functions (untested):
double fahrtocel(double tempF){
return ((tempF-32)*(5/9));
}
double celtofahr(double tempC){
return ((9/5)*tempC + 32);
}
In writing this post, I ran across a very interesting post on using emacs to convert dates. The take-away for this topic is that it uses the one function-per-conversion style. Also, conversions can be very obscure. I tend to do date calculations using SQL because it seems unlikely there are many bugs in that code. In the future, I'm going to look into using emacs.
Here is my take on this (using PHP):
function Temperature($value, $input, $output)
{
$value = floatval($value);
if (isset($input, $output) === true)
{
switch ($input)
{
case 'K': $value = $value - 273.15; break; // Kelvin
case 'F': $value = ($value - 32) * (5 / 9); break; // Fahrenheit
case 'R': $value = ($value - 491.67) * (5 / 9); break; // Rankine
}
switch ($output)
{
case 'K': $value = $value + 273.15; break; // Kelvin
case 'F': $value = $value * (9 / 5) + 32; break; // Fahrenheit
case 'R': $value = ($value + 273.15) * (9 / 5); break; // Rankine
}
}
return $value;
}
Basically the $input value is converted to the standard Celsius scale and then converted back again to the $output scale - one function to rule them all. =)
My vote is two parameters for conversion types, one for the value (as in your first example). I would use enums instead of string literals, however.
Use enums, if your language allows it, for the unit specifications.
I'd say the code inside would be easier with two. I'd have a table with pre-add, multiplty, and post-add, and run the value through the item for one unit, and then through the item for the other unit in reverse. Basically converting the input temperature to a common base value inside, and then out to the other unit. This entire function would be table-driven.
I wish there was some way to accept multiple answers. Based on everyone's recommendations, I think I will stick with the multiple parameters, changing the strings to enums/constants, and moving the value to be converted to the first position in the parameter list. Inside the function, I'll use Kelvin as a common middle ground.
Previously I had written individual functions for each conversion and the overall convertTemperature() function was merely a wrapper with nested switch statements. I'm writing in both classic ASP and PHP, but I wanted to leave the question open to any language.