Boost Spirit sequential key value parser - boost

Is there a better way of doing this with Spirit? I'm parsing a sequential series of key value pairs, with some line endings and other cruft in between. The format is not so consistent that I can just pull key values pairs out with a single rule. So I've got an adapter and production rules like:
BOOST_FUSION_ADAPT_STRUCT(
Record,
( std::string, messageHeader )
( double, field1 )
( std::string, field2 )
( Type3, field3 )
// ...
( TypeN, fieldN )
)
template< typename Iterator, typename Skipper >
class MyGrammar : public qi::grammar< Iterator, Record(), Skipper >
{
public:
MyGrammar() : MyGrammar::base_type{ record }
{
record =
qi::string( "Message header" )
>> field1 >> field2
// ...
>> fieldN;
field1 = qi::lit( "field 1:" ) >> qi::double_;
// ...
}
// field rule declarations...
};
This is a straightforward if tedious way of going about it, and I've already exceeded the compiler's rule complexity threshold once, which forced me to refactor the fields into separate rules. Also if there's an error parsing a message, the parser always shows the error being at the beginning of the string, like the rule doesn't give it enough context to figure out where the problem actually is. I assume this is from the way the >> operator works.
Edit:
In response to sehe's question, I've run into two problems with this approach and the MSVC 15 compiler. The first was a compiler error on my top-level production when it hit somewhere in the vicinity of 80 components separated by >>:
recursive type or function dependency context too complex
So I pushed everything I could down into subordinate rules to reduce the complexity. Unfortunately now, after adding still more rules, I'm running into:
fatal error C1060: compiler is out of heap space
So I find that I do need some way to further decompose the problem that's not just a long series of concatenated production rules...

Related

Reordering members in a template by alignment

Assume I write the following code:
template<typename T1, typename T2>
struct dummy {
T1 first;
T2 second;
};
I would like to know in general how I can order members in a template class by descending size. In other words, I would like the above class to be
struct dummy {
int first;
char second;
};
when instantiated as dummy<int, char>. However, I would like to obtain
struct dummy {
int second;
char first;
};
in the case dummy<char, int>.
On most platforms, padding for std::pair occurs only at "natural" alignment. This sort of padding will end up the same for either order.
For std::tuple, some arrangements can be more efficient than others, but the library can choose any memory layout it likes, so any TMP you add on top is only second-guessing.
In general, yes, you can define a sorting algorithm using templates, but it would be a fair bit of work.
This can be done, the only issue is the naming, how would you name your fields ??
I did what you are asking for not long time ago, I used std::tuple, and some meta-programming skills, I did a merge sort to reorder the template arguments, It is really fun to do (if you like functionnal programming).
For the naming I used some Macro to access the fields.
I really encourage you to do it by yourself, it is really interesting intellectually, however if you like to see some code, please tell me !

Mismtach Error in Foxpro SQL insertion

I need someone could help me out on how to trace the error of "mismatched data type" in visual foxpro 6.0 When I issues a command like this "insert into tmpcur from memvar".
tmpcur is a cursor having bulk numbers of columns and it is ready hard to trace which one is having mismatch in data type for insertion problem.
It is pretty difficult to trace the insertion loop of each record into VFP tables one by one unliked MSSQL profiler.
Appreciate to someone could help. Thanks.
This should help you. I have a temp cursor created with some bogus field / column names testing for types of character, integer, double, currency, date and time. Trying to follow what is the result of your scenario, I am taking the memory variable of "bbbb" which should be double (or numeric at the least), and changed it to a string.
I am then HOLDING the error trapping routine that MAY be in effect, then setting my own (as I don't think try/catch existed in VFP6.. it may, but I just don't remember. So, I did an ON ERROR, set a variable to true. Then, I default it to false, try the insert, then check the flag. If the flag IS set, then I go into a loop and try for each column in the given table/alias (in my example it is "C_Tmp", so replace with your table/alias). It goes through each variable, and if the data type is different from the table structure, it will dump the column name and table / memory value for you to review.
You could put this to a log file or something.
Now, another consideration. Some types are completely valid and common for implied conversion, such as character and memo fields can both get strings. Integer, double, float, currency can all work with generic "numeric" values.
So, if you encounter these differences, then we can go one level further and look for comparable types, but let me know and we can adjust as needed.
At least this should give you a huge jump to your insert issue.
CREATE CURSOR C_tmp ( cccc c(10), iiii i, bbbb b(2), ccyyyy y, ddd d, tttt t )
SCATTER MEMVAR memo
m.bbbb = "wrong data type, was double with 2 decimal"
lcHoldError = ON("ERROR")
ON ERROR lFailInsert = .t.
lFailInsert = .f.
INSERT INTO C_Tmp FROM memvar
IF lFailInsert
FOR lnI = 1 TO FCOUNT( "C_Tmp" )
lcTmp = FIELD( lnI, "C_Tmp" )
IF NOT TYPE( "C_Tmp." + lcTmp ) == TYPE( "m.&lcTmp" )
? "Invalid " + lcTmp + ", C_Tmp.&lcTmp, m.&lcTmp
ENDIF
ENDFOR
ENDIF
ON ERROR &lcHoldError

C++ Boost qi recursive rule construction

[It seems my explanations and expectations are not clear at all, so I added precision on how I'd like to use the feature at the end of the post]
I'm currently working on grammars using boost qi. I had a loop construction for a rule cause I needed to build it from the elements of a vector. I have re-written it with simple types, and it looks like:
#include <string>
// using boost 1.43.0
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/qi_eps.hpp>
#include <boost/spirit/include/phoenix.hpp>
namespace bqi = boost::spirit::qi;
typedef const char* Iterator;
// function that you can find [here][1]
template<typename P> void test_phrase_parser(char const* input, P const& p, bool full_match = true);
int main()
{
// my working rule type:
bqi::rule<Iterator, std::string()> myLoopBuiltRule;
std::vector<std::string> v;
std::vector<std::string>::const_iterator iv;
v.push_back("abc");
v.push_back("def");
v.push_back("ghi");
v.push_back("jkl");
myLoopBuiltRule = (! bqi::eps);
for(iv = v.begin() ; iv != v.end() ; iv++)
{
myLoopBuiltRule =
myLoopBuiltRule.copy() [ bqi::_val = bqi::_1 ]
| bqi::string(*iv) [ bqi::_val = bqi::_1 ]
;
}
debug(myLoopBuiltRule);
char s[] = " abc ";
test_phrase_parser(s, myLoopBuiltRule);
}
(Looks like here does not want to be replaced by corresponding hyperlink, so here is the address to find function test_phrase_parser(): http://www.boost.org/doc/libs/1_43_0/libs/spirit/doc/html/spirit/qi/reference/basics.html)
All was for the best in the best of all worlds... until I had to pass an argument to this rule. Here is the new rule type:
// my not-anymore-working rule type:
bqi::rule<Iterator, std::string(int*)> myLoopBuiltRule;
'int*' type is for example purpose only, my real pointer is adressing a much more complex class... but still a mere pointer.
I changed my 'for' loop accordingly, i.e.:
for(iv = v.begin() ; iv != v.end() ; iv++)
{
myLoopBuiltRule =
myLoopBuiltRule.copy()(bqi::_r1) [ bqi::_val = bqi::_1 ]
| bqi::string(*iv) [ bqi::_val = bqi::_1 ]
;
}
I had to add a new rule because test_phrase_parser() cannot guess which value is to be given to the int pointer:
bqi::rule<Iterator> myInitialRule;
And change everything that followed the for loop:
myInitialRule = myLoopBuiltRule((int*)NULL);
debug(myLoopBuiltRule);
char s[] = " abc ";
test_phrase_parser(s, myInitialRule);
Then everything crashed:
/home/sylvain.darras/software/repository/software/external/include/boost/boost_1_43_0/boost/spirit/home/qi/nonterminal/rule.hpp:199: error: no matching function for call to ‘assertion_failed(mpl_::failed************ (boost::spirit::qi::rule<Iterator, T1, T2, T3, T4>::operator=(const Expr&)
Then I got crazy and tried:
myLoopBuiltRule =
myLoopBuiltRule.copy(bqi::_r1) [ bqi::_val = bqi::_1 ]
| bqi::string(*iv) [ bqi::_val = bqi::_1 ]
-->
error: no matching function for call to ‘boost::spirit::qi::rule<const char*, std::string(int*), boost::fusion::unused_type, boost::fusion::unused_type, boost::fusion::unused_type>::copy(const boost::phoenix::actor<boost::spirit::attribute<1> >&)’
Then I got mad and wrote:
myLoopBuiltRule =
myLoopBuiltRule(bqi::_r1) [ bqi::_val = bqi::_1 ]
| bqi::string(*iv) [ bqi::_val = bqi::_1 ]
Which compiles since it is perfectly syntactically correct, but which magnificently stack overflows coz it happily, nicely, recursively, calls itself to death...
Then I lost my mind and typed:
myLoopBuiltRule =
jf jhsgf jshdg fjsdgh fjsg jhsdg jhg sjfg jsgh df
Which, as you probably expect, has failed to compile.
You imagine that before writing the above novel, I checked out on the web, but didn't find out anything related to copy() and argument passing in the same time. Has anyone already experienced this problem ? Have I missed something ?
Be assured that any help will be really really appreciated.
PS: Great thanks to hkaiser who has, without knowing it, answered a lot of my boost::qi problems through google (but this one).
Further information:
The purpose of my parser is to read files written in a given language L. The purpose of my post is to propagate my "context" (i.e.: variable definitions and especially constant values, so I can compute expressions).
The number of variable types I handle is small, but it's bound to grow, so I keep these types in a container class. I can loop on these managed types.
So, let's consider a pseudo-algorithm of what I would like to achive:
LTypeList myTypes;
LTypeList::const_iterator iTypes;
bqi::rule<Iterator, LType(LContext*)> myLoopBuiltRule;
myLoopBuiltRule = (! bqi::eps);
for(iTypes = myTypes.begin() ; iTypes != myTypes.end() ; iTypes++)
{
myLoopBuiltRule =
myLoopBuiltRule.copy()(bqi::_r1) [ bqi::_val = bqi::_1 ]
| iTypes->getRule()(bqi::_r1) [ bqi::_val = bqi::_1 ]
}
This is done during initialization and then myLoopBuiltRule is used and reused with different LContext*, parsing multiple types. And since some L types can have bounds, which are integer expressions, and that these integer expressions can exhibit constants, I (think that I) need my inherited attribute to take my LContext around and be able to compute expression value.
Hope I've been clearer in my intentions.
Note I just extended my answer with a few more informational links. In this particular case I have a hunch that you could just get away with the Nabialek trick and replacing the inherited attribute with a corresponding qi::locals<> instead. If I have enough time, I might work out a demonstration later.
Caveats, expositioning the problem
Please be advised that there are issues when copying proto expression trees and spirit parser expressions in particular - it will create dangling references as the internals are not supposed to live past the end of the containing full expressions. See BOOST_SPIRIT_AUTO on Zero to 60 MPH in 2 seconds!
Also see these answers which also concerns themselves with building/composing rules on the fly (at runtime):
Generating Spirit parser expressions from a variadic list of alternative parser expressions
Can Boost Spirit Rules be parameterized which demonstrates how to return rules from a function using boost::proto::deepcopy (like BOOST_SPIRIT_AUTO does, actually)
Nabialek Trick
In general, I'd very strongly advise against combining rules at runtime. Instead, if you're looking to 'add alternatives' to a rule at runtime, you can always use qi::symbols<> instead. The trick is to store a rule in the symbol-table and use qi::lazy to call the rule. In particular, this is known as the Nabialek Trick.
I have a toy command-line arguments parser here that demonstrates how you could use this idiom to match a runtime-defined set of command line arguments:
https://gist.github.com/sehe/2a556a8231606406fe36
Limitations of qi::lazy, what's next?
Unfortunately, qi::lazy does not support inherited arguments see e.g.
http://boost.2283326.n4.nabble.com/pass-inhertited-attributes-to-nabialek-trick-td2679066.html
You might be better off writing a custom parser component, as documented here:
http://boost-spirit.com/home/articles/qi-example/creating-your-own-parser-component-for-spirit-qi/
I'll try to find some time to work out a sample that replaces inherited arguments by qi::locals later.

How can I optimize a multiple (matrix) switch / case algorithm?

Is it possible to optimize this kind of (matrix) algorithm:
// | case 1 | case 2 | case 3 |
// ------|--------|--------|--------|
// | | | |
// case a| a1 | a2 | a3 |
// | | | |
// case b| b1 | b2 | b3 |
// | | | |
// case c| c1 | c2 | c3 |
// | | | |
switch (var)
{
case 1:
switch (subvar)
{
case a:
process a1;
case b:
process b1;
case c:
process c1;
}
case 2:
switch (subvar)
{
case a:
process a2;
case b:
process b2;
case c:
process c2;
}
case 3:
switch (subvar)
{
case a:
process a3;
case b:
process b3;
case c:
process c3;
}
}
The code is fairly simple but you have to imagine more complex with more "switch / case".
I work with 3 variables. According they take the values 1, 2, 3 or a, b, c or alpha, beta, charlie have different processes to achieve. Is it possible to optimize it any other way than through a series of "switch / case?
(Question already asked in french here).
Edit: (from Dran Dane's responses to comments below. These might as well be in this more prominent place!)
"optimize" is to be understood in terms of having to write less code, fewer "switch / case". The idea is to improve readability, maintainability, not performance.
There is maybe a way to write less code via a "Chain of Responsibility" but this solution is not optimal on all points, because it requires the creation of many objects in memory.
It sounds like what you want is a 'Finite State Machine' where using those cases you can activate different processes or 'states'. In C this is usually done with an array (matrix) of function pointers.
So you essentially make an array and put the right function pointers at the right indicies and then you use your 'var' as an index to the right 'process' and then you call it. You can do this in most languages. That way different inputs to the machine activate different processes and bring it to different states. This is very useful for numerous applications; I myself use it all of the time in MCU development.
Edit: Valya pointed out that I probably should show a basic model:
stateMachine[var1][var2](); // calls the right 'process' for input var1, var2
There are no good answers to this question :-(
because so much of the response depends on
The effective goals (what is meant by "optimize", what is unpleasing about the nested switches)
The context in which this construct is going to be applied (what are the ultimate needs implicit to the application)
TokenMacGuy was wise to ask about the goals. I took the time to check the question and its replies on the French site and I'm still puzzled as to the goals... Dran Dane latest response seems to point towards lessening the amount of code / improving readability but let's review for sure:
Processing Speed: not an issue the nested switches are quite efficient, possibly a tat less than 3 multiplications to get an index into a map table, but maybe not even.
Readability: yes possibly an issue, As the number of variables and level increases the combinatorial explosion kicks in, and also the format of the switch statement tends to spread the branching spot and associated values over a long vertical stretch. In this case a 3 dimension (or more) table initialized with fct. pointers puts back together the branching values and the function to be call on on a single line.
Writing less code: Sorry not much help here; at the end of the day we need to account for a relatively high number of combinations and the "map", whatever its form, must be written somewhere. Code generators such as TokenMacGuy's may come handy, it does seem a bit of an overkill in this case. Generators have their place, but I'm not sure it is the case here. One of two case: if the number of variables and level is small enough, the generator is not worth it (takes more time to set it up than to write the actual code in the first place), if the number of variables and levels is significant, the generated code is hard to read, hard to maintain...)
In a nutshell, my recommendation with regards to making the code more readable (and a bit faster to write) is the table/matrix approach described on the French site.
This solution is in two part:
a one time initialization of a 3 dimensional array (for 3 levels); (or a "fancier" container structure if preferred: a tree for example) . This is done with code like:
// This is positively more compact / readable
...
FctMap[1][4][0] = fctAlphaOne;
FctMap[1][4][1] = fctAlphaOne;
..
FctMap[3][0][0] = fctBravoCharlie4;
FctMap[3][0][1] = NULL; // impossible case
FctMap[3][0][2] = fctBravoCharlie4; // note how the same fct may serve in mult. places
And a relatively simple snippet wherever the functions need to be called:
if (FctMap[cond1][cond2][cond3]) {
retVal = FctMap[cond1][cond2][cond3](Arg1, Arg2);
if (retVal < 0)
DoSomething(); // anyway we're leveraging the common api to these fct not the switch alternative ....
}
A case which may prompt one NOT using the solution above are if the combination space is relatively sparsely populated (many "branches" in the switch "tree" are not used) or if some of the functions require a different set of parameters; For both of these cases, I'd like to plug a solution Joel Goodwin proposed first here, and which essentially combines the various keys for the several level into one longer key (with separator character if need be), essentially flattening the problem back to a long, but single level switch statement.
Now...
The real discussion should be about why we need such a mapping/decision-tree in the first place. To answer this unfortunately requires understanding the true nature of the underlying application. To be sure I'm not saying that this is indicative of bad design. A big dispatching section may make sense in some applications. However, even with the C language (which the French Site contributors seemed to disqualify to Object Oriented design), it is possible to adopt Object oriented methodology and patterns. Anyway I'm diverging...) It is possible that the application would overall be better served with alternative design patterns where the "information tree about what to call when" has been distributed in several modules and/or several objects.
Apologies to speak about this in rather abstract terms, it's just the lack of application specifics... The point remains: challenge the idea that we need this big dispatching tree; think of alternative approaches to the application at large.
Alors, bonne chance! ;-)
Depending on the language, some form of hash map with the pair (var, subvar) as the key and first-class functions as the values (or whatever your language offers to best approximate that, e.g. instances of classes extending some proper interface in Java) is likely to provide top performance -- and the utter conciseness of fetching the appropriate function (or whatever;-) from the map based on the key, and executing it, leads to high readability for readers familiar with the language and such functional idioms.
The idea of a function pointer is probably best (as per mjv, Shhnap). But, if the code under each case is fairly small, it may be overkill and result in more obfuscation than intended. In that case, I might implement something snappy and fast-to-read like this:
string decision = var1.ToString() + var2.ToString() + var3.ToString();
switch(decision)
{
case "1aa":
....
case "1ab":
....
}
Unfamiliar with your particular scenario so perhaps the previous suggestions are more appropriate.
I had exactly the same problem once, albeit for an immanent mess of a 5-parameter nested switch. I figured, why type all these O(N5) cases myself, why even invent 'nested' function names if the compiler can do this for me. And all this resulted in a 'nested specialized template switch' referring to a 'specialized template database'.
It's a little complicated to write. But I found it worth it: it results in a 'knowledge' database that is very easy to maintain, to debug, to add to etc... And I must admit: a sense of pride.
// the return type: might be an object actually _doing_ something
struct Result {
const char* value;
Result(): value(NULL){}
Result( const char* p ):value(p){};
};
Some variable types for switching:
// types used:
struct A { enum e { a1, a2, a3 }; };
struct B { enum e { b1, b2 }; };
struct C { enum e { c1, c2 }; };
A 'forward declaration' of the knowledge base: the 'api' of the nested switch.
// template database declaration (and default value - omit if not needed)
// specializations may execute code in stead of returning values...
template< A::e, B::e, C::e > Result valuedb() { return "not defined"; };
The actual switching logic (condensed)
// template layer 1: work away the first parameter, then the next, ...
struct Switch {
static Result value( A::e a, B::e b, C::e c ) {
switch( a ) {
case A::a1: return SwitchA<A::a1>::value( b, c );
case A::a2: return SwitchA<A::a2>::value( b, c );
case A::a3: return SwitchA<A::a3>::value( b, c );
default: return Result();
}
}
template< A::e a > struct SwitchA {
static Result value( B::e b, C::e c ) {
switch( b ) {
case B::b1: return SwitchB<a, B::b1>::value( c );
case B::b2: return SwitchB<a, B::b2>::value( c );
default: return Result();
}
}
template< A::e a, B::e b > struct SwitchB {
static Result value( C::e c ) {
switch( c ) {
case C::c1: return valuedb< a, b, C::c1 >();
case C::c2: return valuedb< a, b, C::c2 >();
default: return Result();
}
};
};
};
};
And the knowledge base itself
// the template database
//
template<> Result valuedb<A::a1, B::b1, C::c1 >() { return "a1b1c1"; }
template<> Result valuedb<A::a1, B::b2, C::c2 >() { return "a1b2c2"; }
This is how it can be used.
int main()
{
// usage:
Result r = Switch::value( A::a1, B::b2, C::c2 );
return 0;
}
Yes, there is definitely easier way to do that, both faster and simpler. The idea is basically the same as proposed by Alex Martelli. Instead of seeing you problem as bi-dimentional, see it as some one dimension lookup table.
It means combining var, subvar, subsubvar, etc to get one unique key and use it as your lookup table entry point.
The way to do it depends on the used language. With python combining var, subvar etc. to build a tuple and use it as key in a dictionnary is enough.
With C or such it's usually simpler to convert each keys to enums, then combine them using logical operators to get just one number that you can use in your switch (that's also an easy way to use switch instead of string comparizons with cascading ifs). You also get another benefit doing it. It's quite usual that several treatments in different branches of the initial switch are the same. With the initial form it's quite difficult to make that obvious. You'll probably have some calls to the same functions but it's at differents points in code. Now you can just group the identical cases when writing the switch.
I used such transformation several times in production code and it's easy to do and to maintain.
Summarily you can get something like this... the mix function obviously depends on your application specifics.
switch (mix(var, subvar))
{
case a1:
process a1;
case b1:
process b1;
case c1:
process c1;
case a2:
process a2;
case b2:
process b2;
case c2:
process c2;
case a3:
process a3;
case b3:
process b3;
case c3:
process c3;
}
Perhaps what you want is code generation?
#! /usr/bin/python
first = [1, 2, 3]
second = ['a', 'b', 'c']
def emit(first, second):
result = "switch (var)\n{\n"
for f in first:
result += " case {0}:\n switch (subvar)\n {{\n".format(f)
for s in second:
result += " case {1}:\n process {1}{0};\n".format(f,s)
result += " }\n"
result += "}\n"
return result
print emit(first,second)
#file("autogen.c","w").write(emit(first,second))
This is pretty hard to read, of course, and you might really want a nicer template language to do your dirty work, but this will ease some parts of your task.
If C++ is an option i would try using virtual function and maybe double dispatch. That could make it much cleaner. But it will only probably pay off only if you have many more cases.
This article on DDJ.com might be a good entry.
If you're just trying to eliminate the two-level switch/case statements (and save some vertical space), you can encode the two variable values into a single value, then switch on it:
// Assumes var is in [1,3] and subvar in [1,3]
// and that var and subvar can be cast to int values
switch (10*var + subvar)
{
case 10+1:
process a1;
case 10+2:
process b1;
case 10+3:
process c1;
//
case 20+1:
process a2;
case 20+2:
process b2;
case 20+3:
process c2;
//
case 30+1:
process a3;
case 30+2:
process b3;
case 30+3:
process c3;
//
default:
process error;
}
If your language is C#, and your choices are short enough and contain no special characters you can use reflection and do it with just a few lines of code. This way, instead of manually creating and maintaining an array of function pointers, use one that the framework provides!
Like this:
using System.Reflection;
...
void DispatchCall(string var, string subvar)
{
string functionName="Func_"+var+"_"+subvar;
MethodInfo m=this.GetType().GetMethod(fName);
if (m == null) throw new ArgumentException("Invalid function name "+ functionName);
m.Invoke(this, new object[] { /* put parameters here if needed */ });
}
void Func_1_a()
{
//executed when var=1 and subvar=a
}
void Func_2_charlie()
{
//executed when var=2 and subvar=charlie
}
Solution from developpez.com
Yes, you can optimize it and make it so much cleaner. You can not use such a "Chain of
Responsibility" with a Factory:
public class ProcessFactory {
private ArrayList<Process> processses = null;
public ProcessFactory(){
super();
processses = new ArrayList<Process>();
processses.add(new ProcessC1());
processses.add(new ProcessC2());
processses.add(new ProcessC3());
processses.add(new ProcessC4());
processses.add(new ProcessC5(6));
processses.add(new ProcessC5(22));
}
public Process getProcess(int var, int subvar){
for(Process process : processses){
if(process.canDo(var, subvar)){
return process;
}
}
return null;
}
}
Then just as your processes implement an interface process with canXXX you can easily use:
new ProcessFactory().getProcess(var,subvar).launch();

What is the advantage of this peculiar formatting?

I've seen this format used for comma-delimited lists in some C++ code (although this could apply to any language):
void function( int a
, int b
, int c
)
I was wondering why would someone use that over a more common format such as:
void function (int a,
int b,
int c
)
That's a pretty common coding style when writing SQL statements:
SELECT field1
, field2
, field3
-- , field4
, field5
FROM tablename
Advantages:
Lets you add, remove, or rearrange fields easily without having to worry about that final trailing comma.
Lets you easily comment out a row (TSQL uses "--") without messing up the rest of the statement.
I wouldn't think you'd want to rearrange parameter order in a function as frequent as you do in SQL, so maybe its just somebody's habit.
The ability to comment one of them out will depend on the specific language being used. Not sure about C++. I know that VB.Net wouldn't allow it, but that's because it requires a continuation character ( _ ) to split statements across lines.
It is easier to add a parameter at the end starting by duplicating previous parameter (line).
Make sense when you are sure that first parameter will never change, which is often the case.
Malice?
Seriously though, it's hard to account for formatting style sometimes. It's largely a matter of personal taste. Personally, I think that both forms are a little nasty unless you're seriously restricted in terms of line-length.
Another advantage is that in the first example you could comment-out either the B or C lines, and it will stay syntactically correct. In the second example, if you tried to comment out the C line, you'd have a syntax error.
Not really worth making it that ugly, if you ask me.
The only benefit I would see, is when you add a parameter, you just have to copy and paste the last line, saving you the extra couple key strokes of editing comma position and such.
Seems to me like a personal choice.
No reason, I suspect it's just a matter of personal preference.
I'd personally prefer the second one.
void function (int a,
int b,
int c
)
The only benefit I would see, is when you add a parameter, you just have to copy and paste the last line, saving you the extra couple key strokes of editing comma position and such.
The same goes for if you are removing the last parameter.
When scanning the file quicky, it's clear that each line that begins with a comma is a continuation of the line above it (compared to a line that's simply indented further than the one above). It's a generalization of the following style:
std::cout << "some info "
<< "some more info " << 4
+ 5 << std::endl;
(Please note, in this case, breaking up 4 + 5 is stupid, but if you have a complex math statement it may be necessary).
I use this a lot, especially when dealing with conditionals such as if, for, and while statements. Because it's also common for one-line conditionals to omit the curlies.
std::vector<int> v = ...;
std::vector<int> w = ...;
for (std::vector<int>::iterator i = v.begin()
, std::vector<int>::iterator j = w.begin()
; i != v.end() && j != w.end()
; ++i, ++j)
std::cout << *i + *j << std::endl;
When you add another field to the end, the single line you add contains the new comma, producing a diff of a single line addition, making it slightly easier to see what has changed when viewing change logs some time in the future.
It seems like most of the answers center around the ability to comment out or add new parameters easily. But it seems that you get the same effect with putting the comma at the end of the line rather than the beginning:
function(
int a,
int b,
// int c,
int d
)
You might say that you can't do that to the last parameter, and you would be right, but with the other form, you can't do it to the first parameter:
function (
// int a
, int b
, int c
, int d
)
So the tradeoff is being able to comment out the first parameter vs. being able to comment out the last parameter + being able to add new parameters without adding a comma to the previous last parameter.
I know when I wrap and's in a sql or if statement I try to make sure the and is the start of the next line.
If A and B
and C
I think it makes it clear the the C is still part of the if. The first format you show may be that. But as with most style questions the simple matter is that if the team decides on one style then it should be adhered to.

Resources