Is there any difference between the terms line of code and statement? - difference

Is there any difference between the terms 'line of code' and 'statement' in programming languages?

Yes, check following:
int a =0;
while(a<100)
{
cout<<
"Is it ok"<<
a <<
"is the current value of 'a'"<<endl;
a++;
}
Above code snippet is having:
Line of code: 9
Simple Statements: 3
Compound Statements: 1

Line of code is basically how many end points you are using. A Statement is the group of code that you produce to create an expected output.
For example in a conditional statement using if else, you can right that in multiple line of core or using only one line via Ternary method.
Sample:
if($var == 0) {
echo "this is zero";
} else {
echo "not a zero";
}
that basically created 5 line of code. In ternary you can go like this
$var == 0 ? echo "this is zero" : echo "not a zero";
As you see the result is the same only it is created in 1 line of code.
Hope this helps you moving forward

Lines of code presumably refers to lines of content in a source file, i.e. the number of lines in a file. On the other hand, a given statement of code can, and often does, exceed a single line. For example, consider the following Java statement from the Javadoc for streams:
int sum = widgets.stream()
.filter(b -> b.getColor() == RED)
.mapToInt(b -> b.getWeight())
.sum();
One statement spans four physical lines in the editor. However, we could inline the whole thing into a single line.

Related

How does the interpretive program(ex:perl,shell) work

I propose a hypothesis: 1. the operating system creates a process space to start the interpreter; 2. the interpreter creates a new process space to start the program that needs to be interpreted, translating the first statement into machine language; 3. the execution of the first statement ends and interrupts; 4. the interpreter translates the next statement and dynamically modifies and creates new instructions. Well, I can't make it up. I can't understand the concept of explaining and executing.
Here's a example interpreter:
while (<>) {
my ($cmd, #args) = split;
if ($cmd eq '...') { ... }
elsif ($cmd eq '...') { ... }
elsif ($cmd eq '...') { ... }
else { ... }
}
This points out that the interpreted program isn't run in a separate process from the interpreter.
This also points out there isn't necessarily any translation to machine language.
Due note that Perl is a compiled language rather than interpreted one.
$ perl -MO=Concise,-exec -e'print("Hello, world!\n");'
1 <0> enter
2 <;> nextstate(main 1 -e:1) v:{
3 <0> pushmark s
4 <$> const[PV "Hello, world!\n"] s
5 <#> print vK
6 <#> leave[1 ref] vKP/REFC
-e syntax OK
That said, the compiled form is not native instructions. There are different ways this can be handled, but Perl effectively interprets these. The following is that interpreter:
int
Perl_runops_standard(pTHX)
{
OP *op = PL_op;
PERL_DTRACE_PROBE_OP(op);
while ((PL_op = op = op->op_ppaddr(aTHX))) {
PERL_DTRACE_PROBE_OP(op);
}
PERL_ASYNC_CHECK();
TAINT_NOT;
return 0;
}
(Copied from here.)
The ops are really data structures arranged in a linked list (with other pointers for jumps) rather than a stream of bytes encoding instructions. The above loop traverses the list, executing the function associated with each op. These function returns the address of the next op to execute, thus forming the program.
Some languages probably take a similar approach. Other languages definitely take a different approach.

How can I find both identical and similar strings in a particular field in a text file in Linux?

My apologies ahead of time - I'm not sure that there is an answer for this one using only Linux command-line fu. Please note I am not a programmer, but I have been playing around with bash and python a bit over the last few years.
I have a large text file with rows and columns that resemble the following (note - fields are separated with tabs):
1074 Beetle OOB11061MNH 12/22/16 Confirmed
3430 Hightop 0817BESTYET 08/07/17 Queued
3431 Hightop 0817BESTYET 08/07/17 Queued
3078 Copland 2017GENERAL 07/07/17 Confirmed
3890 Bartok FOODS 09/11/17 Confirmed
5440 Alphapha 00B1106IMNH 01/09/18 Queued
What I want to do is find and output only those rows where the third field is either identical OR similar to another in the list. I don't really care whether the other fields are similar or not, but they should all be included in the output. By similar, I mean no more than [n] characters are different in that particular field (for example, no more than 3 characters are different). So the output I would want would be:
1074 Beetle OOB11061MNH 12/22/16 Confirmed
3430 Hightop 0817BESTYET 08/07/17 Queued
3431 Hightop 0817BESTYET 08/07/17 Queued
5440 Alphapha 00B1106IMNH 01/09/18 Queued
The line beginning 1074 has a third field that differs by 3 characters with 5440, so both of them are included. 3430 and 3431 are included because they are exactly identical. 3078 and 3890 are eliminated because they are not similar.
Through googling the forums I've managed to piece together this rather longish pipeline to be able to find all of the instances where field 3 is exactly identical:
cat inputfile.txt | awk 'BEGIN { OFS=FS="\t" } {if (count[$3] > 1) print $0; else if (count[$3] == 1) { print save[$3]; print $0; } else save[$3] = $0; count[$3]++; }' > outputfile.txt
I must confess I don't really understand awk all that well; I'm just copying and adapting from the web. But that seemed to work great at finding exact duplicates (i.e., it would output only 3430 and 3431 above). But I have no idea how to approach trying to find strings that are not identical but that differ in no more than 3 places.
For instance, in my example above, it should match 1074 and 5440 because they would both fit the pattern:
??B1106?MNH
But I would want it to be able to match also any other random pattern of matches, as long as there are no more than three differences, like this:
20?7G?N?RAL
These differences could be arbitrarily in any position.
The reason for needing this is we are trying to find a way to automatically find typographical errors in a serial-number-like field. There might be a mis-key, or perhaps a letter "O" replaced with a number "0", or the like.
So... any ideas? Thanks for the help!
you can use this script
$ more hamming.awk
function hamming(x,y,xs,ys,min,max,h) {
if(x==y) return 0;
else {
nx=split(x,xs,"");
mx=split(y,ys,"");
min=nx<mx?nx:mx;
max=nx<mx?mx:nx;
for(i=1;i<=min;i++) if(xs[i]!=ys[i]) h++;
return h+(max-min);
}
}
BEGIN {FS=OFS="\t"}
NR==FNR {
if($3 in a) nrs[NR];
for(k in a)
if(hamming(k,$3)<4) {
nrs[NR];
nrs[a[k]];
}
a[$3]=NR;
next
}
FNR in nrs
usage
$ awk -f hamming.awk file{,}
it's a double scan algorithm, finds the hamming distance (the one you described) between keys. Notice the it's O(n^2) algorithm, so may not suitable for very large data sets. However, not sure any other algorithm can do better.
NB Additional note based on the comment which I missed from the post. This algorithm compares the keys character by character, so displacements won't be identified. For example 123 and 23 will give a distance of 3.
Levenshtein distance aka "edit distance" suits your task best. Perl script below requires installing a module Text::Levenshtein (for debian/ubuntu do: sudo apt install libtext-levenshtein-perl).
use Text::Levenshtein qw(distance);
$maxdist = shift;
#ll = (<>);
#k = map {
$k = (split /\t/, $_)[2];
# $k =~ s/O/0/g;
} #ll;
for ($i = 0; $i < #ll; ++$i) {
for ($j = 0; $j < #ll; ++$j) {
if ($i != $j and distance($k[$i], $k[$j]) < $maxdist) {
print $ll[$i];
last;
}
}
}
Usage:
perl lev.pl 3 inputfile.txt > outputfile.txt
The algorithm is the same O(n^2) as in #karakfa's post, but matching is more flexible.
Also note the commented line # $k =~ s/O/0/g;. If you uncomment it, then all O's in key will become 0's, which will fix keys damaged by O->0 transformation. When working with damaged data I always use small rules like this to fix data gradually, refining rules from run to run, to the point where data is almost perfect and fuzzy match is no longer needed.

golang's fallthrough seems unexpected

I have the following code:
package main
import (
"fmt"
)
func main() {
switch {
case 1 == 1:
fmt.Println("1 == 1")
fallthrough
case 2 == 1:
fmt.Println("2 == 1")
}
}
Which prints both lines on the go playground - see example here. I would have expected the fallthrough statement to include evaluation of the next case statement, but this seems not to be the case.
Of course, I can always use a bunch of if statements, so this is not a real impediment, but I am curious what the intention here is, since this seems to me to be a non-obvious result.
Anyone care to explain? For example: in this code, how can I get the 1st and 3rd cases to execute?
Switch is not a bunch of ifs. It's more akin to if {} else if {} construct, but with a couple of twists - namely break and fallthrough. It's not possible to make switch execute first and third cases - a switch does not check each condition, it finds first match and executes it. That's all.
It's primary purpose is to walk through a list of possible values and execute a different code for each value. In fact, in C (where switch statement came from) switch expression can only be of integral type and case values can only be constants that switch expression will be compared too. It's only relatively recently, languages started adding support for strings, boolean expressions etc in switch cases.
As to fallthrough logic it also comes from C. There is no fallthrough operator in C. In C execution falls through into next case (without checking case values) unless break operator encountered. The reason for this design is that sometimes you need to do something special and then do same steps as in another case. So, this design merely allows that. Unfortunately, it's rather rarely useful, so falling through by default was causing more trouble when programmer forgotten to put a break statement in, then actually helping when truly omitted that break intentionally. So, many modern languages change this logic to never fall through by default and to require explicit fallthrough statement if falling through is actually required.
Unfortunately, it's a it hard to come up with a non contrived example of fallthrough being useful that would be short enough to fit into an answer. As I said it's relatively rare. But sometimes you need to write code similar to this:
if x == a || x == b {
if x == a {
// do action a
}
// do action ab
} else if x == c {
// do action c
} else if x == d {
// do action d
}
In fact, I needed code of similar structure quite recently in one of my projects. So, I used switch statement instead. And it looked like this:
switch x {
case a: // do action a
fallthrough
case b: // do action ab
case c: // do action c
case d: // do action d
}
And your switch from the question is functionally equivalent to this:
if 1 == 1 || 2 == 1 {
if 1 == 1 {
fmt.Println("1 == 1")
}
fmt.Println("2 == 1")
}
Presumably, Go's fallthrough behavior is modeled after C, which always worked like this. In C, switch statements are just shorthands for chains of conditional gotos, so your particular example would be compiled as if it was written like:
# Pseudocode
if 1 == 1 goto alpha
if 2 == 1 goto beta
alpha:
fmt.Println("1 == 1")
beta:
fmt.Println("2 == 1")
As you can see, once execution enters the alpha case, it would just keep flowing down, ignoring the beta label (since labels by themselves don't really do anything). The conditional checks have already happened and won't happen again.
Hence, the non-intuitive nature of fallthrough switch statements is simply because switch statements are thinly veiled goto statements.
From the language spec:
A "fallthrough" statement transfers control to the first statement of the next case clause in an expression "switch" statement. It may be used only as the final non-empty statement in such a clause.
That seems to perfectly describe your observed behavior.

While loop throwing error at the start of while

I want to take the user's input of a positive integer where 1 < a < 10^6 and run a loop on it and then store it in a matrix which gets printed to the screen. However, my code is throwing a syntax error pointing to the letter "e" in while. Does anyone know why this error is appearing?
A = (while (a!=1)
If(rem(a,2)=0
floor(a^(1/2));
Else
floor(a^(3/2));
endwhile)
disp(A);
You're having several different problems in your code:
a while loop doesn't return anything, so you can't assign it to A
syntax is case sensitive, so it's if and else, not If and Else
you're missing a closing brace after the if clause
you're missing an endif
you're assigning to rem, use == to compare for equality

What is the advantage of this peculiar formatting?

I've seen this format used for comma-delimited lists in some C++ code (although this could apply to any language):
void function( int a
, int b
, int c
)
I was wondering why would someone use that over a more common format such as:
void function (int a,
int b,
int c
)
That's a pretty common coding style when writing SQL statements:
SELECT field1
, field2
, field3
-- , field4
, field5
FROM tablename
Advantages:
Lets you add, remove, or rearrange fields easily without having to worry about that final trailing comma.
Lets you easily comment out a row (TSQL uses "--") without messing up the rest of the statement.
I wouldn't think you'd want to rearrange parameter order in a function as frequent as you do in SQL, so maybe its just somebody's habit.
The ability to comment one of them out will depend on the specific language being used. Not sure about C++. I know that VB.Net wouldn't allow it, but that's because it requires a continuation character ( _ ) to split statements across lines.
It is easier to add a parameter at the end starting by duplicating previous parameter (line).
Make sense when you are sure that first parameter will never change, which is often the case.
Malice?
Seriously though, it's hard to account for formatting style sometimes. It's largely a matter of personal taste. Personally, I think that both forms are a little nasty unless you're seriously restricted in terms of line-length.
Another advantage is that in the first example you could comment-out either the B or C lines, and it will stay syntactically correct. In the second example, if you tried to comment out the C line, you'd have a syntax error.
Not really worth making it that ugly, if you ask me.
The only benefit I would see, is when you add a parameter, you just have to copy and paste the last line, saving you the extra couple key strokes of editing comma position and such.
Seems to me like a personal choice.
No reason, I suspect it's just a matter of personal preference.
I'd personally prefer the second one.
void function (int a,
int b,
int c
)
The only benefit I would see, is when you add a parameter, you just have to copy and paste the last line, saving you the extra couple key strokes of editing comma position and such.
The same goes for if you are removing the last parameter.
When scanning the file quicky, it's clear that each line that begins with a comma is a continuation of the line above it (compared to a line that's simply indented further than the one above). It's a generalization of the following style:
std::cout << "some info "
<< "some more info " << 4
+ 5 << std::endl;
(Please note, in this case, breaking up 4 + 5 is stupid, but if you have a complex math statement it may be necessary).
I use this a lot, especially when dealing with conditionals such as if, for, and while statements. Because it's also common for one-line conditionals to omit the curlies.
std::vector<int> v = ...;
std::vector<int> w = ...;
for (std::vector<int>::iterator i = v.begin()
, std::vector<int>::iterator j = w.begin()
; i != v.end() && j != w.end()
; ++i, ++j)
std::cout << *i + *j << std::endl;
When you add another field to the end, the single line you add contains the new comma, producing a diff of a single line addition, making it slightly easier to see what has changed when viewing change logs some time in the future.
It seems like most of the answers center around the ability to comment out or add new parameters easily. But it seems that you get the same effect with putting the comma at the end of the line rather than the beginning:
function(
int a,
int b,
// int c,
int d
)
You might say that you can't do that to the last parameter, and you would be right, but with the other form, you can't do it to the first parameter:
function (
// int a
, int b
, int c
, int d
)
So the tradeoff is being able to comment out the first parameter vs. being able to comment out the last parameter + being able to add new parameters without adding a comma to the previous last parameter.
I know when I wrap and's in a sql or if statement I try to make sure the and is the start of the next line.
If A and B
and C
I think it makes it clear the the C is still part of the if. The first format you show may be that. But as with most style questions the simple matter is that if the team decides on one style then it should be adhered to.

Resources