shell script to get added lines of a file - shell

I get a large log file which I have to process.
After a week, I'll get a new one. It will be the same with added new lines (logs).
I just need the new added lines.
How do I do that?
EDIT: I've tried sed so far but haven't been successful

diff would allow you to find any and all differences between these files, as long the changes are restricted to added and/or removed lines. On most Linux distributions it's a part of GNU diffutils, but it exists on pretty much every Uinix-like system.

If line are append to log file and I suppose you have the old one, you could try :
tail -$(( $(cat newLogFileName | wc -l)-$(cat oldLogFileName | wc -l) )) newLogFileName

comm -13 oldfile newfile will get you the lines that only appear in the newfile.

# get new.log
tail -n+$(($(wc -l < old.log)+1)) new.log
mv new.log old.log

Related

Output into new column .CSV Shell

I am still new to Shell. In javascript it is super easy to parse all output into a new column. Allyou need is ,. But I am still struggling to do the same in Shell. I've traversed most of the anwsers on Stackoverflow, and still couldn't get it to work. Most of the anwsers are around cutting from an existing file and pasting into a new one etc. Pretty sure, somewhere I am making a simple syntax error.
At the moment I have this:
echo "Mq1:" >> ~/Desktop/howmanySKUs.csv
cd /Volumes/Hams\ Hall\ Workspace/Mannequin_1_WIP && ls |grep \_01.tif$ | wc -l | sed "s/,//" >> ~/Desktop/howmanySKUs.csv
It counts the amount of files in specified directory.
I get this:
But now I am trying to Output Mq1: in one column, and then the sum of found files in the 2nd column.
Desired Output:
Any help would be much appreciated.
You can directly append both the lines
cd /Volumes/Hams\ Hall\ Workspace/Mannequin_1_WIP && echo "Mq1:,"`ls |grep \_01.tif$ | wc -l` > ~/Desktop/howmanySKUs.csv

How to quickly check a .gz file without unzip? [duplicate]

How to get the first few lines from a gziped file ?
I tried zcat, but its throwing an error
zcat CONN.20111109.0057.gz|head
CONN.20111109.0057.gz.Z: A file or directory in the path name does not exist.
zcat(1) can be supplied by either compress(1) or by gzip(1). On your system, it appears to be compress(1) -- it is looking for a file with a .Z extension.
Switch to gzip -cd in place of zcat and your command should work fine:
gzip -cd CONN.20111109.0057.gz | head
Explanation
-c --stdout --to-stdout
Write output on standard output; keep original files unchanged. If there are several input files, the output consists of a sequence of independently compressed members. To obtain better compression, concatenate all input files before compressing
them.
-d --decompress --uncompress
Decompress.
On some systems (e.g., Mac), you need to use gzcat.
On a mac you need to use the < with zcat:
zcat < CONN.20111109.0057.gz|head
If a continuous range of lines needs be, one option might be:
gunzip -c file.gz | sed -n '5,10p;11q' > subFile
where the lines between 5th and 10th lines (both inclusive) of file.gz are extracted into a new subFile. For sed options, refer to the manual.
If every, say, 5th line is required:
gunzip -c file.gz | sed -n '1~5p;6q' > subFile
which extracts the 1st line and jumps over 4 lines and picks the 5th line and so on.
If you want to use zcat, this will show the first 10 rows
zcat your_filename.gz | head
Let's say you want the 16 first row
zcat your_filename.gz | head -n 16
This awk snippet will let you show not only the first few lines - but a range you can specify. It will also add line numbers which i needed for debugging an error message pointing to a certain line way down in a gzipped file.
gunzip -c file.gz | awk -v from=10 -v to=20 'NR>=from { print NR,$0; if (NR>=to) exit 1}'
Here is the awk snippet used in the one liner above. In awk NR is a built-in variable (Number of records found so far) which usually is equivalent to a line number. the from and to variable are picked up from the command line via the -v options.
NR>=from {
print NR,$0;
if (NR>=to)
exit 1
}

Create files using grep and wildcards with input file

This should be a no-brainer, but apparently I have no brain today.
I have 50 20-gig logs that contain entries from multiple apps, one of which addes a transaction ID to its log lines. I have 42 transaction IDs I need to review, and I'd like to parse out the appropriate lines into separate files.
To do a single file, the command would be simply,
grep CDBBDEADBEEF2020X02393 server.log* > CDBBDEADBEEF2020X02393.log
that creates a log isolated to that transaction, from all 50 server.logs.
Now, I have a file with 42 txnIDs (shortening to 4 here):
CDBBDEADBEEF2020X02393
CDBBDEADBEEF6548X02302
CDBBDE15644F2020X02354
ABBDEADBEEF21014777811
And I wrote:
#/bin/sh
grep $1 server.\* > $1.log
But that is not working. Changing the shebang to #/bin/bash -xv, gives me this weird output (obviously I'm playing with what the correct escape magic must be):
$ ./xtrakt.sh B7F6E465E006B1F1A
#!/bin/bash -xv
grep - ./server\.\*
' grep - './server.*
: No such file or directory
I have also tried the command line
grep - server.* < txids.txt > $1
But OBVIOUSLY that $1 is pointless and I have no idea how to get a file named per txid using the input redirect form of the command.
Thanks in advance for any ideas. I haven't gone the route of doing a foreach in the shell script, because I want grep to put the original filename in the output lines so I can examine context later if I need to.
Also - it would be great to have the server.* files ordered numerically (server.log.1, server.log.2 NOT server.log.1, server.log.10...)
try this:
while read -r txid
do
grep "$txid" server.* > "$txid.log"
done < txids.txt
and for the file ordering - rename files with one digit to two digit, with leading zeroes, e.g. mv server.log.1 server.log.01.

Process loop over multiple file sets

In order to simplify my work I usually do this:
for FILE in ./*.txt;
do ID=`echo ${FILE} | sed 's/^.*\///'`;
bin/Tool ${FILE} > ${ID}_output.txt;
done
Hence process loops over all *.txt files.
Now I have two file groups - my Tool uses two inputs (-a & -b). Is there any command to run Tool for every FILE_A over every FILE_B and name the output file as a combination of both them?
I imagine it should look like something like this:
for FILE_A in ./filesA/*.txt;
do for FILE_B in ./filesB/*.txt;
bin/Tool -a ${FILE_A} -b ${FILE_B} > output.txt;
done
So the process would run number of *.txt in filesA over number of *.txt in filesB.
And also the naming issue which I even don't know where to put in...
Hope it is clear what I am asking. Never had to do such task before and a command line would be really helpful.
Looking forward!
NEWNAME="${FILE_A##*/}_${FILE_B##*/}_output.txt"

Copying part of a large file using command line

I've a text file with 2 million lines. Each line has some transaction information.
e.g.
23848923748, sample text, feild2 , 12/12/2008
etc
What I want to do is create a new file from a certain unique transaction number onwards. So I want to split the file at the line where this number exists.
How can I do this form the command line?
I can find the line by doing this:
cat myfile.txt | grep 23423423423
use sed like this
sed '/23423423423/,$!d' myfile.txt
Just confirm that the unique transaction number cannot appear as a pattern in some other part of the line (especially, before the correctly matching line) in your file.
There is already a 'perl' answer here, so, i'll give one more AWK way :-)
awk '{BEGIN{skip=1} /number/ {skip=0} // {if (skip!=1) print $0}' myfile.txt
On a random file in my tmp directory, this is how I output everything from the line matching popd onwards in a file named tmp.sh:
tail -n+`grep -n popd tmp.sh | cut -f 1 -d:` tmp.sh
tail -n+X matches from that line number onwards; grep -n outputs lineno:filename, and cut extracts just lineno from grep.
So for your case it would be:
tail -n+`grep -n 23423423423 myfile.txt | cut -f 1 -d:` myfile.txt
And it should indeed match from the first occurrence onwards.
It's not a pretty solution, but how about using -A parameter of grep?
Like this:
mc#zolty:/tmp$ cat a
1
2
3
4
5
6
7
mc#zolty:/tmp$ cat a | grep 3 -A1000000
3
4
5
6
7
The only problem I see in this solution is the 1000000 magic number. Probably someone will know the answer without using such a trick.
You can probably get the line number using Grep and then use Tail to print the file from that point into your output file.
Sorry I don't have actual code to show, but hopefully the idea is clear.
I would write a quick Perl script, frankly. It's invaluable for anything like this (relatively simple issues) and as soon as something more complex rears its head (as it will do!) then you'll need the extra power.
Something like:
#!/bin/perl
my $out = 0;
while (<STDIN>) {
if /23423423423/ then $out = 1;
print $_ if $out;
}
and run it using:
$ perl mysplit.pl < input > output
Not tested, I'm afraid.

Resources