Bash - Read only the last 100 lines of a long Log file - bash

I am reading a long log file and splitting the columns in variables using bash.
cd $LOGDIR
IFS=","
while read LogTIME name md5
do
LogTime+="$(echo $LogTIME)"
Name+="$(echo $name)"
LOGDatamd5+="$(echo $md5)"
done < LOG.txt
But this is really slow and I don't need all the lines. The last 100 lines are enough (but the log file itself needs all the other lines for different programs).
I tried to use tail -n 10 LOG.txt | while read LogTIME name md5, but that takes really long as well and I had no output at all.
Another way I tested without success was:
cd $LOGDIR
foo="$(tail -n 10 LOG.txt)"
IFS=","
while read LogTIME name md5
do
LogTime+="$(echo $LogTIME)"
Name+="$(echo $name)"
LOGDatamd5+="$(echo $md5)"
done < "$foo"
But that gives me only the output of foo in total. Nothing was written into the variables inside the while loop.
There is probably a really easy way to do this, that I can't see...
Cheers,
BallerNacken

Process substitution is the common pattern:
while read LogTIME name md5 ; do
LogTime+=$LogTIME
Name+=$name
LogDatamd5+=$md5
done < <(tail -n100 LOG.txt)
Note that you don't need "$(echo $var)", you can assign $var directly.

Related

Read a txt file and iterate through it in bash script [duplicate]

This question already has answers here:
Read lines from a file into a Bash array [duplicate]
(6 answers)
Closed 5 months ago.
I have 20 files from which I want to grep all the lines that have inside a given id (id123), and save them in a new text file. So, in the end, I would have several txt files, as much as ids we have.
If you have a small number of Ids, you can create a script with the list inside. E.g:
list=("id123" "id124" "id125" "id126")
for i in "${list[#]}"
do
zgrep -Hx $i *.vcf.gz > /home/Roy/$i.txt
done
This would give us 4 txt files (id123.txt...) etc.
However, this list is around 500 ids, so it's much easier to read the txt file that stores the ids and iterate through it.
I was trying to do something like:
list = `cat some_data.txt`
for i in "${list[#]}"
do
zgrep -Hx $i *.vcf.gz > /home/Roy/$i.txt
done
However, this only provides the last id of the file.
If each id in the file is on a distinct line, you can do
while read i; do ...; done < panel_genes_cns.txt
If that is not the case, you can simply massage the file to make it so:
tr -s '[[:space:]]' \\n < panel_genes_cns.txt | while read i; do ...; done
There are a few caveats to be aware of. In each, the commands inside the loop are also reading from the same input stream that while reads from, and this may consume ids unexpectedly. In the second, the pipeline will (depending on the shell) run in a subshell, and any variables defined in the loop will be out of scope after the loop ends. But for your simple case, either of these should work without worrying too much about these issues.
I did not check whole code, but from initally I can see you are using wrong redirection.
You have to use >> instead of >.
> is overwrites and >> is append.
list = `cat pannel_genes_cns.txt`
for i in "${list[#]}"
do
zgrep -Hx $i *.vcf.gz >> /home/Roy/$i.txt
done

Delete duplicate commands of zsh_history keeping last occurence

I'm trying to write a shell script that deletes duplicate commands from my zsh_history file. Having no real shell script experience and given my C background I wrote this monstrosity that seems to work (only on Mac though), but takes a couple of lifetimes to end:
#!/bin/sh
history=./.zsh_history
currentLines=$(grep -c '^' $history)
wordToBeSearched=""
currentWord=""
contrastor=0
searchdex=""
echo "Currently handling a grand total of: $currentLines lines. Please stand by..."
while (( $currentLines - $contrastor > 0 ))
do
searchdex=1
wordToBeSearched=$(awk "NR==$currentLines - $contrastor" $history | cut -d ";" -f 2)
echo "$wordToBeSearched A BUSCAR"
while (( $currentLines - $contrastor - $searchdex > 0 ))
do
currentWord=$(awk "NR==$currentLines - $contrastor - $searchdex" $history | cut -d ";" -f 2)
echo $currentWord
if test "$currentWord" == "$wordToBeSearched"
then
sed -i .bak "$((currentLines - $contrastor - $searchdex)) d" $history
currentLines=$(grep -c '^' $history)
echo "Line deleted. New number of lines: $currentLines"
let "searchdex--"
fi
let "searchdex++"
done
let "contrastor++"
done
^THIS IS HORRIBLE CODE NOONE SHOULD USE^
I'm now looking for a less life-consuming approach using more shell-like conventions, mainly sed at this point. Thing is, zsh_history stores commands in a very specific way:
: 1652789298:0;man sed
Where the command itself is always preceded by ":0;".
I'd like to find a way to delete duplicate commands while keeping the last occurrence of each command intact and in order.
Currently I'm at a point where I have a functional line that will delete strange lines that find their way into the file (newlines and such):
#sed -i '/^:/!d' $history
But that's about it. Not really sure how get the expression to look for into a sed without falling back into everlasting whiles or how to delete the duplicates while keeping the last-occurring command.
The zsh option hist_ignore_all_dups should do what you want. Just add setopt hist_ignore_all_dups to your zshrc.
I wanted something similar, but I dont care about preserving the last one as you mentioned. This is just finding duplicates and removing them.
I used this command and then removed my .zsh_history and replacing it with the .zhistory that this command outputs
So from your home folder:
cat -n .zsh_history | sort -t ';' -uk2 | sort -nk1 | cut -f2- > .zhistory
This effectively will give you the file .zhistory containing the changed list, in my case it went from 9000 lines to 3000, you can check it with wc -l .zhistory to count the number of lines it has.
Please double check and make a backup of your zsh history before doing anything with it.
The sort command might be able to be modified to sort it by numerical value and somehow archieve what you want, but you will have to investigate further about that.
I found the script here, along with some commands to avoid saving duplicates in the future
I didn't want to rename the history file.
# dedupe_lines.zsh
if [ $# -eq 0 ]; then
echo "Error: No file specified" >&2
exit 1
fi
if [ ! -f $1 ]; then
echo "Error: File not found" >&2
exit 1
fi
sort $1 | uniq >temp.txt
mv temp.txt $1
Add dedupe_lines.zsh to your home directory, then make it executable.
chmod +x dedupe_lines.zsh
Run it.
./dedupe_lines.zsh .zsh_history

Bash Script IF statement not functioning

I am currently testing a simple dictionary attack using bash scripts. I have encoded my password "Snake" with sha256sum by simply typing the following command:
echo -n Snake | sha256sum
This produced the following:
aaa73ac7721342eac5212f15feb2d5f7631e28222d8b79ffa835def1b81ff620 *-
I then copy pasted the hashed string into the program, but the script is not doing what is intended to do. The script is (Note that I have created a test dictionary text file which only contains 6 lines):
echo "Enter:"
read value
cat dict.txt | while read line1
do
atax=$(echo -n "$line1" | sha256sum)
if [[ "$atax" == "$value" ]];
then
echo "Cracked: $line1"
exit 1
fi
echo "Trying: $line1"
done
Result:
Trying: Dog
Trying: Cat
Trying: Rabbit
Trying: Hamster
Trying: Goldfish
Trying: Snake
The code should display "Cracked: Snake" and terminate, when it compares the hashed string with the word "Snake". Where am I going wrong?
EDIT: The bug was indeed the DOS lines in my textfile. I made a unix file and the checksums matched. Thanks everyone.
One problem is that, as pakistanprogrammerclub points out, you're never initializing name (as opposed to line1).
Another problem is that sha256sum does not just print out the checksum, but also *- (meaning "I read the file from standard input in binary mode").
I'm not sure if there's a clean way to get just the checksum — probably there is, but I can't find it — but you can at least write something like this:
atax=$(echo -n "$name" | sha256sum | sed 's/ .*//')
(using sed to strip off everything from the space onwards).
couple issues - the variable name is not set anywhere - do you mean value? Also better form to use redirection instead of cat
while read ...; do ... done <dict.txt
Variables set by a while loop in a pipeline are not available in the parent shell not the other way around as I mistakenly said before - it's not an issue here though
Could be a cut n paste error - add an echo after the first read
echo "value \"$value\""
also after atax is set
echo "line1 \"$line1\" atax \"$atax\""

Shell script: Get name of last file in a folder by alphabetical order

I have a folder with backups from a MySQL database that are created automatically. Their name consists of the date the backup was made, like so:
2010-06-12_19-45-05.mysql.gz
2010-06-14_19-45-05.mysql.gz
2010-06-18_19-45-05.mysql.gz
2010-07-01_19-45-05.mysql.gz
What is a way to get the filename of the last file in the list, i.e. of the one which in alphabetical order comes last?
In a shell script, I would like to do something like
LAST_BACKUP_FILE= ???
gunzip $LAST_BACKUP_FILE;
ls -1 | tail -n 1
If you want to assign this to a variable, use $(...) or backticks.
FILE=`ls -1 | tail -n 1`
FILE=$(ls -1 | tail -n 1)
#Sjoerd's answer is correct, I'll just pick a few nits from it:
you don't need the -1 option to enforce one path per line if you pipe the output somewhere:
ls | tail -n 1
you can use -r to get the listing in reverse order, and take the first one:
ls -r | head -n 1
gunzip some.log.gz will write uncompressed data into some.log and remove some.log.gz, which may or may not be what you want (probably isn't). if you want to keep the compressed source, pipe it into gunzip:
gunzip < some.file.gz
you might want to protect the script against situation when the dir contains no files, since
gunzip $empty_variable
expands to just
gunzip
and such invocation will wait indefinitely for data on standard input:
latest="$(ls -r /some/where/*.gz | head -1)"
if test -z "$latest"; then
# there's no logs yet, bail out
exit
fi
gunzip < $latest
ls can yield unexpected results when parsed by other commands if the filenames have unusual characters. The following always works:
for LAST_BACKUP_FILE in *; do : ; done
for LAST_BACKUP_FILE in * loops through every filename (and folder name, if there are any) in order in the current directory, storing each in $LAST_BACKUP_FILE
do : does nothing
done finishes after the last file
Now, the last file is stored in $LAST_BACKUP_FILE.
If you happen to want the first file, use this:
for FIRST_BACKUP_FILE in *; do break; done
The break statement jumps out of the loop after the first file is stored in $FIRST_BACKUPT_FILE.
(from comment below) If you want hidden files included in the search, then use the command shopt -s dotglob before running the loops.
The shell is more powerful than many think. Just let it work for you. Assuming filenames without spaces,
set -- $(ls -r *.gz)
LAST_BACKUP_FILE=$1
does the trick with a single fork, no pipes, and you can even avoid the fork if your shell supports arithmetic expansion as in
set -- *.gz
shift $(($# - 1))
LAST_BACKUP_FILE=$1

Handle special characters in bash for...in loop

Suppose I've got a list of files
file1
"file 1"
file2
a for...in loop breaks it up between whitespace, not newlines:
for x in $( ls ); do
echo $x
done
results:
file
1
file1
file2
I want to execute a command on each file. "file" and "1" above are not actual files. How can I do that if the filenames contains things like spaces or commas?
It's a little trickier than I think find -print0 | xargs -0 could handle, because I actually want the command to be something like "convert input/file1.jpg .... output/file1.jpg" so I need to permutate the filename in the process.
Actually, Mark's suggestion works fine without even doing anything to the internal field separator. The problem is running ls in a subshell, whether by backticks or $( ) causes the for loop to be unable to distinguish between spaces in names. Simply using
for f in *
instead of the ls solves the problem.
#!/bin/bash
for f in *
do
echo "$f"
done
UPDATE BY OP: this answer sucks and shouldn't be on top ... #Jordan's post below should be the accepted answer.
one possible way:
ls -1 | while read x; do
echo $x
done
I know this one is LONG past "answered", and with all due respect to eduffy, I came up with a better way and I thought I'd share it.
What's "wrong" with eduffy's answer isn't that it's wrong, but that it imposes what for me is a painful limitation: there's an implied creation of a subshell when the output of the ls is piped and this means that variables set inside the loop are lost after the loop exits. Thus, if you want to write some more sophisticated code, you have a pain in the buttocks to deal with.
My solution was to take the "readline" function and write a program out of it in which you can specify any specific line number that you may want that results from any given function call. ... As a simple example, starting with eduffy's:
ls_output=$(ls -1)
# The cut at the end of the following line removes any trailing new line character
declare -i line_count=$(echo "$ls_output" | wc -l | cut -d ' ' -f 1)
declare -i cur_line=1
while [ $cur_line -le $line_count ] ;
do
# NONE of the values in the variables inside this do loop are trapped here.
filename=$(echo "$ls_output" | readline -n $cur_line)
# Now line contains a filename from the preceeding ls command
cur_line=cur_line+1
done
Now you have wrapped up all the subshell activity into neat little contained packages and can go about your shell coding without having to worry about the scope of your variable values getting trapped in subshells.
I wrote my version of readline in gnuc if anyone wants a copy, it's a little big to post here, but maybe we can find a way...
Hope this helps,
RT

Resources