How to write a tail script without the tail command - bash

How would you achieve this in bash. It's a question I got asked in an interview and I could think of answers in high level languages but not in shell.
As I understand it, the real implementation of tail seeks to the end of the file and then reads backwards.

The main idea is to keep a fixed-size buffer and to remember the last lines. Here's a quick way to do a tail using the shell:
#!/bin/bash
SIZE=5
idx=0
while read line
do
arr[$idx]=$line
idx=$(( ( idx + 1 ) % SIZE ))
done < text
for ((i=0; i<SIZE; i++))
do
echo ${arr[$idx]}
idx=$(( ( idx + 1 ) % SIZE ))
done

If all not-tail commands are allowed, why not be whimsical?
#!/bin/sh
[ -r "$1" ] && exec < "$1"
tac | head | tac

Use wc -l to count the number of lines in the file. Subtract the number of lines you want from this, and add 1, to get the starting line number. Then use this with sed or awk to start printing the file from that line number, e.g.
sed -n "$start,\$p"

There's this:
#!/bin/bash
readarray file
lines=$(( ${#file[#]} - 1 ))
for (( line=$(($lines-$1)), i=${1:-$lines}; (( line < $lines && i > 0 )); line++, i-- )); do
echo -ne "${file[$line]}"
done
Based on this answer: https://stackoverflow.com/a/8020488/851273
You pass in the number of lines at the end of the file you want to see then send the file via stdin, puts the entire file into an array, and only prints the last # lines of the array.

The only way I can think of in “pure” shell is to do a while read linewise on the whole file into an array variable with indexing modulo n, where n is the number of tail lines (default 10) — i.e. a circular buffer, then iterate over the circular buffer from where you left off when the while read ends. It's not efficient or elegant, in any sense, but it'll work and avoids reading the whole file into memory. For example:
#!/bin/bash
incmod() {
let i=$1+1
n=$2
if [ $i -ge $2 ]; then
echo 0
else
echo $i
fi
}
n=10
i=0
buffer=
while read line; do
buffer[$i]=$line
i=$(incmod $i $n)
done < $1
j=$i
echo ${buffer[$i]}
i=$(incmod $i $n)
while [ $i -ne $j ]; do
echo ${buffer[$i]}
i=$(incmod $i $n)
done

This script somehow imitates tail:
#!/bin/bash
shopt -s extglob
LENGTH=10
while [[ $# -gt 0 ]]; do
case "$1" in
--)
FILES+=("${#:2}")
break
;;
-+([0-9]))
LENGTH=${1#-}
;;
-n)
if [[ $2 != +([0-9]) ]]; then
echo "Invalid argument to '-n': $1"
exit 1
fi
LENGTH=$2
shift
;;
-*)
echo "Unknown option: $1"
exit 1
;;
*)
FILES+=("$1")
;;
esac
shift
done
PRINTHEADER=false
case "${#FILES[#]}" in
0)
FILES=("/dev/stdin")
;;
1)
;;
*)
PRINTHEADER=true
;;
esac
IFS=
for I in "${!FILES[#]}"; do
F=${FILES[I]}
if [[ $PRINTHEADER == true ]]; then
[[ I -gt 0 ]] && echo
echo "==> $F <=="
fi
if [[ LENGTH -gt 0 ]]; then
LINES=()
COUNT=0
while read -r LINE; do
LINES[COUNT++ % LENGTH]=$LINE
done < "$F"
for (( I = COUNT >= LENGTH ? LENGTH : COUNT; I; --I )); do
echo "${LINES[--COUNT % LENGTH]}"
done
fi
done
Example run:
> bash script.sh -n 12 <(yes | sed 20q) <(yes | sed 5q)
==> /dev/fd/63 <==
y
y
y
y
y
y
y
y
y
y
y
y
==> /dev/fd/62 <==
y
y
y
y
y
> bash script.sh -4 <(yes | sed 200q)
y
y
y
y

Here's the answer I would give if I were actually asked this question in an interview:
What environment is this where I have bash but not tail? Early boot scripts, maybe? Can we get busybox in there so we can use the full complement of shell utilities? Or maybe we should see if we can squeeze a stripped-down Perl interpreter in, even without most of the modules that would make life a whole lot easier. You know dash is much smaller than bash and perfectly good for scripting use, right? That might also help. If none of that is an option, we should check how much space a statically linked C mini-tail would need, I bet I can fit it in the same number of disk blocks as the shell script you want.
If that doesn't convince the interviewer that it's a silly question, then I go on to observe that I don't believe in using bash extensions, because the only good reason to write anything complicated in shell script nowadays is if total portability is an overriding concern. By avoiding anything that isn't portable even in one-offs, I don't develop bad habits, and I don't get tempted to do something in shell when it would be better done in a real programming language.
Now the thing is, in truly portable shell, arrays may not be available. (I don't actually know whether the POSIX shell spec has arrays, but there certainly are legacy-Unix shells that don't have them.) So, if you have to emulate tail using only shell builtins and it's got to work everywhere, this is the best you can do, and yes, it's hideous, because you're writing in the wrong language:
#! /bin/sh
a=""
b=""
c=""
d=""
e=""
f=""
while read x; do
a="$b"
b="$c"
c="$d"
d="$e"
e="$f"
f="$x"
done
printf '%s\n' "$a"
printf '%s\n' "$b"
printf '%s\n' "$c"
printf '%s\n' "$d"
printf '%s\n' "$e"
printf '%s\n' "$f"
Adjust the number of variables to match the number of lines you want to print.
The battle-scarred will note that printf is not 100% available either. Unfortunately, if all you have is echo, you are up a creek: some versions of echo cannot print the literal string "-n", and others cannot print the literal string "\n", and even figuring out which one you have is a bit of a pain, particularly as, if you don't have printf (which is in POSIX), you probably don't have user-defined functions either.
(N.B. The code in this answer, sans rationale, was originally posted by user 'Nirk' but then deleted under downvote pressure from people whom I shall charitably assume were not aware that some shells do not have arrays.)

Related

Why will it not echo "$d" anwser

I need the script to output the result, but echo "$d" does not output anything. I made the ciphertext.gz earlier in the script and the $fil is ciphertext.gz. Bash script:
echo "Fil: ciphertext.gz"
a="ABCDEFGHIJKLMNOPQRSTUVXYZ"
[[ "${*/-d/}" != "" ]] &&
echo "Usage: $0 [-d]" && exit 1
m=${1:+-}
m=-
t=$fil
printf "Nøgle 'eks. ABCDE': "
read -r k
k=$(echo "$k" | tr [a-vx-z] [A-VX-Z] )
printf "\n"
for ((i=0;i<${#t};i++)); do
p1=${a%%${t:$i:1}*}
p2=${a%%${k:$((i%${#k})):1}*}
d="${d}${a:$(((${#p1}${m:-+}${#p2})%${#a})):1}"
done
echo "$d"
Extended comments, (not really an answer, since it's unclear what the code should do):
This is wrong:
m=${1:+-}
m=-
...since it has the same effect as:
m=-
This reads one line from standard input:
read -r k
...which, unless ciphertext is only one line long, probably
defeats the purpose of the next eight lines of code. Even if
standard input was unzip < ciphertext.gz |, it would only decode
the first line of ciphertext.
Wrap the for in an appropriate while read k loop.

printing line numbers that are multiple of 5

Hi I am trying to print/echo line numbers that are multiple of 5. I am doing this in shell script. I am getting errors and unable to proceed. below is the script
#!/bin/bash
x=0
y=$wc -l $1
while [ $x -le $y ]
do
sed -n `$x`p $1
x=$(( $x + 5 ))
done
When executing above script i get below errors
#./echo5.sh sample.h
./echo5.sh: line 3: -l: command not found
./echo5.sh: line 4: [: 0: unary operator expected
Please help me with this issue.
For efficiency, you don't want to be invoking sed multiple times on your file just to select a particular line. You want to read through the file once, filtering out the lines you don't want.
#!/bin/bash
i=0
while IFS= read -r line; do
(( ++i % 5 == 0 )) && echo "$line"
done < "$1"
Demo:
$ i=0; while read line; do (( ++i % 5 == 0 )) && echo "$line"; done < <(seq 42)
5
10
15
20
25
30
35
40
A funny pure Bash possibility:
#!/bin/bash
mapfile ary < "$1"
printf "%.0s%.0s%.0s%.0s%s" "${ary[#]}"
This slurps the file into an array ary, which each line of the file in a field of the array. Then printf takes care of printing one every 5 lines: %.0s takes a field, but does nothing, and %s prints the field. Since mapfile is used without the -t option, the newlines are included in the array. Of course this really slurps the file into memory, so it might not be good for huge files. For large files you can use a callback with mapfile:
#!/bin/bash
callback() {
printf '%s' "$2"
ary=()
}
mapfile -c 5 -C callback ary < "$1"
We're removing all the elements of the array during the callback, so that the array doesn't grow too large, and the printing is done on the fly, as the file is read.
Another funny possibility, in the spirit of glenn jackmann's solution, yet without a counter (and still pure Bash):
#!/bin/bash
while read && read && read && read && IFS= read -r line; do
printf '%s\n' "$line"
done < "$1"
Use sed.
sed -n '0~5p' $1
This prints every fifth line in the file starting from 0
Also
y=$wc -l $1
wont work
y=$(wc -l < $1)
You need to create a subshell as bash will see the spaces as the end of the assignment, also if you just want the number its best to redirect the file into wc.
Dont know what you were trying to do with this ?
x=$(( $x + 5 ))
Guessing you were trying to use let, so id suggest looking up the syntax for that command. It would look more like
(( x = x + 5 ))
Hope this helps
There are cleaner ways to do it, but what you're looking for is this.
#!/bin/bash
x=5
y=`wc -l $1`
y=`echo $y | cut -f1 -d\ `
while [ "$y" -gt "$x" ]
do
sed -n "${x}p" "$1"
x=$(( $x + 5 ))
done
Initialize x to 5, since there is no "line zero" in your file $1.
Also, wc -l $1 will display the number of line counts, followed by the name of the file. Use cut to strip the file name out and keep just the first word.
In conditionals, a value of zero can be interpreted as "true" in Bash.
You should not have space between your $x and your p in your sed command. You can put them right next to each other using curly braces.
You can do this quite succinctly using awk:
awk 'NR % 5 == 0' "$1"
NR is the record number (line number in this case). Whenever it is a multiple of 5, the expression is true, so the line is printed.
You might also like the even shorter but slightly less readable:
awk '!(NR%5)' "$1"
which does the same thing.

To Continuously loop using for in shell scripting

for m in $count
do
`cat $op ${arr[$m]} > $op1`
`rm -f $op`
`touch $op`
`cat $op1 ${arr[$m+1]} > $op`
if [ $m ge $count ]; then
`rm -f $op1`
`touch $op1`
fi
m=$((m+1))
done
I wanted to continuously loop from the start count 2 till the end count 10 . The $count=10 here. But the above piece of code executes the for loop only once.
Rainy sunday - having much free time - long answer ;)
Many issues with your script, some recommended solutions. Because you used the construction m=$((m+1)) - will be using bash as "shell". (Consider adding the bash tag)
For the cycle - several possibilities
count=10
m=2 #start with 2
while (( $m <= $count )) #while m is less or equal to 10
do #do
echo $m #this action
let m++ #increment m (add one to m)
done #end of while
or, if the count is a constant (not a variable), you can write
for m in {2..10} #REMEMBER, will not works with a variables, like {2..$count}
do
echo "$m"
done
another variant - using the seq (man seq) command for counting
for m in $(seq 2 ${count:=10}) # ${count:=10} - defaults the $count to 10 if it is undefined
do
echo $m
done
or C-like for loop
let count=10
for ((m=2; m<=count; m++))
do
echo $m
done
All 4 loops produces:
2
3
4
5
6
7
8
9
10
so, having a right cycle now. Now add your specific actions.
The:
rm -f $op
touch $op
can be replaced by one command
echo -n > $op #echo nothing and write the "nothing" into the file
it is faster, because the echo is an bash builtin (doesn't start two external commands)
So your actions could looks like
cat $op ${arr[$m]} > $op1
echo -n > $op
cat $op1 ${arr[$m+1]} > $op
in this case, the echo is useless, because the second cat will write its output
to the $op anyway (and before write shortens the file to zero size), so this result is
identical with the above
cat $op ${arr[$m]} > $op1
cat $op1 ${arr[$m+1]} > $op
Those two cat commands can be shorted to one, using bash's >> append to file redirection
cat ${arr[$m]} ${arr[m+1]} >> $op
The whole script could look like the next
#making a testing environment
for f in $(seq 12) #create 12 files opdata-N
do
arr[$f]="opdata-$f" #store the filenames in the array "arr"
echo "data-$f" > ${arr[$f]} #each file contains one line "data-N"
done
#echo ${arr[#]}
#setting the $op and $op1 filenames
#consider choosing more descriptive variable names
op="file_op"
#op1="file_op1" #not needed
#add some initial (old) value to $op
echo "initial value" > $op
#end of creating the testing environment
#the script
count=10
for m in $(seq 2 $count)
do
cat ${arr[$m]} ${arr[m+1]} >> $op
done
at the end, file $op will contain:
initial value
data-2
data-3
data-3
data-4
data-4
data-5
data-5
data-6
data-6
data-7
data-7
data-8
data-8
data-9
data-9
data-10
data-10
data-11
BTW, are you sure about the result? Because if only want add file-2 .. file-10 to the end of $op (without duplicating entries), you can simple write:
cat file-{2..10} >> $op #the '>>' adds to the end of file...
or by using your array:
startpos=2
count=10
cat ${arr[#]:$startpos:$count} >> $op
Ufff.. ;)
Ps: usually it is good practice to enclose variables in double quotes like "$filename" - in the above examples for better readability I omitted them.
Any loop needs a "condition to keep looping". When you use a
for m in count
type of loop, the condition is "if there are more elements in the collection count, pick the next one and keep going". This doesn't seem to be what you want. You are looking for the bash equivalent of
for(m = 0; m < 10; m++)
I think. The best way to do this is - with exactly that kind of loop (but note - an extra pair of parentheses, and a semicolon):
#!/bin/bash
# Display message 5 times
for ((i = 0 ; i < 5 ; i++)); do
echo "Welcome $i times."
done
see nix craft for original
I think you can extend this to your situation… if I understood your question correctly you need something like this:
for ((m = 2; m <= 10; m++))
do
cat $op ${arr[$m]} > $op1
rm -f $op
touch $op
cat $op1 ${arr[$m+1]} > $op
if [ $m ge $count ]; then
rm -f $op1
touch $op1
fi
done
Use a while loop instead.
The for loop is when you have multiple objects to iterate against. You have only one, i.e. $count.

parse and expand interval

In my script I need to expand an interval, e.g.:
input: 1,5-7
to get something like the following:
output: 1,5,6,7
I've found other solutions here, but they involve python and I can't use it in my script.
Solution with Just Bash 4 Builtins
You can use Bash range expansions. For example, assuming you've already parsed your input you can perform a series of successive operations to transform your range into a comma-separated series. For example:
value1=1
value2='5-7'
value2=${value2/-/..}
value2=`eval echo {$value2}`
echo "input: $value1,${value2// /,}"
All the usual caveats about the dangers of eval apply, and you'd definitely be better off solving this problem in Perl, Ruby, Python, or AWK. If you can't or won't, then you should at least consider including some pipeline tools like tr or sed in your conversions to avoid the need for eval.
Try something like this:
#!/bin/bash
for f in ${1//,/ }; do
if [[ $f =~ - ]]; then
a+=( $(seq ${f%-*} 1 ${f#*-}) )
else
a+=( $f )
fi
done
a=${a[*]}
a=${a// /,}
echo $a
Edit: As #Maxim_united mentioned in the comments, appending might be preferable to re-creating the array over and over again.
This should work with multiple ranges too.
#! /bin/bash
input="1,5-7,13-18,22"
result_str=""
for num in $(tr ',' ' ' <<< "$input"); do
if [[ "$num" == *-* ]]; then
res=$(seq -s ',' $(sed -n 's#\([0-9]\+\)-\([0-9]\+\).*#\1 \2#p' <<< "$num"))
else
res="$num"
fi
result_str="$result_str,$res"
done
echo ${result_str:1}
Will produce the following output:
1,5,6,7,13,14,15,16,17,18,22
expand_commas()
{
local arg
local st en i
set -- ${1//,/ }
for arg
do
case $arg in
[0-9]*-[0-9]*)
st=${arg%-*}
en=${arg#*-}
for ((i = st; i <= en; i++))
do
echo $i
done
;;
*)
echo $arg
;;
esac
done
}
Usage:
result=$(expand_commas arg)
eg:
result=$(expand_commas 1,5-7,9-12,3)
echo $result
You'll have to turn the separated words back into commas, of course.
It's a bit fragile with bad inputs but it's entirely in bash.
Here's my stab at it:
input=1,5-7,10,17-20
IFS=, read -a chunks <<< "$input"
output=()
for chunk in "${chunks[#]}"
do
IFS=- read -a args <<< "$chunk"
if (( ${#args[#]} == 1 )) # single number
then
output+=(${args[*]})
else # range
output+=($(seq "${args[#]}"))
fi
done
joined=$(sed -e 's/ /,/g' <<< "${output[*]}")
echo $joined
Basically split on commas, then interpret each piece. Then join back together with commas at the end.
A generic bash solution using the sequence expression `{x..y}'
#!/bin/bash
function doIt() {
local inp="${#/,/ }"
declare -a args=( $(echo ${inp/-/..}) )
local item
local sep
for item in "${args[#]}"
do
case ${item} in
*..*) eval "for i in {${item}} ; do echo -n \${sep}\${i}; sep=, ; done";;
*) echo -n ${sep}${item};;
esac
sep=,
done
}
doIt "1,5-7"
Should work with any input following the sample in the question. Also with multiple occurrences of x-y
Use only bash builtins
Using ideas from both #Ansgar Wiechers and #CodeGnome:
input="1,5-7,13-18,22"
for s in ${input//,/ }
do
if [[ $f =~ - ]]
then
a+=( $(eval echo {${s//-/..}}) )
else
a+=( $s )
fi
done
oldIFS=$IFS; IFS=$','; echo "${a[*]}"; IFS=$oldIFS
Works in Bash 3
Considering all the other answers, I came up with this solution, which does not use any sub-shells (but one call to eval for brace expansion) or separate processes:
# range list is assumed to be in $1 (e.g. 1-3,5,9-13)
# convert $1 to an array of ranges ("1-3" "5" "9-13")
IFS=,
local range=($1)
unset IFS
list=() # initialize result list
local r
for r in "${range[#]}"; do
if [[ $r == *-* ]]; then
# if the range is of the form "x-y",
# * convert to a brace expression "{x..y}",
# * using eval, this gets expanded to "x" "x+1" … "y" and
# * append this to the list array
eval list+=( {${r/-/..}} )
else
# otherwise, it is a simple number and can be appended to the array
list+=($r)
fi
done
# test output
echo ${list[#]}

Counting newline characters in bash shell script

I cannot get this script to work at all. I am just trying to count the number of lines in a file WITHOUT using wc. here is what I have so far
FILE=file.txt
lines=0
while IFS= read -n1 char
do
if [ "$char" == "\n" ]
then
lines=$((lines+1))
fi
done < $FILE
this is just a small part of a bigger script that should count total words, characters and lines in a file. I cannot figure any of it out though. Please help
The problem is the if-statement conditional is never true.. Its as if the program cannot detect what a '\n' is.
declare -i lines=0 words=0 chars=0
while IFS= read -r line; do
((lines++))
array=($line) # don't quote the var to enable word splitting
((words += ${#array[#]}))
((chars += ${#line} + 1)) # add 1 for the newline
done < "$filename"
echo "$lines $words $chars $filename"
You have two problems there. They are fixed in the following:
#!/bin/bash
file=file.txt
lines=0
while IFS= read -rN1 char; do
if [[ "$char" == $'\n' ]]; then
((++lines))
fi
done < "$file"
One problem was the $'\n' in the test, the other one, more subtle, was that you need to use the -N switch, not the -n one in read (help read for more information). Oh, and you also want to use the -r option (check with and without, when you have backslashes in your file).
Minor things I changed: Use more robust [[...]], used lower case variable names (it's considered bad practice to have upper case variable names). Used arithmetic ((++lines)) instead of the silly lines=$((lines+1)).

Resources