This question already has answers here:
number in string to find char in that string UNIX
(2 answers)
Closed 8 years ago.
Say I have a file
3 boy
2 hello
3 bus
and I want to select the ith letter from each line, where i is the number in front of the line (resulting in y, e, s). Is there an easy way to do this with sed/cut? I tried matching for example the substring with the first that many letters with
cat test.txt | sed -e 's/\([0-9]\) \(.*\)\{\1\}.*/\2/'
to then cut it afterwards, but this yields an error Invalid content of \{\}. What is the proper way to do this (preferrably with just sed/cut/... so without for-loops etc.)?
I am looking for a way that can be done as pipelining, i.e. starting the line with cat test.txt | ....
a single awk script can be written as
awk '{print substr($2,$1,1)}' inputFile
gives output as
y
e
s
substr(str, pos, len) function returns the substring starting at postion pos with length as len
You can use a loop:
while read -r number name
do
echo "${name:$number - 1:1}"
done < file
This takes profit of the ${string:position:length} syntax: extract $length characters substring from $string at $position. As the first character is at position 0, we have to substract 1 to get the needed one.
For your given input it returns:
$ while read -r number name; do echo "${name:$number - 1:1}"; done < a
y
e
s
Related
suppose I have file containing numbers like:
1 4 7
2 5 8
and I want to add 1 to all these numbers, making the output like:
2 5 8
3 6 9
is there a simple one-line command (e.g. awk) to realize this?
try following once.
awk '{for(i=1;i<=NF;i++){$i=$i+1}} 1' Input_file
EDIT: As per OP's request without loop, here is a solution(written as per shown sample only).
With hardcoding of number of fields.
awk -v RS='[ \n]' '{ORS=NR%3==0?"\n":" ";print $0+1}' Input_file
OR
Without hardcoding number of fields.
awk -v RS='[ \n]' -v col=$(awk 'FNR==1{print NF}' Input_file) '{ORS=NR%col==0?"\n":" ";print $0+1}' Input_file
Explanation: So in EDIT section 1st solution I have hardcoded the number of fields by mentioning 3 there, in OR solution of EDIT, I am creating a variable named col which will read the very first line of Input_file to get the number of fields. Then it will not read all the Input_file, Now coming onto the code I have set Record separator as space or new line to it will add them without using a loop and it will add space each time after incrementing 1 in their values. It will print new line only when number of lines are completely divided by value of col(which is why we have taken number of fields in -v col section).
In native bash (no awk or other external tool needed):
#!/usr/bin/env bash
while read -r -a nums; do # read a line into an array, splitting on spaces
out=( ) # initialize an empty output array for that line
for num in "${nums[#]}"; do # iterate over the input array...
out+=( "$(( num + 1 ))" ) # ...and add n+1 to the output array.
done
printf '%s\n' "${out[*]}" # then print that output array with a newline following
done <in.txt >out.txt # with input from in.txt and output to out.txt
You can do this using gnu awk:
awk -v RS="[[:space:]]+" '{$0++; ORS=RT} 1' file
2 5 8
3 6 9
If you don't mind Perl:
perl -pe 's/(\d+)/$1+1/eg' file
Substitute any number composed of multiple digits (\d+) with that number ($1) plus 1. /e means to execute the replacement calculation, and /g means globally throughout the file.
As mentioned in the comments, the above only works for positive integers - per the OP's original sample file. If you wanted it to work with negative numbers, decimals and still retain text and spacing, you could go for something like this:
perl -pe 's/([-]?[.0-9]+)/$1+1/eg' file
Input file
Some column headers # words
1 4 7 # a comment
2 5 cat dog # spacing and stray words
+5 0 # plus sign
-7 4 # minus sign
+1000.6 # positive decimal
-21.789 # negative decimal
Output
Some column headers # words
2 5 8 # a comment
3 6 cat dog # spacing and stray words
+6 1 # plus sign
-6 5 # minus sign
+1001.6 # positive decimal
-20.789 # negative decimal
I am trying to create a Bash script that
- prints a random word
- if a number is supplied as the first command line argument then it will select from only words with that many characters.
This is my go at the first section (print a random word):
C=$(sed -n "$RANDOM p" /usr/share/dict/words)
echo $C
I am really stuck with the second section. Can anyone help?
might help someone coming from ryans tutorial
#!/bin/bash
charlen=$1
grep -E "^.{$charlen}$" $PWD/words.txt | shuf -n 1
you have to use a while loop to read every single line of that file and check if the length of a word equals the specified number ( including apostrophes ). In my o.s it is 99171 line ( i.e the file).
#!/usr/bin/env bash
readWords() {
declare -i int="$1"
(( int == 0 )) && {
printf "%s\n" "$int is 0, cant find 0 words"
return 1
}
while read getWords;do
if [[ ${#getWords} -eq $int ]];then
printf "%s\n" "$getWords"
fi
done < /usr/share/dict/words
}
readWords 20
this function takes a single argument. the declare command coerces the argument into an integer, if the argument is a string , it coerces it into a number which is 0 . Since we don't have 0 words if the specified argument ( number ) is 0 ( or a string coerced to 0 ) return from the function.
Read every single line in /usr/share/dict/words, get the length of each line with ${#getWords} ( $# >> gives the length of a string/commandline parameters/array size ) check if it equals the specified argument ( number )
A loop is not required, you can do something like
CH=$1; # how many characters the word must have
WordFile=/usr/share/dict/words; # file to read from
# find how many words that matches that length
TOTW=$(grep -Ec "^.{$CH}$" $WordFile);
# pick a random one, if you expect more than 32767 hits you
# need to do something like ($RANDOM+1)*($RANDOM+1)
RWORD=$(($RANDOM%$TOTW+1));
#show that word
grep -E "^.{$CH}$" $WordFile|sed -n "$RWORD p"
Depending on things you probably need to add checks for things like that $1 is a reasonable number, the file exist, that TOTW is >0 and so on.
This code would achieve what you want:
awk -v n="$1" 'length($0) == n' /usr/share/dict/words > /tmp/wordsHolder
shuf -n 1 /tmp/wordsHolder
Some comments: by using "$RANDOM" (as you did on your original script attempt), one would generate an integer on the range 0 - 32767, which could be more (or less) than the number of words (lines) available, given the desired number of characters on a word -- thus, potential for errors here.
To avoid that, we are using a shuf syntax that will retrieve a (sub)randomly picked word (line) on the file using its entire range (from line 1 - last line of file).
Suppose I have a file as follows (a sorted, unique list of integers, one per line):
1
3
4
5
8
9
10
I would like the following output (i.e. the missing integers in the list):
2
6
7
How can I accomplish this within a bash terminal (using awk or a similar solution, preferably a one-liner)?
Using awk you can do this:
awk '{for(i=p+1; i<$1; i++) print i} {p=$1}' file
2
6
7
Explanation:
{p = $1}: Variable p contains value from previous record
{for ...}: We loop from p+1 to the current row's value (excluding current value) and print each value which is basically the missing values
Using seq and grep:
seq $(head -n1 file) $(tail -n1 file) | grep -vwFf file -
seq creates the full sequence, grep removes the lines that exists in the file from it.
perl -nE 'say for $a+1 .. $_-1; $a=$_'
Calling no external program (if filein contains the list of numbers):
#!/bin/bash
i=0
while read num; do
while (( ++i<num )); do
echo $i
done
done <filein
To adapt choroba's clever answer for my own use case, I needed my sequence to deal with zero-padded numbers.
The -w switch to seq is the magic here - it automatically pads the first number with the necessary number of zeroes to keep it aligned with the second number:
-w, --equal-width equalize width by padding with leading zeroes
My integers go from 0 to 9999, so I used the following:
seq -w 0 9999 | grep -vwFf "file.txt"
...which finds the missing integers in a sequence from 0000 to 9999. Or to put it back into the more universal solution in choroba's answer:
seq -w $(head -n1 "file.txt") $(tail -n1 "file.txt") | grep -vwFf "file.txt"
I didn't personally find the - in his answer was necessary, but there may be usecases which make it so.
Using Raku (formerly known as Perl_6)
raku -e 'my #a = lines.map: *.Int; say #a.Set (^) #a.minmax.Set;'
Sample Input:
1
3
4
5
8
9
10
Sample Output:
Set(2 6 7)
I'm sure there's a Raku solution similar to #JJoao's clever Perl5 answer, but in thinking about this problem my mind naturally turned to Set operations.
The code above reads lines into the #a array, mapping each line so that elements in the #a array are Ints, not strings. In the second statement, #a.Set converts the array to a Set on the left-hand side of the (^) operator. Also in the second statement, #a.minmax.Set converts the array to a second Set, on the right-hand side of the (^) operator, but this time because the minmax operator is used, all Int elements from the min to max are included. Finally, the (^) symbol is the symmetric set-difference (infix) operator, which finds the difference.
To get an unordered whitespace-separated list of missing integers, replace the above say with put. To get a sequentially-ordered list of missing integers, add the explicit sort below:
~$ raku -e 'my #a = lines.map: *.Int; .put for (#a.Set (^) #a.minmax.Set).sort.map: *.key;' file
2
6
7
The advantage of all Raku code above is that finding "missing integers" doesn't require a "sequential list" as input, nor is the input required to be unique. So hopefully this code will be useful for a wide variety of problems in addition to the explicit problem stated in the Question.
OTOH, Raku is a Perl-family language, so TMTOWTDI. Below, a #a.minmax array is created, and grepped so that none of the elements of #a are returned (none junction):
~$ raku -e 'my #a = lines.map: *.Int; .put for #a.minmax.grep: none #a;' file
2
6
7
https://docs.raku.org/language/setbagmix
https://docs.raku.org/type/Junction
https://raku.org
This question already has answers here:
How to perform a for loop on each character in a string in Bash?
(16 answers)
Closed 7 years ago.
I am trying to iterate through a string taken as an input through the read command. I'm trying to output the number of each letter and each letter It should then use a loop to output each letter in turn. For example, if the user enters "picasso", the output should be:
Letter 1: p
Letter 2: i
Letter 3: c
Letter 4: a
Letter 5: s
Letter 6: s
Letter 7: o
Here is my current code:
#!/bin/bash
# Prompt a user to enter a word and output each letter in turn.
read -p "Please enter a word: " word
for i in $word
do
echo "Letter $i: $word"
done
Should I be placing the input to an array? I'm still new to programming loops but I'm finding it impossible to figure out the logic.
Any advice? Thanks.
Combining answers from dtmilano and patrat would give you:
read -p "Please enter a word: " word
for i in $(seq 1 ${#word})
do
echo "Letter $i: ${word:i-1:1}"
done
${#word} gives you the length of the string.
Use the substring operator
${word:i:1}
to obtain the i'th character of word.
Check out seq mechanism in bash
For example:
seq 1 10
Will give you
1 2 3 4 5 6 7 8 9 10
You can try with letters
echo {a..g}
Result
a b c d e f g
Now you should handle your problem
I have a text file like this:
AAAAAA this is some content.
This is AAAAAA some more content AAAAAA. AAAAAA
This is yet AAAAAA some more [AAAAAA] content.
I need to replace all occurrence of AAAAAA with an incremented number, e.g., the output would look like this:
1 this is some content.
This is 2 some more content 3. 4
This is yet 5 some more [6] content.
How can I replace all of the matches with an incrementing number?
Here is one way of doing it:
$ awk '{for(x=1;x<=NF;x++)if($x~/AAAAAA/){sub(/AAAAAA/,++i)}}1' file
1 this is some content.
This is 2 some more content 3. 4
This is yet 5 some more [6] content.
A perl solution:
perl -pe 'BEGIN{$A=1;} s/AAAAAA/$A++/ge' test.dat
This might work for you (GNU sed):
sed -r ':a;/AAAAAA/{x;:b;s/9(_*)$/_\1/;tb;s/^(_*)$/0\1/;s/$/:0123456789/;s/([^_])(_*):.*\1(.).*/\3\2/;s/_/0/g;x;G;s/AAAAAA(.*)\n(.*)/\2\1/;ta}' file
This is a toy example, perl or awk would be a better fit for a solution.
The solution only acts on lines which contain the required string (AAAAAA).
The hold buffer is used as a place to keep the incremented integer.
In overview: when a required string is encountered, the integer in the hold space is incremented, appended to the current line, swapped for the required string and the process is then repeated until all occurences of the string are accounted for.
Incrementing an integer simply swaps the last digit (other than trailing 9's) for the next integer in sequence i.e. 0 to 1, 1 to 2 ... 8 to 9. Where trailing 9's occur, each trailing 9 is replaced by a non-integer character e.g '_'. If the number being incremented consists entirely of trailing 9's a 0 is added to the front of the number so that it can be incremented to 1. Following the increment operation, the trailing 9's (now _'s) are replaced by '0's.
As an example say the integer 9 is to be incremented:
9 is replaced by _, a 0 is prepended (0_), the 0 is swapped for 1 (1_), the _ is replaced by 0. resulting in the number 10.
See comments directed at #jaypal for further notes.
Maybe something like this
#!/bin/bash
NR=1
cat filename while read line
do
line=$(echo $line | sed 's/AAAAA/$NR/')
echo ${line}
NR=$((NR + 1 ))
done
Perl did the job for me
perl -pi -e 's/\b'DROP'\b/$&.'_'.++$A /ge' /folder/subfolder/subsubfolder/*
Input:
DROP
drop
$drop
$DROP
$DROP="DROP"
$DROP='DROP'
$DROP=$DROP
$DROP="DROP";
$DROP='DROP';
$DROP=$DROP;
$var="DROP_ACTION"
drops
DROPS
CODROP
'DROP'
"DROP"
/DROP/
Output:
DROP_1
drop
$drop
$DROP_2
$DROP_3="DROP_4"
$DROP_5='DROP_6'
$DROP_7=$DROP_8
$DROP_9="DROP_10";
$DROP_11='DROP_12';
$DROP_13=$DROP_14;
$var="DROP_ACTION"
drops
DROPS
CODROP
'DROP_15'
"DROP_16"
/DROP_17/