List of extensions of filenames in bash script in one line - bash

I currently have the following line of code:
ls /some/dir/prefix* | sed -e 's/prefix.//' | tr '\n' ' '
Which does achieve what I want it to do:
Get list of files starting with prefix
Remove path and prefix from each string
Remove newlines and replace with spaces for later processing.
For example:
/some/dir/prefix.hello
/some/dir/prefix.world
Should become
hello world
But I feel like there's a nicer way of doing this. Is there a better way to do this in one line?

Here is a two-liner using just built-ins that does it:
fnames=(some/dir/prefix*)
echo "${fnames[#]##*.}"
And here's how this works:
fnames=(some/dir/prefix*) creates an array with all the files starting with prefix and avoids all the problems that come with parsing ls
echo "${fnames[#]##*.}" is a combination of two parameter expansions: ${fnames[#]} prints all array elements, and the ##*. part removes the longest match of anything that ends with . from each array element, leaving just the extension
If you're hell-bent on a one-liner, just join the two commands with &&.

passing ls output to external programs is not recommended, following bash solution may help you here.
for file in prefix*; do echo ${file##*.}; done
Adding a non-one liner form of solution too now.
for file in prefix*
do
echo ${file##*.}
done

Here is a very simple Awk one-liner to achieve this :
awk -F. '{$0=FILENAME; printf $NF" "; nextfile}' /some/dir/prefix*
It essentially does the following :
-F.: Set the field separator FS to a .. This way $NF represents the extension.
$0=FILENAME: Ignore the current record and set it to FILENAME, reparse everything this way.
print $NF; nextfile : print the extension and go to the next file.
The problem with this is that the file still reads a record of the current file. If that file is empty this will fail.
To make this work with empty files, you could use the gawk extension BEGINFILE
awk -F. 'BEGINFILE{$0=FILENAME; printf $NF" "; nextfile}' /some/dir/prefix*
Or you can loop over all the arguments :
awk -F. 'BEGIN{for(i in ARGV){$0=ARGV[i]; printf $NF" "};exit}' /some/dir/prefix*

One approach with awk:
ls /some/dir/prefix* | awk -F"." '{printf "%s ", $2} END {print ""}'
It might qualify as being "nicer" because there's only one command the output is piped through?!

Related

Bash read filename and return version number with awk

I am trying to use one or two lines of Bash (that can be run in a command line) to read a folder-name and return the version inside of the name.
So if I have myfolder_v1.0.13 I know that I can use echo "myfolder_v1.0.13" | awk -F"v" '{ print $2 }' and it will return with 1.0.13.
But how do I get the shell to read the folder name and pipe with the awk command to give me the same result without using echo? I suppose I could always navigate to the directory and translate the output of pwd into a variable somehow?
Thanks in advance.
Edit: As soon as I asked I figured it out. I can use
result=${PWD##*/}; echo $result | awk -F"v" '{ print $2 }'
and it gives me what I want. I will leave this question up for others to reference unless someone wants me to take it down.
But you don't need an Awk at all, here just use bash parameter expansion.
string="myfolder_v1.0.13"
printf "%s\n" "${string##*v}"
1.0.13
You can use
basename "$(cd "foldername" ; pwd )" | awk -Fv '{print $2}'
to get the shell to give you the directory name, but if you really want to use the shell, you could also avoid the use of awk completetly:
Assuming you have the path to the folder with the version number in the parameter "FOLDERNAME":
echo "${FOLDERNAME##*v}"
This removes the longest prefix matching the glob expression "*v" in the value of the parameter FOLDERNAME.

Find string in col 1, print col 2 in awk

I'm on a Mac, and I want to find a field in a CSV file adjacent to a search string
This is going to be a single file with a hard path; here's a sample of it:
84:a5:7e:6c:a6:b0, AP-ATC-151g84
84:a5:7e:6c:a6:b1, AP-A88-131g84
84:a5:7e:73:10:32, AP-AG7-133g56
84:a5:7e:73:10:30, AP-ADC-152g81
84:a5:7e:73:10:31, AP-D78-152e80
so if my search string is "84:a5:7e:73:10:32"
I want to get returned "AP-AG7-133g56"
I had been working within an Applescript, but maybe a shell script will do.
I just need the proper syntax for opening the file and having awk search it. Again, I'm weak conceptually on how shell commands run, how they must be executed, etc
This errors, gives me ("command not found"):
set the_file to "/Users/Paw/Desktop/AP-Decoder 3.app/Contents/Resources/BSSIDtable.csv"
set the_val to "70:56:81:cb:a2:dc"
do shell script "'awk $1 ~ the_val {print $2} the_file'"
Thank you for coddling me...
This is a relatively simple:
awk '$1 == "70:56:81:cb:a2:dc," {print "The answer is "$2}' 'BSSIDtable.csv'
(the "The answer is " text can be omitted if you only wish to see only the data, but this shows you how to get more user-friendly output if desired).
The comma is included since awk uses white space for separators so the comma becomes part of column 1.
If the thing you're looking for is in a shell variable, you can use -v to provide that to awk as an awk variable:
lookfor="70:56:81:cb:a2:dc,"
awk -v mac=$lookfor '$1 == mac {print "The answer is "$2}' 'BSSIDtable.csv'
As an aside, your AppleScript solution is probably not working because the $1/$2 are being interpreted as shell variable rather than awk variables. If you insist on using AppleScript, you will have to figure out how to construct a shell command that quotes the awk commands correctly.
My advice is to just use the shell directly, the number of people proficient in that almost certainly far outnumber those proficient in AppleScript :-)
if sed is available (normaly on mac, event if not tagged in OP)
simple but read all the file
sed -n 's/84:a5:7e:73:10:32,[[:blank:]]*//p' YourFile
quit after first occurence (so average of 50% faster on huge file)
sed -n -e '/84:a5:7e:73:10:32,[[:blank:]]*/!b' -e 's///p;q' YourFile
awk
awk '/^84:a5:7e:73:10:32/ {print $2}'
# OR using a variable for batch interaction
awk -v Src='84:a5:7e:73:10:32' '$1 == Src {print $2}'
# OR assuming that case is unknow
awk -v Src='84:a5:7e:73:10:32' 'BEGIN{IGNORECASE=1} $1 == Src {print $2}'
by default it take $0 as compare test if a regex is present, just add the ^ to take first field content

How do I write an awk print command in a loop?

I would like to write a loop creating various output files with the first column of each input file, respectively.
So I wrote
for i in $(\ls -d /home/*paired.isoforms.results)
do
awk -F"\t" {print $1}' $i > $i.transcript_ids.txt
done
As an example if there were 5 files in the home directory named
A_paired.isoforms.results
B_paired.isoforms.results
C_paired.isoforms.results
D_paired.isoforms.results
E_paired.isoforms.results
I would like to print the first column of each of these files into a seperate output file, i.e. I would like to have 5 output files called
A.transcript_ids.txt
B.transcript_ids.txt
C.transcript_ids.txt
D.transcript_ids.txt
E.transcript_ids.txt
or any other name as long as it is 5 different names and I can still link them back to the original files.
I understand, that there is a problem with the double usage of $ in both the awk and the loop command, but I don't know how to change that.
Is it possible to write a command like this in a loop?
This should do the job:
for file in /home/*paired.isoforms.results
do
base=${file##*/}
base=${base%%_*}
awk -F"\t" '{print $1}' $file > $base.transcript_ids.txt
done
I assume that there can be spaces in the first field since you set the delimiter explicitly to tab. This runs awk once per file. There are ways to do it running awk once for all files, but I'm not convinced the benefit is significant. You could consider using cut instead of awk '{print $1}', too. Note that using ls as you did is less satisfactory than using globbing directly; it runs foul of file names with oddball characters (spaces, tabs, etc) in the name.
You can do that entirely in awk:
awk -F"\t" '{split(FILENAME,a,"_"); out=a[1]".transcript_ids.txt"; print $1 > out}' *_paired.isoforms.results
If your input files don't have names as indicated in the question, you'd have to split on something else ( as well as use a different pattern match for the input files ).
My original answer is actually doing extra name resolution every time something is printed. Here's a version that only updates the output filename when FILENAME changes:
awk -F"\t" 'FILENAME!=lf{split(FILENAME,a,"_"); out=a[1]".transcript_ids.txt"; lf=FILENAME} {print $1 > out}' *_paired.isoforms.results

How to put punctuation quotation in Awk command?

I am new to awk.I just try to write some thing that to exchange my text file.but I failed.
I want to output like 'hello'.
I used command awk '{print "'hello'"}' filename to do it.but failed:
output like: hello
but I used command awk '{print "\'hello\'"}' filename to do it.failed again:
output like: >
ok.it seems that the awk command do not get what I mean.
So I am confused about that .how to solve the problem.
guys thanks.
Using the ascii code:
awk '{print "\x27" "hello" "\x27"}' filename
Using a variable:
awk -v q="'" '{print q "hello" q}' filename
Example:
$ seq 2 > filename
$ awk '{print "\x27" "hello" "\x27"}' filename
'hello'
'hello'
$ awk -v q="'" '{print q "hello" q}' filename
'hello'
'hello'
Simply use double quotes:
awk "{print \"'hello'\"}" filename
Although that won't really modify your file.
awk '{print "'"'"'hello'"'"'"}' filename
clyfish's answer works, if you must have it output single quotes and you must use scripts that you pass on the command line.
What I usually do in cases like these, though, when I need to do quoting but I don't want to write a 'real' awk script, is this:
awk 'function q(word) { return "\"" word "\"" }
{ printf("mv %s SomeDir/;", q($0)) }'
What I've done is to define a function that returns whatever you pass it in double quotes. Then use printf to actually use it. Without doing that, I would have had to do:
awk '{ print("mv \"" $0 "\" SomeDir/;") }';
It gets pretty nasty. For more complicated examples, this can be a life saver.
However, suppose you really do need to output something with actual single quotes. In that case dealing with odd shell quoting rules while trying to pass scripts like this on the command line is going to drive you completely insane, so I would suggest you just write a simple throwaway file.
#!/usr/bin/awk
# hi.awk
{ print("'hello'") }
then call it:
awk -f ./hi.awk
You don't really even need the #! line in the file if you do it that way, but neither does it hurt.

How do I print a field from a pipe-separated file?

I have a file with fields separated by pipe characters and I want to print only the second field. This attempt fails:
$ cat file | awk -F| '{print $2}'
awk: syntax error near line 1
awk: bailing out near line 1
bash: {print $2}: command not found
Is there a way to do this?
Or just use one command:
cut -d '|' -f FIELDNUMBER
The key point here is that the pipe character (|) must be escaped to the shell. Use "\|" or "'|'" to protect it from shell interpertation and allow it to be passed to awk on the command line.
Reading the comments I see that the original poster presents a simplified version of the original problem which involved filtering file before selecting and printing the fields. A pass through grep was used and the result piped into awk for field selection. That accounts for the wholly unnecessary cat file that appears in the question (it replaces the grep <pattern> file).
Fine, that will work. However, awk is largely a pattern matching tool on its own, and can be trusted to find and work on the matching lines without needing to invoke grep. Use something like:
awk -F\| '/<pattern>/{print $2;}{next;}' file
The /<pattern>/ bit tells awk to perform the action that follows on lines that match <pattern>.
The lost-looking {next;} is a default action skipping to the next line in the input. It does not seem to be necessary, but I have this habit from long ago...
The pipe character needs to be escaped so that the shell doesn't interpret it. A simple solution:
$ awk -F\| '{print $2}' file
Another choice would be to quote the character:
$ awk -F'|' '{print $2}' file
Another way using awk
awk 'BEGIN { FS = "|" } ; { print $2 }'
And 'file' contains no pipe symbols, so it prints nothing. You should either use 'cat file' or simply list the file after the awk program.

Resources