How to place files containing increasing numeric names consecutively in the terminal - bash

I have certain files named something like file_1.txt, file_2.txt, ..., file_40.txt and I want to plot them in the terminal using xmgrace like this:
xmgrace file_01.txt file_02.txt [...] file_40.txt
What would be a bash code, maybe a for loop code so that I don't have to write them one by one from 1 to 40, please?
[Edit:]
I should mention that I tried to use the for loop as follows: for i in {00-40}; do xmgrace file_$i.txt; done, but it didn't help as it opens each file separately.

Depending of the tool you use:
xmlgrace file_*.txt
using a glob (this will treat all files matching the pattern)
or as Jetchisel wrote in comments:
xmlgrace file_{1..40}.txt
This is brace expansion
For general purpose, if the tool require a loop:
for i in {1..40}; do something "$i"; done
or
for ((i=0; i<=40; i++)); do something "$i"; done

Related

Double sequence with brackets?

I was wondering if there is a simple way to produce a double sequence with something similar to curl braces.
I would like to produce a double sequence like this one:
eog directory1/somethingelse/file2.png directory3/somethingelse/file6.png ... directory25/somethingelse/file50.png
The sequences of directories and files are regular (I mean is something like, e.g., {1..25..2} for directories and {2..50..4} for files).
I wonder if there is a simple way to produce the sequences instead of using vectors with all the values. I mean something like
eog directory[someOpenedBracket]1..25..2[someClosedBracket]/somethingelse/file[someOpenedBracket]2..50..4[someClosedBracket].png
Thanks in advance
You can populate 2 separate array and loop through them:
dirs=(directory{1..25..2})
files=(file{2..50..4})
for ((i=0; i<${#dirs[#]}; i++)); do
printf '%s ' "${dirs[i]}/somethingelse/${files[i]}.png"
done
echo
Output: (appears in one line in output)
directory1/somethingelse/file2.png
directory3/somethingelse/file6.png
directory5/somethingelse/file10.png
directory7/somethingelse/file14.png
directory9/somethingelse/file18.png
directory11/somethingelse/file22.png
directory13/somethingelse/file26.png
directory15/somethingelse/file30.png
directory17/somethingelse/file34.png
directory19/somethingelse/file38.png
directory21/somethingelse/file42.png
directory23/somethingelse/file46.png
directory25/somethingelse/file50.png
I think this is all you can achieve with Bash.
eog $(i=0; while ((++i<=25)); do echo dir$i/file$((i++*2)); done)

concatenate files with similar names using shell

I have very limited knowledge of shell scripting, for example if I have the following files in a folder
abcd_1_1.txt
abcd_1_2.txt
def_2_1.txt
def_2_2.txt
I want the output as abcd_1.txt, def_2.txt. For each pattern in the file names, concantenate the files and generate the 'pattern'.txt as an output
patterns list <-?
for i in patterns; do echo cat "$i"* > "$i".txt; done
I am not sure how to code this in a shell script, any help is appreciated.
Maybe something like this (assumes bash, and I didn't test it).
declare -A prefix
files=(*.txt)
for f in "${files[#]"; do
prefix[${f%_*}]=
done
for key in "${!prefix[#]}"; do
echo "${prefix[$key]}.txt"
done
for i in abcd_1 def_2
do
cat "$i"*.txt > "$i".txt
done
The above will work in any POSIX shell, such as dash or bash.
If, for some reason, you want to maintain a list of patterns and then loop through them, then it is appropriate to use an array:
#!/bin/bash
patterns=(abcd_1 def_2)
for i in "${patterns[#]}"
do
cat "$i"*.txt > "$i".txt
done
Arrays require an advanced shell such as bash.
Related Issue: File Order
Does it the order in which files are added to abcd_1 or def_2 matter to you? The * will result is lexical ordering. This can conflict with numeric ordering. For example:
$ echo def_2_*.txt
def_2_10.txt def_2_11.txt def_2_12.txt def_2_1.txt def_2_2.txt def_2_3.txt def_2_4.txt def_2_5.txt def_2_6.txt def_2_7.txt def_2_8.txt def_2_9.txt
Observe that def_2_12.txt appears in the list ahead of def_2_1.txt. Is this a problem? If so, we can explicitly force numeric ordering. One method to do this is bash's brace expansion:
$ echo def_2_{1..12}.txt
def_2_1.txt def_2_2.txt def_2_3.txt def_2_4.txt def_2_5.txt def_2_6.txt def_2_7.txt def_2_8.txt def_2_9.txt def_2_10.txt def_2_11.txt def_2_12.txt
In the above, the files are numerically ordered.

Bash: Trying to append to a variable name in the output of a function

this is my very first post on Stackoverflow, and I should probably point out that I am EXTREMELY new to a lot of programming. I'm currently a postgraduate student doing projects involving a lot of coding in various programs, everything from LaTeX to bash, MATLAB etc etc.
If you could explicitly explain your answers that would be much appreciated as I'm trying to learn as I go. I apologise if there is an answer else where that does what I'm trying to do, but I have spent a couple of days looking now.
So to the problem I'm trying to solve: I'm currently using a selection of bioinformatics tools to analyse a range of genomes, and I'm trying to somewhat automate the process.
I have a few sequences with names that look like this for instance (all contained in folders of their own currently as paired files):
SOL2511_S5_L001_R1_001.fastq
SOL2511_S5_L001_R2_001.fastq
SOL2510_S4_L001_R1_001.fastq
SOL2510_S4_L001_R2_001.fastq
...and so on...
I basically wish to automate the process by turning these in to variables and passing these variables to each of the programs I use in turn. So for example my idea thus far was to assign them as wildcards, using the R1 and R2 (which appears in all the file names, as they represent each strand of DNA) as follows:
#!/bin/bash
seq1=*R1_001*
seq2=*R2_001*
On a rudimentary level this works, as it returns the correct files, so now I pass these variables to my first function which trims the DNA sequences down by a specified amount, like so:
# seqtk is the program suite, trimfq is a function within it,
# and the options -b -e specify how many bases to trim from the beginning and end of
# the DNA sequence respectively.
seqtk trimfq -b 10 -e 20 $seq1 >
seqtk trimfq -b 10 -e 20 $seq2 >
So now my problem is I wish to be able to append something like "_trim" to the output file which appears after the >, but I can't find anything that seems like it will work online.
Alternatively, I've been hunting for a script that will take the name of the folder that the files are in, and create a variable for the folder name which I can then give to the functions in question so that all the output files are named correctly for use later on.
Many thanks in advance for any help, and I apologise that this isn't really much of a minimum working example to go on, as I'm only just getting going on all this stuff!
Joe
EDIT
So I modified #ghoti 's for loop (does the job wonderfully I might add, rep for you :D ) and now I append trim_, as the loop as it was before ended up giving me a .fastq.trim which will cause errors later.
Is there any way I can append _trim to the end of the filename, but before the extension?
Explicit is usually better than implied, when matching filenames. Your wildcards may match more than you expect, especially if you have versions of the files with "_trim" appended to the end!
I would be more precise with the wildcards, and use for loops to process the files instead of relying on seqtk to handle multiple files. That way, you can do your own processing on the filenames.
Here's an example:
#!/bin/bash
# Define an array of sequences
sequences=(R1_001 R2_001)
# Step through the array...
for seq in ${sequences[#]}; do
# Step through the files in this sequence...
for file in SOL*_${seq}.fastq; do
seqtk trimfq -b 10 -e 20 "$file" > "${file}.trim"
done
done
I don't know how your folders are set up, so I haven't addressed that in this script. But the basic idea is that if you want the script to be able to manipulate individual filenames, you need something like a for loop to handle the that manipulation on a per-filename basis.
Does this help?
UPDATE:
To put _trim before the extension, replace the seqtk line with the following:
seqtk trimfq -b 10 -e 20 "$file" > "${file%.fastq}_trim.fastq"
This uses something documented in the Bash man page under Parameter Expansion if you want to read up on it. Basically, the ${file%.fastq} takes the $file variable and strips off a suffix. Then we add your extra text, along with the suffix.
You could also strip an extension using basename(1), but there's no need to call something external when you can use something built in to the shell.
Instead of setting variables with the filenames, you could pipe the output of ls to the command you want to run with these filenames, like this:
ls *R{1,2}_001* | xargs -I# sh -c 'seqtk trimfq -b 10 -e 20 "$1" > "${1}_trim"' -- #
xargs -I# will grab the output of the previous command and store it in # to be used by seqtk

Create a new sequence of files from an existing sequence, along with numbering

I know this question has been asked, but I can't find more than one solution, and it does not work for me. Essentially, I'm looking for a bash script that will take a file list that looks like this:
image1.jpg
image2.jpg
image3.jpg
And then make a copy of each one, but number it sequentially backwards. So, the sequence would have three new files created, being:
image4.jpg
image5.jpg
image6.jpg
And yet, image4.jpg would have been an untouched copy of image3.jpg, and image5.jpg an untouched copy of image2.jpg, and so on. I have already tried the solution outlined in this stackoverflow question with no luck. I am admittedly not very far down the bash scripting path, and if I take the chunk of code in the first listed answer and make a script, I always get "2: Syntax error: "(" unexpected" over and over. I've tried changing the syntax with the ( around a bit, but no success ever. So, either I am doing something wrong or there's a better script around.
Sorry for not posting this earlier, but the code I'm using is:
image=( image*.jpg )
MAX=${#image[*]}
for i in ${image[*]}
do
num=${i:5:3} # grab the digits
compliment=$(printf '%03d' $(echo $MAX-$num | bc))
ln $i copy_of_image$compliment.jpg
done
And I'm taking this code and pasting it into a file with nano, and adding !#/bin/bash as the first line, then chmod +x script and executing in bash via sh script. Of course, in my test runs, I'm using files appropriately titled image1.jpg - but I was also wondering about a way to apply this script to a directory of jpegs, not necessarily titled image(integer).jpg - in my file keeping structure, most of these are a single word, followed by a number, then .jpg, and it would be nice to not have to rewrite the script for each use.
Perhaps something like this. It will work well for something like script image*.jpg where the wildcard matches a set of files which match a regular pattern with monotonously increasing numbers of the same length, and less ideally with a less regular subset of the files in the current directory. It simply assumes that the last file's digit index plus one through the total number of file names is the range of digits to loop over.
#!/bin/sh
# Extract number from final file name
eval lastidx=\$$#
tmp=${lastidx#*[!0-9][0-9]}
lastidx=${lastidx#${lastidx%[0-9]$tmp}}
tmp=${lastidx%[0-9][!0-9]*}
lastidx=${lastidx%${lastidx#$tmp[0-9]}}
num=$(expr $lastidx + $#)
width=${#lastidx}
for f; do
pref=${f%%[0-9]*}
suff=${f##*[0-9]}
# Maybe show a warning if pref, suff, or width changed since the previous file
printf "cp '$f' '$pref%0${width}i$suff'\\n" $num
num=$(expr $num - 1)
done |
sh
This is sh-compatible; the expr stuff and the substring extraction up front is ugly but Bourne-compatible. If you are fine with the built-in arithmetic and string manipulation constructs of Bash, converting to that form should be trivial.
(To be explicit, ${var%foo} returns the value of $var with foo trimmed off the end, and ${var#foo} does similar trimming from the beginning of the value. Regular shell wildcard matching operators are available in the expression for what to trim. ${#var} returns the length of the value of $var.)
Maybe your real test data runs from 001 to 300, but here you have image1 2 3, and therefore you extract one, not three digits from the filename. num=${i:5:1}
Integer arithmetic can be done in the bash without calling bc
${#image[#]} is more robust than ${#image[*]}, but shouldn't be a difference here.
I didn't consult a dictionary, but isn't compliment something for your girl friend? The opposite is complement, isn't it? :)
the other command made links - to make copies, call cp.
Code:
#!/bin/bash
image=( image*.jpg )
MAX=${#image[#]}
for i in ${image[#]}
do
num=${i:5:1}
complement=$((2*$MAX-$num+1))
cp $i image$complement.jpg
done
Most important: If it is bash, call it with bash. Best: do a shebang (as you did), make it executable and call it by ./name . Calling it with sh name will force the wrong interpreter. If you don't make it executable, call it bash name.

How to rename files keeping a variable part of the original file name

I'm trying to make a script that will go into a directory and run my own application with each file matching a regular expression, specifically Test[0-9]*.txt.
My input filenames look like this TestXX.txt. Now, I could just use cut and chop off the Test and .txt, but how would I do this if XX wasn't predefined to be two digits? What would I do if I had Test1.txt, ..., Test10.txt? In other words, How would I get the [0-9]* part?
Just so you know, I want to be able to make a OutputXX.txt :)
EDIT:
I have files with filename Test[0-9]*.txt and I want to manipulate the string into Output[0-9]*.txt
Would something like this help?
#!/bin/bash
for f in Test*.txt ;
do
process < $f > ${f/Test/Output}
done
Bash Shell Parameter Expansion
A good tutorial on regexes in bash is here. Summarizing, you need something like:
if [[$filenamein =~ "^Test([0-9]*).txt$"]]; then
filenameout = "Output${BASH_REMATCH[1]}.txt"
and so on. The key is that, when you perform the =~" regex-match, the "sub-matches" to parentheses-enclosed groups in the RE are set in the entries of arrayBASH_REMATCH(the[0]entry is the whole match,1` the first parentheses-enclosed group, etc).
You need to use rounded brackets around the part you want to keep.
i.e. "Test([0-9]*).txt"
The syntax for replacing these bracketed groups varies between programs, but you'll probably find you can use \1 , something like this:
s/Test(0-9*).txt/Output\1.txt/
If you're using a unix shell, then 'sed' might be your best bet for performing the transformation.
http://www.grymoire.com/Unix/Sed.html#uh-4
Hope that helps
for file in Test[0-9]*.txt;
do
num=${file//[^0-9]/}
process $file > "Output${num}.txt"
done

Resources