I want to be able to specify a directory with fastq files with a script that will loop through all the files and do some stuff. Here is my attempt:
threads=24
current_path=`pwd`
input_file=${current_path}/raw/
files=${current_path}/raw/*
for file in ${files}; do
output_file=${current_path}/${file}_out/
mkdir -m gu=wrx,o=rx ${output_file}
spades.py \
--s1 ${input_file}${file} \
-t ${threads} \
--plasmid \
--careful \
-o ${output_file}
done
So in this script I get an error: cannot make directory, directory does not exist The script generates a /home folder. I don't know if I am specifying the files incorrectly or if I am using the for loop incorrectly.
Thank you!
you concatenate full path to file with folder in line
output_file=${current_path}/${file}_out/
it should be
output_file=${file}_out/
Related
I'm working on a python scrip that verify the integrity of some downloaded projects.
On my nas, I have all my compressed folder: folder1.tar.gz, folder2.tar.gz, …
On my Linux computer, the equivalent uncompressed folder : folder1, folder2, …
So, i want to compare the integrity of my files without any UnTar or download !
I think i can do it on the nas with something like (with md5sum):
sshpass -p 'pasword' ssh login#my.nas.ip tar -xvf /path/to/my/folder.tar.gz | md5sum | awk '{ print $1 }'
this give me a hash, but I don't know how to get an equivalent hash to compare with the normal folder on my computer. Maybe the way I am doing it is wrong.
I need one command for the nas, and one for the Linux computer, that output the same hash ( if the folders are the same, of course )
If you did that, tar xf would actually extract the files. md5sum would only see the file listing, and not the file content.
However, if you have GNU tar on the server and the standard utility paste, you could create checksums this way:
mksums:
#!/bin/bash
data=/path/to/data.tar.gz
sums=/path/to/data.md5
paste \
<(tar xzf "$data" --to-command=md5sum) \
<(tar tzf "$data" | grep -v '/$') \
| sed 's/-\t//' > "$sums"
Run mksums above on the machine with the tar file.
Copy the sums file it creates to the computer with the folders and run:
cd /top/level/matching/tar/contents
md5sums -c "$sums"
paste joins lines of files given as arguments
<( ...) runs a command, making its output appear in a fifo
--to-command is a GNU tar extension which allows running commands which will receive their data from stdin
grep filters out directories from the tar listing
sed removes the extraneous -\t so the checksum file can be understood by md5sum
The above assumes you don't have any very-oddly named files (for example, the names can't contain newlines)
I have a logical issue, trying to do a nested for loop for a fpöder which holds two subfolders with code files in it.
for folder in $(twofolders); do \
for file in $$folder; do \
zip -j $$file"_lambda.zip" $(PYTHONPATH)/$$file/*.py; \
aws s3 cp ./$$file"_lambda.zip" s3://$(S3_BUCKET)/$(SOURCE_CODE_VERSION)/lambda/; \
done
done
But I receive the following error
/bin/bash: -c: line 5: syntax error: unexpected end of file
make: *** [Makefile:62: build] Error 1
Anyone knows what Iam doing wrong?
You forgot a line continuation after the first done. And you should definitely double-quote bash and make variable expansions:
target: prerequisites
for folder in $(twofolders); do \
for file in "$$folder"; do \
zip -j "$${file}_lambda.zip" "$(PYTHONPATH)/$$file"/*.py; \
aws s3 cp ./"$${file}_lambda.zip" s3://"$(S3_BUCKET)/$(SOURCE_CODE_VERSION)"/lambda/; \
done \
done
I am trying to run a bash script which should load data into jena. This script comes from a github repository and was allegedly working on the owner's machine but on mine it won't run, even though I followed the instructions. So let me first describe what the script does based on my understanding: It should load .nt data (RDF data) into Jena using docker by using the docker image of jena, named stain/jena. Here is the script:
#/bin/bash
files=$(echo $(pwd)/rawdata-bearb/hour/alldata.IC.nt/*.nt | sed "s%$(pwd)/rawdata-bearb/hour/alldata.IC.nt%/var/data/in%g")
mkdir output # added
for file in $files; do
v=$(echo $file | sed "s/^.*\/\([0-9][0-9]*\)\.nt$/\1-1/" | bc)
echo "$v"
mkdir -p /var/data/out/ic/$v
time docker run \
-it \
--rm \
-v $(pwd)/tdb-bearb-hour/:/var/data/out/ \
-v $(pwd)/rawdata-bearb/hour/alldata.IC.nt/:/var/data/in/ \
stain/jena /jena/bin/tdbloader2 \
--sort-args "-S=16G" \
--loc /var/data/out/ic/$v $file \
> output/load-bearb-hour-ic-$v-.txt
done
However, when I execute the script, I get following message from the saved log file:
13:12:46 INFO -- TDB Bulk Loader Start
mkdir: cannot create directory ‘/var/data/out/ic/0’: No such file or directory
13:12:46 ERROR Failed during data phase
According to the tdbloader2 manual the --loc parameter should create the directory if it does not exist
-- loc: Sets the location in which the database should be created.
This location must be a directory and must be empty,
if a non-existent path is specified it will be created as a new directory.
I created the directories /var/data/out/ic/0 - /var/data/out/ic/10 manually and re-executed the script. Still, I got the same error message. My first guess was that tdbloader2 or docker use the mkdir command without the -p parameter but since I manually created the directories, thus, they existed before the execution and I still got the same error, it must be something else. I am kindly asking for your help
I'm trying to make a Makefile that exports a markdown file to a pdf file that uses the same filename as the original markdown file. I used "basename" command but it produces "inputfile.md.pdf" instead of "inputfile.pdf".
Please see my code below (I adapted a code I found on the Internet. Thank you!):
.PHONY: pdf docx apa format
FILES := $(wildcard ./*.md)
pdf:
for file in $(FILES); do \
pandoc $$file \
--bibliography mypath \
--csl mypath \
--filter pandoc-citeproc \
--template eisvogel \
-o $(basename $$file).pdf; \
open $(basename $$file).pdf; \
done
Anyone who can help me? I'm a novice in Makefile (and programming in general) so any detailed help would be very much appreciated.
I also tried these codes below, but they generated an error message:
-o $(basename -s ".md" $$file).pdf; \
-o $(basename -s .md $$file).pdf; \
The way you write $(basename …) you get the basename make function. This would normally the right thing, but you try to reference a shell variable file in its argument, which is unavailable at the make layer.
In this case, it is probably easiest to call the basename shell utility, at the shell level. Therefore, you need to escape the $ to get shell substitution, like this:
-o "$$(basename -s .md $$file)".pdf; \
open "$$(basename -s .md $$file)".pdf; \
Alternatively, you could try to move the loop to the make layer, perhaps using foreach.
There is something which I am missing or might be the whole case. So I am trying to download NCDC data from NCDC Datasets and unable to do it the unix box.
The command which I have used this far are
wget ftp://ftp.ncdc.noaa.gov:21/pub/data/noaa/1901/029070-99999-1901.gz">029070-99999-1901.gz
This is for one file, but will be very happy if I can downlaod the entire parent directory.
You seem to have a lonely " just before the >
to download everything you can try this command to get the whole directory content
wget -r ftp://ftp.ncdc.noaa.gov:21/pub/data/noaa/1901/*
for i in {1990..1993}
do
echo "$i"
cd /home/chile/data
# -nH Disable generation of host-prefixed directories
# -nd all files will get saved to the current directory
# -np Do not ever ascend to the parent directory when retrieving recursively.
# -R index.html*,227070*.gz* don't download files with this regex
wget -r -nH -nd -np -R *.html,999999-99999-$i.gz* http://www1.ncdc.noaa.gov/pub/data/noaa/$i/
/data/noaa/$i/
done