So, I just took up Shell Scripting and I'm developing an address book.
For the user to insert a contact I made this form:
form=$(dialog \
--title "INSERIR" \
--form "" \
0 0 0 \
"Nome:" 1 1 "$nome" 1 10 20 0 \
"Morada:" 2 1 "$morada" 2 10 20 0 \
"Telefone:" 3 1 "$telefone" 3 10 20 0 \
"E-Mail:" 4 1 "$mail" 4 10 20 0 \
2>&1 1>&3)
And I want to insert those values through a MySQL query. I saw somewhere that I had to use, for instance:
form[$1]
In order to access the variable $nome. However, it was a comment from 2008.
What is the easiest way to access those variables?
Thank you!
IFS=$'\n' read -r -d '' nome morada telefone mail < <( dialog ... )
Unlike dialog ... | { read; ... } (which scopes the variables which are read to a subshell), this approach puts dialog in the subshell, and your variables in the main shell -- much more convenient.
So, after a bit of tinkering I got what I was looking for.
Here is the new form:
exec 3>&1
dialog \
--separate-widget $'\n' \
--title "INSERIR" \
--form "" \
0 0 0 \
"Nome:" 1 1 "$nome" 1 10 30 0 \
"Morada:" 2 1 "$morada" 2 10 30 0 \
"Telefone:" 3 1 "$telefone" 3 10 30 0 \
"E-Mail:" 4 1 "$mail" 4 10 30 0 \
2>&1 1>&3 | {
read -r nome
read -r morada
read -r telefone
read -r mail
#The rest of the script goes here
}
exec 3>&-
So, you can really just put the output into an array and deal with that. Avoids all the subshell / subprocess garbage. (Just trust on the flippy redirect, yeah, it's ugly but you're basically just subbing out stdin and swapping it back.) Not sure why that's been so elusive after 5 years, but hey. I guess it's cool to be obscure.
response=$(dialog \
--title "INSERIR" \
--form "" \
0 0 0 \
"Nome:" 1 1 "$nome" 1 10 20 0 \
"Morada:" 2 1 "$morada" 2 10 20 0 \
"Telefone:" 3 1 "$telefone" 3 10 20 0 \
"E-Mail:" 4 1 "$mail" 4 10 20 0 \
3>&1 1>&2 2>&3 3>&-)
#convert the space separated string to an array.. the madness!!
responsearray=($response)
echo ${responsearray[0]} #nome
echo $(responsearray[1]} #morada
echo ${responsearray[2]} #telefone
echo ${responsearray[3]} #mail
...and bob's your uncle.
After several days looking for a way get those variables, here what I used, with your form:
nome=""
morada=""
telefone=""
mail=""
user_record=$(\
dialog \
--separate-widget $'\n' \
--title "INSERIR" \
--form "" \
0 0 0 \
"Nome:" 1 1 "$nome" 1 10 30 0 \
"Morada:" 2 1 "$morada" 2 10 30 0 \
"Telefone:" 3 1 "$telefone" 3 10 30 0 \
"E-Mail:" 4 1 "$mail" 4 10 30 0 \
3>&1 1>&2 2>&3 3>&- \
)
nome=$(echo "$user_record" | sed -n 1p)
morada=$(echo "$user_record" | sed -n 2p)
telefone=$(echo "$user_record" | sed -n 3p)
mail=$(echo "$user_record" | sed -n 4p)
echo $nome
echo $morada
echo $telefone
echo $mail
This way you can use those variables later on your script.
Hope it helps others.
The question regarding the easiest way to access the result depends partly on whether the items might contain blanks. If the items can contain arbitrary data, then line-oriented output (the default) seems the only way to go. If they are more constrained, e.g., not containing some readily-used punctuation character which can be used as a delimiter, then that makes it simpler.
The manual page mentions an option (and alias) which can be used to do this:
--separator string
--output-separator string
Specify a string that will separate the output on dialog's output from checklists, rather than a newline (for --separate-output) or a space. This applies to other widgets such as forms
and editboxes which normally use a newline.
For example, if the data does not include a : (colon), then you could use the option
--output-separator :
and get colon-separated values on a single line.
If there are no commas or quotes in the string, you could conceivably use
--output-separator \",\"
and embed the result directly in an SQL statement. However, commas occur more frequently than the other punctuation mentioned, so processing the form's output with sed is the most likely way one might proceed.
Related
I have a tab separated file, consisting of 7 columns.
ABC 1437 1 0 71 15.7 174.4
DEF 0 0 0 1 45.9 45.9
GHIJ 2 3 0 9 1.1 1.6
What I need is to replace the tab character with variable amount of space characters in order ot maintain the column alignment. Note that, I do not want every tab to be replaced by 8 spaces. Instead, I want 5 spaces after row #1 column #1 (8 - length(ABC) = 5), 4 spaces after row #1 column #2 (8 - length(1437) = 4), etc.
Is there a linux tool to do it for me, or I should write it myself?
The POSIX utility pr called as pr -e -t does exactly what you want and AFAIK is present in every Unix installation.
$ cat file
ABC 1437 1 0 71 15.7 174.4
DEF 0 0 0 1 45.9 45.9
GHIJ 2 3 0 9 1.1 1.6
$ pr -e -t file
ABC 1437 1 0 71 15.7 174.4
DEF 0 0 0 1 45.9 45.9
GHIJ 2 3 0 9 1.1 1.6
and with the tabs visible as ^Is:
$ cat -ET file
ABC^I1437^I1^I0^I71^I15.7^I174.4$
DEF^I0^I0^I0^I1^I45.9^I45.9$
GHIJ^I2^I3^I0^I9^I1.1^I1.6$
$ pr -e -t file | cat -ET
ABC 1437 1 0 71 15.7 174.4$
DEF 0 0 0 1 45.9 45.9$
GHIJ 2 3 0 9 1.1 1.6$
There is command pair dedicated for this task.
$ expand file
will do exactly what you want. The counterpart unexpand -a to do the reverse. There are few other useful options in both.
Use column, as suggested in the comment by anubhava, specifically using -t and -s options:
column -t -s $'\t' in_file
From the column manual:
-s, --separator separators
Specify the possible input item delimiters (default is
whitespace).
-t, --table
Determine the number of columns the input contains and
create a table. Columns are delimited with whitespace, by
default, or with the characters supplied using the
--output-separator option. Table output is useful for
pretty-printing.
I have file that looks like that:
t # 3-7, 1
v 0 104
v 1 92
v 2 95
u 0 1 2
u 0 2 2
u 1 2 2
t # 3-8, 1
v 0 94
v 1 13
v 2 19
v 3 5
u 0 1 2
u 0 2 2
u 0 3 2
t # 3-9, 1
v 0 94
v 1 13
v 2 19
v 3 7
u 0 1 2
u 0 2 2
u 0 3 2
t corresponds to header of each block.
I would like to extract multiple patterns from the file and output transactions that contain required patterns altogether.
I tried the following code:
ps | grep -e 't\|u 0 1 2' file.txt
and it works well to extract header and pattern 'u 0 1 2'. However, when I add one more pattern, the output list only headers start with t #. My modified code looks like that:
ps | grep -e 't\|u 0 1 2 && u 0 2 2' file.txt
I tried sed and awk solutions, but they do not work for me as well.
Thank you for your help!
Olha
Use | as the separator before the third alternative, just like the second alternative.
grep -E 't|u 0 1 2|u 0 2 2' file.txt
Also, it doesn't make sense to specify a filename and also pipe ps to grep. If you provide filename arguments, it doesn't read from the pipe (unless you use - as a filename).
You can use grep with multiple -e expressions to grep for more than one thing at a time:
$ printf '%d\n' {0..10} | grep -e '0' -e '5'
0
5
10
Expanding on #kojiro's answer, you'll want to use an array to collect arguments:
mapfile -t lines < file.txt
for line in "${lines[#]}"
do
arguments+=(-e "$line")
done
grep "${arguments[#]}"
You'll probably need a condition within the loop to check whether the line is one you want to search for, but that's it.
I am working with PLINK to analyse SNP chip data.
Does anyone know how to remove duplicated SNPs (duplicated by position)?
If we already have files in plink format then we should have .bim for binary plink files or .map for text plink files. In either case the positions are on the 3rd column and SNP names are on 2nd column.
We need to create a list of SNPs that are duplicated:
sort -k3n myFile.map | uniq -f2 -D | cut -f2 > dupeSNP.txt
Then run plink with --exclude flag:
plink --file myFile --exclude dupeSNP.txt --out myFileSubset
you can also do it directly in PLINK1.9 using the --list-duplicate-vars flag
together with the <require-same-ref>, <ids-only>, or <suppress-first> modifiers depending on what you want to do.
check https://www.cog-genomics.org/plink/1.9/data#list_duplicate_vars for more details
If you want to delete all occurences of a variant with duplicates, you will have to use the --exclude flag on the output file of --list-duplicate-vars ,
which should have a .dupvar extention.
I should caution that the two answers given below yield different results. This is because the sort | uniq method only takes into account SNP and bp location; whereas, the PLINK method (--list-duplicate-vars) takes into account A1 and A2 as well.
Similar to sort | uniq on the .map file we could use AWK on a .gen file, that looks like this:
22 rs1 12 A G 1 0 0 1 0 0
22 rs1 12 G A 0 1 0 0 0 1
22 rs2 16 C A 1 0 0 0 1 0
22 rs2 16 C G 0 0 1 1 0 0
22 rs3 17 T CTA 0 0 1 0 1 0
22 rs3 17 CTA T 1 0 0 0 0 1
# Get list of duplicate rsXYZ ID's
awk -F' ' '{print $2}' chr22.gen |\
sort |\
uniq -d > chr22_rsid_duplicates.txt
# Get list of duplicated bp positions
awk -F' ' '{print $3}' chr22.gen |\
sort |\
uniq -d > chr22_location_duplicates.txt
# Now match this list of bp positions to gen file to get the rsid for these locations
awk 'NR==FNR{a[$1]=$2;next}$3 in a{print $2}' \
chr22_location_duplicates.txt \
chr22.gen |\
sort |\
uniq \
> chr22_rsidBylocation_duplicates.txt
cat chr22_rsid_duplicates.txt \
chr22_rsidBylocation_duplicates.txt \
> tmp
# Get list of duplicates (by location and/or rsid)
cat tmp | sort | uniq > chr22_duplicates.txt
plink --gen chr22.gen \
--sample chr22.sample \
--exclude chr22_duplicates.txt \
--recode oxford \
--out chr22_noDups
This will classify rs2 as a duplicate; however, for the PLINK list-duplicate-vars method rs2 will not be flagged as a duplicate.
If one want's to obtain the same results using PLINK (a non-trivial task for BGEN file formats since awk, sed etc. do not work on binary files!) you can use the --rm-dup command from PLINK2.0. The list of all duplicate SNPs removed can be logged (to a file ending in .rmdup.list) using the list parameter, like so:
plink2 --bgen chr22.bgen \
--sample chr22.sample \
--rm-dup exclude-all list \
--export bgen-1.1 \ # Export as bgen version 1.1
--out chr22_noDups
Note: I'm saving the output as version 1.1 since plink1.9 still has commands not available in plink version 2.0. Therefore the only way to use bgen files with plink1.9 (at this time) is with the older 1.1 version.
I have a directory with 100 files of the same format:
> S43.txt
Gene S43-A1 S43-A10 S43-A11 S43-A12
DDX11L1 0 0 0 0
WASH7P 0 0 0 0
C1orf86 0 15 0 1
> S44.txt
Gene S44-A1 S44-A10 S44-A11 S44-A12
DDX11L1 0 0 0 0
WASH7P 0 0 0 0
C1orf86 0 15 0 1
I want to make a giant table containing all the columns from all the files, however when I do this:
paste S88.txt S89.txt | column -d '\t' >test.merge
Naturally, the file contains two 'Gene' columns.
How can I paste ALL the files in the directory at once?
How can I exclude the first column from all the files after the first one?
Thank you.
If you're using bash, you can use process substitution in paste:
paste S43.txt <(cut -d ' ' -f2- S44.txt) | column -t
Gene S43-A1 S43-A10 S43-A11 S43-A12 S44-A1 S44-A10 S44-A11 S44-A12
DDX11L1 0 0 0 0 0 0 0 0
WASH7P 0 0 0 0 0 0 0 0
C1orf86 0 15 0 1 0 15 0 1
(cut -d$'\t' -f2- S44.txt) will read all but first column in S44.txt file.
To do this for all the file matching S*.txt, use this snippet:
arr=(S*txt)
file="${arr[1]}"
for f in "${arr[#]:1}"; do
paste "$file" <(cut -d$'\t' -f2- "$f") > _file.tmp && mv _file.tmp file.tmp
file=file.tmp
done
# Clean up final output:
column -t file.tmp
use join with the --nocheck-order option:
join --nocheck-order S43.txt S44.txt | column -t
(the column -t command to make it pretty)
However, as you say you want to join all the files, and join only takes 2 at a time, you should be able to do this (assuming your shell is bash):
tmp=$(mktemp)
files=(*.txt)
cp "${files[0]}" result.file
for file in "${files[#]:1}"; do
join --nocheck-order result.file "$file" | column -t > "$tmp" && mv "$tmp" result.file
done
I am trying to save all the data entered into the table as an individual record in the contacts.txt file like so
Record: 12
Bob
Roberts
Bobs Stuff
Bobby
Bos Road
Bobsville
BB0 B22
01234123456
At the moment my code is saving each field as an individual record like so
Record: 5
Bob
==========================
Record: 6
Roberts
and so on. How do I get round this?
This is my code:
#!/bin/bash
BOOK="contacts.txt"
# set field names i.e. shell variables
forename=""
surname=""
company=""
position=""
street=""
town=""
postcode=""
phone=""
# open fd
exec 3>&1
# Store data to $VALUES variable
VALUES=$(dialog --ok-label "Submit" \
--backtitle "Contacts" \
--title "Add New Contact" \
--form "Create a new contact" \
15 50 0 \
"Forename:" 1 1 "$forename" 1 10 10 0 \
"Surname:" 2 1 "$surname" 2 10 15 0 \
"Company:" 3 1 "$company" 3 10 45 0 \
"Position:" 4 1 "$position" 4 10 40 0 \
"Street:" 5 1 "$street" 5 10 50 0 \
"Town:" 6 1 "$town" 6 10 20 0 \
"Postcode:" 7 1 "$postcode" 7 10 8 0 \
"Phone:" 8 1 "$phone" 8 10 11 0 \
2>&1 1>&3)
# close fd
exec 3>&-
# Echo the answers and ask for confirmation
echo "Should I enter the values:"
echo -e " $VALUES";
echo -n "y/n: "
read answer
# Convert the answer to lower case
fixedanswer=`echo $answer | tr "A-Z" "a-z"`;
if [ "$fixedanswer" = "y" ]
then
# Write the values to the address book
echo "$VALUES" >>$BOOK
echo "Added the entry OK"
sleep 5
else
# Give the user a message
echo -e " $VALUES \n NOT written to $BOOK"
sleep 5
fi
exit 0
In the line
echo "$VALUES" >>$BOOK
remove the quotes around $VALUES.
Adding quotes will list each value as its own line of text. Removing the quotes considers it one line, printing each value with a space between them.