So lets say I have several characters in an email which don't belong. I want to take them out with the tr command. For example...
jsmith#test1.google.com
msmith#test2.google.com
zsmith#test3.google.com
I want to take out all the test[123]. so I am using the command tr -s 'test[123].' < email > mail. That is one way I have tried but the two or three I have attempted all do not work as intended. The output I am trying to get to is ...
jsmith#google.com
msmith#google.com
zsmith#google.com
You could use sed.
$ sed 's/#test[1-3]\./#/' file
jsmith#google.com
msmith#google.com
zsmith#google.com
[1-3] matches all the characters which falls within the range 1 to 3 (1,2,3). Add in-place edit -i parameter to save the changes made.
Related
I have a query in shell scripts that gives me a results like:
article;20200120
fruit;22
fish;23
I execute that report every day. I would like that when I execute the query the next day shows me output like that:
article;20200120;20200121
fruit;22;11
fish;23;12
These report I execute with postgre sql in a linux shell script. The output of csv is generated redirecting the ouput with ">>"
Please any help to achive that.
Thanks
This might be somewhat fragile, but it sounds like what you want can be accomplished with cut and paste.
Let's start with two files we want to join:
$ cat f1.csv
article;20200120
fruit;22
fish;23
$ cat f2.csv
article;20200121
fruit;11
fish;12
We first use cut to strip the headers from the second file, then send that into paste with the first file to combine corresponding lines:
$ cut -d ';' -f 2- f2.csv | paste -d ';' f1.csv -
article;20200120;20200121
fruit;22;11
fish;23;12
Parsing that command line, the -d ';' tells cut to use semicolons as the delimiter (the default is tab), and -f 2- says to print the second and later fields. f2.csv is the input file for cut. Then the -d ';' similarly tells paste to use semicolons to join the lines, and f1.csv - are the two files to paste together, in that order, with - representing the input piped in using the | shell operator.
Now, like I say, this is somewhat fragile. We're not matching the lines based on the header information, only their line number from the start of the file. If some fields are optional, or the set of fields changes over time, this will silently produce garbage. One way to mitigate that would be to first call cut -d ';' -f 1 on each of the input files and insist the results are the same before combining them.
I'm trying to excerpt a bit of content from 2 text files and to send it as the body of an e-mail using the mailx program. I am trying to do this as a bash script, since I do have at least a limited amount of experience with creating simple bash scripts and so have a rudimentary knowledge in this area. That said, I am not opposed to entertaining other scripting options such as perl/python/whatever.
I've gotten partway to where I'd like to be using sed: sed -e '1,/excerpt delimiter 1/d' -e '/excerpt delimiter 2/,$d' file1.txt && sed -e '1,/excerpt delimiter one/d' -e '/excerpt delimiter two/,$d' file2.txt outputs to stdout the content I'm aiming to get into the e-mail body. But piping said content to mailx is not working, for reasons that are not entirely clear to me. That is to say that sed -e '1,/excerpt delimiter 1/d' -e '/excerpt delimiter 2/,$d' file1.txt && sed -e '1,/excerpt delimiter one/d' -e '/excerpt delimiter two/,$d' file2.txt | mail -s excerpts me#mymail.me does not send the output of both sed commands in the body of the e-mail: it only sends the output of the final sed command. I'm trying to understand why this is and to remedy matters by getting the output of both sed commands into the e-mail body.
Further background. The two text files contain many lines of text and are actually web page dumps I'm getting using lynx browser. I need just a block of a few lines from each of those files, so I'm using sed to delimit the blocks I need and to allow me to excise out those few lines from each file. My task might be easier and/or simpler if I were trying to excise from just one file rather than from two. But since the web pages with the content I'm after require entry of login credentials, and because I am trying to automate this process, I am using lynx's cmd_script option to first log in, then save (print-to-file, actually) the pages I need. lynx does not offer any way, so far as I can tell, to concatenate files, so I seem stuck with working with two separate files.
There must certainly be alternate ways of accomplishing my aim and I am not constrained, either by preference or by necessity, to use any particular utility. The only real constraint is, since I'm trying to automate this, that it be done as a script I can invoke as a cron job. I am using Linux and have at my disposal all the standard text manipulating tools. As may be clear, my scripting knowledge/abilities are quite limited, so I've been trying to accomplish what I'm aiming at using a one-liner. mailx is properly configured and working on this system.
The pipe only applies to the first command in the && list. You need combine the two into a single compound command whose output is piped to mailx.
{ sed -e '1,/excerpt delimiter 1/d' \
-e '/excerpt delimiter 2/,$d' file1.txt &&
sed -e '1,/excerpt delimiter one/d' \
-e '/excerpt delimiter two/,$d' file2.txt ; } |
mail -s excerpts me#mymail.me
Input:-
echo "1234ABC89,234" # A
echo "0520001DEF78,66" # B
echo "46545455KRJ21,00"
From the above strings, I need to split the characters to get the alphabetic field and the number after that.
From "1234ABC89,234", the output should be:
ABC
89,234
From "0520001DEF78,66", the output should be:
DEF
78,66
I have many strings that I need to split like this.
Here is my script so far:
echo "1234ABC89,234" | cut -d',' -f1
but it gives me 1234ABC89 which isn't what I want.
Assuming that you want to discard leading digits only, and that the letters will be all upper case, the following should work:
echo "1234ABC89,234" | sed 's/^[0-9]*\([A-Z]*\)\([0-9].*\)/\1\n\2/'
This works fine with GNU sed (I have 4.2.2), but other sed implementations might not like the \n, in which case you'll need to substitute something else.
Depending on the version of sed you can try:
echo "0520001DEF78,66" | sed -E -e 's/[0-9]*([A-Z]*)([,0-9]*)/\1\n\2/'
or:
echo "0520001DEF78,66" | sed -E -e 's/[0-9]*([A-Z]*)([,0-9]*)/\1$\2/' | tr '$' '\n'
DEF
78,66
Explanation: the regular expression replaces the input with the expected output, except instead of the new-line it puts a "$" sign, that we replace to a new-line with the tr command
Where do the strings come from? Are they read from a file (or other source external to the script), or are they stored in the script? If they're in the script, you should simply reformat the data so it is easier to manage. Therefore, it is sensible to assume they come from an external data source such as a file or being piped to the script.
You could simply feed the data through sed:
sed 's/^[0-9]*\([A-Z]*\)/\1 /' |
while read alpha number
do
…process the two fields…
done
The only trick to watch there is that if you set variables in the loop, they won't necessarily be visible to the script after the done. There are ways around that problem — some of which depend on which shell you use. This much is the same in any derivative of the Bourne shell.
You said you have many strings like this, so I recommend if possible save them to a file such as input.txt:
1234ABC89,234
0520001DEF78,66
46545455KRJ21,00
On your command line, try this sed command reading input.txt as file argument:
$ sed -E 's/([0-9]+)([[:alpha:]]{3})(.+)/\2\t\3/g' input.txt
ABC 89,234
DEF 78,66
KRJ 21,00
How it works
uses -E for extended regular expressions to save on typing, otherwise for example for grouping we would have to escape \(
uses grouping ( and ), searches three groups:
firstly digits, + specifies one-or-more of digits. Oddly using [0-9] results in an extra blank space above results, so use POSIX class [[:digit:]]
the next is to search for POSIX alphabetical characters, regardless if lowercase or uppercase, and {3} specifies to search for 3 of them
the last group searches for . meaning any character, + for one or more times
\2\t\3 then returns group 2 and group 3, with a tab separator
Thus you are able to extract two separate fields per line, just separated by tab, for easier manipulation later.
I have a file that has a list of product ids each on one line. I want to modify this file in a way that all product ids are on one line and comma separated and in inverted commas. Original format -
1\n2\n3\n
Expected format -
'1','2','3'
I tried the following command -
paste -s -d "','" velocities.txt > out.txt
The result is looking like this -
1',2'3'4,
I do understand that using the above command I wont get the anything before the first product id, but i will be able to handle that case.
You could use sed to quote all digits:
paste -s -d, velocities.txt | sed "s|\([0-9]\+\)|'\1'|g" > out.txt
P.S. Another command that also handles IP-addressed:
sed "s|^\(.*\)$|'\1'|g" velocities.txt | paste -s -d, - > out.txt
I have a text file and I want to remove all lines containing the words: facebook, youtube, google, amazon, dropbox, etc.
I know to delete lines containing a string with sed:
sed '/facebook/d' myfile.txt
I don't want to run this command five different times though for each string, is there a way to combine all the strings into one command?
Try this:
sed '/facebook\|youtube\|google\|amazon\|dropbox/d' myfile.txt
From GNU's sed manual:
regexp1\|regexp2
Matches either regexp1 or regexp2. Use parentheses to use
complex alternative regular expressions. The matching process tries
each alternative in turn, from left to right, and the first one that
succeeds is used. It is a GNU extension.
grep -vf wordsToExcludeFile myfile.txt
"wordsToExcludeFile" should contain the words you don't want, one per line.
If you need to save the result back to the same file, then add this to the command:
> myfile.new && mv myfile.new myfile.txt
With awk
awk '!/facebook|youtube|google|amazon|dropbox/' myfile.txt > filtered.txt