Automated renaming of linux filenames to a new filenames that are legal in windows - windows

I would like to rename a linux file to a filename that is legal in windows. It should not be longer than is allowed and should not have characters that are not allowed in windows. Sometimes I copy the title from papers to a filename and they have special characters such as –, ®, or ?
Also there is there are some kind of characters sometimes at the ends of lines generated when copying and pasting a title from a pdf. You can see them when using sed -n 'l':
echo 'Estrogen receptor agonists and estrogen attenuate TNF-α induced
α
apoptosis in VSC4.1 motoneurons.pdf' | sed -n 'l'
Estrogen receptor agonists and estrogen attenuate TNF-\316\261 induce\
d$
\316\261$
apoptosis in VSC4.1 motoneurons.pdf$
or
echo 'A synthetic review of the five molecular Sorlie’s subtypes in
breast cancer' | sed -n 'l'
A synthetic review of the \357\254\201ve molecular Sorlie\342\200\231\
s subtypes in$
breast cancer$
I have started a script but it is not elegant and incomplete. Has someone done something like this already or is there a fast elegant way to do it?
fn2win="$1"
testFn=$(echo "$fn2win" | sed -n 'l')
#SPEC_CHAR="ÀÁÂÃÄÅÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞàáâãäåçèéêëìíîïðñòóôõöøùúûüýþÿ"
#NORM_CHAR="AAAAAACEEEEIIIIDNOOOOOOUUUUYPaaaaaaceeeeiiiionoooooouuuuyby"
#SPEC_LOW_CHAR="aàáâãäåāăąbḃcćçčĉċdḑďḋđeèéěêëēĕęėfḟƒgǵģǧĝğġǥhĥħiìíîĩïīĭįıjĵkḱķǩlĺļľłmṁnńņňñoòóôõöōŏøpṗqrŕŗřsśşšŝṡſtţťṫŧuùúûũüůūŭųvwẁẃŵẅxyỳýŷÿzźžż"
#NORM_LOW_CHAR="aaaaaaaaaabbccccccdddddeeeeeeeeeefffgggggggghhhiiiiiiiiiijjkkkklllllmmnnnnnoooooooooppqrrrrssssssstttttuuuuuuuuuuvwwwwwxyyyyyzzzz"
#SPEC_CAP_CHAR="AÀÁÂÃÄÅĀĂĄBḂCĆÇČĈĊDḐĎḊĐEÈÉĚÊËĒĔĘĖFḞGǴĢǦĜĞĠǤHĤĦIÌÍÎĨÏĪĬĮİJĴKḰĶǨĸLĹĻĽŁMṀNŃŅŇÑOÒÓÔÕÖŌŎØPṖQRŔŖŘSŚŞŠŜṠTŢŤṪŦUÙÚÛŨÜŮŪŬŲVWẀẂŴẄXYỲÝŶŸZŹŽŻ"
#SPEC_CAP_CHAR="AAAAAAAAAABBCCCCCCDDDDDEEEEEEEEEEFFGGGGGGGGHHHIIIIIIIIIIJJKKKKKLLLLLMMNNNNNOOOOOOOOOPPQRRRRSSSSSSTTTTTUUUUUUUUUUVWWWWWXYYYYYZZZZ"
#sed -e "y/'$SPEC_CHAR'/'$NORM_CHAR'/"
if [ "$fn2win" != "$testFn" ]; then
newLinFn=$(echo "$fn2win" | fromdos | tr "\n" " " |\
sed -e "
s/[?()\[\]=+<>:;©®”,*|]/_/g
s/"$'\t'"/ /g
s/–/-/g
s/’/'/g
s/α/alpha/g
s/β/beta/g
s/µ/micro/g
s/Æ/AE/g
s/Ǽ/AE/g
s/æ/ae/g
s/ǽ/ae/g
s/DZ/DZ/g
s/DŽ/DZ/g
s/Dž/Dz/g
s/Dz/Dz/g
s/dz/dz/g
s/dž/dz/g
s/ff/ff/g
s/fi/fi/g
s/fl/fl/g
s/ffi/ffi/g
s/ffl/ffl/g
s/ſt/ft/g
s/IJ/IJ/g
s/ij/ij/g
s/LJ/LJ/g
s/Lj/Lj/g
s/lj/lj/g
s/NJ/NJ/g
s/Nj/Nj/g
s/nj/nj/g
s/Œ/OE/g
s/œ/oe/g
s/ß/SZ/g
s/\"/_/g
s/[[:cntrl:]]/_/g
s/\ $//g
" |\
fold -s -w 251 | head -1 | sed 's/\ $/.pdf/')
if [ "$fn2win" != "$newLinFn" ]; then
mv "$fn2win" "$newLinFn"
fi
fi
winFn=$(echo "z:"$newLinFn | sed 's/\//\\/g' )

This looks like it should do it: http://pwet.fr/man/linux/commandes/konwert

Related

Parsing CSV records when a value is multiline

Source file looks like this:
"google.com", "vuln_example1
vuln_example2
vuln_example3"
"facebook.com", "vuln_example2"
"reddit.com", "stupidly_long_vuln_name1"
"stackoverflow.com", ""
I've been trying to get the output to be something like this but the line breaks seem to cause me no end of problems. I'm using a "while read line" job to do this because I do some processing on the columns (e.g Vulnerability count and url in this example). This is output into a jenkins job (yuk).
The basic summary of the problem is getting the linebreaks in the csv to be output into the third column while retaining the table structure. I've got a sort of weird example of the desired output below.
||hostname ||Vulnerability count|| Vulnerability list || URL ||
|google.com |3 |vuln_example1 |http://cve.com/vuln_example1|
| | |vuln_example2 |http://cve.com/vuln_example2|
| | |vuln_example3 |http://cve.com/vuln_example3|
|facebook.com |1 |vuln_example2 |http://cve.com/vuln_example2|
|reddit.com |1 |stupidly_long_vuln_name1 |http://cve.com/stupidly_long_vuln_name1|
|stackoverflow.com |0 | ||
Looking at this... I've got a feeling it might be easier by showing some code and example output.
Parsing your input with the command line below makes the problem easier (I'm assuming the inputs are correct):
perl -0777 -pe 's/([^"])\s*\n/\1 /g ; s/[",]//g' < sample.txt
This line invokes Perl to perform two regex substitutions:
s/([^"])\s*\n/\1 /g: This substitution removes an end of line if it doesn't terminate by a quote " (i.e. if a host entry, with all vulnerabilities isn't yet complete).
s/[",]//g removes all quotes and commas remaining.
For each host entry like this one:
"google.com", "vuln_example1
vuln_example2
vuln_example3"
You'll get:
google.com vuln_example1 vuln_example2 vuln_example3
Then you can assume for each line, you have an host and a set of vulnerabilities.
The given example below stores vulnerabilities in an array and loop through it, formatting and printing each line:
# Replace this by your custom function
# to get an URL for a given vulnerability
function get_vuln_url () {
# This just displays a random url for an non-empty arg
[[ -z "$1" ]] || echo "http://host/$1.htm"
}
# Format your line (see printf help)
function print_row () {
printf "%-20s|%5s|%-30s|%s\n" "$#"
}
# The perl line reformat
perl -0777 -pe 's/([^"])\s*\n/\1 /g ; s/[",]//g' < sample.txt |
while read -r line ; do
arr=(${line})
print_row "${arr[0]}" "$((${#arr[#]} - 1))" "${arr[1]}" "$(get_vuln_url ${arr[1]})"
#echo -e "${arr[0]}\t|$vul_count\t|${arr[1]}\t|$(get_vuln_url ${arr[1]})"
for v in "${arr[#]:2}" ; do
print_row " " " " "$v" "$(get_vuln_url ${arr[1]})"
done
done
Output:
google.com | 3|vuln_example1 |http://host/vuln_example1.htm
| |vuln_example2 |http://host/vuln_example1.htm
| |vuln_example3 |http://host/vuln_example1.htm
facebook.com | 1|vuln_example2 |http://host/vuln_example2.htm
reddit.com | 1|stupidly_long_vuln_name1 |http://host/stupidly_long_vuln_name1.htm
stackoverflow.com | 0| |
Update.
If you don't have Perl, and if your file doesn't have tabulations, you can use this command as a workaround instead:
tr '\n' '\t' < sample.txt | sed -r -e 's/([^"])\s*\t/\1 /g' -e 's/[",]//g' -e 's/\t/\n/g'
tr '\n' '\t' replaces all ends of line by tabulations
sed part acts like Perl line, except it deals with tabulations instead of ends of line and restores tabulations back to ends of line.

Using sed to find a string with wildcards and then replacing with same wildcards

So I am trying to remove new lines using sed, because it the only way I can think of to do it. I'm completely self taught so there may be a more efficient way that I just don't know.
The string I am searching for is \HF=-[0-9](newline character). The problem is the data it is searching through can look like (Note: there are actual new line characters in this data, which I think is causing a bit of the problem)
1\1\GINC-N076\SP\RMP2-FC\CC-pVDZ\C12H12\R2536\09-Apr-2020\0\\# mp2/cc-
pVDZ\\Squish3_Slide0\\0,1\H,0,0.,2.4822,0.\C,0,0.,1.3948,0.\C,0,0.,-1.
3948,0.\C,0,1.2079,0.6974,0.\C,0,-1.2079,0.6974,0.\C,0,-1.2079,-0.6974
,0.\C,0,1.2079,-0.6974,0.\H,0,2.1497,1.2411,0.\H,0,-2.1497,1.2411,0.\H
,0,-2.1497,-1.2411,0.\H,0,2.1497,-1.2411,0.\H,0,0.,-2.4822,0.\C,0,0.,1
.3948,3.\C,0,0.,-1.3948,3.\C,0,1.2079,0.6974,3.\C,0,-1.2079,0.6974,3.\
C,0,-1.2079,-0.6974,3.\C,0,1.2079,-0.6974,3.\H,0,0.,2.4822,3.\H,0,2.14
97,1.2411,3.\H,0,-2.1497,1.2411,3.\H,0,-2.1497,-1.2411,3.\H,0,2.1497,-
1.2411,3.\H,0,0.,-2.4822,3.\\Version=ES64L-G09RevD.01\State=1-AG\HF=-4
61.3998608\MP2=-463.0005321\RMSD=3.490e-09\PG=D02H [SG"(C4H4),X(C8H8)]
\\#
OR
1\1\GINC-N076\SP\RMP2-FC\CC-pVDZ\C12H12\R2536\09-Apr-2020\0\\# mp2/cc-
pVDZ\\Squish3.1_Slide0\\0,1\H,0,0.,2.4822,0.\C,0,0.,1.3948,0.\C,0,0.,-
1.3948,0.\C,0,1.2079,0.6974,0.\C,0,-1.2079,0.6974,0.\C,0,-1.2079,-0.69
74,0.\C,0,1.2079,-0.6974,0.\H,0,2.1497,1.2411,0.\H,0,-2.1497,1.2411,0.
\H,0,-2.1497,-1.2411,0.\H,0,2.1497,-1.2411,0.\H,0,0.,-2.4822,0.\C,0,0.
,1.3948,3.1\C,0,0.,-1.3948,3.1\C,0,1.2079,0.6974,3.1\C,0,-1.2079,0.697
4,3.1\C,0,-1.2079,-0.6974,3.1\C,0,1.2079,-0.6974,3.1\H,0,0.,2.4822,3.1
\H,0,2.1497,1.2411,3.1\H,0,-2.1497,1.2411,3.1\H,0,-2.1497,-1.2411,3.1\
H,0,2.1497,-1.2411,3.1\H,0,0.,-2.4822,3.1\\Version=ES64L-G09RevD.01\St
ate=1-AG\HF=-461.4104442\MP2=-463.0062587\RMSD=3.651e-09\PG=D02H [SG"(
C4H4),X(C8H8)]\\#
OR
1\1\GINC-N076\SP\RMP2-FC\CC-pVDZ\C12H12\R2536\09-Apr-2020\0\\# mp2/cc-
pVDZ\\Squish3.3_Slide1.7\\0,1\H,0,0.,2.4822,0.\C,0,0.,1.3948,0.\C,0,0.
,-1.3948,0.\C,0,1.2079,0.6974,0.\C,0,-1.2079,0.6974,0.\C,0,-1.2079,-0.
6974,0.\C,0,1.2079,-0.6974,0.\H,0,2.1497,1.2411,0.\H,0,-2.1497,1.2411,
0.\H,0,-2.1497,-1.2411,0.\H,0,2.1497,-1.2411,0.\H,0,0.,-2.4822,0.\C,0,
0.,-0.3052,3.3\C,0,0.,-3.0948,3.3\C,0,1.2079,-1.0026,3.3\C,0,-1.2079,-
1.0026,3.3\C,0,-1.2079,-2.3974,3.3\C,0,1.2079,-2.3974,3.3\H,0,0.,0.782
2,3.3\H,0,2.1497,-0.4589,3.3\H,0,-2.1497,-0.4589,3.3\H,0,-2.1497,-2.94
11,3.3\H,0,2.1497,-2.9411,3.3\H,0,0.,-4.1822,3.3\\Version=ES64L-G09Rev
D.01\State=1-AG\HF=-461.436061\MP2=-463.0177441\RMSD=7.859e-09\PG=C02H
[SGH(C4H4),X(C8H8)]\\#
OR
1\1\GINC-N076\SP\RMP2-FC\CC-pVDZ\C12H12\R2536\09-Apr-2020\0\\# mp2/cc-
pVDZ\\Squish3.6_Slide0.9\\0,1\H,0,0.,2.4822,0.\C,0,0.,1.3948,0.\C,0,0.
,-1.3948,0.\C,0,1.2079,0.6974,0.\C,0,-1.2079,0.6974,0.\C,0,-1.2079,-0.
6974,0.\C,0,1.2079,-0.6974,0.\H,0,2.1497,1.2411,0.\H,0,-2.1497,1.2411,
0.\H,0,-2.1497,-1.2411,0.\H,0,2.1497,-1.2411,0.\H,0,0.,-2.4822,0.\C,0,
0.,0.4948,3.6\C,0,0.,-2.2948,3.6\C,0,1.2079,-0.2026,3.6\C,0,-1.2079,-0
.2026,3.6\C,0,-1.2079,-1.5974,3.6\C,0,1.2079,-1.5974,3.6\H,0,0.,1.5822
,3.6\H,0,2.1497,0.3411,3.6\H,0,-2.1497,0.3411,3.6\H,0,-2.1497,-2.1411,
3.6\H,0,2.1497,-2.1411,3.6\H,0,0.,-3.3822,3.6\\Version=ES64L-G09RevD.0
1\State=1-AG\HF=-461.4376969\MP2=-463.0163868\RMSD=7.263e-09\PG=C02H [
SGH(C4H4),X(C8H8)]\\#
Basically the number I am looking for can be broken up into two lines at any point based on character count. I need to get rid of the newline breaking up the number so that I can extract the entire value into a separate file. (I have no problems with the extraction to a new file, hence why it isn't included in the code)
Currently I am using this code
sed -i ':a;N;$!ba;s/HF=-*[0-9]*\n/HF=-*[0-9]*/g' $i &&
Which ALMOST works, expect it doesn't replace the wildcard values with the same values. It replaces it with the actual text [0-9] instead and doesn't always remove the new line character.
Important to the is that THERE ARE ACTUAL NEW LINE CHARACTERS in the output file and there is no way to change that without messing up the other 30 lines I am extracting from this output file.
What I want is to just get rid of the newline characters that occur when that string is found, regardless of how many digits there are in between the - sign and the newline character.
So the expected output would be something like
1\1\GINC-N076\SP\RMP2-FC\CC-pVDZ\C12H12\R2536\09-Apr-2020\0\\# mp2/cc-
pVDZ\\Squish3_Slide0\\0,1\H,0,0.,2.4822,0.\C,0,0.,1.3948,0.\C,0,0.,-1.
3948,0.\C,0,1.2079,0.6974,0.\C,0,-1.2079,0.6974,0.\C,0,-1.2079,-0.6974
,0.\C,0,1.2079,-0.6974,0.\H,0,2.1497,1.2411,0.\H,0,-2.1497,1.2411,0.\H
,0,-2.1497,-1.2411,0.\H,0,2.1497,-1.2411,0.\H,0,0.,-2.4822,0.\C,0,0.,1
.3948,3.\C,0,0.,-1.3948,3.\C,0,1.2079,0.6974,3.\C,0,-1.2079,0.6974,3.\
C,0,-1.2079,-0.6974,3.\C,0,1.2079,-0.6974,3.\H,0,0.,2.4822,3.\H,0,2.14
97,1.2411,3.\H,0,-2.1497,1.2411,3.\H,0,-2.1497,-1.2411,3.\H,0,2.1497,-
1.2411,3.\H,0,0.,-2.4822,3.\\Version=ES64L-G09RevD.01\State=1-AG\HF=-461.3998608\MP2=-463.0005321\RMSD=3.490e-09\PG=D02H [SG"(C4H4),X(C8H8)]
\\#
These files are rather large and have over 1500 executions of this line of code, so the more efficient the better.
Everything else in the script this is in is using a combination of grep, awk, sed, and basic UNIX commands.
EDIT
After trying
sed -i -E ':a;N;$!ba;s/(\\HF=-?[.0-9]*)\n/\1/' $i &&
I still had no luck getting rid of those pesky new line characters.
If it has any effect on the answers at all here is the rest of the code to go with the one line that is causing problems
echo name HF MP2 mpdiff | cat > allE
for i in *.out
do echo name HF MP2 mpdiff | cat > $i.allE
grep "Slide" $i | cut -d "\\" -f2 | cat | tr -d '\n' > $i.name &&
grep "EUMP2" $i | cut -d "=" -f3 | cut -c 1-25 | tr '\n' ' ' | tr -s ' ' >> $i.mp &&
grep "EUMP2" $i | cut -d "=" -f2 | cut -c 1-25 | tr '\n' ' ' | tr -s ' ' >> $i.mpdiff &&
sed -i -E ':a;N;$!ba;s/(\\HF=-?[.0-9]*)\n/\1/' $i &&
grep '\\HF' $i | awk -F 'HF' '{print substr($2,2,14)}' | tr '\n' ' ' >> $i.hf &&
paste $i.name >> $i.energies &&
sed -i 's/ /0 /g' $i.hf &&
sed -i 's/\\/0/g' $i.hf &&
sed -i 's/[A-Z]/0/g' $i.hf &&
paste $i.hf >> $i.energies &&
sed -i 's/[ABCEFGHIJKLMNOPQRSTUVWXYZ]//g' $i.mp &&
paste $i.mp >> $i.energies &&
sed -i 's/[ABCEFGHIJKLMNOPQRSTUVWXYZ]//g' $i.mpdiff &&
paste $i.mpdiff >> $i.energies &&
transpose $i.energies >> $i.allE #temp.txt &&
#cat temp.txt > $i.energies
#echo $i is finished
done
echo see allE for energies
#rm *.energies #temp.txt
rm *.name
rm *.mp
rm *.hf
rm *.mpdiff
Here is how you can fix your current attempt.
sed -E ':a;N;$!ba;s/(\\HF=-?[.0-9]*)\n/\1/'
Add the i flag if you want to make the changes on the file itself, add && to send the job to the background, etc. The -E flag is needed, because backreferences (see below) are part of extended regular expressions.
I made the following changes: I changed -* to -? as there should be at most one dash (if I understand correctly and that is in fact a minus sign, not a dash). I added the period to the bracket expression, so that the decimal point would be matched too. (Note that in a bracket expression, the dot is a regular character). I wrapped the whole thing except the newline in parentheses - making it into a subexpression, which you can refer to with a backreference - which is what I did in the replacement part.
A few notes though - this will join the lines even if the entire number is at the end of one line, but not followed by the closing \. If in fact the entire number being on one line, but the closing \ is on the next line, you can change the sed command slightly, to leave those alone. On the other hand, this does not handle situations where, for example, one line ends in \H and the next line begins with F=304.222\ You only mentioned "split number" in your problem statement; shouldn't you, though, also handle such cases, where the newline splits the \HF=...\ token, just not in the "number" portion of the token?
It looks like your input lines start with a space. I have ignored them in this solution.
sed -rz 's/(AG\\HF=-[0-9]*)\n/\1/g' "$i"

How to append lots of variables to one variable with a simple command

I want to stick all the variables into one variable
A=('blah')
AA=('blah2')
AAA=('blah3')
AAB=('blah4')
AAC=('blah5')
#^^lets pretend theres 100 more of these ^^
#Variable composition
#after AAA, is AAB then AAC then AAD etc etc, does that 100 times
I want them all placed into this MASTER variable
#MASTER=${A}${AA}${AAA} (<-- insert AAB, AAC and 100 more variables here)
I obviously don't want to type 100 variables in this expression because there's probably an easier way to do this. Plus I'm gonna be doing more of these therefore I need it automated.
I'm relatively new to sed, awk, is there a way to append those 100 variables into the master variable?
For this specific purpose I DO NOT want an array.
You can use a simple one-liner, quite straightforward, though more expensive:
master=$(set | grep -E '^(A|AA|A[A-D][A-D])=' | sort | cut -f2- -d= | tr -d '\n')
set lists all the variables in var=name format
grep filters out the variables we need
sort puts them in the right order (probably optional since set gives a sorted output)
cut extracts the values, removing the variable names
tr removes the newlines
Let's test it.
A=1
AA=2
AAA=3
AAB=4
AAC=5
AAD=6
AAAA=99 # just to make sure we don't pick this one up
master=$(set | grep -E '^(A|AA|A[A-D][A-D])=' | sort | cut -f2- -d= | tr -d '\n')
echo "$master"
Output:
123456
With my best guess, how about:
#!/bin/bash
A=('blah')
AA=('blah2')
AAA=('blah3')
AAB=('blah4')
AAC=('blah5')
# to be continued ..
for varname in A AA A{A..D}{A..Z}; do
value=${!varname}
if [ -n "$value" ]; then
MASTER+=$value
fi
done
echo $MASTER
which yields:
blahblah2blah3blah4blah5...
Although I'm not sure whether this is what the OP wants.
echo {a..z}{a..z}{a..z} | tr ' ' '\n' | head -n 100 | tail -n 3
adt
adu
adv
tells us, that it would go from AAA to ADV to reach 100, or for ADY for 103.
echo A{A..D}{A..Z} | sed 's/ /}${/g'
AAA}${AAB}${AAC}${AAD}${AAE}${AAF}${AAG}${AAH}${AAI}${AAJ}${AAK}${AAL}${AAM}${AAN}${AAO}${AAP}${AAQ}${AAR}${AAS}${AAT}${AAU}${AAV}${AAW}${AAX}${AAY}${AAZ}${ABA}${ABB}${ABC}${ABD}${ABE}${ABF}${ABG}${ABH}${ABI}${ABJ}${ABK}${ABL}${ABM}${ABN}${ABO}${ABP}${ABQ}${ABR}${ABS}${ABT}${ABU}${ABV}${ABW}${ABX}${ABY}${ABZ}${ACA}${ACB}${ACC}${ACD}${ACE}${ACF}${ACG}${ACH}${ACI}${ACJ}${ACK}${ACL}${ACM}${ACN}${ACO}${ACP}${ACQ}${ACR}${ACS}${ACT}${ACU}${ACV}${ACW}${ACX}${ACY}${ACZ}${ADA}${ADB}${ADC}${ADD}${ADE}${ADF}${ADG}${ADH}${ADI}${ADJ}${ADK}${ADL}${ADM}${ADN}${ADO}${ADP}${ADQ}${ADR}${ADS}${ADT}${ADU}${ADV}${ADW}${ADX}${ADY}${ADZ
The final cosmetics is easily made by hand.
One-liner using a for loop:
for n in A AA A{A..D}{A..Z}; do str+="${!n}"; done; echo ${str}
Output:
blahblah2blah3blah4blah5
Say you have the input file inputfile.txt with arbitrary variable names and values:
name="Joe"
last="Doe"
A="blah"
AA="blah2
then do:
master=$(eval echo $(grep -o "^[^=]\+" inputfile.txt | sed 's/^/\$/;:a;N;$!ba;s/\n/$/g'))
This will concatenate the values of all variables in inputfile.txt into master variable. So you will have:
>echo $master
JoeDoeblahblah2

BASH: i can echo string + grep + sed, but how to add more strings on the same line?

Asking a question here is always my last resort. I tried everything even the most embarrassing code so i'm confused on explaining what i tried with no success. I have:
echo $output | grep -i -m 1 "Time:" | sed 's/.*\s\([0-9]*:[0-9]*:[0-9]*\).time.*/\1/'
it outputs:
23:25:31
Easy.
But i'd like to add one more string to the end, like " , $year" - so that i have:
23:25:31 , 2013
The problem is that whatever i tried (printf, -n, -e, -ne, brackets, quotes, |, ;, &, /r, etc.) gives an error or goes to a new line anyway.
Any suggestion will be really appreciated.
Thanks
time=$(echo $output | grep -i -m 1 "Time:" | sed 's/.*\s\([0-9]*:[0-9]*:[0-9]*\).time.*/\1/')
echo "The time is ${time}, 2013"
Alternates
add tr -d '\n' at the end of echo+grep+sed pipeline.
{ entire-echo-grep-sed-pipeline ; echo , 2013 ; } | xargs echo (This however, will add a space before ,)

modify the contents of a file without a temp file

I have the following log file which contains lines like this
1345447800561|FINE|blah#13|txReq
1345447800561|FINE|blah#13|Req
1345447800561|FINE|blah#13|rxReq
1345447800561|FINE|blah#14|txReq
1345447800561|FINE|blah#15|Req
I am trying extract the first field from each line and depending on whether it belongs to blah#13 or blah#14, blah#15 i am creating the corresponding files using the following script, which seems quite in-efficient in terms of the number of temp files creates. Any suggestions on how I can optimize it ?
cat newLog | grep -i "org.arl.unet.maca.blah#13" >> maca13
cat newLog | grep -i "org.arl.unet.maca.blah#14" >> maca14
cat newLog | grep -i "org.arl.unet.maca.blah#15" >> maca15
cat maca10 | grep -i "txReq" >> maca10TxFrameNtf_temp
exec<blah10TxFrameNtf_temp
while read line
do
echo $line | cut -d '|' -f 1 >>maca10TxFrameNtf
done
cat maca10 | grep -i "Req" >> maca10RxFrameNtf_temp
while read line
do
echo $line | cut -d '|' -f 1 >>maca10TxFrameNtf
done
rm -rf *_temp
Something like this ?
for m in org.arl.unet.maca.blah#13 org.arl.unet.maca.blah#14 org.arl.unet.maca.blah#15
do
grep -i "$m" newLog | grep "txReq" | cut -d' ' -f1 > log.$m
done
I've found it useful at times to use ex instead of grep/sed to modify text files in place without using temps ... saves the trouble of worrying about uniqueness and writability to the temp file and its directory etc. Plus it just seemed cleaner.
In ksh I would use a code block with the edit commands and just pipe that into ex ...
{
# Any edit command that would work at the colon prompt of a vi editor will work
# This one was just a text substitution that would replace all contents of the line
# at line number ${NUMBER} with the word DATABASE ... which strangely enough was
# necessary at one time lol
# The wq is the "write/quit" command as you would enter it at the vi colon prompt
# which are essentially ex commands.
print "${NUMBER}s/.*/DATABASE/"
print "wq"
} | ex filename > /dev/null 2>&1

Resources