I have a very large expression which contains many fractional numbers (a/b). I want to replace all fractional numbers by their latex form i.e. my output would be \frac{a}{b} globally. Here is a part of the input file:
+ 93
- 220/3*zeta3
+ 536/9*zeta2
- 4/5*zeta2^2
My output would be something like :
+ 93
- \frac{220}{3}*zeta3
+ \frac{536}{9}*zeta2
- \frac{4}{5}*zeta2^2
I can do it manually by vim editor (which is very time consuming). I was looking for a script which can do it globally for all such fractions. Is it possible to do in shell script?
With GNU sed:
sed -E 's|([0-9]+)/([0-9]+)|\\frac{\1}{\2}|' file
Output:
+ 93
- \frac{220}{3}*zeta3
+ \frac{536}{9}*zeta2
- \frac{4}{5}*zeta2^2
Assuming your fractions consist only of numbers, and they don't span multiple lines:
:%s!\v(\d+)\s*/\s*(\d+)!\\frac{\1}{\2}!g
using regex :
match : ^(\s+(\+|\-)\s+)(\d+)\/(\d+)(\*.*)$
replace : \1\\frac{\3}{\4}\5
use tools such as sed to edit file.
see DEMO
Usually, when you think:
I can do it manually by vim editor (which is very time consuming).
You're doing something wrong, since nothing should take long in the vim editor, thus:
Use a vim macro
First search for /:
<esc>/\/
Record the following macro:
nbifrac{<esc>nvc}{<esc>ea}<esc>
Step by step
Start recording a macro in buffer q:
qq
Find next occurrence of /
n
Step in front of numerator and insert frac{:
bifrac{<esc>
Find the division sign, select it and change it to }{:
nvc}{<esc>
Step over denominator and append }:
ea}<esc>
Stop recording macro:
q
Play the macro 1000 times if you like:
1000#q
Related
For some context, I am trying to combine multiple files (in an ordered fashion) named FILENAME.xxx.xyz (xxx starts from 001 and increases by 1) into a single file (denoted as $COMBINED_FILE), then replace a number of lines of text in the $COMBINED_FILE taking values from another file (named $ACTFILE). I have two for loops to do this which work perfectly fine. However, when I have a larger number of files, this process tends to take a fairly long time. As such, I am wondering if anyone has any ideas on how to speed this process up?
Step 1:
for i in {001..999}; do
[[ ! -f ${FILENAME}.${i}.xyz ]] && break
cat ${FILENAME}.${i}.xyz >> ${COMBINED_FILE}
mv -f ${FILENAME}.${i}.xyz ${XYZDIR}/${JOB_BASENAME}_${i}.xyz
done
Step 2:
for ((j=0; j<=${NUM_CONF}; j++)); do
let "n = 2 + (${j} * ${LINES_PER_CONF})"
let "m = ${j} + 1"
ENERGY=$(awk -v NUM=$m 'NR==NUM { print $2 }' $ACTFILE)
sed -i "${n}s/.*/${ENERGY}/" ${COMBINED_FILE}
done
I forgot to mention: there are other files named FILENAME.*.xyz which I do not want to append to the $COMBINED_FILE
Some details about the files:
FILENAME.xxx.xyz are molecular xyz files of the form:
Line 1: Number of atoms
Line 2: Title
Line 3-Number of atoms: Molecular coordinates
Line (number of atoms +1): same as line 1
Line (number of atoms +2): Title 2
... continues on (where line 1 through Number of atoms is associated with conformer 1, and so on)
The ACT file is a file containing the energies which has the form:
Line 1: conformer1 Energy
Line 2: conformer2 Energy2
Where conformer1 is in column 1 and the energy is in column 2.
The goal is to make the energy for the conformer the title for in the combined file (where the energy must be the title for a specific conformer)
If you know that at least one matching file exists, you should be able to do this:
cat -- ${FILENAME}.[0-9][0-9][0-9].xyz > ${COMBINED_FILE}
Note that this will match the 000 file, whereas your script counts from 001. If you know that 000 either doesn't exist or isn't a problem if it were to exist, then you should just be able to do the above.
However, moving these files to renamed names in another directory does require a loop, or one of the less-than-highly portable pattern-based renaming utilities.
If you could change your workflow so that the filenames are preserved, it could just be:
mv -- ${FILENAME}.[0-9][0-9][0-9].xyz ${XYZDIR}/${JOB_BASENAME}
where we now have a directory named after the job basename, rather than a path component fragment.
The Step 2 processing should be doable entirely in Awk, rather than a shell loop; you can read the file into an associative array indexed by line number, and have random access over it.
Awk can also accept multiple files, so the following pattern may be workable for processing the individual files:
awk 'your program' ${FILENAME}.[0-9][0-9][0-9].xyz
for instance just before catenating and moving them away. Then you don't have to rely on a fixed LINES_PER_FILE and such. Awk has the FNR variable which is the record in the current file; condition/action pairs can tell when processing has moved to the next file.
GNU Awk also has extensions BEGINFILE and ENDFILE, which are similar to the standard BEGIN and END, but are executed around each processed file; you can do some calculations over the record and in ENDFILE print the results for that file, and clear your accumulation variables for the next file. This is nicer than checking for FNR == 1, and having an END action for the last file.
if you really wanna materialize all the file names without globbing you can always jot it (it's like seq with more integer digits in default mode before going to scientific notation) :
jot -w 'myFILENAME.%03d' - 0 999 |
mawk '_<(_+=(NR == +_)*__)' \_=17 __=91 # extracting fixed interval
# samples without modulo(%) math
myFILENAME.016
myFILENAME.107
myFILENAME.198
myFILENAME.289
myFILENAME.380
myFILENAME.471
myFILENAME.562
myFILENAME.653
myFILENAME.744
myFILENAME.835
myFILENAME.926
I'm looking for a bit of help here. I'm a complete newbie!
I need to look in a file for a code matching the pattern A00000_00_A and append a count to it, so the first time it appears it is replaced with A00000_00_A_001, second time A00000_00_A_002 etc. The output needs to be written back to the same file. Each file only contains 1 code, but it appears multiple times.
After some digging I have found-
perl -pi -e 's/Q\d{4,5}'_'\d{2}_./$&.'_'.++$A /ge' /users/documents/*.xml
but the issue is the counter does not reset in each file.
That is, the output of the first file is say Q00390_01_A_1 to Q00390_01_A_7, while the second file is Q00391_01_A_8 to Q00391_01_A_10.
What I want is Q00390_01_A_1 to Q00390_01_A_7 in the first file and Q00391_01_A_1 to Q00391_01_A_2 in the second.
Does anyone have any idea on how to edit the above code to make it do that? I'm a total newbie so ideally an edit to what I have would be brilliant. Thanks
cd /users/documents/
for f in *.xml;do
perl -pi -e 's/facs=.(Q|M)\d{4,5}_\d{2}_\w/$&._.sprintf("%04d",++$A) /ge' $f
done
This matches the string facs= and any character, then "Q" or "M" followed by either four or five digits, then an underscore, then two digits, another underscore, and a word character. The entire match is then concatenated with an underscore and the value of $A zero padded to four digits.
I've got some source code like the following where I call a function in C:
void myFunction (
&((int) table[1, 0]),
&((int) table[2, 0]),
&((int) table[3, 0])
);
...the only problem is that the function has >300 parameters (it's an auto-generated wrapper for initialising and calling a whole module; it was given to me and I cannot change it). And as you can see: I began accessing the array with a 1 instead of a 0... Great times, modifying all the 300 parameters, i.e. decrasing 300 x the x-coordinate of the array, by hand.
The solution I am looking for is how I could force sed to to do the work for me ;)
EDIT: Please note that the syntax above for accessing a two-dimensional array in C is wrong anyway! Of course it should be [1][0]... (so don't just copy-and-paste ;))
Basically, the command I came up with, was the following:
sed -r 's/(.*)(table\[)([0-9]+)(,)(.*)/echo "\1\2$((\3-1))\4\5"/ge' inputfile.c > outputfile.c
Well, this does not look very intuitive on the first sight - and I was missing good explanations for nearly every example I found.
So I will try to give a detailed explanation on this:
sed
--> basic command
-r
--> most examples you find are using -e; however, the -r parameter (only works with GNU sed) enables extended regular expressions and brings support for the + in a regex. It basically means "one or more matches".
's/input/output/ge'
--> this is the basic replacement syntax. It basically means "replace 'input' by 'output'". The /g is a "global" flag, i.e. sed will replace all occurences and not only the first one. You can add an additional e to execute the result in the bash. This is what we want to do here to handle the calculation.
(.*)
--> this matches "everthing" from the last match to the next match
(table\[)
--> the \ is to escape the bracket. This part of the expression will match Strings like table[
([0-9]+)
--> this one matches numbers with at least one digit, however, it can also match higher numbers with more than only one digit.
(,)
--> this simply matches the comma ,
(.*)
--> and again: the rest of the line
And now the interesting part:
echo "\1\2$((\3-1))\4\5"
the echo is a bash command
the \n (you can use every value from \1 up to \9) is some kind of "variable" for the inputs: \1 will contain the first match, \2 the seconds match, ... --> this helps you to preserve parts of the input string
the $((1+1)) is a simple bash syntax to calculate the value of the term inside the double brackets (in the complete sed command above, the \3 will of course be automatically replaced by the 3rd match, i.e. the 1st part inside the brackets to access the table's cells)
please note that we use quotation marks around the echo content to also be able to process lines with characters like & which would otherwise not work
The already mentioned e of \ge at the end will trigger the execution of the result in the bash. E.g. the first two lines of the example source code in the question would produce the following bash statements:
echo "void myFunction ("
echo " &((int) table[$((1-1)), 0]),"
which is being executed and results in the following output:
void myFunction (
&((int) table[0, 0]),
...which is exatcly what I wanted :)
BTW:
text > output.c
is simple bash syntax to output text (or in this case the sed-processed source code) to a file called output.c.
Good links about this topic are:
sed basics
regular expressions basics
Ahh and one more thing: You can also use sed in the git-Bash on Windows - if you are "forced" to use Windows at work like me ;)
PS: In the meantime I could have easily done this by hand but using sed was a lot more fun ;)
Here's another way you could do it, using Perl:
perl -pe 's/(table\[)(\d+)(,)/$1.($2-1).$3/e' file.c
This uses the e modifier to execute an expression in the replacement. The capture groups are concatenated together but the middle group has 1 subtracted from its value.
This will output to standard output so you can check that it does what you want. When you're happy, you can add the -i switch to overwrite the original file.
I have a text file:
Function Description
concat Returns the concatenation of the arguments.
contains Returns true if the first argument string contains the second argument string; otherwise returns false.
I'd like to wrap the text on column#2, the result should be:
Function Description
concat Returns the concatenation
of the arguments.
contains Returns true if the first
argument string contains
the second argument
string; otherwise returns
false.
How to do it in vim or shell quickly? Thank you for any suggestions.
The issue can be easily solved in Vim by using the indentexpr option. Set
it to the number of characters designated for the first column,
:set inde=16
then format the text as usual with the gq or gw families of commands.
I don't think this qualifies as "quickly", and I hope someone out there has a better answer, but this is the best I could come up with in vim:
1) Set textwidth to the desired width of your second column:
:set tw=60
2) Mark the first-column words with something special (to be removed later - any non-normal text will do, I'm using jjj here) (using g!/^$/ to ignore empty lines):
:%g!/^$/s/^/jjj/
3) Put the second column text on a separate line:
:%s/ \</ \r/
4) Rewrap all the second-column lines to the desired width:
:%g!/^jjj/normal gqq
5) Join the first line of each second-column paragraph with its first-column word (should preserve the space that was after the first-column words at the beginning):
:%g/^jjj/join
6) Indent all the remaining second-column lines the appropriate amount to line them up (use however many >>s are needed - there may be a way to make vim check the length of the last first-column line and insert that number of spaces instead of using this method):
:%g!/^jjj/normal >>>>>>>>
7) Finally remove the first-column marker from the first columns:
:%s/^jjj//
Not worth it for your example, but if the file's large enough, it's better than doing it by hand...
:set tw=80 #or :set textwidth=80
Would wrap text to 80 chars.
Then you can type in command mode:
gg #go to the top
and then
gqG #apply reformat to the end
Reference:
http://www.cs.swarthmore.edu/help/vim/reformatting.html
Emacs: C-U (79) # » a pretty 79 character length divider
VIM: 79-i-# » see above
Textmate: ????
Or is it just assumed that we'll make a Ruby call or have a snippet somewhere?
I would create a bundle command to do this.
You can take editor selection as input to your script, then replace it with the result of execution. This command, for example, will take a selected number and print the character '#' that number of times.
python -c "print '#' * $TM_SELECTED_TEXT"
Of course this example doesn't allow you to specify the character, but it gives you an idea of what's possible.
By taking the
python -c "print '#' * $TM_SELECTED_TEXT"
a step further, you can duplicate the examples you gave in the question.
Just make a snippet, called divider or something, set the tab trigger field to something appropriate '--' for example, then enter something like:
`python -c "print '_' * $TM_COLUMNS"`
Then when you type --⇥ (dash dash tab), you should get a divider of the correct width.
True, you've lost some of the terseness that you get from vim, but this is far easier to reuse, and you only have to type it once. You can also use whatever language you like.
Inspired by the other answers. Make a snippet with the following:
`python -c "print ':'.join('$TM_SELECTED_TEXT'.split(':')[:-1]) * int('$TM_SELECTED_TEXT'.split(':')[-1])"`
and optionally assign a key sequence to it, e.g. CTRL-SHIFT-R
If you type -x:4, select it, and call the snippet (by it's shortcut for example), you'll get "-x-x-x-x".
You can also use ::4 to obtain "::::".
The string you repeat is enclosed in single quotes, so to repeat ', you have to use \'.