TextMate: remove trailing spaces and save - textmate

I'm using a script to remove trailing spaces and then save the file.
The problem is that all my code foldings expand when I use it. How do I change the command so it will keep the code foldings?

You can use foldingStartMarker & foldingStopMarker to indicate the folds to TextMate.
To define a block that starts with { as the last non-space character on the line and stops with } as the first non-space character on the line, we can use the following patterns:
foldingStartMarker = '\{\s*$';
foldingStopMarker = '^\s*\}';
pressing the F1 key will fold any code folds present.
Reference: http://manual.macromates.com/en/navigation_overview#collapsing_text_blocks_foldings

Related

Why does CMD ignore the character `;`?

I'm just wondering. When I type ; in cmd, it will just ignore it.
I can type ;;;;;;;;;;;;;;; and it will do the same thing but, if I do ;a it will say error.
Why is that?
; is a delimiter.
Delimiters separate one parameter from the next - they split the command line up into words.
More info on https://ss64.com/nt/syntax-esc.html
The semicolon is not ignored by cmd.exe; rather is it even particularly recognised, namely as a token separator, which are used to separate commands from its arguments and arguments from each other. Here are all such characters:
SPACE (code 0x20)
TAB (horizontal tabulator, code 0x09)
, (comma, code 0x2C)
; (semicolon, code 0x3B)
= (equal-to sign, code 0x3D)
VTAB (vertical tabulator, code 0x0B)
FF (form-feed or page-break, code 0x0C)
NBSP (non-breaking space, code 0xFF)
Note that multiple consecutive token separators are collapsed to a single one.
Command prompt does not ignore the character ";", ";" is a delimeter and cmd recognizes it as so so it doesn't "ignore" the character, but reads it similar to a space so nothing appears when you write it alone.

Delete string from a file with beginning and ending tag like <ex></ex> in a shell script

I am writing a bash script to uncomment a comment with beginning of a tag
ex;
/*<O33>*/
// here my code
/*</O33>*/
Here's my script, and I successfully uncomment it.
sed -i "/^[ \t]*\/\*<O33>\*\//,/^[ \t]*\/\*<\/O33>\*\//s/\/\///g" $Path/DebugVersion.c
and this is the result:
/*<O33>*/
here my code
/*</O33>*/
Now I'm trying the reverse the process, to recomment the string between the begin and end tags, but I don't know how to do it.
The task of reinserting the comments is a little trickier because you do not want to insert the // markers on the start and end lines. I ended up with a file, script.sed, that contained:
\#/\*<O33>\*/#, \#/\*</O33>\*/# {
\#/\*</\{0,1\}O33>\*/# ! s%\([[:space:]]*\)%&//%
}
I then ran:
$ sed -f script.sed data
/*<O33>*/
//here my code
/*</O33>*/
$
Your pattern matching is complicated by the presence of / and * in the material to be matched. One way around this is to change sed's search character by using, in this case, \# to tell it that # marks the end of the pattern. (You can choose any other character that's convenient: = or % work well too.) The first line has the form patt1, patt2{ where the first pattern looks for /*<O33*/ (note that is a capital letter O, not a zero) and the second looks for /*</O33>*/. The { starts grouping the commands. It means that lines in the range from the first pattern to the second pattern will be subjected to the commands contained within the matching braces { … }.
The second line uses a slightly different variant on the pattern match to identify both begin and end lines (the /\{0,1\} is the classic sed way of finding 0 or 1 instances of /; if you use modern extended regexes, it is equivalent to /?). The ! operator inverts the sense of the match; only lines that do not match the end tags will be subjected to the s%%% substitution. This avoids inserting // in front of /*<O33>*/, for example.
The s%\([[:space:]]*\)%&//% operation looks for zero or more leading white space characters (blanks or tabs), and adds // after them. Again, this avoids using s/// because / is part of the replacement text.
This can all be squished into one line rather than needing a separate script file, but I like script files because I don't have to fight the shell. That's not a big problem here; there are no quotes — single, double or back — to confuse things. Use single quotes around the script unless you need to interpolate shell variables into the script. If you must use double quotes, you have to double up the backslashes, and it rapidly gets fiddly. Avoid double quotes around the script if you can.
$ sed '\#/\*<O33>\*/#, \#/\*</O33>\*/# { \#/\*</\{0,1\}O33>\*/# ! s%\([[:space:]]*\)%&//%; }' data
/*<O33>*/
//here my code
/*</O33>*/
$
The trailing semicolon is optional in GNU sed; it is necessary in BSD (macOS) and classic sed.
I note in passing that in the decommenting sed script, you have:
s/\/\///g
The g is probably not a good idea. If you have:
/*<O33>*/
// printf("%s\n", __func__); // Identify the function!
/*</O33*/
Then the uncommenting operation will leave:
/*<O33>*/
printf("%s\n", __func__); Identify the function!
/*</O33*/
removing the second comment marker too. The code won't compile. Drop the g; you only want to remove the first comment marker on the lines. (Clearly, if you only use // comments to disable blocks of code, this isn't critical — but it is safer in the long run.)
Also note that if you're working in C or C++ with the C preprocessor available, you'd do better to use:
#ifdef O33
printf("%s\n", __func__); // Identify the function!
#endif
as you don't have to edit the file to enable or disable the code; you can simply recompile with or without -DO33 or equivalent.
i learn from Jonathan's answer and SLePort's answer and i get it.
sed -i "/^[ \t]*\/\*<O33>\*\//,/^[ \t]*\/\*<\/O33>\*\//d" $Path/DebugVersion.c

Remove everything but brackets with sed, then indent

I have a huge file, a really huge file (some 600+MB of text). In fact they are jsons. Each json is on a new line and only comes in a few flavours.
They look like:
{"text":{"some nested words":"Some more","something else":"Yeah more stuff","some list":["itemA","ItemB","itemEtc"]},"One last object":{"a thing":"and it's value"}}
And what I want is it go through with sed, suck out the text, and for each nexted pair put in some indent, so we get:
{
-{
--[]
-}
--{}
-}
}
(I'm not 100% sure I got the nesting right on the output, I think it's right)
Is this possible? I saw this, which was the closest I could imagine it being, but that gets rid of the brackets two.
I've noticed the answer there uses braching, so I think I need that, and I'll need to do some kind of s/pattern/newline+tab/space/g type command but I can't figure out how or what to make that...
Could someone help please? It needn't be pure sed but that is prefered.
This will not be pretty... =) Here is my solution as a sed script. Notice that it requires that the first line notifies the shell how to invoke sed to execute our script. As you can see, the "-n" flag is used so we force sed only to print what we explicitly command it to through the "p" or "P" commands. The "-f" option tells sed to read the commands from a file, with the name following the option. As the file name of the script is concatenated by the shell into the final command, it will properly read commands from the script (ie. if you run "./myscript.sed" the shell will execute "/bin/sed -nf myscript.sed").
#!/bin/sed -nf
s/[^][{}]//g
t loop
: loop
t dummy
: dummy
s/^\s*[[{]/&/
t open
s/^\s*[]}]/&\
/
t close
d
: open
s/^\(\s*\)[[]\s*[]]/\1[]\
/
s/^\(\s*\)[{]\s*[}]/\1{}\
/
t will_loop
b only_open
: will_loop
P
s/.*\n//
b loop
: only_open
s/^\s*[[{]/&\
/
P
s/.*\n//
s/[][{}]/ &/g
b loop
: close
s/ \([][{}]\)/\1/g
P
s/.*\n//
b loop
Before we start, we must first strip everything into brackets and square brackets. That's the responsibility of the first "s" command. It tells sed to replace every character that isn't a bracket or a square bracket with nothing, ie. remove it. Notice that the square brackets in the match represent a group of characters to match, but when the first character inside them is a "^", it will actually match any character except the ones specified after the "^". Because we want to match the closing square bracket and we need to close with a square bracket the group of characters to ignore, we tell that a closing square bracket should be included in the group by making it the first character following the "^". We can then specify the rest of the characters: opening square bracket, open bracket and close bracket (group of ignored characters: "][{}"), and then close the group with the closing square bracket. I tried to detail more here because this can be confusing.
Now for the actual logic. The algorithm is pretty simple:
while line isn't empty
if line starts with optional spaces followed by [ or {
if after the [ or { there are optional spaces followed by a respective ] or }
print the respective pair, with only the indentation spaces, followed by a newline
else
print the opening square or normal bracket, followed by a newline
remove what was printed from the pattern space (a.k.a. the buffer)
add a space before every open or close bracket (normal or square)
end-if
else
remove a space before every open or close bracket (normal or square)
print the closing square or normal bracket, followed by a newline
remove what was printed from the pattern space
end-if
end-while
But there are a couple of quirks. First of all, sed doesn't support a "while" loop or an "if" statement directly. The closest we can get to is the "b" and "t" commands. The "b" command branches (jumps) to a predefined label, similar to a C goto statement. The "t" also branches to a predefined label, but only if a substitution has happened since the start of the script running on the current line or since the last "t" command. Labels are written with the ":" command.
Because it is very likely that the first command actually performs at least one substitution, the first "t" command that follows it will cause a branch. Because we need to test for some other substitutions, we need to make sure that the next "t" command won't automatically succeed because of that first command. That is why we start with a "t" command to a line just above it (ie. if it branches or not, it will still continue at the same point), so we can "reset" the internal flag used by "t" commands.
Because the "loop" label will be branched to from at least one "b" command, it is possible that the same flag will be set when the "b" is executed, because only "t" commands can clear it. Therefore, we need to do the same workaround to reset the flag, this time by using a "dummy" label.
We now start the algorithm by checking for the presence of an open square bracket or an open close bracket. Because we only want to test for their presence, we must replace the match with itself, which is what "&" represents, and sed will automatically set the internal flag for the "t" command if the match succeeds. If the match succeeds, we use the "t" command to branch into into "open" label.
If it doesn't succeed, we need to see if we match a close square or normal the bracket. The command is nearly identical, but now we append a newline after the closing bracket. We do this by adding an escaped newline (ie. a backslash followed by an actual newline) after where we place the match (ie. after the "&"). Similarly to above, we use the "t" command to branch to the "close" label if the match succeeds. If it doesn't succeed, we will consider the line as invalid, and promptly empty the pattern space (buffer) and restart the script on the next line, all with the single "d" command.
Entering the "open" label, we will first handle the case of a pair of matching open and close brackets. If we do match them, we will print them with the indentation spaces preceding them, without any spaces between them, and ending with a newline. There is one specific command for each type of bracket pair (square or normal), but they are analogous. Because we have to keep track of how many indentation spaces there are we must store them in a special "variable". We do this by using the group capture, which will store the part of the match that starts after the "(" and ends before the ")". Therefore, we use it to capture the spaces after the start of the line and before the open bracket. We then proceed to match the open bracket followed by spaces and the respective close bracket. When we write the replacement, we make sure to reinsert the spaces by using the special variable "\1", which contains the data stored by the first group capture in the match. We then write the respective pair of open and close brackets and the escaped newline.
If we managed to do any of the replacements, we must print what we have just written, remove it from the pattern space and restart the loop with the remaining characters of the line. Because of this, we first branch with the "t" command to the "will_loop" label. Otherwise, we branch to the "only_open" label, which will handle the case of only an open bracket, without the consecutive respective close bracket.
Inside the "will_loop" label, we just print everything in the pattern space up to the first newline (which we manually added) with the "P" command. We then manually remove everything up to that first newline, so we can proceed with the rest of the line. This is similar to what the "D" command does, but without restarting the execution of the script. Finally we branch to the start of the loop again.
Inside the "only_open" label, we match an open bracket in a similar fashion as previously, but now we rewrite it appended with a newline. We then print that line and remove it from the pattern space. Now we replace all brackets (open or close, square or normal) with itself preceded by a single space character. This is so we can increment the indentation. Finally we branch to the beginning of the loop again.
The final label "close" will handle a closing bracket. We first remove every single space before a bracket, effectively decrementing the indentation. To do this, we need to use captures, because although we want to match the space and the bracket that follows, we only want to write back the bracket. Finally, print everything up to the newline that we manually added before entering the "close" label, remove what we printed from pattern space and restart the loop.
Some observations:
This doesn't check for the syntactic correctness of the code (ie. {{[}] would be accepted)
It will add and remove indentation as brackets are encountered, regardless of their type. This means that when it adds an indentation, it will remove it even if the encountered close bracket is not of the same type.
Hope this helps, and sorry for the long post =)
This might work for you (GNU sed):
sed 's/[^][{}]//g;s/./&\n/g;s/.$//' file |
sed -r '/[[{]/{G;s/(.)\n(.*)/\2\1/;x;s/^/\t/;x;b};x;s/.//;x;G;s/(.)\n(.*)/\2\1/' |
sed -r '$!N;s/((\{).*(\}))|((\[).*(\]))/\2\5\3\6/;P;D'
Explanation:
The first sed command produces a stream of curly/square brackets each on its own line
The second sed command indents each bracket
The third sed command reduces those paired brackets to a single line
If your happy with correctly indented brackets, the third command can be omitted.
I think you're expected output should look like:
{
-{
--[]
-}
-{
-}
}
Here's one way using GNU awk:
awk -f script.awk file.txt
Contents of script.awk:
BEGIN {
FS=""
flag = 0
}
{
for (i=1; i<=NF; i++) {
if ($i == "{" || $i == "[") {
flag = flag + 1
build_tree(flag, $i)
printf (flag <=2) ? "\n" : ""
}
if ($i == "}" || $i == "]") {
flag = flag - 1
printf (flag >= 2) ? $i : \
build_tree(flag + 1, $i); \
printf "\n"
}
}
}
function build_tree (num, brace) {
for (j=1; j<=num - 1; j++) {
printf "-"
}
printf brace
}
I know this is an ancient thread and nobody is looking anyway, but there's a simpler way now.
cat file.txt | jq '.' | sed 's/ /-/g' | tr -dc '[[]{}()]\n-' | sed '/^-*$/d'
There are 2 spaces in the first sed.

How to remove the extra double quote?

In a malformed .csv file, there is a row of data with extra double quotes, e.g. the last line:
Name,Comment
"Peter","Nice singer"
"Paul","Love "folk" songs"
How can I remove the double quotes around folk and replace the string as:
Name,Comment
"Peter","Nice singer"
"Paul","Love _folk_ songs"
In Ruby 1.9, the following works:
result = subject.gsub(/(?<!^|,)"(?!,|$)/, '_')
Previous versions don't have lookbehind assertions.
Explanation:
(?<!^|,) # Assert that we're not at the start of the line or right after a comma
" # Match a quote
(?!,|$) # Assert that we're not at the end of the line or right before a comma
Of course this assumes that we won't run into pathological cases like
"Mary",""Oh," she said"
If you're not on Ruby 1.9, or just get tired of regexes sometimes, split the string on ,, strip the first/last quotes, replace remaining "s with _s, re-quote, and join with ,.
(We don't always have to worry about efficiency!)
$str = '"folk"';
$new = str_replace('"', '', $str);
/* now $new is only folk, without " */
Meta-strategy:
It's likely the case that the data was manually entered inconsistently, CSV's get messy when people manually enter either field terminators (double quote) or separators (comma) into the field itself. If you can have the file regenerated, ask them to use an extremely unlikely field begin/end marker, like 5 tilde's (~~~~~), and then you can split on "~~~~~,~~~~~" and get the correct number of fields every time.
Unless you have no other choice, get the file regenerated with correct escaping. Any other approach is asking for trouble, because the insertion of unescaped quotes is lossy, and thus cannot be reliably reversed.
If you can't get the file fixed from the source, then Tim Pietzcker's regex is better than nothing, but I strongly recommend that you have your script print all "fixed" lines and check them for errors manually.

Ruby gsub issues

I have a piece of text that resembled the following:
==EXCLUDE
#lots of lines of text
==EXCLUDE
#this is what I actually want
And so I was trying to remove the unwanted bit by doing:
str.gsub!(/==EX.*?==EXCLUDE/, '')
However, its not working. When I tried to remove the \n chars first, it worked like a dream. The issue is that I can't actually remove the \n characters. How can I do a substitution like this while leaving newlines in place?
By default, the . does not match line break chars. If you enable the m modifier in Ruby (in other languages, this is the s modifier) it should work:
str.gsub!(/==EX.*?==EXCLUDE/m, '')
Here's a live demo on Rubular: http://rubular.com/r/YxLSB1Iq95
Try str.gsub!(/==EX.*?==EXCLUDE/m, '')
That should make it span new lines.

Resources