Sometimes I find myself editing a C source file which sees both use of tab as four spaces, and regular tab.
Is there any tool that attempts to parse the file and "normalize" this, i.e. convert all occurrences of four spaces to regular tab, or all occurrences of tab to four spaces, to keep it consistent?
I assume something like this can be done even with just a simple vim one-liner?
There's :retab and :retab! which can help, but there are caveats.
It's easier if you're using spaces for indentation, then just set 'expandtab' and execute :retab, then all your tabs will be converted to spaces at the appropriate tab stops (which default to 8.) That's easy and there are no traps in this method!
If you want to use 4 space indentation, then keep 'expandtab' enabled and set 'softtabstop' to 4. (Avoid modifying the 'tabstop' option, it should always stay at 8.)
If you want to do the inverse and convert to tabs instead, you could set 'noexpandtab' and then use :retab! (which will also look at sequences of spaces and try to convert them back to tabs.) The main problem with this approach is that it won't just consider indentation for conversion, but also sequences of spaces in the middle of lines, which can cause the operation to affect strings inside your code, which would be highly undesirable.
Perhaps a better approach for replacing spaces with tabs for indentation is to use the following substitute command:
:%s#^\s\+#\=repeat("\t", indent('.') / &tabstop).repeat(" ", indent('.') % &tabstop)#
Yeah it's a mouthful... It's matching whitespace at the beginning of the lines, then using the indent() function to find the total indentation (that function calculates indentation taking tab stops in consideration), then dividing that by the 'tabstop' to decide how many tabs and how many spaces a specific line needs.
If this command works for you, you might want to consider adding a mapping or :command for it, to keep it handy. For example:
command! -range=% Retab <line1>,<line2>s#^\s\+#\=repeat("\t", indent('.') / &tabstop).repeat(" ", indent('.') % &tabstop)
This also allows you to "Retab" a range of the file, including one you select with a visual selection.
Finally, one last alternative to :retab is that to ask Vim to "reformat" your code completely, using the = command, which will use the current 'indentexpr' or other indentation configurations such as 'cindent' to completely reindent the block. That typically respects your 'noexpandtab' and 'smarttabstop' options, so it use tabs and spaces for indentation consistently. The downside of this approach is that it will completely reformat your code, including changing indentation in places. The upside is that it typically has a semantic understanding of the language and will be able to take that in consideration when reindenting the code block.
I am trying to write a bash script that uses sed to modify lines in a config file not containing a specific string. To illustrate by example, I could have ...
/some/file/path1 ipAddress1/subnetMask(rw,sync,no_root_squash)
/some/file/path2 ipAddress1/subnetMask(rw,sync,no_root_squash,anonuid=-1)
/some/file/path3 ipAddress2/subnetMask(rw,sync,no_root_squash,anonuid=0)
/some/file/path4 ipAddress2/subnetMask(rw,sync,no_root_squash,anongid=-1)
/some/file/path5 ipAddress2/subnetMask(rw,sync,no_root_squash,anonuid=-1,anongid=-1)
And I want every line's parenthetical list to be changed such that it contains strings anonuid=-1 and anongid=-1 within its parentheses ...
/some/file/path1 ipAddress1/subnetMask(rw,sync,no_root_squash,anonuid=-1,anongid=-1)
/some/file/path2 ipAddress1/subnetMask(rw,sync,no_root_squash,anonuid=-1,anongid=-1)
/some/file/path3 ipAddress2/subnetMask(rw,sync,no_root_squash,anonuid=-1,anongid=-1)
/some/file/path4 ipAddress2/subnetMask(rw,sync,no_root_squash,anongid=-1,anonuid=-1)
/some/file/path5 ipAddress2/subnetMask(rw,sync,no_root_squash,anonuid=-1,anongid=-1)
As can be seen from the example, both anonuid and anongid may already exist within the parentheses, but it is possible that the original parenthetical list has one string but not the other (lines 2, 3, and 4), the list has neither (line 1), the list has both already set properly (line 5), or even one or both of them are set incorrectly (line 3). When either anonuid or anongid is set to a value other than -1, it must be changed to the proper value of -1 (line 3).
What would be the best way to edit my config file using sed such that anonuid=-1 and anongid=-1 is contained in each line's parenthetical list, separated by a comma delimiter of course?
I think this does what you want:
sed -e '/anonuid/{s/anonuid=[-0-9]*/anonuid=-1/;b gid;};s/)$/,anonuid=-1)/;:gid;/anongid/{s/anongid=[-0-9]*/anongid=-1/;b;};s/)$/,anongid=-1)/'
Basically, it has two nearly identical parts with the first dealing with anonuid and the second anongid, each with a bit of logic to decide if it needs to replace or add the appropriate values. (It doesn't bother to check if the value is already correct, that would just complicate things while not changing the results.)
You can use sed to specify the lines you are interested in:
$ sed '/anonuid=..*,anongid=..*)$/!p' $file
The above will print (p) all lines that don't match the regular expression between the two slashes. I negated the expression by using the !. This way, you're not matching lines with both anaonuid and anongid in them.
Now, you can work on the non-matching lines and editing those with the sed s command:
$ sed '/anonuid=..*,anongid=..*)$/!s/from/to/`
The manipulation might be fairly complex, and you might be passing multiple sed commands to get everything just right.
However, if the string no_root_squash appear in each line you want to change, why not take the simple way out:
$ sed 's/no_root_squash.*$/no_root_squash,anonuid=-1,anongid=-1)/' $file
This is looking for that no_root_squash string, and replacing everything from that string to the end of the line with the text you want. Are there lines you are touching that don't need to be edited? Yes, but you're not really changing those lines. You're basically substituting /no_root_squash,anonuid=-1,anongid=-1) with the same /no_root_squash,anonuid=-1,anongid=-1).
This may be faster even though it's replacing text that doesn't need replacing because there's less processing going on. Plus, it's easier to understand and support in the future.
Response
Thanks David! Yeah I was considering going that route, but I didn't want to rely 100% on every line containing no_root_squash. My current config file only ends in that string, but I'm just not 100% sure that won't potentially be different in the field. Do you think there would be a way to change that so it just overwrites from the end of the last string not containing anonuid=-1 or anongid=-1 onward?
What can you guarantee will be in each line?
You might be able to do a capture group:
sed 's/\(sync,[^,)]*\).*/\1,anonuid=-1,anongid=-1)/' $file
The \(..\) is a capture group. It basically captures that portion of the matching regular expression, and then allows you to reuse it via the \1. I'm capturing from the word sync to a group of characters not including a comma or a closing parentheses. Then, I'm appending the capture group, a comma, and your anon uid and gid.
Will that work?
Maybe I am oversimplifying:
sed 's/anonuid=[-0-9]*[^)]//g;s/anongid=[-0-9]*[^)]//g;s/[)]/anonuid=-1,anongid=-1)/g' test.txt > test3.txt
This just drops any current instance of anonuid or anongid and adds the string
"anonuid=-1,anongid=-1" into the parentheses
I want to split a string suppressing all null fields
Command:
",1,2,,3,4,,".split(',')
Result:
["", "1", "2", "", "3", "4", ""]
Expected:
["1", "2", "3", "4"]
How to do this?
Edit
Ok. Just to sum up all that good questions posted.
What I wanted is that split method (or other method) didn't generate empty strings. Looks like it isn't possible.
So, the solution is two step process: split string as usual, and then somehow delete empty strings from resulting array.
The second part is exactly this question
(and its duplicate)
So I would use
",1,2,,3,4,,".split(',').delete_if(&:empty?)
The solution proposed by Nikita Rybak and by user229426 is to use reject method. According to docs reject returns a new array. While delete_if method is more efficient since I don't want a copy. Using select proposed by Mark Byers even more inefficient.
steenslag proposed to replace commas with space and then use split by space:
",1,2,,3,4,,".gsub(',', ' ').split(' ')
Actually, the documentation says that space is actually a white space. But results of "split(/\s/)" and "split(' ')" are not the same. Why's that?
Mark Byers proposed another solution - just using regular expressions. Seems like this is what I need. But this solution implies that you have to be a master of regexp. But this is great solution! For example, if I need spaces to be separators as well as any non-alphanumeric symbol I can rewrite this to
",1,2, ,3 3,4 4 4,,".scan(/\w+[\s*\w*]*/)
the result is:
["1", "2", "3 3", "4 4 4"]
But again regexps are very unintuitive and they need an experience.
Summary
I expect that split to work with whitespaces as if whitespaces were a comma or even regexp. I expect it to do not produce empty strings. I think this is a bug in ruby or my misunderstanding.
Made it a community question.
There's a reject method in Array:
",1,2,,3,4,,".split(',').reject { |s| s.empty? }
Or if you prefer Symbol#to_proc:
",1,2,,3,4,,".split(',').reject(&:empty?)
Hoping to illuminate a bit here:
But results of "split(/\s/)" and "split(' ')" are not the same. Why's that?
If you look at the docs for String#split you'll see that split with ' ' is a special case:
If pattern is a single space, str is split on whitespace,
with leading whitespace and runs of contiguous whitespace characters ignored.
You also mention:
I expect it to do not produce empty strings. I think this is a bug in ruby or my misunderstanding.
The problem probably lies between the keyboard and the chair. ;-)
split will happily produce empty strings as it should, because there are times when you would definitely want this ability, and there are plenty of easy ways to work around it. Consider if you were splitting a csv from an Excel file. Anywhere you see ',,' would be an empty column, not a column you should just get rid of.
Regardless, you've seen a bunch of solutions - and here's another one that might show you the things you can do with ruby and split!
It seems you want to split up data between multiple commas, so why not try that and see what happens?
a = ",1,2,,3,4,,5,,,,6,,,".split(/,+/)
It's a simple enough regular expression: /,+/ means one or more commas, so we'll split on that.
This almost gives you want you want, except that you also want to ignore the leading empty field. You'll note that split ignores the empty field on the end because (from the String#split docs):
If the limit parameter is omitted, trailing null fields are suppressed.
So that means we can either use something that will remove that nil at the front of the array or just remove the initial commas. We can use gsub for that:
a = ",1,2,,3,4,,5,,,,6,,,".gsub(/^,+/,'')
If you print that out you'll see that our trailing empty "field" is now gone. So we can combine them all in one line:
a = ",1,2,,3,4,,5,,,,6,,,".gsub(/^,+/,'').split(/,+/)
And you have another solution!
And incidentally, this points out another possibility, that we can just cleanup our string entirely before sending it to split if we want a simple split. I'll leave it to you to figure out what this one is doing:
a = ",1,2,,3,4,,5,,,,6,,,".gsub(/,+/,',').gsub(/^,/,'').split(',')
There's lots of ways to do things in ruby. If it seems that ruby isn't doing what you want, then take a look at the docs and realize that it probably works the way that it does for a reason (there are plenty of people who would be upset if split wasn't able to spit out empty fields :)
Hope that helps!
You could use split followed by select:
",1,2,,3,4,,".split(',').select{|x|!x.empty?}
Or you could use a regular expression to match what you want to keep instead of splitting on the delimiter:
",1,2,,3,4,,".scan(/[^,]+/)
",1,2,,3,4,,".split(/,/).reject(&:empty?)
",1,2,,3,,,4,,".squeeze(",").sub(/^,*|,*$/,"").split(",")
String#split(pattern) behaves as desired when pattern is a single space (ruby-doc).
",1,2,,3,4,,".gsub(',', ' ').split(' ')
I have been looking at regular expressions to try and do this, but the most I can do is find the start of a line with ^, but not replace it.
I can then find the first characters on a line to replace, but can not do it in such a way with keeping it intact.
Unfortunately I donĀ“t have access to a tool like cut since I am on a windows machine...so is there any way to do what I want with just regexp?
Use notepad++. It offers a way to record an sequence of actions which then can be repeated for all lines in the file.
Did you try replacing the regular expression ^ with the text you want to put at the start of each line? Also you should use the multiline option (also called m in some regex dialects) if you want ^ to match the start of every line in your input rather than just the first.
string s = "test test\ntest2 test2";
s = Regex.Replace(s, "^", "foo", RegexOptions.Multiline);
Console.WriteLine(s);
Result:
footest test
footest2 test2
I used to program on the mainframe and got used to SPF panels. I was thrilled to find a Windows version of the same editor at Command Technology. Makes problems like this drop-dead simple. You can use expressions to exclude or include lines, then apply transforms on just the excluded or included lines and do so inside of column boundaries. You can even take the contents of one set of lines and overlay the contents of another set of lines entirely or within column boundaries which makes it very easy to generate mass assignments of values to variables and similar tasks. I use Notepad++ for most stuff but keep a copy of SPFSE around for special-purpose editing like this. It's not cheap but once you figure out how to use it, it pays for itself in time saved.