In .txt files, I want to write a paragraph, with a tab in the front.
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad
minim veniam, quis nostrud exercitation ullamco laboris nisi ut
aliquip ex ea commodo consequat.
However, in my Sublime Text all preceding lines are indented according to the first tab.
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad
minim veniam, quis nostrud exercitation ullamco laboris nisi ut
aliquip ex ea commodo consequat.
What are the settings to get the first format?
If you only want to have the indentation on the first line instead of the whole paragraph, then you should add
"indent_subsequent_lines": false,
to your user settings (Preferences -> Settings -> User). I don't know if there's a way to restrict this configuration to .txt files only.
Related
As you know _test.go are ignored when go projects is build and mock package is only imported by _test.go files so if these files are not include in builded project why to simply include the mock package.
So was wondering how to ignore the files inside it when building project.
Tried adding suffix _test.go to the files in mock package but got but an error "MockStruct not declared by package mock" when used.
Also tried to use build constraints
//go:build ignore
Got same error "MockStruct not declared by package mock"
Am i missing something here?
Is using build constraints the only way?
If your mock is being used only on test files it is not imported when building the project. Go compiler does not include tests and its dependencies when building.
Try this as an example:
Build the following code;
Check its binary size;
Remove the sample_test.go file;
Build again and check its binary size;
Size before and after tests should not be different, and it proves that nothing from test is included in the build.
sample.go
package main
import "fmt"
type SampleInterface interface {
DoSomething()
}
type Sample struct {
Name string
}
func main() {
s := Sample{}
CallDoSomething(&s)
}
func (s *Sample) DoSomething() {
fmt.Println("Do Something implementation ", s.Name)
}
func CallDoSomething(si SampleInterface) {
si.DoSomething()
}
sample_test.go
package main
import (
"fmt"
"testing"
)
type sample_mock struct {
Name string
}
func (s *sample_mock) DoSomething() {
fmt.Println("Do Something implementation", s.Name)
}
func TestCallDoSomething(t *testing.T) {
s := sample_mock{
Name: "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.",
}
CallDoSomething(&s)
}
I have a text file ar. 50 GB size.
I used to process it through TextPipe but atm only mac is available and no TextPipe access.
Is it possible to initiate regex search in this file with good results saving to some other file per matching line?
I was thinking about vim editor but have no sufficient knowledge on where to search for.
Would appreciate any suggestions.
As an example let's assume that I have the code below in my initial.txt file and I want to save lines with "Lorem" in line processed.txt.
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam,
quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo
consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse
cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non
proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam,
quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo
consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse
cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non
proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam,
quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo
consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse
cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non
proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
For fixed strings use fgrep:
fgrep Lorem initial.txt > processed.txt
For regular expressions use grep and egrep (they have slightly different regexp syntax).
Vim is "genetically" related to other text-processing tools, such as sed or grep. And it also has an embedded sophisticated scripting language, so it is perfectly capable of batch text processing.
But Vim is an interactive text editor, so it feels a little wrong to use it merely as a replacement of awk or grep. However, if you're going to learn and use it for both editing and scripting, it's elegant and powerful.
To get some taste of Vim, you can solve your problem as follows (typing ':' in normal mode will automatically switch into command mode):
:e initial.txt
:g/Lorem/.w! >>processed.txt
I was thinking about vim editor but have no sufficient knowledge on where to search for. Would appreciate any suggestions.
The main problem with Vim is that you have to start from the very beginning, i.e. to learn how to open, edit and save files, and even how to properly exit the application. So you should download and install it and run vimtutor. Next, you should get used to Vim's embedded help system (:h user-manual) which by far is the best Vim's feature.
If you look for more books and tutorials, you can start from here. IMHO, Steve Oualline's "Vi IMproved" is still the best for beginners; and Drew Neil's "Practical Vim" is highly recommended for advanced vimmers.
Is there a way to get Sphinx to generate superscripted links for footnotes that will be represented in HTML like this:
I tried:
Lorem ipsum dolor sit amet, consectetur :superscript:`1` adipiscing
elit, sed do eiusmod tempor :superscript:`[#footnote2]_`
incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis
But the result is just:
Lorem ipsum dolor sit amet, consectetur 1 adipiscing
elit, sed do eiusmod tempor [#footnote2]_
incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis
But what I want is:
Lorem ipsum dolor sit amet, consectetur 1 adipiscing
elit, sed do eiusmod tempor [4]
incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis
Can this be done?
This is done with footnotes, and explicitly numbering the footnotes. Footnote links are automatically superscripted.
Lorem ipsum dolor sit amet, consectetur :superscript:`1` adipiscing
elit, sed do eiusmod tempor [2]_
incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis
.. [2] Text of the second footnote.
I have to display multiple long strings (with different length), but I can only display chunks of strings that need them to be between 275 and 295 characters.
So if I have a 3000 words string, It'd be displayed in about 10 pieces.
I'm looking for a way to find the next blank.
For example:
if str[275] != " "
# find next blank
p str[0..next_blank]
else
p str[0..275]
end
I thought of finding the index of the next blank in the 275-295th characters range, but I couldn't find how to do it in Ruby.
Any help will be much appreciated !
Rails has a method word_wrap which uses a simple regular expression:
str = 'Lorem ipsum dolor sit amet, consectetur adipisici elit, sed eiusmod tempor incidunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquid ex ea commodi consequat. Quis aute iure reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint obcaecat cupiditat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.'
puts str.gsub(/(.{1,80})(\s+|$)/, "\\1\n")
Output:
Lorem ipsum dolor sit amet, consectetur adipisici elit, sed eiusmod tempor
incidunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud
exercitation ullamco laboris nisi ut aliquid ex ea commodi consequat. Quis aute
iure reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla
pariatur. Excepteur sint obcaecat cupiditat non proident, sunt in culpa qui
officia deserunt mollit anim id est laborum.
The regular expression matches (and captures) up to 80 characters (.{1,80}) that are followed by whitespace or end-of-line (\s+|$).
Not using regular expresions, tear the input apart and put it back together:
str = 'Lorem ipsum dolor sit amet, consectetur adipisici elit, sed eiusmod tempor incidunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquid ex ea commodi consequat. Quis aute iure reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint obcaecat cupiditat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.'
def reformat_wrapped(s, width=78)
lines = []
line = ""
s.split(/\s+/).each do |word|
if line.size + word.size >= width
lines << line
line = word
elsif line.empty?
line = word
else
line << " " << word
end
end
lines << line if line
return lines.join "\n"
end
#=>puts reformat_wrapped(str, 78)
Lorem ipsum dolor sit amet, consectetur adipisici elit, sed eiusmod tempor
incidunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis
nostrud exercitation ullamco laboris nisi ut aliquid ex ea commodi consequat.
Quis aute iure reprehenderit in voluptate velit esse cillum dolore eu fugiat
nulla pariatur. Excepteur sint obcaecat cupiditat non proident, sunt in culpa
qui officia deserunt mollit anim id est laborum.
I'm not sure how do this, as I'm pretty new to regular expressions, and can't seem to find the proper method to accomplish this but say I have the following as a string (all tabs, and newlines included)
1/2 cup
onion
(chopped)
How can I remove all the whitespace and replace each instance with just a single space?
This is a case where regular expressions work well, because you want to treat the whole class of whitespace characters the same and replace runs of any combination of whitespace with a single space character. So if that string is stored in s, then you would do:
fixed_string = s.gsub(/\s+/, ' ')
Within Rails you can use String#squish, which is an active_support extensions.
require 'active_support'
s = <<-EOS
1/2 cup
onion
EOS
s.squish
# => 1/2 cup onion
You want the squeeze method:
str.squeeze([other_str]*) → new_str
Builds a set of characters from the other_str parameter(s) using the procedure described for String#count. Returns a new string where runs of the same character that occur in this set are replaced by a single character. If no arguments are given, all runs of identical characters are replaced by a single character.
"yellow moon".squeeze #=> "yelow mon"
" now is the".squeeze(" ") #=> " now is the"
"putters shoot balls".squeeze("m-z") #=> "puters shot balls"
The problem with the simplest solution gsub(/\s+/, ' ') is that it is very SLOW, as it replaces every space, even if it is single. But usually there is 1 space between words and we should fix only if there are 2 or more whitespaces in sequence.
Better solution is tr("\r\n\t", ' ').gsub(/ {2,}/, ' ') – first replace special whitespacing to ordinary spaces (tr works faster than gsub for replacing 1 char) and then squeeze spaces only if there are 2 or more consecutive spaces.
def method1(s) s.gsub!(/\s+/, ' '); s end
def method2(s) s.tr!("\r\n\t", ' '); s.gsub!(/ {2,}/, ' '); s end
Benchmark.bm do |x|
n = 100_000
x.report('method1') { n.times { method1("Lorem ipsum\n\n dolor \t\t\tsit amet, consectetur\n \n\t\n adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.") } }
x.report('method2') { n.times { method2("Lorem ipsum\n\n dolor \t\t\tsit amet, consectetur\n \n\t\n adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.") } }
end;1
# user system total real
# method1 2.907425 0.024254 2.931679 ( 3.406144)
# method2 0.644329 0.011254 0.655583 ( 0.658699)
The selected answer will not remove non-breaking space characters.
This should work in 1.9:
fixed_string = s.gsub(/(\s|\u00A0)+/, ' ')
If speed is a concern then your best bet is this.
.tr("\r\n\t", ' ').gsub(/ {2,}/, ' ')
This replaces whitespace characters with a space then replaces multiple spaces with a single space.
I saw the benchmark that Lev posted and compared variations of gsub .sqeeze .tr and .squish. I expanded his benchmark to try them out and while .squeeze is the fastest it does not answer the questions since it would only compress multiple tabs/new lines to a singe tab/new line.
# Replace multiple whitespace characters with a single space.
def method1(s) s.gsub!(/\s+/, ' '); s end # (in place)
def method2(s) s = s.gsub(/\s+/, ' '); s end
# Replace characters with a space then replace multiple spaces with a single space.
def method3(s) s.gsub!(/[\r\n\t]/, ' '); s.gsub!(/ {2,}/, ' '); s end # (in place)
def method4(s) s = s.gsub(/[\r\n\t]/, ' ').gsub(/ {2,}/, ' '); s end
# Replace characters with a space then replace multiple spaces with a single space.
def method5(s) s.tr!("\r\n\t", ' '); s.gsub!(/ {2,}/, ' '); s end # (in place)
def method6(s) s = s.tr("\r\n\t", ' ').gsub(/ {2,}/, ' '); s end
# Replace multiple whitespace characters with a single space.
def method7(s) s.squish!; s end # (in place)
def method8(s) s = s.squish; s end
# Combines multiple spaces into a single space
def method9(s) s.squeeze!(" "); s end # (in place)
def method10(s) s = s.squeeze(" "); s end
Benchmark.bm do |x|
n = 100_000
x.report('.gsub! ') { n.times { method1("Lorem ipsum\n\n dolor \t\t\tsit amet, consectetur\n \n\t\n adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.") } }
x.report('.gsub ') { n.times { method2("Lorem ipsum\n\n dolor \t\t\tsit amet, consectetur\n \n\t\n adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.") } }
x.report('.gsub!.gsub!') { n.times { method3("Lorem ipsum\n\n dolor \t\t\tsit amet, consectetur\n \n\t\n adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.") } }
x.report('.gsub .gsub ') { n.times { method4("Lorem ipsum\n\n dolor \t\t\tsit amet, consectetur\n \n\t\n adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.") } }
x.report('.tr!.gsub! ') { n.times { method5("Lorem ipsum\n\n dolor \t\t\tsit amet, consectetur\n \n\t\n adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.") } }
x.report('.tr .gsub ') { n.times { method6("Lorem ipsum\n\n dolor \t\t\tsit amet, consectetur\n \n\t\n adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.") } }
x.report('.squish ') { n.times { method7("Lorem ipsum\n\n dolor \t\t\tsit amet, consectetur\n \n\t\n adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.") } }
x.report('.squish! ') { n.times { method8("Lorem ipsum\n\n dolor \t\t\tsit amet, consectetur\n \n\t\n adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.") } }
x.report('.squeeze! ') { n.times { method9("Lorem ipsum\n\n dolor \t\t\tsit amet, consectetur\n \n\t\n adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.") } }
x.report('.squeeze ') { n.times { method10("Lorem ipsum\n\n dolor \t\t\tsit amet, consectetur\n \n\t\n adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.") } }
end
Which gets these results
=>
# user system total real
# .gsub! 2.019544 0.030325 2.049869 ( 2.059379)
# .gsub 1.968179 0.011204 1.979383 ( 1.988050)
# .gsub!.gsub! 0.770042 0.014097 0.784139 ( 0.787055)
# .gsub .gsub 0.728955 0.011577 0.740532 ( 0.742887)
# .tr!.gsub! 0.487014 0.008260 0.495274 ( 0.496820)
# .tr .gsub 0.487231 0.007769 0.495000 ( 0.497164)
# .squish! 2.005224 0.011673 2.016897 ( 2.025851)
# .squish 2.043497 0.013331 2.056828 ( 2.066794)
# .squeeze! 0.117615 0.002004 0.119619 ( 0.120140)
# .squeeze 0.196301 0.012094 0.208395 ( 0.209267)