Is there a method similar to the Rails truncate method that accepts an index where I can indicate the start of truncation and a separator parameter so that it does not start in the middle of a word or string?
For example:
"i love the taste of bubble tea after lunch."
I would like to grab a string of size 15 starting from index 9 so this should result in:
"the taste of bubble"
I don't think there is one function to do this, so you'll have to write your own. I'd recommend chopping off the start of the string first and then using truncate to handle the end. Something like this might do what you want:
def truncate_beginning_and_end(str, beginning, length, separator)
first_space_before_beginning = str[0..beginning].rindex(separator)
str_without_beginning = str[(first_space_before_beginning + 1)..-1]
truncate(str_without_beginning, length: length, separator: separator, omission: '')
end
Related
I am wondering how to make something where if X=5 and Y=2, then have it output something like
Hello 2 World 5.
In Java I would do
String a = "Hello " + Y + " World " + X;
System.out.println(a);
So how would I do that in TI-BASIC?
You have two issues to work out, concatenating strings and converting integers to a string representation.
String concatenation is very straightforward and utilizes the + operator. In your example:
"Hello " + "World"
Will yield the string "Hello World'.
Converting numbers to strings is not as easy in TI-BASIC, but a method for doing so compatible with the TI-83+/84+ series is available here. The following code and explanation are quoted from the linked page:
:"?
:For(X,1,1+log(N
:sub("0123456789",ipart(10fpart(N10^(-X)))+1,1)+Ans
:End
:sub(Ans,1,length(Ans)-1?Str1
With our number stored in N, we loop through each digit of N and store
the numeric character to our string that is at the matching position
in our substring. You access the individual digit in the number by
using iPart(10fPart(A/10^(X, and then locate where it is in the string
"0123456789". The reason you need to add 1 is so that it works with
the 0 digit.
In order to construct a string with all of the digits of the number, we first create a dummy string. This is what the "? is used
for. Each time through the For( loop, we concatenate the string from
before (which is still stored in the Ans variable) to the next numeric
character that is found in N. Using Ans allows us to not have to use
another string variable, since Ans can act like a string and it gets
updated accordingly, and Ans is also faster than a string variable.
By the time we are done with the For( loop, all of our numeric characters are put together in Ans. However, because we stored a dummy
character to the string initially, we now need to remove it, which we
do by getting the substring from the first character to the second to
last character of the string. Finally, we store the string to a more
permanent variable (in this case, Str1) for future use.
Once converted to a string, you can simply use the + operator to concatenate your string literals with the converted number strings.
You should also take a look at a similar Stack Overflow question which addresses a similar issue.
For this issue you can use the toString( function which was introduced in version 5.2.0. This function translates a number to a string which you can use to display numbers and strings together easily. It would end up like this:
Disp "Hello "+toString(Y)+" World "+toString(X)
If you know the length of "Hello" and "World," then you can simply use Output() because Disp creates a new line after every statement.
For the following code, which according to the style guide should be wrapped at 80 chars:
opts.on('--scores_min <uint>', Integer, 'Drop reads if a single position in ',
'the index have a quality score ',
'below scores_main (default= ',
"#{DEFAULT_SCORE_MIN})") do |o|
options[:scores_min] = o
end
The resulting output is:
--scores_min <uint> Drop reads if a single position in
the index have a quality score
below scores_main (default=
16)
Which wraps at 72 chars and looks wrong :o(
I really want it wrapped at 80 chars and aligned like this:
--scores_min <uint> Drop reads if a single position in the
index have a quality score below
scores_min (default=16)
How can this be achieved in a clever way?
The easiest solution in this case is to stack parameters like this:
opts.on('--scores_min <uint>',
Integer,
"Drop reads if a single position in the ",
"index have a quality score below ",
"scores_min (default= #{DEFAULT_SCORE_MIN})") do |o|
options[:scores_min] = o
end
That results in a fairly pleasant output:
--scores_min <uint> Drop reads if a single position in the
index have a quality score below
scores_min (default= 16)
More generally, here docs can make it easier to format output strings in a way that looks good both in the code and in the output:
# Deeply nested code
puts <<~EOT
Drop reads if a single position in the
index have a quality score below
scores_min (default= #{DEFAULT_SCORE_MIN})
EOT
But in this case it doesn't work so well since the description string is indented automatically.
So I think the solution is to follow the Ruby Style Guide:
When using heredocs for multi-line strings keep in mind the fact that
they preserve leading whitespace. It's a good practice to employ some
margin based on which to trim the excessive whitespace.
code = <<-END.gsub(/^\s+\|/, '')
|def test
| some_method
| other_method
|end
END
# => "def test\n some_method\n other_method\nend\n"
[EDIT] In Ruby 2.3 you can do (same ref):
code = <<~END
def test
some_method
other_method
end
END
I'm just learning Ruby and have been tackling small code projects to accelerate the process.
What I'm trying to do here is read only the alphabetic words from a text file into an array, then delete the words from the array that are less than 5 characters long. Then where the stdout is at the bottom, I'm intending to use the array. My code currently works, but is very very slow since it has to read the entire file, then individually check each element and delete the appropriate ones. This seems like it's doing too much work.
goal = File.read('big.txt').split(/\s/).map do |word|
word.scan(/[[:alpha:]]+/).uniq
end
goal.each { |word|
if word.length < 5
goal.delete(word)
end
}
puts goal.sample
Is there a way to apply the criteria to my File.read block to keep it from mapping the short words to begin with? I'm open to anything that would help me speed this up.
You might want to change your regex instead to catch only words longer than 5 characters to begin with:
goal = File.read('C:\Users\bkuhar\Documents\php\big.txt').split(/\s/).flat_map do |word|
word.scan(/[[:alpha:]]{6,}/).uniq
end
Further optimization might be to maintain a Set instead of an Array, to avoid re-scanning for uniqueness:
goal = Set.new
File.read('C:\Users\bkuhar\Documents\php\big.txt').scan(/\b[[:alpha:]]{6,}\b/).each do |w|
goal << w
end
In this case, use the delete_if method
goal => your array
goal.delete_if{|w|w.length < 5}
This will return a new array with the words of length lower than 5 deleted.
Hope this helps.
I really don't understand what a lot of the stuff you are doing in the first loop is for.
You take every chunk of text separated by white space, and map it to a unique value in an array generated by chunking together groups of letter characters, and plug that into an array.
This is way too complicated for what you want. Try this:
goal = File.readlines('big.txt').select do |word|
word =~ /^[a-zA-Z]+$/ &&
word.length >= 5
end
This makes it easy to add new conditions, too. If the word can't contain 'q' or 'Q', for example:
goal = File.readlines('big.txt').select do |word|
word =~ /^[a-zA-Z]+$/ &&
word.length >= 5 &&
! word.upcase.include? 'Q'
end
This assumes that each word in your dictionary is on its own line. You could go back to splitting it on white space, but it makes me wonder if the file you are reading in is written, human-readable text; a.k.a, it has 'words' ending in periods or commas, like this sentence. In that case, splitting on whitespace will not work.
Another note - map is the wrong array function to use. It modifies the values in one array and creates another out of those values. You want to select certain values from an array, but not modify them. The Array#select method is what you want.
Also, feel free to modify the Regex back to using the :alpha: tag if you are expecting non-standard letter characters.
Edit: Second version
goal = /([a-z][a-z']{4,})/gi.match(File.readlines('big.txt').join(" "))[1..-1]
Explanation: Load a file, and join all the lines in the file together with a space. Capture all occurences of a group of letters, at least 5 long and possibly containing but not starting with a '. Put all those occurences into an array. the [1..-1] discards "full match" returned by the MatchData object, which would be all the words appended together.
This works well, and it's only one line for your whole task, but it'll match
sugar'
in
I'd like some 'sugar', if you know what I mean
Like above, if your word can't contain q or Q, you could change the regex to
/[a-pr-z][a-pr-z']{4,})[ .'",]/i
And an idea - do another select on goal, removing all those entries that end with a '. This overcomes the limitations of my Regex
I'm trying to write a regex to validate a string and accepts only a series of four comma-separated digits, each up to 100. Something like this would be valid:
20,30,40,50
and these invalid:
120,0,20,0
20,30,40,ss
invalid_string
Any thoughts?
They're used for CMYK colours. We just need to store them here, not use them.
Number Range and Subroutine
In Ruby 2+, for a compact regex, use this:
^([0-9]|[1-9][0-9]|100)(?:,\g<1>){3}$
Explanation
The ^ anchor asserts that we are at the beginning of the string
The parentheses around ([0-9]|[1-9][0-9]|100) match a number from 0 to 100 and define subroutine #1
(?:,\g<1>) matches one comma and the expression defined by subroutine # 1
The {3} quantifier repeats that three times
The $ anchor asserts that we are at the end of the string
I'd save myself the headache of using regex for a number related problem. Also the validation message will look akward so it's better to make your own:
validate :that_string_has_only_4_numbers_upto_100
def that_string_has_only_4_numbers_upto_100
errors.add(:str, 'is not valid.') unless str.split(/,/).all? { |n| 1..100 === n.to_i }
end
Unless you a re regex jedi guru like #zx81 :p.
^(?:\d{1,2},){3}\d{1,2}$
Try this
I have a string in Ruby:
str = "<TAG1>Text 1<TAG1>Text 2"
I want to use gsub to get a string like this:
want = "<TAG2>Text 1</TAG2><TAG2>Text2</TAG2>"
In other words, I want to save everything in between a <TAG1> and EITHER: 1) the next occurrence of a "<", or 2) the end of the string.
The best regex i could come up with was:
regex = /<TAG1>(.*)(?:<|$)/
But the problem with this is that it'll just match the entire str, where what I want is both matches within str. (In other words, it seems like the end of string char ($) seems to have precedence over the "<" character--is there a way to flip it around?
/<TAG1>([^<]*)/ will match that. If there's no < it'll go all the way to the end of the string. Otherwise it will stop when it hits a <. Your problem is that . matches < as well. An alternative way would be to do /<TAG1>(.*?)(?:<|$)/, which makes the * non-greedy.