How can I strip tab characters from a string in Ruby? - ruby

I have a program that loads some tab-separated lines into a MySQL table. One of the values has tabs in it, which is causing some problems. The data is created column by column, so I need to find a way to strip the tab character out of an individual field with gsub. I do not, however, want to get rid of anything else, like spaces.

It's really easy \t is the tab character.
result = string.gsub /\t/, ''
or, in-place
string.gsub! /\t/, ''

\t is the escape character for tabs within strings. So you can just search for "\t" and replace that by a space or something.

Related

Splitting clipboard import in ABAP

I'm using CLPB_IMPORT function module to get clipboard to internal table. it's ok. I'am copying two column Excel data. So it fills the table with delimiter '#', like;
4448#3000
4449#4000
4441#5000
But the problem is splitting these strings. I'm trying;
LOOP AT foytab.
SPLIT foytab-tab AT '#' INTO temp1 temp2.
ENDLOOP.
But it doesn't split. it puts whole line into temp1. I think the delimiter is not what I thought ('#'). Because when I write a string manually with delimiter '#' it splits.
Do you have any idea how to split this ?
You should not use CLPB_IMPORT since it's explicitly marked as obsolete. Use CL_GUI_FRONTEND_SERVICES=>CLIPBOARD_IMPORT instead.
The data is probably not separated by # but by a tab character. You can check this in the hex view of the debugger. # is just a replacement symbol the UI uses for any unprintable character. If the delimiter is the tab character, you can use the constant CL_ABAP_CHAR_UTILITIES=>HORIZONTAL_TAB.

select only non space and word characters in ruby from a string

Im coding my fist program in ruby.
I want to select all characters from a string except of whitespace and nonword characters to compare to my other string.
I know i can select nonwhite space characters by using \S on my string and I can select word only characters by calling \w on my string but i cant find anywhere how to combine those two to select only nonwhitespace word characters only.
Word characters are all nonwhitespace characters, so \w will suffice.
This online Ruby regex simulator can also be of help if you have a block of text to try.
http://rubular.com/
Also Ruby does support $1 $2 $3, etc like Perl and other languages.

Bash for truncation

I have to make changes to a document where there are two columns separated by tab (\t) and each record separated by newline \n. the statements of the document are as follows:
/something/random/2345.txt
my aim is to remove the entire string and just keep the number 2345 in this case.I used
sed 's/something/random//g' file.csv
but I do not know how to escape the / cause sed syntax has / too. Also not all records have the same words so i would be looking for regex of the type
/*/*.*
But each entry has a number as a part of the record and I would like to extract that.
Also there are a few records which do not contain any number, I would like to delete those records along with the corresponding entry in the next column for that record.
The file is in CSV format.
You can escape the forward slash with a backslash, or you can use a different character than forward slash to delimit your expression. Observe:
echo foobar | sed sIfooIcrowI
> crowbar
Of course, you probably shouldn't use an alphabetic character for the delimiter. I'm just using it here to make the point that pretty much any normal character can be substituted for the slash.
You could just remove all non digit characters from brining of each statement in string :
sed 's/[^0-9]*\(.*\)[\t]*/\1/g'

In Ruby, what's the easiest way to "chomp" at the start of a string instead of the end?

In Ruby, sometimes I need to remove the new line character at the beginning of a string. Currently what I did is like the following. I want to know the best way to do this. Thanks.
s = "\naaaa\nbbbb"
s.sub!(/^\n?/, "")
lstrip seems to be what you want (assuming trailing white space should be kept):
>> s = "\naaaa\nbbbb" #=> "\naaaa\nbbbb"
>> s.lstrip #=> "aaaa\nbbbb"
From the docs:
Returns a copy of str with leading whitespace removed. See also
String#rstrip and String#strip.
http://ruby-doc.org/core-1.9.3/String.html#method-i-lstrip
strip will remove all trailing whitespace
s = "\naaaa\nbbbb"
s.strip!
Little hack to chomp leading whitespace:
str = "\nmy string"
chomped_str = str.reverse.chomp.reverse
To be perfectly accurate chomp not only can delete whitespace, from the end of a string, but can also delete arbitrary characters.
If the latter functionality is sought, one can use:
'\naaaa\nbbbb'.delete_prefix( "\n" )
As opposed to strip this works for arbitrary characters exactly like chomp.
So, just for a bit of clarification, there are three ways that you can go about this: sub, reverse.chomp.reverse and lstrip.
I'd recommend against sub because it's a bit less readable, but also because of how it works: by creating a new string that inherits from your old string. Plus you need a regular expression for something that's fairly simple.
So then you're down to reverse.chomp.reverse and lstrip. Most likely, you want lstrip because it's a bit faster, but keep in mind that the strip operations are not the same as the chomp operations. strip will remove all leading newlines and whitespace:
"\n aaa\nbbb".reverse.chomp.reverse # => " aaa\nbbb"
"\n aaa\nbbb".lstrip # => "aaa\nbbb"
If you want to make sure you only remove one character and that it's definitely a newline, use the reverse.chomp.reverse solution. If you consider all leading newlines and whitespace garbage, go with lstrip.
The one case I can think of for using regular expressions would be if you have an unknown number of \rs and \ns at the beginning and want to trim them all but avoid touching any whitespace. You could use a loop and the more String methods for trimming but it would just be uglier. The performance implications don't really matter that much.
s.sub(/^[\n\r]*/, '')
This removes leading newlines (carriage returns and line feeds, as in chomp), not any whitespace.
Not sure if it's the best way but you could try:
s.reverse.chomp.reverse
if you want to leave the trailing newline (if it exists).
This should work for you: s.strip.
A way to do this for whitespace or non-whitespace characters is like this:
s = "\naaaa\nbbbb"
s.slice!("\n") # returns "\n" but s also has the first newline removed.
puts s # shows s has the first newline removed

String#split in Ruby not behaving as expected

File.open(path, 'r').each do |line|
row = line.chomp.split('\t')
puts "#{row[0]}"
end
path is the path of file having content like name, age, profession, hobby
I'm expecting output to be name only but I am getting the whole line.
Why is it so?
The question already has an accepted answer, but it's worth noting what the cause of the original problem was:
This is the problem part:
split('\t')
Ruby has several forms for quoted string, which have differences, usually useful ones.
Quoting from Ruby Programming at wikibooks.org:
...double quotes are designed to
interpret escaped characters such as
new lines and tabs so that they appear
as actual new lines and tabs when the
string is rendered for the user.
Single quotes, however, display the
actual escape sequence, for example
displaying \n instead of a new line.
Read further in the linked article to see the use of %q and %Q strings. Or Google for "ruby string delimiters", or see this SO question.
So '\t' is interpreted as "backslash+t", whereas "\t" is a tab character.
String#split will also take a Regexp, which in this case might remove the ambiguity:
split(/\t/)
Your question was not very clear
split("\n") - if you want to split by lines
split - if you want to split by spaces
and as I can understand, you do not need chomp, because it removes all the "\n"

Resources