How Ruby Interprets $(foo) - ruby

When I read RbConfig::CONFIG['libdir'] it gives me lib folder location. But in rbconfig.rb file CONFIG["libdir"] = "$(exec_prefix)/lib". How the value is interpreted here.

$(exec_prefix) refers to a key in RbConfig::CONFIG.
But that's not a Ruby feature. rbconfig.rb contains code to expand these values: every occurrence of $(key) is replaced with the corresponding value of RbConfig::CONFIG['key']
My rbconfig.rb contains these lines:
CONFIG["prefix"] = (TOPDIR || DESTDIR + "/Users/sos/.rubies/ruby-2.2.2")
CONFIG["exec_prefix"] = "$(prefix)"
CONFIG["libdir"] = "$(exec_prefix)/lib"
And their values are:
RbConfig::CONFIG["prefix"] #=> "/Users/sos/.rubies/ruby-2.2.2"
RbConfig::CONFIG["exec_prefix"] #=> "/Users/sos/.rubies/ruby-2.2.2"
RbConfig::CONFIG["libdir"] #=> "/Users/sos/.rubies/ruby-2.2.2/lib"

Related

Sort two text files with its indented text aligned to it

I would like to compare two of my log files generated before and after an implementation to see if it has impacted anything. However, the order of the logs I get is not the same all the time. Since, the log file also has multiple indented lines, when I tried to sort, everything is sorted. But, I would like to keep the child intact with the parent. Indented lines are spaces and not tab.
Any help would be greatly appreciated. I am fine with any windows solution or Linux one.
Eg of the file:
#This is a sample code
Parent1 to be verified
Child1 to be verified
Child2 to be verified
Child21 to be verified
Child23 to be verified
Child22 to be verified
Child221 to be verified
Child4 to be verified
Child5 to be verified
Child53 to be verified
Child52 to be verified
Child522 to be verified
Child521 to be verified
Child3 to be verified
I am posting another answer here to sort it hierarchically, using python.
The idea is to attach the parents to the children to make sure that the children under the same parent are sorted together.
See the python script below:
"""Attach parent to children in an indentation-structured text"""
from typing import Tuple, List
import sys
# A unique separator to separate the parent and child in each line
SEPARATOR = '#'
# The indentation
INDENT = ' '
def parse_line(line: str) -> Tuple[int, str]:
"""Parse a line into indentation level and its content
with indentation stripped
Args:
line (str): One of the lines from the input file, with newline ending
Returns:
Tuple[int, str]: The indentation level and the content with
indentation stripped.
Raises:
ValueError: If the line is incorrectly indented.
"""
# strip the leading white spaces
lstripped_line = line.lstrip()
# get the indentation
indent = line[:-len(lstripped_line)]
# Let's check if the indentation is correct
# meaning it should be N * INDENT
n = len(indent) // len(INDENT)
if INDENT * n != indent:
raise ValueError(f"Wrong indentation of line: {line}")
return n, lstripped_line.rstrip('\r\n')
def format_text(txtfile: str) -> List[str]:
"""Format the text file by attaching the parent to it children
Args:
txtfile (str): The text file
Returns:
List[str]: A list of formatted lines
"""
formatted = []
par_indent = par_line = None
with open(txtfile) as ftxt:
for line in ftxt:
# get the indentation level and line without indentation
indent, line_noindent = parse_line(line)
# level 1 parents
if indent == 0:
par_indent = indent
par_line = line_noindent
formatted.append(line_noindent)
# children
elif indent > par_indent:
formatted.append(par_line +
SEPARATOR * (indent - par_indent) +
line_noindent)
par_indent = indent
par_line = par_line + SEPARATOR + line_noindent
# siblings or dedentation
else:
# We just need first `indent` parts of parent line as our prefix
prefix = SEPARATOR.join(par_line.split(SEPARATOR)[:indent])
formatted.append(prefix + SEPARATOR + line_noindent)
par_indent = indent
par_line = prefix + SEPARATOR + line_noindent
return formatted
def sort_and_revert(lines: List[str]):
"""Sort the formatted lines and revert the leading parents
into indentations
Args:
lines (List[str]): list of formatted lines
Prints:
The sorted and reverted lines
"""
sorted_lines = sorted(lines)
for line in sorted_lines:
if SEPARATOR not in line:
print(line)
else:
leading, _, orig_line = line.rpartition(SEPARATOR)
print(INDENT * (leading.count(SEPARATOR) + 1) + orig_line)
def main():
"""Main entry"""
if len(sys.argv) < 2:
print(f"Usage: {sys.argv[0]} <file>")
sys.exit(1)
formatted = format_text(sys.argv[1])
sort_and_revert(formatted)
if __name__ == "__main__":
main()
Let's save it as format.py, and we have a test file, say test.txt:
parent2
child2-1
child2-1-1
child2-2
parent1
child1-2
child1-2-2
child1-2-1
child1-1
Let's test it:
$ python format.py test.txt
parent1
child1-1
child1-2
child1-2-1
child1-2-2
parent2
child2-1
child2-1-1
child2-2
If you wonder how the format_text function formats the text, here is the intermediate results, which also explains why we could make file sorted as we wanted:
parent2
parent2#child2-1
parent2#child2-1#child2-1-1
parent2#child2-2
parent1
parent1#child1-2
parent1#child1-2#child1-2-2
parent1#child1-2#child1-2-1
parent1#child1-1
You may see that each child has its parents attached, all the way along to the root. So that the children under the same parent are sorted together.
Short answer (Linux solution):
sed ':a;N;$!ba;s/\n /#/g' test.txt | sort | sed ':a;N;$!ba;s/#/\n /g'
Test it out:
test.txt
parent2
child2-1
child2-1-1
child2-2
parent1
child1-1
child1-2
child1-2-1
$ sed ':a;N;$!ba;s/\n /#/g' test.txt | sort | sed ':a;N;$!ba;s/#/\n /g'
parent1
child1-1
child1-2
child1-2-1
parent2
child2-1
child2-1-1
child2-2
Explanation:
The idea is to replace the newline followed by an indentation/space with a non newline character, which has to be unique in your file (here I used # for example, if it is not unique in your file, use other characters or even a string), because we need to turn it back the newline and indentation/space later.
About sed command:
:a create a label 'a'
N append the next line to the pattern space
$! if not the last line, ba branch (go to) label 'a'
s substitute, /\n / regex for newline followed by a space
/#/ a unique character to replace the newline and space
if it is not unique in your file, use other characters or even a string
/g global match (as many times as it can)

How to decoding IFC using Ruby

In Ruby, I'm reading an .ifc file to get some information, but I can't decode it. For example, the file content:
"'S\X2\00E9\X0\jour/Cuisine'"
should be:
"'Séjour/Cuisine'"
I'm trying to encode it with:
puts ifcFileLine.encode("Windows-1252")
puts ifcFileLine.encode("ISO-8859-1")
puts ifcFileLine.encode("ISO-8859-5")
puts ifcFileLine.encode("iso-8859-1").force_encoding("utf-8")'
But nothing gives me what I need.
I don't know anything about IFC, but based solely on the page Denis linked to and your example input, this works:
ESCAPE_SEQUENCE_EXPR = /\\X2\\(.*?)\\X0\\/
def decode_ifc(str)
str.gsub(ESCAPE_SEQUENCE_EXPR) do
$1.gsub(/..../) { $&.to_i(16).chr(Encoding::UTF_8) }
end
end
str = 'S\X2\00E9\X0\jour/Cuisine'
puts "Input:", str
puts "Output:", decode_ifc(str)
All this code does is replace every sequence of four characters (/..../) between the delimiters, which will each be a Unicode code point in hexadecimal, with the corresponding Unicode character.
Note that this code handles only this specific encoding. A quick glance at the implementation guide shows other encodings, including an \X4 directive for Unicode characters outside the Basic Multilingual Plane. This ought to get you started, though.
See it on eval.in: https://eval.in/776980
If someone is interested, I wrote here a Python Code that decode 3 of the IFC encodings : \X, \X2\ and \S\
import re
def decodeIfc(txt):
# In regex "\" is hard to manage in Python... I use this workaround
txt = txt.replace('\\', 'µµµ')
txt = re.sub('µµµX2µµµ([0-9A-F]{4,})+µµµX0µµµ', decodeIfcX2, txt)
txt = re.sub('µµµSµµµ(.)', decodeIfcS, txt)
txt = re.sub('µµµXµµµ([0-9A-F]{2})', decodeIfcX, txt)
txt = txt.replace('µµµ','\\')
return txt
def decodeIfcX2(match):
# X2 encodes characters with multiple of 4 hexadecimal numbers.
return ''.join(list(map(lambda x : chr(int(x,16)), re.findall('([0-9A-F]{4})',match.group(1)))))
def decodeIfcS(match):
return chr(ord(match.group(1))+128)
def decodeIfcX(match):
# Sometimes, IFC files were made with old Mac... wich use MacRoman encoding.
num = int(match.group(1), 16)
if (num <= 127) | (num >= 160):
return chr(num)
else:
return bytes.fromhex(match.group(1)).decode("macroman")

Ruby replace array list

I have two strings:
packages="­linux-imag­e-3.2.0-4-­amd64 linux­-libc-dev linux­-headers-3­.2.0-4-amd­64 linux­-headers-3­.2.0-4-com­mon dnsutils mysql-server-5.5"
exclusion="dnsutils mysql-server-5.5"
I need a string pkgs that has the content of packages without exclusion like this:
pkgs="­linux-imag­e-3.2.0-4-­amd64 linux­-libc-dev linux­-headers-3­.2.0-4-amd­64 linux­-headers-3­.2.0-4-com­mon"
I tried the following code:
pkgs = packages.gsub!( /(?<!^|,)#{exclusion}(?!,|$)/, '\1')
which does not seem to be working. What would be the best working solution in this case?
packages="linux-image-3.2.0-4-amd64 linux-libc-dev linux-headers-3.2.0-4-amd64 linux-headers-3.2.0-4-common dnsutils mysql-server-5.5"
exclusion="dnsutils mysql-server-5.5"
(packages.split - exclusion.split).join(" ") # => "linux-image-3.2.0-4-amd64 linux-libc-dev linux-headers-3.2.0-4-amd64 linux-headers-3.2.0-4-common"
You need your variables to be arrays, not strings. Then you can just use the - operator to "subtract" the items in exclusion from packages:
packages = [ "­linux-imag­e-3.2.0-4-­amd64",
"linux­-libc-dev",
"linux­-headers-3­.2.0-4-amd­64",
"linux­-headers-3­.2.0-4-com­mon",
"dnsutils",
"mysql-server-5.5" ]
exclusion = [ "dnsutils", "mysql-server-5.5" ]
remaining = packages - exclusion
# => [ "­linux-imag­e-3.2.0-4-­amd64",
# "linux­-libc-dev",
# "linux­-headers-3­.2.0-4-amd­64",
# "linux­-headers-3­.2.0-4-com­mon" ]
If you then need the values in a single string, join them together with the join method:
remaining_str = remaining.join(" ")
# => "­linux-imag­e-3.2.0-4-­amd64 linux­-libc-dev linux­-headers-3­.2.0-4-amd­64 linux­-headers-3­.2.0-4-com­mon"
If you want to keep it simple, you can always split these strings into arrays, and join the difference.
(packages.split - exclusion.split).join ' '
String's split method will default to space characters. This give you two arrays, where you subtract the any values that exist in the both the first and second array from the first array. You then join this new array with space characters.
Longer example:
packages="linux-image-3.2.0-4-amd64 linux-libc-dev linux-headers-3.2.0-4-amd64 linux-headers-3.2.0-4-common dnsutils mysql-server-5.5"
exclusion="dnsutils mysql-server-5.5"
one = packages.split
# >> ["linux-image-3.2.0-4-amd64", "linux-libc-dev", "linux-headers-3.2.0-4-amd64", "linux-headers-3.2.0-4-common", "dnsutils", "mysql-server-5.5"]
two = exclusion.split
# >> ["dnsutils", "mysql-server-5.5"]
difference = one - two
# >> ["linux-image-3.2.0-4-amd64", "linux-libc-dev", "linux-headers-3.2.0-4-amd64", "linux-headers-3.2.0-4-common"]
finished = difference.join ' '
# >> "linux-image-3.2.0-4-amd64 linux-libc-dev linux-headers-3.2.0-4-amd64 linux-headers-3.2.0-4-common"

Handling Multiline Cells in Ruby CSV

require 'csv'
input = CSV.read("test_first.csv", :encoding => 'ascii')[1 .. -1]
DOC = "test_final.csv"
profile = []
profile[0] = "Multiline"
profile[1] = "Standard"
CSV.open(DOC, mode = 'w', :force_quotes => true) do |me|
me << profile
end
a = 0
b = input.length
while a < b
temp = []
temp = input[a]
profile = []
profile[0] = ' A text string with embedded newlines
as well as some substitution from the source file: '"#{temp[0]}"''
profile[1] = temp[1]
CSV.open(DOC, mode = "a", :force_quotes => true ) do |me|
me << profile
end
a += 1
end
The resulting file test_final.csv retains the desired newlines within the cells however the force_quotes isn't working as expected and the embedded newlines aren't being escaped. So when the test_final.csv is reviewed in a text editor it has more lines than intended because each newline is being treated as a new row.
I tried appending the offending column with it's own unique row separator> profile[0] = ' A text string with embedded newlines
as well as some substitution from the source file: '"#{temp[0]}"'~' and assigning that in the options hash like so> :row_sep => "~" but this didn't seem to work.
Some clarification as to the desired input and output
Desired Input:
test_first.csv
1.testHeader1,testheader2
2.sub1,standard1
3.sub2,standard2
Desired Output:
test_final.csv
1.Multiline,Standard
2. A text string with embedded newlines
as well as some substitution from the source file: sub1,standard1
3. A text string with embedded newlines
as well as some substitution from the source file: sub2,standard2
What I'm getting right now instead is:
test_fail.csv
1.Multiline,Standard
2. A text string with embedded newlines
3. as well as some substitution from the source file: sub1,standard1
4. A text string with embedded newlines
5. as well as some substitution from the source file: sub2,standard2

Is there a SnakeYaml DumperOptions setting to avoid double-spacing output?

I seem to see double-spaced output when parsing/dumping a simple YAML file with a pipe-text field.
The test is:
public void yamlTest()
{
DumperOptions printOptions = new DumperOptions();
printOptions.setLineBreak(DumperOptions.LineBreak.UNIX);
Yaml y = new Yaml(printOptions);
String input = "foo: |\n" +
" line 1\n" +
" line 2\n";
Object parsedObject = y.load(new StringReader(input));
String output = y.dump(parsedObject);
System.out.println(output);
}
and the output is:
{foo: 'line 1
line 2
'}
Note the extra space between line 1 and line 2, and after line 2 before the end of the string.
This test was run on Mac OS X 10.6, java version "1.6.0_29".
Thanks!
Mark
In the original string you use literal style - it is indicating by the '|' character. When you dump your text, you use single-quoted style which ignores the '\n' characters at the end. That is why they are repeated with the empty lines.
Try to set different styles in DumperOptions:
// and others - FOLDED, DOUBLE_QUOTED
DumperOptions.setDefaultScalarStyle(ScalarStyle.LITERAL)

Resources