updating a yaml file with ruby truncating spaces and adding dash - ruby

I have a yaml file which contains the following data
:books:
:action1:
:book_name: name1
:book_author: author1
:publish_date: 2009
:action2:
:book_name: name2
:book_author: author2
:publish_date: 2016
I am trying to update one of the yaml file using simple ruby snippet as below
test = YAML::load_file('books_details.yml')
test[:book][:action1][:book_name] = "book x"
test[:book][:action1][:book_author] = "author y"
test[:book][:action1][:publish_date] = "2019"
File.open('books_details.yml', 'w') { |f| YAML.dump(test, f) }
this works but i get the following output
:books:
:action1:
:book_name: book x
:book_author: author y
:publish_date: 2019
:action2:
:book_name: name2
:book_author: author2
:publish_date: 2016
There is a --- added to the top of the file and the spaces are truncated
Is there any other library that i could use that would not remove spaces and append --- to it??

Related

Keep indentation for array value in parsed YAML frontmatter

I am using https://github.com/waiting-for-dev/front_matter_parser to parse and update values in the frontmatter of my markdown files.
The following code removes the original indentation of two spaces for array values:
require 'front_matter_parser'
class FrontMatterUpdater
def self.run(path)
parsed = FrontMatterParser::Parser.parse_file(path)
front_matter = parsed.front_matter
front_matter['redirect_from'] = Array(front_matter['redirect_from'])
front_matter['redirect_from'] << 'new_entry2'
front_matter['redirect_from'].uniq!
new_content = [YAML.dump(front_matter), '---', "\n\n", parsed.content].join
File.write(path, new_content)
end
end
FrontMatterUpdater.run('test.md')
Content of my test.md file (same path):.
---
redirect_from:
- new_entry0
- new_entry1
---
Script result (indentation has been removed by the parser):
---
redirect_from:
- new_entry0
- new_entry1
- new_entry2
---
YAML indentation for array in hash confirms that both versions (indented and not) is valid YAML syntax.
But I'd love to keep the indentation for readability.
Do you see any option to keep the indentation of 2 spaces for the array values?

How to export pdf table data into csv?

I am using Rails 4.2, Ruby 2.2, Gem: 'pdf-reader'.
My application will read pdf file which has table-data and it exports into CSV which i have already done. When i match result with table header and table content, they are in wrong position, yes because pdf table is not a actual table, we need to write some extra logic behind this which I am asking for.
marks.pdf has content similar as shown below
School Name: ABC
Program: MicroBiology Year: Second
| Roll No | Math |
|----------- |-------- |
1000001 | 65
|----------- |-------- |
Any help would be appreciated.
Working code which reads PDF and export to CSV is given below
class ExportToCsv
# method useful to export pdf to csv
def convert_to_csv
pdf_reader = PDF::Reader.new("public/marks.pdf")
csv = CSV.open("output100.tsv","wb", {:col_sep => "\t"})
data_header = ""
pdf_reader.pages.each do |page|
page.text.each_line do |line|
# line with characters
if /^[a-z|\s]*$/i=~line
data_header = line.strip
else
# line with number
data_row = line.split(/[0-9]/).first
csv_line = line.sub(data_row,'').strip.split(/[\(|\)]/)
csv_line.unshift(data_row).unshift(data_header)
csv << csv_line
end
end
end
end
end
I am not able to attach original pdf here because of security, sorry for that. You can generate the pdf as per below screenshot.
The screen of pdf is given below:
The screen of generated Csv is given below:
Desired pdf should be like below image

When exporting XLIFF from Xcode, how to exclude dummy strings?

I'm using the Xcode's Editor > Export For Localization... to export XLIFF file for translation
but the translations for the Main.storyboard includes a lot of unnecessary strings, mostly placeholders/dummies that are useful at design time.
How do I exclude such strings from XLIFF file?
I've written a script that excludes certain translation.
How it works?
cmd-line: python strip_loc.py input.xliff output.xliff exclude_list.txt [-v]
Example usage:python strip_loc.py en.xliff en-stripped.xliff exclude_words.txt -v
The exclude_list.txt is a file with a string per line. The script parses this list and creates a dictionary of banned words. If a translation with source containing one of these strings is encountered, the whole translation unit is removed from the output xml/xliff.
Here is the solution that works with latest python version:
def log(string_to_log):
if args.verbose:
print(string_to_log)
import argparse
parser = argparse.ArgumentParser(description="Process xliff file against banned words and output new xliff with stripped translation.", epilog="Example usage: strip_loc.py en.xliff en-stripped.xliff exclude_words.txt -v")
parser.add_argument('source', help="Input .xliff file containing all the strings")
parser.add_argument('output', help="Output .xliff file which will containt the stripped strings according to the exclude_list")
parser.add_argument('exclude_list', help="Multi-line text file where every line is a banned string")
parser.add_argument('-v', '--verbose', action="store_true", help="print script steps while working")
args = parser.parse_args()
banned_words = [line.strip().lower() for line in open(args.exclude_list, 'r')]
log("original file: " + args.source)
log("output file: " + args.output)
log("banned words: " + ", ".join(banned_words))
log("")
import xml.etree.ElementTree as ET
ET.register_namespace('',"urn:oasis:names:tc:xliff:document:1.2")
ns = {"n": "urn:oasis:names:tc:xliff:document:1.2"}
with open(args.source, 'r') as xml_file:
tree = ET.parse(xml_file)
root = tree.getroot()
counter = 1
for file_body in root.findall("./*/n:body", ns):
for trans_unit in file_body.findall("n:trans-unit", ns):
source = trans_unit.find("n:source", ns)
if source.text is not None:
source = source.text.encode("utf-8").lower()
source = source.decode("utf-8")
source = source.strip()
for banned_word in banned_words:
if source.find(banned_word) != -1:
log(str(counter) + ": removing <trans-unit id=\"" + trans_unit.attrib['id'] + "\">, banned: \"" + banned_word + "\"")
file_body.remove(trans_unit)
break
counter += 1
tree.write(args.output, "utf-8", True)
log("")
print("DONE")
And the usage is the same:
python strip_loc.py en.xliff en-stripped.xliff exclude_words.txt -v
For me I use this XLIFF Online Editor to edit the xliff file. It will be easy to you to ignore the dummy text or anything you need.

How to write some value to a text file in ruby based on position

I need some help is some unique solution. I have a text file in which I have to replace some value based on some position. This is not a big file and will always contain 5 lines with fixed number of length in all the lines at any given time. But I have to specficaly replace soem text in some position only. Further, i can also put in some text in required position and replace that text with required value every time. I am not sure how to implement this solution. I have given the example below.
Line 1 - 00000 This Is Me 12345 trying
Line 2 - 23456 This is line 2 987654
Line 3 - This is 345678 line 3 67890
Consider the above is the file I have to use to replace some values. Like in line 1, I have to replace '00000' with '11111' and in line 2, I have to replace 'This' with 'Line' or any require four digit text. The position will always remain the same in text file.
I have a solution which works but this is for reading the file based on position and not for writing. Can someone please give a solution similarly for wrtiting aswell based on position
Solution for reading the file based on position :
def read_var file, line_nr, vbegin, vend
IO.readlines(file)[line_nr][vbegin..vend]
end
puts read_var("read_var_from_file.txt", 0, 1, 3) #line 0, beginning at 1, ending at 3
#=>308
puts read_var("read_var_from_file.txt", 1, 3, 6)
#=>8522
I have also tried this solution for writing. This works but I need it to work based on position or based on text present in the specific line.
Explored solution to wirte to file :
open(Dir.pwd + '/Files/Try.txt', 'w') { |f|
f << "Four score\n"
f << "and seven\n"
f << "years ago\n"
}
I made you a working sample anagraj.
in_file = "in.txt"
out_file = "out.txt"
=begin
=>contents of file in.txt
00000 This Is Me 12345 trying
23456 This is line 2 987654
This is 345678 line 3 67890
=end
def replace_in_file in_file, out_file, shreds
File.open(out_file,"wb") do |file|
File.read(in_file).each_line.with_index do |line, index|
shreds.each do |shred|
if shred[:index]==index
line[shred[:begin]..shred[:end]]=shred[:replace]
end
end
file << line
end
end
end
shreds = [
{index:0, begin:0, end:4, replace:"11111"},
{index:1, begin:6, end:9, replace:"Line"}
]
replace_in_file in_file, out_file, shreds
=begin
=>contents of file out.txt
11111 This Is Me 12345 trying
23456 Line is line 2 987654
This is 345678 line 3 67890
=end

Programmatically get a list of characters a certain .ttf font file supports

Is there a way to programmatically get a list of characters a .ttf file supports using Ruby and/or Bash. I am trying to pipe the supported character codes into a text file for later processing.
(I would prefer not to use Font Forge.)
Found a Ruby gem called ttfunk which can be found here.
After a gem install ttfunk, you can get all unicode characters by running the following script:
require 'ttfunk'
file = TTFunk::File.open("path/to/font.ttf")
cmap = file.cmap
chars = {}
unicode_chars = []
cmap.tables.each do |subtable|
next if !subtable.unicode?
chars = chars.merge( subtable.code_map )
end
unicode_chars = chars.keys.map{ |dec| dec.to_s(16) }
puts "\n -- Found #{unicode_chars.length} characters in this font \n\n"
p unicode_chars
Which will output something like:
- Found 2815 characters in this font
["20", "21", "22", "23", ... , "fef8", "fef9", "fefa", "fefb", "fefc", "fffc", "ffff"]

Resources