Ruby - How to subtract numbers of two files and save the result in one of them on a specified position? - ruby

I have 2 txt files with different strings and numbers in them splitted with ;
Now I need to subtract the
((number on position 2 in file1) - (number on position 25 in file2)) = result
Now I want to replace the (number on position 2 in file1) with the result.
I tried my code below but it only appends the number in the end of the file and its not the result of the calculation which got appended.
def calc
f1 = File.open("./file1.txt", File::RDWR)
f2 = File.open("./file2.txt", File::RDWR)
f1.flock(File::LOCK_EX)
f2.flock(File::LOCK_EX)
f1.each.zip(f2.each).each do |line, line2|
bg = line.split(";").compact.collect(&:strip)
bd = line2.split(";").compact.collect(&:strip)
n = bd[2].to_i - bg[25].to_i
f2.print bd[2] << n
#puts "#{n}" Only for testing
end
f1.flock(File::LOCK_UN)
f2.flock(File::LOCK_UN)
f1.close && f2.close
end

Use something like this:
lines1 = File.readlines('file1.txt').map(&:to_i)
lines2 = File.readlines('file2.txt').map(&:to_i)
result = lines1.zip(lines2).map do |value1, value2| value1 - value2 }
File.write('file1.txt', result.join(?\n))
This code load all files in memory, then calculate result and write it to first file.
FYI: If you want to use your code just save result to other file (i.e. result.txt) and at the end copy it to original file.

Related

How to read by the number at each iteration of the loop in Ruby?

How to read by the number at each iteration of the loop? Dynamic work is important, (not once to read the entire line and convert to an array), at each iteration, take one number from the file string and work with it. How to do it right?
input.txt :
5
1 7 5 2 3
Work with 2nd line of the file.
fin = File.open("input.txt", "r")
fout = File.open("output.txt", "w")
n = fin.readline.to_i
heap_min = Heap.new(:min)
heap_max = Heap.new(:max)
for i in 1..n
a = fin.read.to_i #code here <--
heap_max.push(a)
if heap_max.size > heap_min.size
tmp = heap_max.top
heap_max.pop
heap_min.push(tmp)
end
if heap_min.size > heap_max.size
tmp = heap_min.top
heap_min.pop
heap_max.push(tmp)
end
if heap_max.size == heap_min.size
heap_max.top > heap_min.top ? median = heap_min.top : median = heap_max.top
else
median = heap_max.top
end
fout.print(median, " ")
end
If you're 100% sure that your file separate numbers by space you can try this :
a = fin.gets(' ', -1).to_i
Read the 2nd line of a file:
line2 = File.readlines('input.txt')[1]
Convert it to an array of integers:
array = line2.split(' ').map(&:to_i).compact

Prevent or delete duplicates in textscraper?

I have a code that parses through text files in a folder, and saves a predefined number of words around certain search words.
For example, it looks for words such as "date" and "year". If it finds both in the same sentence it will save the sentence twice. Furthermore, if it finds the same word used a few times in a sentence, it will also save it multiple times.
This way the scraper saves an enormous amount of unnecessary duplicate text.
I see two possible solutions:
If the next search-match is in the padding, in the group of words, of the previous one, it will not be saved.
If a group of, say, seven words of the search match is also part of the preceding group it will not be saved/deleted.
Everything I've tried has utterly failed thusfar:
#helper
def indices text, index, word
padding = 200
bottom_i = index - padding < 0 ? 0 : index - padding
top_i = index + word.length + padding > text.length ? text.length : index + word.length + padding
return bottom_i, top_i
end
#script
base_text = File.open("base.txt", 'w')
Dir::mkdir("summaries") unless File.exists?("summaries")
Dir.chdir("summaries")
Dir.glob("*.txt").each do |textfile|
whole_file = File.open(textfile, 'r').read
puts "Currently summarizing " + textfile + "..."
curr_i = 0
str = nil
whole_file.scan(Regexp.union(/firstword/, /secondword/).each do |match|
if i_match = whole_file.index(match, curr_i)
top_bottom = indices(whole_file, i_match, match)
base_text.puts(whole_file[top_bottom[0]..top_bottom[1]] + " : " + File.path(textfile))
curr_i += i_match
end
end
puts "Done summarizing " + textfile + "."
end
base_text.close
Why not do something where you keep track of what you're looking for:
search_words = %w( year date etc )
And then downcase the search string, and start an index.
def summarize(str)
search_str = str.downcase
ind = 0
Then find the smallest index offset of your search words in the search_str, and remove everything up to (ind + offset - delta), go up to (ind + delta) into matches, and continue in the while loop. Something like:
matches = []
while (offset = search_words.map{|w| search_str.index w }.min)
ind += offset
matches.push str[ind - delta, delta * 2]
search_str = search_str[offset + delta, ]
end
matches
end
Preferably something better than:
whole_file.scan(Regexp.union(/firstword/, /secondword/).each do |match|
if i_match = whole_file.index(match, curr_i)
top_bottom = indices(whole_file, i_match, match)
base_text.puts(whole_file[top_bottom[0]..top_bottom[1]] + " : " + File.path(textfile))
curr_i += i_match + 50
end
end

Radix sort not working in Lua

Firstly I should mention I've not been coding very long at all, although that much is probably obvious from my code :P
I'm having two problems, firstly the sort isn't functioning correctly but does sort the numbers by their length. Any help here would be appreciated.
Secondly it's changing both the table it grabs and the table it returns (not sure why). How do I prevent it changing the table it grabs?
I'd prefer if people didn't post a fully optisimised premade code as I'm not going to learn or understand anything that way.
function radix_sort(x)
pass, bucket, maxstring = 0, x, 2
while true do
pass = pass + 1
queue = {}
for n=#bucket,1,-1 do
key_length = string.len(bucket[n])
key = bucket[n]
if pass == 1 and key_length > maxstring then
maxstring = key_length
end
if key_length == pass then
pool = string.sub(key, 1,1)
if queue[pool + 1] == nil then
queue[pool + 1] = {}
end
table.insert(queue[pool + 1], key)
table.remove(bucket, n)
end
end
for k,v in pairs(queue) do
for n=1,#v do
table.insert(bucket, v[n])
end
end
if pass == maxstring then
break
end
end
return bucket
end
There's a lot of changes I made to get this working, so hopefully you can look through and pickup on them. I tried to comment as best I could.
function radix_sort(x)
pass, maxstring = 0, 0
-- to avoid overwriting x, copy into bucket like this
-- it also gives the chance to init maxstring
bucket={}
for n=1,#x,1 do
-- since we can, convert all entries to strings for string functions below
bucket[n]=tostring(x[n])
key_length = string.len(bucket[n])
if key_length > maxstring then
maxstring = key_length
end
end
-- not a fan of "while true ... break" when we can set a condition here
while pass <= maxstring do
pass = pass + 1
-- init both queue and all queue entries so ipairs doesn't skip anything below
queue = {}
for n=1,10,1 do
queue[n] = {}
end
-- go through bucket entries in order for an LSD radix sort
for n=1,#bucket,1 do
key_length = string.len(bucket[n])
key = bucket[n]
-- for string.sub, start at end of string (LSD sort) with -pass
if key_length >= pass then
pool = tonumber(string.sub(key, pass*-1, pass*-1))
else
pool = 0
end
-- add to appropriate queue, but no need to remove from bucket, reset it below
table.insert(queue[pool + 1], key)
end
-- empty out the bucket and reset, use ipairs to call queues in order
bucket={}
for k,v in ipairs(queue) do
for n=1,#v do
table.insert(bucket, v[n])
end
end
end
return bucket
end
Here's a test run:
> input={55,2,123,1,42,9999,6,666,999,543,13}
> output=radix_sort(input)
> for k,v in pairs(output) do
> print (k , " = " , v)
> end
1 = 1
2 = 2
3 = 6
4 = 13
5 = 42
6 = 55
7 = 123
8 = 543
9 = 666
10 = 999
11 = 9999
pool = string.sub(key, 1,1)
always looks at the first character; perhaps you meant string.sub(key, pass, 1)

ruby multiple loop sets but with limited rows per set

Alrightie, so I'm building an CSV file this time with ruby. The outer loop will run up to length of num_of_loops, but it runs for an entire set rather than up to the specified row. I want to change the first column of a CSV file to a new name for each row.
If I do this:
class_days = %w[Wednesday Thursday Friday]
num_of_loops = (num_of_loops / class_days.size).ceil
num_of_loops.times {
["Wednesday","Thursday","Friday"].each do |x|
data[0] = x
data[4] = classname()
# Write all to file
#
csv << data
end
}
Then the loop will run only 3 times for a 5 row request.
I'd like it to run the full 5 rows such that instead of stopping at Wed/Thurs/Fri it goes to Wed/Thurs/Fri/Wed/Thurs instead.
class_days = %w[Wednesday Thursday Friday]
num_of_loops.times do |i|
data[0] = class_days[i % class_days.size]
data[4] = classname
csv << data
end
The interesting part is here:
class_days[i % class_days.size]
We need an index into class_days that is between 0 and class_days.size - 1. We can get that with the % (modulo) operator. That operator yields the remainder after dividing i by class_days.size. This table shows how it works:
i i % 3
0 0
1 1
2 2
3 0
4 1
5 2
...
The other key part is that the times method yields indices starting with 0.

Ruby data extraction from a text file

I have a relatively big text file with blocks of data layered like this:
ANALYSIS OF X SIGNAL, CASE: 1
TUNE X = 0.2561890123390808
Line Frequency Amplitude Phase Error mx my ms p
1 0.2561890123391E+00 0.204316425208E-01 0.164145385871E+03 0.00000000000E+00 1 0 0 0
2 0.2562865535359E+00 0.288712798671E-01 -.161563284233E+03 0.97541196785E-04 1 0 0 0
(they contain more lines and then are repeated)
I would like first to extract the numerical value after TUNE X = and output these in a text file. Then I would like to extract the numerical value of LINE FREQUENCY and AMPLITUDE as a pair of values and output to a file.
My question is the following: altough I could make something moreorless working using a simple REGEXP I'm not convinced that it's the right way to do it and I would like some advices or examples of code showing how I can do that efficiently with Ruby.
Generally, (not tested)
toggle=0
File.open("file").each do |line|
if line[/TUNE/]
puts line.split("=",2)[-1].strip
end
if line[/Line Frequency/]
toggle=1
next
end
if toggle
a = line.split
puts "#{a[1]} #{a[2]}"
end
end
go through the file line by line, check for /TUNE/, then split on "=" to get last item.
Do the same for lines containing /Line Frequency/ and set the toggle flag to 1. This signify that the rest of line contains the data you want to get. Since the freq and amplitude are at fields 2 and 3, then split on the lines and get the respective positions. Generally, this is the idea. As for toggling, you might want to set toggle flag to 0 at the next block using a pattern (eg SIGNAL CASE or ANALYSIS)
file = File.open("data.dat")
#tune_x = #frequency = #amplitude = []
file.each_line do |line|
tune_x_scan = line.scan /TUNE X = (\d*\.\d*)/
data_scan = line.scan /(\d*\.\d*E[-|+]\d*)/
#tune_x << tune_x_scan[0] if tune_x_scan
#frequency << data_scan[0] if data_scan
#amplitude << data_scan[0] if data_scan
end
There are lots of ways to do it. This is a simple first pass at it:
text = 'ANALYSIS OF X SIGNAL, CASE: 1
TUNE X = 0.2561890123390808
Line Frequency Amplitude Phase Error mx my ms p
1 0.2561890123391E+00 0.204316425208E-01 0.164145385871E+03 0.00000000000E+00 1 0 0 0
2 0.2562865535359E+00 0.288712798671E-01 -.161563284233E+03 0.97541196785E-04 1 0 0 0
ANALYSIS OF X SIGNAL, CASE: 1
TUNE X = 1.2561890123390808
Line Frequency Amplitude Phase Error mx my ms p
1 1.2561890123391E+00 0.204316425208E-01 0.164145385871E+03 0.00000000000E+00 1 0 0 0
2 1.2562865535359E+00 0.288712798671E-01 -.161563284233E+03 0.97541196785E-04 1 0 0 0
ANALYSIS OF X SIGNAL, CASE: 1
TUNE X = 2.2561890123390808
Line Frequency Amplitude Phase Error mx my ms p
1 2.2561890123391E+00 0.204316425208E-01 0.164145385871E+03 0.00000000000E+00 1 0 0 0
2 2.2562865535359E+00 0.288712798671E-01 -.161563284233E+03 0.97541196785E-04 1 0 0 0
'
require 'stringio'
pretend_file = StringIO.new(text, 'r')
That gives us a StringIO object we can pretend is a file. We can read from it by lines.
I changed the numbers a bit just to make it easier to see that they are being captured in the output.
pretend_file.each_line do |li|
case
when li =~ /^TUNE.+?=\s+(.+)/
print $1.strip, "\n"
when li =~ /^\d+\s+(\S+)\s+(\S+)/
print $1, ' ', $2, "\n"
end
end
For real use you'd want to change the print statements to a file handle: fileh.print
The output looks like:
# >> 0.2561890123390808
# >> 0.2561890123391E+00 0.204316425208E-01
# >> 0.2562865535359E+00 0.288712798671E-01
# >> 1.2561890123390808
# >> 1.2561890123391E+00 0.204316425208E-01
# >> 1.2562865535359E+00 0.288712798671E-01
# >> 2.2561890123390808
# >> 2.2561890123391E+00 0.204316425208E-01
# >> 2.2562865535359E+00 0.288712798671E-01
You can read your file line by line and cut each by number of symbol, for example:
to extract tune x get symbols from
10 till 27 on line 2
to extract LINE FREQUENCY get
symbols from 3 till 22 on line 6+n

Resources