ruby gedcom parser EOF exception - ruby

i need to parse gedcom 5.5 files for a analyziation project.
The first ruby parser i found couses a stack level too deep error, so i tryed to find alternatives. I fount this project: https://github.com/jslade/gedcom-ruby
There are some samples included, but i don't get them to work either.
Here is the parser itself: https://github.com/jslade/gedcom-ruby/blob/master/lib/gedcom.rb
If i try the sample like this:
ruby ./samples/count.rb ./samples/royal.ged
i get the following error:
D:/rails_projects/gedom_test/lib/gedcom.rb:185:in `readchar': end of file reached (EOFError)
I wrote a "gets" in every method for better unterstanding, this is the output till the exception raises:
Parsing './samples/royal.ged'...
INIT
BEFORE
CHECK_PROC_OR_BLOCK
BEFORE
CHECK_PROC_OR_BLOCK
PARSE
PARSE_FILE
PARSE_IO
DETECT_RS
The exact line that causes the trouble is
while ch = io.readchar
in the detect_rs method:
# valid gedcom may use either of \r or \r\n as the record separator.
# just in case, also detects simple \n as the separator as well
# detects the rs for this string by scanning ahead to the first occurence
# of either \r or \n, and checking the character after it
def detect_rs io
puts "DETECT_RS"
rs = "\x0d"
mark = io.pos
begin
while ch = io.readchar
case ch
when 0x0d
ch2 = io.readchar
if ch2 == 0x0a
rs = "\x0d\x0a"
end
break
when 0x0a
rs = "\x0a"
break
end
end
ensure
io.pos = mark
end
rs
end
I hope someone can help me with this.

The readchar method of Ruby's IO class will raise an EOFError when it encounters the end of the file. http://www.ruby-doc.org/core-2.1.1/IO.html#method-i-readchar
The gedcom-ruby gem hasn't been touched in years, but there was a fork of it made a couple of years go to fix this very problem.
Basically it changes:
while ch = io.readchar
to
while !io.eof && ch = io.readchar
You can get the fork of the gem here: https://github.com/trentlarson/gedcom-ruby

Related

ignore ANSI colors in pexpect response

Can I use pexpect in a way that ignores ANSI escape codes (especially colors) in the output? I am trying to do this:
expect('foo 3 bar 5')
...but sometimes I get output with ANSI-colored numbers. The problem is I don't know which numbers will have ANSI colors and which won't.
Is there a way to use pexpect but have it ignore ANSI sequences in the response from the child process?
Here's a not entirely satisfying proposal, subclassing 2 routines of the pexpect classes pexpect.Expecter and pexpect.spawn so that incoming data can have the escape sequences removed before they get added to the buffer and tested for pattern match. It is a lazy implementation in that it assumes any escape sequence will always be read atomically, but coping with split reads is more difficult.
# https://stackoverflow.com/a/59413525/5008284
import re, pexpect
from pexpect.expect import searcher_re
# regex for vt100 from https://stackoverflow.com/a/14693789/5008284
class MyExpecter(pexpect.Expecter):
ansi_escape = re.compile(rb'\x1B[#-_][0-?]*[ -/]*[#-~]')
def new_data(self, data):
data = self.ansi_escape.sub(b'', data)
return pexpect.Expecter.new_data(self, data)
class Myspawn(pexpect.spawn):
def expect_list(self, pattern_list, timeout=-1, searchwindowsize=-1,
async=False):
if timeout == -1:
timeout = self.timeout
exp = MyExpecter(self, searcher_re(pattern_list), searchwindowsize)
return exp.expect_loop(timeout)
This assumes you use the expect() call with a list, and do
child = Myspawn("...")
rc = child.expect(['pat1'])
For some reason I had to use bytes rather than strings as I get the data before it is decoded, but that may just be because of a currently incorrect locale environment.
This workaround partially defeats the purpose of using pexpect but it satisfies my requirements.
The idea is:
expect anything at all (regex match .*) followed by the next prompt (which in my case is xsh $ - note the backslash in the "prompt" regex)
get the after property
trim off the prompt: [1:]
remove ANSI escape codes from that
compare the filtered text with my "expected" response regex
with pexpect.spawn(XINU_CMD, timeout=3, encoding='utf-8') as c:
# from https://stackoverflow.com/a/14693789/5008284
ansi_escape = re.compile(r"\x1B[#-_][0-?]*[ -/]*[#-~]")
system_prompt_wildcard = r".*xsh \$ " # backslash because prompt is "xsh $ "
# tests is {command:str, responses:[str]}
for test in tests:
c.sendline(test["cmd"])
response = c.expect([system_prompt_wildcard, pexpect.EOF, pexpect.TIMEOUT]) #=> (0|1|2)
if response != 0: # any error
continue
response_text = c.after.split('\n')[1:]
for expected, actual in zip(test['responses'], response_text):
norm_a = ansi_escape.sub('', norm_input.sub('', actual.strip()))
result = re.compile(norm_a).findall(expected)
if not len(result):
print('NO MATCH FOUND')

Assign File.readlines[n] to variable

I'm reading a text file which has instructions on each line. I want to assign the text on each line to it's own variable. When I do this, the value returned is nil but when I output the value of readlines[n] it is correct.
e.g.
# Using the variable (incorrect result)
puts current_zone_size
>
e.g.
# Using readlines after variable assignment (incorrect result)
current_zone_size = instructions.readlines[0]
instructions.readlines[0]
>
e.g.
# Using readlines (correct result)
instructions.readlines[0]
> 8 10
This is my code:
instructions = File.open("operator-input.txt", "r")
current_zone_size = instructions.readlines[0]
rover_init_location_orientation = instructions.readlines[1]
rover_movements = instructions.readlines[2]
This is the text in the file being read:
8 10
1 2 E
MMLMRMMRRMML
Edit:
Is the file being closed? Is this the reason I can't assign values from File.readlines[n] to variables if I'm not doing the variable assignment from within a block?
Also, the file will only ever have three lines which is why I'm not using a loop to read the lines.
IO#readlines reads all the lines in the file. It should not come as a surprise that, in order to read all the lines in the file, it has to read the entire file.
So, where is the file pointer after you read the entire file? It is at the end of the file.
What happens if you call IO#readlines the second time, when the file pointer is still at the end of the file? It will start reading at the position of the file pointer, which means it will read an empty file.
Therefore, if you want to do it the way you are doing it, you need to reset the file pointer to the beginning of the file every time you call IO#readlines:
instructions = File.open('operator-input.txt', 'r')
current_zone_size = instructions.readlines[0]
instructions.pos = 0
rover_init_location_orientation = instructions.readlines[1]
instructions.pos = 0
rover_movements = instructions.readlines[2]
Note also that you are leaking resources: you never close the file, so it will only by closed at the earliest by Ruby when the instructions variable gets out of scope and the File instance gets garbage-collected, and at the latest by the OS automatically when your Ruby process exits, which may be a long time later. So, your code should rather be:
instructions = File.open('operator-input.txt', 'r')
current_zone_size = instructions.readlines[0]
instructions.pos = 0
rover_init_location_orientation = instructions.readlines[1]
instructions.pos = 0
rover_movements = instructions.readlines[2]
instructions.close
In general, it is much better to use the block form of File::open, which closes the file handle automatically for you at the end of the block, and also ensures that this happens even in the case of complex control flow, errors, or exceptions:
File.open('operator-input.txt', 'r') do |instructions|
current_zone_size = instructions.readlines[0]
instructions.pos = 0
rover_init_location_orientation = instructions.readlines[1]
instructions.pos = 0
rover_movements = instructions.readlines[2]
end
Note, however, that what you want to do is horribly inefficient: you read the entire file, then take the first line, throw the rest away. Then you read the entire file again, take the second line, throw the rest away. Then you read the entire file again, take the third line, throw the rest away.
It makes much more sense to read the entire file once and then take the lines you need. Something like this:
File.open('operator-input.txt', 'r') do |instructions|
current_zone_size, rover_init_location_orientation, rover_movements =
instructions.readlines
end
However, in the case where all you do is open the file, read all lines, then immediately close it again, you should rather use the IO::readlines method instead of IO#readlines, since it does all three things for you in one call:
current_zone_size, rover_init_location_orientation, rover_movements =
File.readlines('operator-input.txt')
I ended up reading all the lines at once, now I'm able to set each variable outside of a block. Like this:
instructions = File.readlines "operator-input.txt"
current_zone_size = instructions[0]
rover_init_location_orientation = instructions[1]
rover_movements = instructions[2]
e.g.
puts current_zone_size
> 8 10

Get wifi BSSID programmatically using Ruby and ioctl

Using Getting essid via ioctl in ruby as a template I wanted to get the BSSID rather than the ESSID. However, not being a C developer, there are a few things that I don't understand.
What I have so far which does not work :( ...
NOTE I'm a bit confused because part of me thinks, according to some comments in wireless.h, that the BSSID can only be set via ioctl. However, the ioctl to get exists. That along with my almost complete lack of understanding of the more intermediate C type isms (structs, unions, and stuff ;) ), I simply don't know.
def _get_bssid(interface)
# Copied from wireless.h
# supposing a 16 byte address and 32 byte buffer but I'm totally
# guessing here.
iwreq = [interface, '' * 48,0].pack('a*pI')
sock = Socket.new(Socket::AF_INET, Socket::SOCK_DGRAM, 0)
# from wireless.h
# SIOCGIWAP 0x8B15 /* get access point MAC addresses */
sock.ioctl('0x8B15', iwreq) # always get an error: Can't convert string to Integer
puts iwreq.inspect
end
So, in the meantime, I'm using a wpa_cli method for grabbing the BSSID but I'd prefer to use IOCTL:
def _wpa_status(interface)
wpa_data = nil
unless interface.nil?
# need to write a method to get the src_sock_path
# programmatically. Fortunately, for me
# this is going to be the correct sock path 99% of the time.
# Ideas to get programmatically would be:
# parse wpa_supplicant.conf
# check process table | grep wpa_suppl | parse arguments
src_sock_path = '/var/run/wpa_supplicant/' + interface
else
return nil
end
client_sock_path = '/var/run/hwinfo_wpa'
# open Domain socket
socket = Socket.new(Socket::AF_UNIX, Socket::SOCK_DGRAM, 0)
begin
# bind client domain socket
socket.bind(Socket.pack_sockaddr_un(client_sock_path))
# connect to server with our client socket
socket.connect(Socket.pack_sockaddr_un(src_sock_path))
# send STATUS command
socket.send('STATUS', 0)
# receive 1024 bytes (totally arbitrary value)
# split lines by \n
# store in variable wpa_data.
wpa_data = socket.recv(1024)
rescue => e
$stderr.puts 'WARN: unable to gather wpa data: ' + e.inspect
end
# close or next time we attempt to read it will fail.
socket.close
begin
# remove the domain socket file for the client
File.unlink(client_sock_path)
rescue => e
$stderr.puts 'WARN: ' + e.inspect
end
unless wpa_data.nil?
#wifis = Hash[wpa_data.split(/\n/).map\
{|line|
# first, split into pairs delimited by '='
key,value = line.split('=')
# if key is camel-humped then put space in front
# of capped letter
if key =~ /[a-z][A-Z]/
key.gsub!(/([a-z])([A-Z])/,'\\1_\\2')
end
# if key is "id" then rename it.
key.eql?('id') && key = 'wpa_id'
# fix key so that it can be used as a table name
# by replacing spaces with underscores
key.gsub!(' ','_')
# lower case it.
key.downcase!
[key,value]
}]
end
end
EDIT:
So far nobody has been able to answer this question. I think I'm liking the wpa method better anyway because I'm getting more data from it. That said, one call-out I'd like to make is if anyone uses the wpa code, be aware that it will require escalated privileges to read the wlan socket.
EDIT^2 (full code snippet):
Thanks to #dasup I've been able to re-factor my class to correctly pull the bssid and essids using system ioctls. (YMMV given the implementation, age, and any other possible destabilization thing to your Linux distribution - the following code snippet works with the 3.2 and 3.7 kernels though.)
require 'socket'
class Wpa
attr_accessor :essid, :bssid, :if
def initialize(interface)
#if = interface
puts 'essid: ' + _get_essid.inspect
puts 'bssid: ' + _get_bssid.inspect
end
def _get_essid
# Copied from wireless.h
iwreq = [#if, " " * 32, 32, 0 ].pack('a16pII')
sock = Socket.new(Socket::AF_INET, Socket::SOCK_DGRAM, 0)
sock.ioctl(0x8B1B, iwreq)
#essid = iwreq.unpack('#16p').pop.strip
end
def _get_bssid
# Copied from wireless.h
# supposing a 16 byte address and 32 byte buffer but I'm totally
# guessing here.
iwreq = [#if, "\0" * 32].pack('a16a32')
sock = Socket.new(Socket::AF_INET, Socket::SOCK_DGRAM, 0)
# from wireless.h
# SIOCGIWAP 0x8B15 /* get access point MAC addresses */
sock.ioctl(0x8B15, iwreq) # always get an error: Can't convert string to Integer
#bssid = iwreq.unpack('#18H2H2H2H2H2H2').join(':')
end
end
h = Wpa.new('wlan0')
I'm not very much familiar with Ruby, but I spotted two mistakes:
The hex number for SIOCGIWAP should be given without quotes/ticks.
The initialization of the data buffer ends up with some trailing bytes after the interface name (debugged using gdb). The initialization given below works.
Be aware that your code will break if any of the data structures or constants change (IFNAMSIZ, sa_family, struct sockaddr etc.) However, I don't think that such changes are likely anytime soon.
require 'socket'
def _get_bssid(interface)
# Copied from wireless.h
# supposing a 16 byte address and 32 byte buffer but I'm totally
# guessing here.
iwreq = [interface, "\0" * 32].pack('a16a32')
sock = Socket.new(Socket::AF_INET, Socket::SOCK_DGRAM, 0)
# from wireless.h
# SIOCGIWAP 0x8B15 /* get access point MAC addresses */
sock.ioctl(0x8B15, iwreq) # always get an error: Can't convert string to Integer
puts iwreq.inspect
end
You'll get back an array/buffer with:
The interface name you sent, padded with 0x00 bytes to a total length of 16 bytes.
Followed by a struct sockaddr, i.e. a two-byte identifier 0x01 0x00 (coming from ARPHRD_ETHER?) followed by the BSSID padded with 0x00 bytes to a total length of 14 bytes.
Good luck!

Zlib inflate error

I am trying to save compressed strings to a file and load them later for use in the game. I kept getting "in 'finish': buffer error" errors when loading the data back up for use. I came up with this:
require "zlib"
def deflate(string)
zipper = Zlib::Deflate.new
data = zipper.deflate(string, Zlib::FINISH)
end
def inflate(string)
zstream = Zlib::Inflate.new
buf = zstream.inflate(string)
zstream.finish
zstream.close
buf
end
setting = ["nothing","nada","nope"]
taggedskills = ["nothing","nada","nope","nuhuh"]
File.open('testzip.txt','wb') do |w|
w.write(deflate("hello world")+"\n")
w.write(deflate("goodbye world")+"\n")
w.write(deflate("etc")+"\n")
w.write(deflate("etc")+"\n")
w.write(deflate("Setting: name "+setting[0]+" set"+(setting[1].class == String ? "str" : "num")+" "+setting[1].to_s)+"\n")
w.write(deflate("Taggedskill: "+taggedskills[0]+" "+taggedskills[1]+" "+taggedskills[2]+" "+taggedskills[3])+"\n")
w.write(deflate("etc")+"\n")
end
File.open('testzip.txt','rb') do |file|
file.each do |line|
p inflate(line)
end
end
It was throwing errors at the "Taggedskill:" point. I don't know what it is, but trying to change it to "Skilltag:", "Skillt:", etc. continues to throw a buffer error, while things like "Setting:" or "Thing:" work fine, while changing the setting line to "Taggedskill:" continues to work fine. What is going on here?
In testzip.txt, you are storing newline separated binary blobs. However, binary blobs may contain newlines by themselves, so when you open testzip.txt and split it by line, you may end up splitting one binary blob that inflate would understand, into two binary blobs that it does not understand.
Try to run wc -l testzip.txt after you get the error. You'll see the file contains one more line, than the number of lines you are putting in.
What you need to do, is compress the whole file at once, not line by line.

How do I block on reading a named pipe in Ruby?

I'm trying to set up a Ruby script that reads from a named pipe in a loop, blocking until input is available in the pipe.
I have a process that periodically puts debugging events into a named pipe:
# Open the logging pipe
log = File.open("log_pipe", "w+") #'log_pipe' created in shell using mkfifo
...
# An interesting event happens
log.puts "Interesting event #4291 occurred"
log.flush
...
I then want a separate process that will read from this pipe and print events to the console as they happen. I've tried using code like this:
input = File.open("log_pipe", "r+")
while true
puts input.gets #I expect this to block and wait for input
end
# Kill loop with ctrl+c when done
I want the input.gets to block, waiting patiently until new input arrives in the fifo; but instead it immediately reads nil and loops again, scrolling off the top of the console window.
Two things I've tried:
I've opened the input fifo with both "r" and "r+"--I have the same problem either way;
I've tried to determine if my writing process is sending EOF (which I've heard will cause the read fifo to close)--AFAIK it isn't.
SOME CONTEXT:
If it helps, here's a 'big picture' view of what I'm trying to do:
I'm working on a game that runs in RGSS, a Ruby based game engine. Since it doesn't have good integrated debugging, I want to set up a real-time log as the game runs--as events happen in the game, I want messages to show up in a console window on the side. I can send events in the Ruby game code to a named pipe using code similar to the writer code above; I'm now trying to set up a separate process that will wait for events to show up in the pipe and show them on the console as they arrive. I'm not even sure I need Ruby to do this, but it was the first solution I could think of.
Note that I'm using mkfifo from cygwin, which I happened to have installed anyway; I wonder if that might be the source of my trouble.
If it helps anyone, here's exactly what I see in irb with my 'reader' process:
irb(main):001:0> input = File.open("mypipe", "r")
=> #<File:mypipe>
irb(main):002:0> x = input.gets
=> nil
irb(main):003:0> x = input.gets
=> nil
I don't expect the input.gets at 002 and 003 to return immediately--I expect them to block.
I found a solution that avoids using Cygwin's unreliable named pipe implementation entirely. Windows has its own named pipe facility, and there is even a Ruby Gem called win32-pipe that uses it.
Unfortunately, there appears to be no way to use Ruby Gems in an RGSS script; but by dissecting the win32-pipe gem, I was able to incorporate the same idea into an RGSS game. This code is the bare minimum needed to log game events in real time to a back channel, but it can be very useful for deep debugging.
I added a new script page right before 'Main' and added this:
module PipeLogger
# -- Change THIS to change the name of the pipe!
PIPE_NAME = "RGSSPipe"
# Constant Defines
PIPE_DEFAULT_MODE = 0 # Pipe operation mode
PIPE_ACCESS_DUPLEX = 0x00000003 # Pipe open mode
PIPE_UNLIMITED_INSTANCES = 255 # Number of concurrent instances
PIPE_BUFFER_SIZE = 1024 # Size of I/O buffer (1K)
PIPE_TIMEOUT = 5000 # Wait time for buffer (5 secs)
INVALID_HANDLE_VALUE = 0xFFFFFFFF # Retval for bad pipe handle
#-----------------------------------------------------------------------
# make_APIs
#-----------------------------------------------------------------------
def self.make_APIs
$CreateNamedPipe = Win32API.new('kernel32', 'CreateNamedPipe', 'PLLLLLLL', 'L')
$FlushFileBuffers = Win32API.new('kernel32', 'FlushFileBuffers', 'L', 'B')
$DisconnectNamedPipe = Win32API.new('kernel32', 'DisconnectNamedPipe', 'L', 'B')
$WriteFile = Win32API.new('kernel32', 'WriteFile', 'LPLPP', 'B')
$CloseHandle = Win32API.new('kernel32', 'CloseHandle', 'L', 'B')
end
#-----------------------------------------------------------------------
# setup_pipe
#-----------------------------------------------------------------------
def self.setup_pipe
make_APIs
##name = "\\\\.\\pipe\\" + PIPE_NAME
##pipe_mode = PIPE_DEFAULT_MODE
##open_mode = PIPE_ACCESS_DUPLEX
##pipe = nil
##buffer = 0.chr * PIPE_BUFFER_SIZE
##size = 0
##bytes = [0].pack('L')
##pipe = $CreateNamedPipe.call(
##name,
##open_mode,
##pipe_mode,
PIPE_UNLIMITED_INSTANCES,
PIPE_BUFFER_SIZE,
PIPE_BUFFER_SIZE,
PIPE_TIMEOUT,
0
)
if ##pipe == INVALID_HANDLE_VALUE
# If we could not open the pipe, notify the user
# and proceed quietly
print "WARNING -- Unable to create named pipe: " + PIPE_NAME
##pipe = nil
else
# Prompt the user to open the pipe
print "Please launch the RGSSMonitor.rb script"
end
end
#-----------------------------------------------------------------------
# write_to_pipe ('msg' must be a string)
#-----------------------------------------------------------------------
def self.write_to_pipe(msg)
if ##pipe
# Format data
##buffer = msg
##size = msg.size
$WriteFile.call(##pipe, ##buffer, ##buffer.size, ##bytes, 0)
end
end
#------------------------------------------------------------------------
# close_pipe
#------------------------------------------------------------------------
def self.close_pipe
if ##pipe
# Send kill message to RGSSMonitor
##buffer = "!!GAMEOVER!!"
##size = ##buffer.size
$WriteFile.call(##pipe, ##buffer, ##buffer.size, ##bytes, 0)
# Close down the pipe
$FlushFileBuffers.call(##pipe)
$DisconnectNamedPipe.call(##pipe)
$CloseHandle.call(##pipe)
##pipe = nil
end
end
end
To use this, you only need to make sure to call PipeLogger::setup_pipe before writing an event; and call PipeLogger::close_pipe before game exit. (I put the setup call at the start of 'Main', and add an ensure clause to call close_pipe.) After that, you can add a call to PipeLogger::write_to_pipe("msg") at any point in any script with any string for "msg" and write into the pipe.
I have tested this code with RPG Maker XP; it should also work with RPG Maker VX and later.
You will also need something to read FROM the pipe. There are any number of ways to do this, but a simple one is to use a standard Ruby installation, the win32-pipe Ruby Gem, and this script:
require 'rubygems'
require 'win32/pipe'
include Win32
# -- Change THIS to change the name of the pipe!
PIPE_NAME = "RGSSPipe"
Thread.new { loop { sleep 0.01 } } # Allow Ctrl+C
pipe = Pipe::Client.new(PIPE_NAME)
continue = true
while continue
msg = pipe.read.to_s
puts msg
continue = false if msg.chomp == "!!GAMEOVER!!"
end
I use Ruby 1.8.7 for Windows and the win32-pipe gem mentioned above (see here for a good reference on installing gems). Save the above as "RGSSMonitor.rb" and invoke it from the command line as ruby RGSSMonitor.rb.
CAVEATS:
The RGSS code listed above is fragile; in particular, it does not handle failure to open the named pipe. This is not usually an issue on your own development machine, but I would not recommend shipping this code.
I haven't tested it, but I suspect you'll have problems if you write a lot of things to the log without running a process to read the pipe (e.g. RGSSMonitor.rb). A Windows named pipe has a fixed size (I set it here to 1K), and by default writes will block once the pipe is filled (because no process is 'relieving the pressure' by reading from it). Unfortunately, the RPGXP engine will kill a Ruby script that has stopped running for 10 seconds. (I'm told that RPGVX has eliminated this watchdog function--in which case, the game will hang instead of abruptly terminating.)
What's probably happening is the writing process is exiting, and as there are no other writing processes, EOF is sent to the pipe which causes gets to return nil, and so your code loops continually.
To get around this you can usually just open the pipe read-write at the reader end. This works for me (on a Mac), but isn't working for you (you've tried "r" and "r+"). I'm guessing this is to due with Cygwin (POSIX says opening a FIFO read-write is undefined).
An alternative is to open the pipe twice, once read-only and once write-only. You don't use the write-only IO for anything, it's just so that there's always an active writer attached to the pipe so it doesn't get closed.
input = File.open("log_pipe", "r") # note 'r', not 'r+'
keep_open = File.open("log_pipe", "w") # ensure there's always a writer
while true
puts input.gets
end

Resources