How to gsub an unicode 0083 with ruby?

How to gsub an unicode 0083 with ruby? - ruby

I have loaded a string from a html.file, and I have writen it to a yaml file with the plugin ya2yaml:
- title: 'What a wonderful day!'
body: ... # main contents here
and I will load the .yml file by YAML::parse_file method.
but "\n" in the string will cause load problems, so I tried to gsub all "\n" to "", but there is still problems, a char '0083'(I see it in the terminal) still breaks the line, and cause loading problem:
in `load': syntax error on line 32, col 6: ` </strong><br>ok ' (ArgumentError)
from /home/croplio/.rvm/rubies/ruby-1.9.2-preview3/lib/ruby/1.9.1/syck.rb:178:in `parse'
from /home/croplio/.rvm/rubies/ruby-1.9.2-preview3/lib/ruby/1.9.1/syck.rb:203:in `block in parse_file'
from /home/croplio/.rvm/rubies/ruby-1.9.2-preview3/lib/ruby/1.9.1/syck.rb:202:in `open'
So what's wrong with the yaml or the char 0083?
or how can I avoid this problem?

0083 is a unicode character 'NO BREAK HERE'.
I don't know YAML::pars, but maybe you can switch it to use unicodes or use pure ascii codes.

Related

Sphinx-autodoc with napoleon (Google Doc String Style): Warnings and Errors about Block quotes and indention

I am using Sphinx 4.4.0 with napoleon extension (Google Doc String). I have this two problems
ARNING: Block quote ends without a blank line; unexpected unindent.
ERROR: Unexpected indentation.
I found something about it on the internet but can not fit this two my code. My problem is I even do not understand the messages. I do not see where the problem could be.
This is the code:
def read_and_validate_csv(basename, specs_and_rules):
"""Read a CSV file with respect to specifications about format and
rules about valid values.
Hints: Do not use objects of type type (e.g. str instead of "str") when
specificing the column type.
specs_and_rules = {
'TEMPLATES': {
'T1l': ('Int16', [-9, ' '])
},
'ColumnA': 'str',
'ColumnB': ('str', 'no answer'),
'ColumnC': None,
'ColumnD': (
'Int16',
-9, {
'len': [1, 2, (4-8)],
'val': [0, 1, (3-9)]
}
}
Returns:
(pandas.DataFrame): Result.
"""
This are the original messages:
.../bandas.py:docstring of buhtzology.bandas.read_and_validate_csv:11: WARNING: Block quote ends without a blank line; unexpected unindent.
.../bandas.py:docstring of buhtzology.bandas.read_and_validate_csv:15: ERROR: Unexpected indentation.
.../bandas.py:docstring of buhtzology.bandas.read_and_validate_csv:17: ERROR: Unexpected indentation.
.../bandas.py:docstring of buhtzology.bandas.read_and_validate_csv:19: WARNING: Block quote ends without a blank line; unexpected unindent.
.../bandas.py:docstring of buhtzology.bandas.read_and_validate_csv:20: WARNING: Block quote ends without a blank line; unexpected unindent.

reStructuredText is not Markdown, and indentation alone is not enough to demarcate the code block. reStructuredText calls this a literal block. Although the use of :: is one option, you might want to explicitly specify the language (overriding the default) with the use of the code-block directive.
Also I noticed that you have invalid syntax in your code block—a missing ) and extra spaces in your indentation—which could have caused those errors.
Try this.
def read_and_validate_csv(basename, specs_and_rules):
"""Read a CSV file with respect to specifications about format and
rules about valid values.
Hints: Do not use objects of type type (e.g. str instead of "str") when
specificing the column type.
.. code-block:: python
specs_and_rules = {
'TEMPLATES': {
'T1l': ('Int16', [-9, ' '])
},
'ColumnA': 'str',
'ColumnB': ('str', 'no answer'),
'ColumnC': None,
'ColumnD': (
'Int16',
-9, {
'len': [1, 2, (4-8)],
'val': [0, 1, (3-9)]
}
)
}
Returns:
(pandas.DataFrame): Result.
"""

Logstash getting syntax errors after upgrading to 7.13.3

So my company has me upgrading our Logstash version for our repository to 7.13.3 from 6.6.2.
After fixing some of the other errors with the upgrade, it seems the last piece is to change the ruby syntax in the config file.
However, I am not too familiar with the language and not sure why the syntax no longer works.
Here is an example of one of the syntax errors we get from the file.
[2021-07-21T16:10:22,524][ERROR][logstash.javapipeline ][main] Pipeline error {
:pipeline_id=>"main",
:exception=>#<RegexpError: unmatched range specifier in char-class: /(?<ucd_environment_name1>(?<=release_ucd_environment_name:)[\w-.]*)/m>,
:backtrace=>[
"org/jruby/RubyRegexp.java:965:in `initialize'",
"/Users/808451090/Desktop/app/logstash-7.13.3/vendor/bundle/jruby/2.5.0/gems/jls-grok-0.11.5/lib/grok-pure.rb:127:in `compile'",
"/Users/808451090/Desktop/app/logstash-7.13.3/vendor/bundle/jruby/2.5.0/gems/logstash-filter-grok-4.4.0/lib/logstash/filters/grok.rb:282:in `block in register'",
"org/jruby/RubyArray.java:1809:in `each'",
"/Users/808451090/Desktop/app/logstash-7.13.3/vendor/bundle/jruby/2.5.0/gems/logstash-filter-grok-4.4.0/lib/logstash/filters/grok.rb:276:in `block in register'",
"org/jruby/RubyHash.java:1415:in `each'",
"/Users/808451090/Desktop/app/logstash-7.13.3/vendor/bundle/jruby/2.5.0/gems/logstash-filter-grok-4.4.0/lib/logstash/filters/grok.rb:271:in `register'",
"org/logstash/config/ir/compiler/AbstractFilterDelegatorExt.java:75:in `register'",
"/Users/808451090/Desktop/app/logstash-7.13.3/logstash-core/lib/logstash/java_pipeline.rb:228:in `block in register_plugins'",
"org/jruby/RubyArray.java:1809:in `each'",
"/Users/808451090/Desktop/app/logstash-7.13.3/logstash-core/lib/logstash/java_pipeline.rb:227:in `register_plugins'",
"/Users/808451090/Desktop/app/logstash-7.13.3/logstash-core/lib/logstash/java_pipeline.rb:586:in `maybe_setup_out_plugins'",
"/Users/808451090/Desktop/app/logstash-7.13.3/logstash-core/lib/logstash/java_pipeline.rb:240:in `start_workers'",
"/Users/808451090/Desktop/app/logstash-7.13.3/logstash-core/lib/logstash/java_pipeline.rb:185:in `run'",
"/Users/808451090/Desktop/app/logstash-7.13.3/logstash-core/lib/logstash/java_pipeline.rb:137:in `block in start'"
],
"pipeline.sources"=>["/Users/808451090/Desktop/app/logstash-7.13.3/devops-jenkins/jenkins.conf"],
:thread=>"#<Thread:0x4fbac51b run>"
}
The line this error references:
grok {
match => { "message_string" => "(?<ucd_environment_name1>(?<=release_ucd_environment_name:)[\w-.]*)" }
}
There are other lines where this error occurs, but they all have similar syntax to this line so I'm sure I can apply the same change to those ones too.
If anyone can point me to how I can change this line to check for the same expression that this Logstash version accepts syntactically.

The error is
unmatched range specifier in char-class
In character definitions, you can define ranges of characters as e.g. [a-z]. When using a literal dash character there, you have to be careful to either escape it or to make sure it unambiguously defines a single character rather than a range.
In your example, you can just escape the dash in your regex to make sure the dash is regarded as a single possible character:
grok {
match => { "message_string" => "(?<ucd_environment_name1>(?<=release_ucd_environment_name:)[\w\-.]*)" }
}

No such file or directory - ruby

I am trying to read the contents of the file from a local disk as follows :
content = File.read("C:\abc.rb","r")
when I execute the rb file I get an exception as Error: No such file or directory .What am I missing in this?

In a double quoted string, "\a" is a non-printable bel character. Similar to how "\n" is a newline. (I think these originate from C)
You don't have a file with name "C:<BEL>bc.rb" which is why you get the error.
To fix, use single quotes, where these interpolations don't happen:
content = File.read('C:\abc.rb')

content = File.read("C:\/abc.rb","r")

First of all:
Try using:
Dir.glob(".")
To see what's in the directory (and therefore what directory it's looking at).
open("C:/abc.rb", "rb") { |io| a = a + io.read }
EDIT: Unless you're concatenating files together, you could write it as:
data = File.open("C:/abc.rb", "rb") { |io| io.read }

Problems catching unidecoder exceptions

I'm trying out the unidecoder gem and it's giving me problems with some strings:
require 'unidecoder'
str = "\u00A3"
str.to_ascii
#: (C:/Ruby193/lib/ruby/gems/1.9.1/gems/unidecoder-1.1.1/lib/unidecoder/data/x00.yml):
found unknown escape character while parsing a quote d scalar at line
2 column 3
from C:/Ruby193/lib/ruby/1.9.1/psych.rb:203:in parse'
from C:/Ruby193/lib/ruby/1.9.1/psych.rb:203:inparse_stream'
from C:/Ruby193/lib/ruby/1.9.1/psych.rb:151:in parse'
from C:/Ruby193/lib/ruby/1.9.1/psych.rb:127:inload'
from C:/Ruby193/lib/ruby/1.9.1/psych.rb:297:in block in load_file'
from C:/Ruby193/lib/ruby/1.9.1/psych.rb:297:inopen'
from C:/Ruby193/lib/ruby/1.9.1/psych.rb:297:in load_file'
from C:/Ruby193/lib/ruby/gems/1.9.1/gems/unidecoder-1.1.1/lib/unidecoder.rb:8:in
block in '
from C:/Ruby193/lib/ruby/gems/1.9.1/gems/unidecoder-1.1.1/lib/unidecoder.rb:78:in
yield'
from C:/Ruby193/lib/ruby/gems/1.9.1/gems/unidecoder-1.1.1/lib/unidecoder.rb:78:in
default'
from C:/Ruby193/lib/ruby/gems/1.9.1/gems/unidecoder-1.1.1/lib/unidecoder.rb:78:in
decode_char'
from C:/Ruby193/lib/ruby/gems/1.9.1/gems/unidecoder-1.1.1/lib/unidecoder.rb:39:in
block in decode'
from C:/Ruby193/lib/ruby/gems/1.9.1/gems/unidecoder-1.1.1/lib/unidecoder.rb:37:in
gsub'
from C:/Ruby193/lib/ruby/gems/1.9.1/gems/unidecoder-1.1.1/lib/unidecoder.rb:37:in
decode'
from C:/Ruby193/lib/ruby/gems/1.9.1/gems/unidecoder-1.1.1/lib/unidecoder.rb:16:in
to_ascii'
from (irb):21
from C:/Ruby193/bin/irb:12:in'>>
What's worse, I can't catch the error by doing:
foo = str.to_ascii rescue 'x'
Does anyone know what's happening here?

rescue clause with no parameter list, the parameter defaults to StandardError; it looks like unidecoder raises kinda other exception, but the stacktrace seems to be incomplete (it should show the exception type.)

Take a look at "C:/Ruby193/lib/ruby/gems/1.9.1/gems/unidecoder-1.1.1/lib/unidecoder/data/x00.yml". Line 2 is an YAML entry - "\z"，which is not a valid escape sequence in Ruby(but a Regexp anchor to mark the end of string). This might be a bug. You can edit this line to - "\x00".
However, "\u00A3"(£) is not a valid ASCII character, I didn't find the point of encoding it to ASCII.
The exception raised is Psych::SyntaxError, you can catch that specific exception, as #mudasobwa commented.

How to ignore "invalid byte sequence in US-ASCII" with AptanaStudio Ruby Debugger?

I'm not good English speaker, so please forgive my English problem.
I am using AptanaStudio 3 with Ruby 1.9.2 in Windows 7.
When I try to use Win32 memcpy to get string data from shared memory in ruby debugger, this problem occurs.
# encoding: utf-8
require 'windows/file_mapping'
require 'windows/msvcrt/buffer'
require 'windows/handle'
include Windows::FileMapping
include Windows::MSVCRT::Buffer
include Windows::Handle
buf1 = 0.chr * 256
#mh = OpenFileMapping(FILE_MAP_ALL_ACCESS, false, "TAG_NAME")
#address = MapViewOfFile(#mh, FILE_MAP_ALL_ACCESS, 0, 0, 0)
memcpy(buf1, #address, 256)
UnmapViewOfFile(#address)
CloseHandle(#mh)
puts buf1.unpack("Z*")
The problem is below
c:/Ruby192/lib/ruby/1.9.1/syck/rubytypes.rb:151:in `count'
c:/Ruby192/lib/ruby/1.9.1/syck/rubytypes.rb:151:in `is_binary_data?'
C:/Users/Zenbook/SkyDrive/AptanaStudio/workspace/best_practice/test.rb:19:in `<top (required)>'
c:/Ruby192/lib/ruby/gems/1.9.1/gems/ruby-debug-ide-0.4.16/lib/ruby-debug-ide.rb:112:in `debug_load'
c:/Ruby192/lib/ruby/gems/1.9.1/gems/ruby-debug-ide-0.4.16/lib/ruby-debug-ide.rb:112:in `debug_program'
c:/Ruby192/lib/ruby/gems/1.9.1/gems/ruby-debug-ide-0.4.16/bin/rdebug-ide:87:in `<top (required)>'
c:/Ruby192/bin/rdebug-ide:19:in `load'
c:/Ruby192/bin/rdebug-ide:19:in `<main>'
Uncaught exception: invalid byte sequence in US-ASCII
It doesn't occur when I don't use any breakpoint after memcpy or I just get only string length. I mean, when char string size is 256 ,char string length is 12 and I just get 12 bytes, the problem doesn't occur.
I think this is because debugger cannot read the text from char string including uninitialized place.
So I want to ignore the error or allow string to have invalid text.
Would anyone help me?

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

How to gsub an unicode 0083 with ruby? - ruby

0083 is a unicode character 'NO BREAK HERE'. I don't know YAML::pars, but maybe you can switch it to use unicodes or use pure ascii codes.

Related

Sphinx-autodoc with napoleon (Google Doc String Style): Warnings and Errors about Block quotes and indention

Logstash getting syntax errors after upgrading to 7.13.3

No such file or directory - ruby

Problems catching unidecoder exceptions

How to ignore "invalid byte sequence in US-ASCII" with AptanaStudio Ruby Debugger?

Categories

Resources