prevalence of UCS2 vs UCS4 python chars - python-2.x

builtin functions - unichr says
The valid range for the argument depends how Python was configured – it may be either UCS2 [0..0xFFFF] or UCS4 [0..0x10FFFF]
and
builtin functions - ord says
If a unicode argument is given and Python was built with UCS2 Unicode, then the character’s code point must be in the range [0..65535] inclusive; otherwise the string length is two, and a TypeError will be raised.
Are there any stats on how widely used the two definitions of code-unit are in production python interpreters?
Any idea how prevalent python scripts are that use something like #!/usr/bin/env python and which run with different code-unit definitions based on the environment of the user running it?
Background:
I'm wondering how much work to put into making a parser generator backend for python 2.x produce a single library that works for both configurations given that Python 3 tightened this.
Specifically, am I likely bloating the generated code bundle unnecessarily by doing
# Module my_generated_parser
try
unichr(0x10000)
except ValueError:
from my_generated_parser_ucs2 import *
else:
from my_generated_parser_ucs4 import *
and including two generated parsers by default?

Related

Parallel processing in Julia throws errors

My understanding is that parallelization is included by default in a base Julia installation.
However, when I try to use it, I am getting errors that the functions and macros are not defined. For example:
nprocs()
Throws an error:
ERROR: UndefVarError: nprocs not defined
Stacktrace:
[1] top-level scope at none:0
Nowhere in any Julia documentation can I find mention of any packages that need to be included in order to use these functions. Am I missing something here?
I am using Julia version 1.0.5 inside the JuliaPro/Atom IDE
I figured it out. I'll leave this up for anyone else who is having this problem.
The solution is to import the Distributed package using:
using Distributed
Why this is not included in the documentation I do not know.
Once you know that nproc needs to be used, there exist a couple of options to find where it is defined.
A search through the documentation can help: https://docs.julialang.org/en/v1/search/?q=nprocs
Without leaving the Julia REPL, and even before nprocs gets imported in your session, you can use apropos in order to find more about it and determine that it is needed to import the Distributed package:
julia> apropos("nprocs")
Distributed.nprocs
Distributed.addprocs
Distributed.nworkers
An other way of using apropos is via the help REPL mode:
julia> # type `?` when the cursor is right after the prompt to enter help REPL mode
# note the use of double quotes to trigger "apropos" instead of a regular help query
help?> "nprocs"
Distributed.nprocs
Distributed.addprocs
Distributed.nworkers
Previous options work well in the case of nprocs because it is part of the standard library. JuliaHub is another option which allows looking for things more broadly, in the entire Julia ecosystem. As an example, looking for nprocs in JuliaHub's "Doc Search" tool also returns relevant results: https://juliahub.com/ui/Documentation?q=nprocs

Retrieve hostname -f within YAML(yml)

I've got a config .cfg file that has the hostname hard coded in it. I'm trying to find a way for the hostname to be gotten locally (dynamically) by running a command similar to hostname -f to have it configure the variable in the .cfg, without running a script, like python, to write the config file ahead time. Is it possible to run a 'yum' command that gets the hostname to use in the YAML/yml file?
Thanks to Wikipedia, I think I found out why no one is helping me with this:
Wiki YAML -> Security
Security
YAML is purely a data representation language and thus has no executable commands. While validation and safe parsing is inherently possible in any data language, implementation is such a notorious pitfall that YAML's lack of an associated command language may be a relative security benefit.
However, YAML allows language-specific tags so that arbitrary local objects can be created by a parser that supports those tags. Any YAML parser that allows sophisticated object instantiation to be executed opens the potential for an injection attack. Perl parsers that allow loading of objects of arbitrary class create so-called "blessed" values. Using these values may trigger unexpected behavior, e.g. if the class uses overloaded operators. This may lead to execution of arbitrary Perl code.
The situation is similar for Python or Ruby parsers. According to the PyYAML documentation

Golang complex fold grüßen

I'm trying to get case folding to be consistent between three languages (C++, Python and Golang) because I need to be able to check if a string matches the one saved no matter the language.
An example problematic word is the German word "grüßen" which in uppercase is "GRÜSSEN" (Note the 'ß' becomes two characters as 'SS').
C++ works well using boost::locale text conversion docs
Python 3 also works through str.casefold() casefold docs
However, Golang doesn't seem to have a way to do proper case folding. golang
playground example
Is there some way to do this that I'm missing, or does this bug at the end of unicode's documentation apply to all usages of text conversion in golang? If so, what are my options for case folding other than writing it in cgo?
Advanced (Unicode-enabled) text processing is not part of the Go stdlib,¹
and exists in the form of a host of ("blessed") third-party packages
under the golang.org/x/text/ umbrella.
As Shawn figured out by himself, one can do
import (
"golang.org/x/text/cases"
)
c := cases.Fold()
c.String("grüßen")
to get "grüssen" back.
¹ That's because whatever is shipped in the stdlib is subject to the
Go 1 compatibility promise,
and at the time Go 1 was shipped certain functionality wasn't available
or was incomplete or its APIs were in flux etc, so such bits were kept out
of the core to let them mature.

Converting a working code from double-precision to quadruple-precision: How to read quadruple-precision numbers in FORTRAN from an input file

I have a big, old, FORTRAN 77 code that has worked for many, many years with no problems.
Double-precision is not enough anymore, so to convert to quadruple-precision I have:
Replaced all occurrences of REAL*8 to REAL*16
Replaced all functions like DCOS() into functions like COS()
Replaced all built-in numbers like 0.d0 to 0.q0 , and 1D+01 to 1Q+01
The program compiles with no errors or warnings with the gcc-4.6 compiler on
operating system: openSUSE 11.3 x86_64 (a 64-bit operating system)
hardware: Intel Xeon E5-2650 (Sandy Bridge)
My LD_LIBRARY_PATH variable is set to the 64-bit library folder:
/gcc-4.6/lib64
The program reads an input file that has numbers in it.
Those numbers used to be of the form 1.234D+02 (for the double-precision version of the code, which works).
I have changed them so now that number is 1.234Q+02 , however I get the runtime error:
Bad real number in item 1 of list input
Indicating that the subroutine that reads in the data from the input file (called read.f) does not find the first number in the inputfile, to be compatible with what it expected.
Strangely, the quadruple-precision version of the code does not complain when the input file contains numbers like 1.234D+02 or 123.4 (which, based on the output seems to automatically be converted to the form 1.234D+02 rather than Q+02), it just does not like the Q+02 , so it seems that gcc-4.6 does not allow quadruple-precision numbers to be read in from input files in scientific notation !
Has anyone ever been able to read from an input file a quadruple-precision number in scientific notation (ie, like 1234Q+02) in FORTRAN with a gcc compiler, and if so how did you get it to work ? (or did you need a different compiler/operating system/hardware to get it to work ?)
Almost all of this is already in comments by #IanH and #Vladimi.
I suggest mixing in a little Fortran 90 into your FORTRAN 77 code.
Write all of your numbers with "E". Change your other program to write the data this way. Don't bother with "D" and don't try to use the infrequently supported "Q". (Using "Q" in constants in source code is an extension of gfortran -- see 6.1.8 in manual.)
Since you want the same source code to support two precisions, at the top of the program, have:
use ISO_FORTRAN_ENV
WP = real128
or
use ISO_FORTRAN_ENV
WP = real64
as the variation that changes whether your code is using double or quadruple precision. This is using the ISO Fortran Environment to select the types by their number of bits. (use needs to between program and implicit none; the assignment statement after implicit none.)
Then declare your real variables via:
real (WP) :: MyVar
In source code, write real constants as 1.23456789012345E+12_WP. The _type is the Fortran 90 way of specifying the type of a constant. This way you can go back and forth between double and quadruple precision by only changing the single line defining WP
WP == Working Precision.
Just use "E" in input files. Fortran will read according to the type of the variable.
Why not write a tiny test program to try it out?

Why do people say that Java can't have an expression evaluator?

I am aware that by default Java does not have the so-called eval (what I pronounce as "evil") method. This sounds like a bad thing—knowing you do not have something which so many others do. But even worse seems being notified that you can't have it.
My question is: What is solid reasoning behind it? I mean, Google'ing this just returns a massive amount of old data and bogus reasons—even if there is an answer that I'm looking for, I can't filter it from people who are just throwing generic tag-words around.
I'm not interested in answers that are telling me how to get around that; I can do that myself:
Using Bean Scripting Framework (BSF)
File sample.py (in py folder) contents:
def factorial(n):
return reduce(lambda x, y:x * y, range(1, n + 1))
And Java code:
ScriptEngine engine = new ScriptEngineManager().getEngineByName("jython");
engine.eval(new FileReader("py" + java.io.File.separator + "sample.py"));
System.out.println(engine.eval("factorial(932)"));
Using designed bridges like JLink
This is equivalent to:
String expr = "N[Integrate[E^(2 y^5)/(2 x^3), {x, 4, 7}, {y, 2, 3}]]";
System.out.println(MM.Eval(expr));
//Output: 1.5187560850359461*^206 + 4.2210685420287355*^190*I
Other methods
Using Dijkstras shunting-yard algorithm or alike and writing an expression evaluator from scratch.
Using complex regex and string manipulations with delegates and HashMultimaps.
Using Java Expressions Library
Using Java Expression Language
Using JRE compliant scripting language like BeanShell.
Using the Java Assembler and approach below or direct bytecode manipulation like Javaassist.
Using the Java Compiler API and reflections.
Using Runtime.getRuntime().exec as root
"eval" is only available in scripting languages, because it uses the same interpreter that runs the rest of the code; in such languages the feature is free and well integrated, as in scripting environment it makes little difference if you run a string or a "real" function.
In copiled languages, adding "eval" would mean bundling the whole compiler - which would defy the purpose of compiling. No compiled language I know (even dynamic ones, like ActionScrip3) has eval.
Incidentally, the easiest way to eval in Java is the one you forgot to mention: JRE 1.6 comes with Javascript engine, so you can eval any Javascript in two lines of code. You could even argue that the presuposition of your question is false. Java 1.6 bundles a very advanced expression evaluator.
As Daniel points out there is at least one limitation that eval-solutions face in java. The php eval for example executes the code as if it was part of the surrounding method with complete access to local variables, this is not possible to do in standard java. Without this feature eval alternatives require a lot more work and verbosity, which makes them a lot less attractive for "quick" and "easy" solutions.
eval() is mostly part of interpreted languages where the names of local variables and code structure(scopes) are available at runtime, making it possible to "insert" new code. Java bytecode no longer contains this information leaving eval() alternatives unable to map access to local variables. (Note: I ignore debug information as no program should rely on it and it may not be present)
An example
int i = 0;
eval("i = 1");
System.out.println(i);
required pseudocode for java
context.put("i",new Integer(0));
eval(context,"i = 1");
System.out.println(context.get("i"));
This looks nice for one variable used in the eval, try it for 10 in a longer method and you get 20 additional lines for variable access and the one or other runtime error if you forget one.
Because evaluation of arbitrary Java expressions depends on the context of it, of variable scopes etc.
If you need some kind of variable expression, just use the scripting framework, and badamm! you have lots of different kinds of expression evaluation. Just take one kind like JavaScript as a default, and there is your eval()!
Enterprisy as Java is, you are not constrained to one choice.
But even worse seems being notified that you can't have it.
I think you are misunderstanding what (most of) those articles are saying. Clearly, there are many ways to do expression evaluation in a Java application. They haven't always been available, but at least some of them have been around for a long time.
I think what people are trying to say is that expression evaluation is not available as native (i.e. as an intrinsic part of Java or the standard libraries) and is unlikely to be added for a number of good reasons. For example:
Native eval would have significant security issues if used in the wrong place. (And it does for other languages; e.g. you shouldn't use eval in Javascript to read JSON because it can be a route for injecting bad stuff into the user's browser.)
Native eval would have significant performance issues, compared with compiled Java code. We are talking of 100 to 10,000 times slower, depending on the implementation techniques and the amount of caching of "compiled" eval expressions.
Native eval would introduce a whole stack of reliability issues ... much as overuse / misuse of type casting and reflection to.
Native eval is "not Java". Java is designed to be a primarily static programming language.
and of course ...
There are other ways to do this, including all of the implementation approaches that you listed. The Java SE platform is not in the business of providing every possible library that anyone could possibly want. (JRE downloads are big enough already.)
For these reasons, and probably others as well, the Java language designers have decided not to support expression evaluation natively in Java SE. (Even so, some expression support has officially made it into Java EE; e.g. in the form of JSP Expression Language. The classes are in the javax.el package ... or javax.servlet.jsp.el for an older / deprecated version.)
I think you already put the solution to your answer - bundle the BeanShell jar with your application (or lobby for it to be included in the JRE sometime), and you have your Java expression evaluator. It will still need a Binding of the input variables, though.
(What I'm more curious about: How does sandboxing of such a script/expression work? I don't want my web users to execute dangerous code in my server.)

Resources