EBNF Syntax for imaginary language - ebnf

In school we have been studying metalanguages, in particular, railroad diagrams and EBNF. I received a question where an imaginary programming language (winston) was described in EBNF. Here it is:
Digit = 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
LCase = a | b | c | d
UCase = A | B | C | D | E | F | G | H | I | J
Operator = + | - | * | /
Logical = < | > | <= | >= | <>
Constant = [-] <Digit>{<Digit>}
Identifier = <UCase>{<LCase> | <Digit>}
Assignment = Set <Identifier> to <Constant> | <Identifier
{<Operator>(<Constant> | <Identifier>)}
Condition = <Identifier> <Logical> (<Identifier> | <Constant>)
{(and | or) <Identifier> <Logical> (<Identifier> | <Constant>)}
When = (<Assignment> | <Condition> {<Assignment> | <Condition>})
Statement = <Input> | <Output> | <Assignment> | <Condition> | <When> | <Pretest> | <Posttest>
Program = Start <Statement> {! <Statement>} Stop
The program written below was made with winston but doesn't execute properly. Use the EBNF descriptions to identify the error.
Start
Input J1
Input J2
When (J1 = J2, Set A3 to 0), (J1 < J2, Set A3 to -1), Set A3 to 1
Output A3
Stop
My working so far: To me, this program seems legitimate. It is a program so if must start with "start" and end with "stop", which it does. The statements in the middle seemingly are allowed to be in there. Can someone point me in the right direction?
Also, can someone tell me what it means in the EBNF description of a program what this means:<statement>
I think it means statements like when and if but Im not too sure. Thanks for the help :)

When is comma-separated and the grammar doesn't specify commas at all.
J1 = J2 -- there is no = comparison op in the grammar (see Logical), so J1 = J2 is neither Assignment, nor Condition and is thus invalid.
<statement> -- the grammar wraps symbols in angle brackets on the right-hand side, e.g. Identifier on the left-hand and, later, <Identifier> in Assignment rule -- that doesn't look like a valid EBNF.

Related

How avoid a dataframe.collect() in a pyspark request for better perform

I have to need find evenements in two dataframe (sources and sons). I look for the KeySource == Key in an time interval before son times (the nearest time if son is unique.) I use a dataframe.collect() to compare first dataframe line to the second. Normally, the code (reproduced from my human memory) do the job, but it's very slow even for small tables. I'm new in spark. May I improve performance ? My platform is in pyspark 2.3.
Thanks!
More details :
Ev1
| Time | Key | Index |
| -------- | -------- | -------- |
| t1 | k1 | i1 |
| t2 | k2 | i2 |
| t3 | k3 | i3 |
| t4 | k1 | i4 |
Ev2
| Time | KeySource| Index |
| -------- | -------- | -------- |
| t1 | k1 | i1 |
| t2 | k3 | i2 |
| t3 | k1 | i3 |
| t4 | k5 | i4 |
I look for the KeySource == Key in an time interval before Ev2.Time (the nearest time if son is unique.)
DeltaT_max = 30000
Unique_Son = True or False
Ev2 = Ev2.withColumn("tmin", Ev2.Time -lit(DeltaT_max ) )
Ev1 = Ev2.withColumn("Son_Index", lit(None)) )
Ev1 = Ev1.sort(desc("Time"))
Ev2_without_father = dict()
for ev2 in Ev2.select("Time", "SourceIndex", "Index", "tmin") .collect() :
tmp_Ev1 = Ev1.filter( (Ev1.Key == ev2.KeySource ) & (Ev1.Time >= ev2.tmin) & (Ev1.Time <= ev2.Time) )
if tmp_Ev1.count() > 0 :
# I'm not sure withColumn here are a good idea but my answer is good...
if Unique_Son :
Ev1 = Ev1.withcolumn("Son_Index", when(ev2.Index == tmp_Ev1 .Index, lit(ev2.firs().Index) ).otherwise(Ev2.Son_Index))
else :
Ev1 = Ev1.withcolumn("Son_Index", when(ev2.Index == tmp_Ev1 .Index, lit(ev2.Index) ).otherwise(Ev2.Son_Index))
else :
# rare case but rebuilt a dataframe after for CSV export
Ev2_without_father[ev2.Index] = ev2.KeySource
Ev2 = Ev2.drop("tmin")
Result = Ev1.select("Index", "Son_Index")
Normally, the code (reproduced from my human memory) do the job, but it's very slow even for some hundred lines. I'm new in spark. May I improve performance ? My platform is in pyspark 2.3.
Thanks!

RST/Sphinx create inline target markup inside array

I have an array in sphinx / rst, and I would like to reference a line or cell from other part of my documentation.
How can I create an inline markup reference target in the array?
The array looks like this:
+-----------------+-------------------------+-------------------------------------------------------+
| e | c | p |
+=================+=========================+=======================================================+
| e1 | c1 | p1 |
+-----------------+-------------------------+-------------------------------------------------------+
| e2 | c2 | p2 |
+-----------------+-------------------------+-------------------------------------------------------+
I did not think this was possible, but this worked for me.
+----+----+-------------------------+
| e | c | .. _my-reference-label: |
| | | |
| | | p |
+====+====+=========================+
| e1 | c1 | p1 |
+----+----+-------------------------+
| e2 | c2 | p2 |
+----+----+-------------------------+
and then the link to the target would be:
:ref:`Link title <my-reference-label>`.
The formatting makes the targeted cell larger than it should be, but you can fiddle with the other column widths to get percentage widths close enough.

Print all possible path using haskell

I need to write a program to draw all possible paths in a given matrix that can be had by moving in only left, right and up direction.
One should not cross the same location more than once. Note also that on a particular path, we may or may not use motion in all possible directions.
Path will start in the bottom-left corner in the matrix and will reach the top-right corner.
Following symbols are used to denote the direction of the motion in the current position:
+---+
| > | right
+---+
+---+
| ^ | up
+---+
+---+
| < | left
+---+
The symbol * is used in the final location to indicate end of path.
Example:
For a 5x8 matrix, using left, right and up directions, 2 different paths are shown below.
Path 1:
+---+---+---+---+---+---+---+---+
| | | | | | | | * |
+---+---+---+---+---+---+---+---+
| | | > | > | > | > | > | ^ |
+---+---+---+---+---+---+---+---+
| | | ^ | < | < | | | |
+---+---+---+---+---+---+---+---+
| | > | > | > | ^ | | | |
+---+---+---+---+---+---+---+---+
| > | ^ | | | | | | |
+---+---+---+---+---+---+---+---+
Path 2
+---+---+---+---+---+---+---+---+
| | | | > | > | > | > | * |
+---+---+---+---+---+---+---+---+
| | | | ^ | < | < | | |
+---+---+---+---+---+---+---+---+
| | | | | | ^ | | |
+---+---+---+---+---+---+---+---+
| | | > | > | > | ^ | | |
+---+---+---+---+---+---+---+---+
| > | > | ^ | | | | | |
+---+---+---+---+---+---+---+---+
Can anyone help me with this?
I tried to solve using lists. It i soon realized that i am making a disaster. Here is the code i tried with.
solution x y = travel (1,1) (x,y)
travelRight (x,y) = zip [1..x] [1,1..] ++ [(x,y)]
travelUp (x,y) = zip [1,1..] [1..y] ++ [(x,y)]
minPaths = [[(1,1),(2,1),(2,2)],[(1,1),(1,2),(2,2)]]
travel startpos (x,y) = rt (x,y) ++ up (x,y)
rt (x,y) | odd y = map (++[(x,y)]) (furtherRight (3,2) (x,2) minPaths)
| otherwise = furtherRight (3,2) (x,2) minPaths
up (x,y) | odd x = map (++[(x,y)]) (furtherUp (2,3) (2,y) minPaths)
| otherwise = furtherUp (2,3) (2,y) minPaths
furtherRight currpos endpos paths | currpos == endpos = (travelRight currpos) : map (++[currpos]) paths
| otherwise = furtherRight (nextRight currpos) endpos ((travelRight currpos) : (map (++[currpos]) paths))
nextRight (x,y) = (x+1,y)
furtherUp currpos endpos paths | currpos == endpos = (travelUp currpos) : map (++[currpos]) paths
| otherwise = furtherUp (nextUp currpos) endpos ((travelUp currpos) : (map(++[currpos]) paths))
nextUp (x,y) = (x,y+1)
identify lst = map (map iden) lst
iden (x,y) = (x,y,1)
arrows lst = map mydir lst
mydir (ele:[]) = "*"
mydir ((x1,y1):(x2,y2):lst) | x1==x2 = '>' : mydir ((x2,y2):lst)
| otherwise = '^' : mydir ((x2,y2):lst)
surroundBox lst = map (map createBox) lst
bar = "+ -+"
mid x = "| "++ [x] ++" |"
createBox chr = bar ++ "\n" ++ mid chr ++ "\n" ++ bar ++ "\n"
This ASCII grids are much more confusing than enlightening. Let me describe a better way to represent each possible path.
Each non-top row will have exactly one cell with UP. I claim that once each of the UP cells has been chosen that the LEFT and RIGHT and EMPTY cells can be determined. I claim that all possible cells in each of the non-top rows can be UP in all combination.
Each path is thus isomorphic to a (rows-1) length list of numbers in the range (1..columns) that determine the UP cells. The number of allowed paths is thus columns^(rows-1) and enumerating the possible paths in this format should be easy.
Then you could make a printer that converts this format to the ASCII art. This may be annoying, depending on skill level.
Looks like a homework so I will try to give enough hints
Try first filling number of paths from a cell to your goal.
So
+---+---+---+---+---+---+---+---+
| 1 | 1 | 1 | 1 | 1 | 1 | 1 | * |
+---+---+---+---+---+---+---+---+
| | | | | | | | |
+---+---+---+---+---+---+---+---+
| | | | | | | | |
+---+---+---+---+---+---+---+---+
| | | | | | | | |
+---+---+---+---+---+---+---+---+
| | | | | | | | |
+---+---+---+---+---+---+---+---+
The thing to note here is from the cell in the top level there will always be one path to the *.
Number of possible path from cells in the same row will be same. You can realize this as all the paths will ultimately have to move up as there is no down action so in any path the cell above the current row can be reached by any cell in the current row.
You can feel the all possible paths from the current cell has its relation with the possible paths from the cell left,right and above. But as we know we can find all possible paths from only one cell in a row and rest of cells' possible paths will be some movements in the same row followed by a suffix of possible paths from that cell.
Maybe I will give you a example
+---+---+---+
| 1 | 1 | * |
+---+---+---+
| | | |
+---+---+---+
| | | |
+---+---+---+
You know all possible paths from cells in the first row. You need to find the same in the second row. So a good strategy would be to do it for the right most cell
+---+---+---+
| > | > | * |
+---+---+---+
| ^ | < | < |
+---+---+---+
| | | |
+---+---+---+
+---+---+---+
| | > | * |
+---+---+---+
| | ^ | < |
+---+---+---+
| | | |
+---+---+---+
+---+---+---+
| | | * |
+---+---+---+
| | | ^ |
+---+---+---+
| | | |
+---+---+---+
Now finding this for rest of the cells in the same row is trivial using these as I have told before.
In the end if you have m X n matrix the number of paths from bottom-left corner to top-right corner will be n^(m-1).
Another way
This way is not very optimal but easy to implement. Consider m X n grid
Find the path of longest length. You dont need the exact path just the number of <,>,^.
You can find the direct formula in terms of m and n.
Like
^ = m - 1
< = (n-1) * floor((m-1)/2)
> = (n-1) * (floor((m-1)/2) + 1)
Any valid path will be a prefix of the permutations of this which you can search exhaustively. Use permutations from Data.List to get all possible permutations. Then make a function which given a path strips a valid path from this. map this over the list of permutations and remove duplicates. The thing to note is path will be a prefix of what you get from permutation, so there can be several permutations for the same path.
Can you create that matrix and define the "fields"? Even if you can't (a specific matrix is given), you can map an [(Int, Int)] matrix (which sounds reasonable for this kind of task) to your own representation.
Since you didn't specify what your skill level was, I hope you don't mind that I suggest that you first try to create some kind of a grid in order to have something to work on:
data Status = Free | Left | Right | Up
deriving (Read, Show, Eq)
type Position = (Int, Int)
type Field = (Position, Status)
type Grid = [Field]
grid :: Grid
grid = [((x, y), stat) | x <- [1..10], y <- [1..10], let stat = Free]
Of course there are other ways to achieve this. Afterwards you can define some movement, map Position to Grid index and Statuses to printable characters... Try to fiddle with it and you might get some ideas.

Data management with several variables

Currently I am facing the following problem, which I'm working in Stata to solve. I have added the algorithm tag, because it's mainly the steps that I'm interested in rather than the Stata code.
I have some variables, say, var1 - var20 that can possibly contain a string. I am only interested in some of these strings, let us call them A,B,C,D,E,F, but other strings can occur also (all of these will be denoted X). Also I have a unique identifier ID. A part of the data could look like this:
ID | var1 | var2 | var3 | .. | var20
1 | E | | | | X
1 | | A | | | C
2 | X | F | A | |
8 | | | | | E
Now I want to create an entry for every ID and for every occurrence of one of the strings A,B,C,E,D,F in any of the variables. The above data should look like this:
ID | var1 | var2 | var3 | .. | var20
1 | E | | | .. |
1 | | A | | |
1 | | | | | C
2 | | F | | |
2 | | | A | |
8 | | | | | E
Here we ignore every time there's a string X that is NOT A,B,C,D,E or F. My attempt so far was to create a variable that for each entry counts the number, N, of occurrences of A,B,C,D,E,F. In the original data above that variable would be N=1,2,2,1. Then for each entry I create N duplicates of this. This results in the data:
ID | var1 | var2 | var3 | .. | var20
1 | E | | | | X
1 | | A | | | C
1 | | A | | | C
2 | X | F | A | |
2 | X | F | A | |
8 | | | | | E
My problem is how do I attack this problem from here? And sorry for the poor title, but I couldn't word it any more specific.
Sorry, I thought the finally block was your desired output (now I understand that it's what you've accomplished so far). You can get the middle block with two calls to reshape (long, then wide).
First I'll generate data to match yours.
clear
set obs 4
* ids
generate n = _n
generate id = 1 in 1/2
replace id = 2 in 3
replace id = 8 in 4
* generate your variables
forvalues i = 1/20 {
generate var`i' = ""
}
replace var1 = "E" in 1
replace var1 = "X" in 3
replace var2 = "A" in 2
replace var2 = "F" in 3
replace var3 = "A" in 3
replace var20 = "X" in 1
replace var20 = "C" in 2
replace var20 = "E" in 4
Now the two calls to reshape.
* reshape to long, keep only desired obs, then reshape to wide
reshape long var, i(n id) string
keep if inlist(var, "A", "B", "C", "D", "E", "F")
tempvar long_id
generate int `long_id' = _n
reshape wide var, i(`long_id') string
The first reshape converts your data from wide to long. The var specifies that the variables you want to reshape to long all start with var. The i(n id) specifies that each unique combination of n and i is a unique observation. The reshape call provides one observation for each n-id combination for each of your var1 through var20 variables. So now there are 4*20=80 observations. Then I keep only the strings that you'd like to keep with inlist().
For the second reshape call var specifies that the values you're reshaping are in variable var and that you'll use this as the prefix. You wanted one row per remaining letter, so I made a new index (that has no real meaning in the end) that becomes the i index for the second reshape call (if I used n-id as the unique observation, then we'd end up back where we started, but with only the good strings). The j index remains from the first reshape call (variable _j) so the reshape already knows what suffix to give to each var.
These two reshape calls yield:
. list n id var1 var2 var3 var20
+-------------------------------------+
| n id var1 var2 var3 var20 |
|-------------------------------------|
1. | 1 1 E |
2. | 2 1 A |
3. | 2 1 C |
4. | 3 2 F |
5. | 3 2 A |
|-------------------------------------|
6. | 4 8 E |
+-------------------------------------+
You can easily add back variables that don't survive the two reshapes.
* if you need to add back dropped variables
forvalues i =1/20 {
capture confirm variable var`i'
if _rc {
generate var`i' = ""
}
}

Regex that matches valid Ruby local variable names

Does anyone know the rules for valid Ruby variable names? Can it be matched using a RegEx?
UPDATE: This is what I could come up with so far:
^[_a-z][a-zA-Z0-9_]+$
Does this seem right?
Identifiers are pretty straightforward. They begin with letters or an underscore, and contain letters, underscore and numbers. Local variables can't (or shouldn't?) begin with an uppercase letter, so you could just use a regex like this.
/^[a-z_][a-zA-Z_0-9]*$/
It's possible for variable names to be unicode letters, in which case most of the existing regexes don't match.
varname = "\u2211" # => "∑"
eval(varname + '= "Tony the Pony"') => "Tony the Pony"
puts varname # => ∑
local_variable_identifier = /Insert large regular expression here/
varname =~ local_variable_identifier # => nil
See also "Fun with Unicode" in either the Ruby 1.9 Pickaxe or at Fun with Unicode.
According to http://rubylearning.com/satishtalim/ruby_names.html a Ruby variable consists of:
A name is an uppercase letter,
lowercase letter, or an underscore
("_"), followed by Name characters
(this is any combination of upper- and
lowercase letters, underscore and
digits).
In addition, global variables begin with a dollar sign, instance variables with a single at-sign, and class variables with two at-signs.
A regular expression to match all that would be:
%r{
(\$|#{1,2})? # optional leading punctuation
[A-Za-z_] # at least one upper case, lower case, or underscore
[A-Za-z0-9_]* # optional characters (including digits)
}x
Hope that helps.
I like #aboutruby's answer, but just to complete it, here's the equivalent using POSIX bracket expressions.
/^[_[:lower:]][_[:alnum:]]*$/
Or, since a-z is actually shorter than [:lower:]:
/^[_a-z][_[:alnum:]]*$/
I think /^(\$){0,1}[_a-zA-Z][a-zA-Z0-9_]*([?!]){0,1}$/ is a bit closer to what you will need...
It depends on whether you want to match method names as well.
If you are trying to match a name that might be encountered in an expression, then it might start with $ and it might end with ? or !. If you know for sure that it is just a local variable then the rule will be much simpler.
i was trying to figure one out for a rails patch, and Matthew Draper wrote this one, using the ruby parser as a reference:
/\A(?![A-Z0-9])(?:[[:alnum:]_]|[^\0-\177])+\z/
And here it is, straight from the horse's mouth. (The horse in this case is the Draft ISO Ruby Specification):
local-variable-identifier → ( lowercase-character | _ ) identifier-character *
identifier-character → lowercase-character | uppercase-character | decimal-digit | _
uppercase-character → A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
lowercase-character → a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z
decimal-digit → 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
In Ruby 1.9, using named groups, you can translate this literally:
local_variable_identifier = %r{
(?<uppercase_character> A | B | C | D | E | F | G | H | I | J | K | L | M
| N | O | P | Q | R | S | T | U | V | W | X | Y | Z
){0}
(?<lowercase_character> a | b | c | d | e | f | g | h | i | j | k | l | m
| n | o | p | q | r | s | t | u | v | w | x | y | z
){0}
(?<decimal_digit> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9){0}
(?<identifier_character> \g<lowercase_character>
| \g<uppercase_character>
| \g<decimal_digit>
| _
){0}
( \g<lowercase_character> | _ ) \g<identifier_character>*
}x
Of course, this is not how you would really write it.

Resources