reading a "." in scheme R5RS - scheme

I need to be able to read user input in scheme for a project. For example, I need to be able to read the string 4 5 * .. I was implementing it using the (read) function but it gives an error when it reads a .. I would use a different symbol but it is specified by the project description. Is there a way to do this?

You cannot use read to input arbitrary text. The read procedure is only meant for inputting "S-expressions", a data format that can be used to represent a superset of Scheme source code expressions.
The reason you cannot read a . via the read procedure is that a period token has a special role in Scheme source: it is used for dotted pair notation. (C1 . C2) is the way that the pair of C1 and C2 is written as an S-expression. Note there is a crucial difference between the single pair (C1 . C2) and the list (C1 C2) (which is made from two pairs); and yet the only difference between the source text is the presence/absence of a single period.
The dotted pair notation is described in section 6.3.2 of the R5RS.
So, as suggested in the comments on your question by Dan D., you should consider using the read-char procedure to consume user input text. It described in section 6.6.2 of the R5RS. It may seem counter-intuitive, since read-char only consumes a single character while read consumes many characters (and builds a potentially large tree of structured data), but the reality is that you can build your own parser on top of read-char, by invoking it repeatedly in a loop, as suggested by Dan D.
In fact, some scheme systems implement read itself by making it a Scheme procedure that invokes read-char. See for example Larceny's reader source code, where read will call get-datum, which calls get-datum-with-source-locations, which calls read-char in a number of places.
Alternatively, you might have other ways of reading input from the user. The read-line procedure is quite common (and its also easy to write on top of read-char). Or you might look into a Parser-Generator (like the one that generated the source code for Larceny's reader, linked above.

Related

Difference between multiple values and plain tuples in Racket?

What is the difference between values and list or cons in Racket or Scheme? When is it better to use one over the other? For example, what would be the disadvantage if quotient/remainder returns (cons _ _) rather than (values _ _)?
Back in 2002 George Caswell asked that question in comp.lang.scheme.
The ensuing thread is long, but has many insights. The discussion
reveals that opinions are divided.
https://groups.google.com/d/msg/comp.lang.scheme/ruhDvI9utVc/786ztruIUNYJ
My answer back then:
> What are the motivations behind Scheme's multiple return values feature?
> Is it meant to reflect the difference in intent, or is there a
> runtime-practical reason?
I imagine the reason being this.
Let's say that need f is called by g. g needs several values from f.
Without multiple value return, f packs the values in a list (or vector),
which is passed to g. g then immediately unpacks the list.
With multple values, the values are just pushed on the stack. Thus no
packing and unpacking is done.
Whether this should be called an optimization hack or not, is up to you.
--
Jens Axel Søgaard
We don't need no side-effecting We don't need no allocation
We don't need no flow control We don't need no special-nodes
No global variables for execution No dark bit-flipping for debugging
Hey! did you leave the args alone? Hey! did you leave those bits alone?
(Chorus) -- "Another Glitch in the Call", a la Pink Floyd
They are semantically the same in Scheme and Racket. In both you need to know how the return looks like to use it.
values is connected to call-with-values and special forms like let-values are just syntax sugar with this procedure call. The user needs to know the form of the result to use call-with-values to make use of the result. A return is often done on a stack and a call is also on a stack. The only reason to favor values in Scheme would be that there are no overhead between the producer return and the consumer call.
With cons (or list) the user needs to know how the data structure of the return looks like. As with values you can use apply instead of call-with-values to do the same thing. As a replacement for let-values (and more) it's easy to make a destructuring-bind macro.
In Common Lisp it's quite different. You can use values always if you have more information to give and the user can still use it as a normal procedure if she only wants to use the first value. Thus for CL you wouldn't need to supply quotient as a variant since quotient/remainder would work just as well. Only when you use special forms or procedures that take multiple values will the fact that the procedure does return more values work the same way as with Scheme. This makes values a better choice in CL than Scheme since you get away with writing one instead of more procedures.
In CL you can access a hash like this:
(gethash 'key *hash* 't)
; ==> T; NIL
If you don't use the second value returned you don't know if T was the default value or the actual value found. Here you see the second value indicating the key was not found in the hash. Often you don't use that value if you know there are only numbers the default value would already be an indication that the key was not found. In Racket:
(hash-ref hash 'key #t)
; ==> #t
In racket failure-result can be a thunk so you get by, but I bet it would return multiple values instead if values did work like in CL. I assume there is more housekeeping with the CL version and Scheme, being a minimalistic language, perhaps didn't want to give the implementors the extra work.
Edit: Missed Alexis' comment on the same topic before posting this
One oft-overlooked practical advantage of using multiple return values over lists is that Racket's compose "just works" with functions that return multiple values:
(define (hello-goodbye name)
(values (format "Hello ~a! " name)
(format "Goodbye ~a." name)))
(define short-conversation (compose string-append hello-goodbye))
> (short-conversation "John")
"Hello John! Goodbye John."
The function produced by compose will pass the two values returned by hello-goodbye as two arguments to string-append. If you're writing code in a functional style with lots of compositions, this is very handy, and it's much more natural than explicitly passing values around yourself with call-with-values and the like.
It's also related to your programming style. If you use values, then it usually means you want to explicitly return n values. Using cons, list or vector usually means you want to return one value which contains something.
There are always pros/cons. For values: It may use less memory on some implemenentations. The caller need to use let-values or other multiple values specific syntax. (I wish I could use just let like CL.)
For cons or other types: You can use let or lambda to receive the returning value. You need to explicitly deconstruct it to get the value you want using car or other procedures.
Which to use and when? Again depending on your programming style and case by case but if the returning value can't be represented in one object (e.g. quotient and remainder), then it might be better to use values to make the procedure's meaning clearer. If the returning value is one object (e.g. name and age for a person), then it might be better to use cons or other constructor (e.g. record).

Pythonesque blocks and postfix expressions

In JavaScript,
f = function(x) {
return x + 1;
}
(5)
seems at a glance as though it should assign f the successor function, but actually assigns the value 6, because the lambda expression followed by parentheses is interpreted by the parser as a postfix expression, specifically a function call. Fortunately this is easy to fix:
f = function(x) {
return x + 1;
};
(5)
behaves as expected.
If Python allowed a block in a lambda expression, there would be a similar problem:
f = lambda(x):
return x + 1
(5)
but this time we can't solve it the same way because there are no semicolons. In practice Python avoids the problem by not allowing multiline lambda expressions, but I'm working on a language with indentation-based syntax where I do want multiline lambda and other expressions, so I'm trying to figure out how to avoid having a block parse as the start of a postfix expression. Thus far I'm thinking maybe each level of the recursive descent parser should have a parameter along the lines of 'we have already eaten a block in this statement so don't do postfix'.
Are there any existing languages that encounter this problem, and how do they solve it if so?
Python has semicolons. This is perfectly valid (though ugly and not recommended) Python code: f = lambda(x): x + 1; (5).
There are many other problems with multi-line lambdas in otherwise standard Python syntax though. It is completely incompatible with how Python handles indentation (whitespace in general, actually) inside expressions - it doesn't, and that's the complete opposite of what you want. You should read the numerous python-ideas thread about multi-line lambdas. It's somewhere between very hard to impossible.
If you want arbitrarily complex compound statements inside lambdas you can't use the existing rules for multi-line expressions even if you made all statements expressions. You'd have to change the indentation handling (see the language reference for how it works right now) so that expressions can also contain blocks. This is hard to do without breaking perfectly fine Python code, and will certainly result in a language many Python programmers will consider worse in several regards: Harder to understand, more complex to implement, permits some stupid errors, etc.
Most languages don't solve this exact problem at all. Most candidates (Scala, Ruby, Lisps, and variants of these three) have explicit end-of-block tokens. I know of two languages that have the same problem, one of which (Haskell) has been mentioned by another answer. Coffeescript also uses indentation without end-of-block tokens. It parses the transliteration of your example correctly. However, I could not find any specification of how or why it does this (and I won't dig through the parser source code). Both differ significantly from Python in syntax as well as design philosophy, so their solution is of little (if any) use for Python.
In Haskell, there is an implicit semicolon whenever you start a line with the same indentation as a previous one, assuming the parser is in a layout-sensitive mode.
More specifically, after a token is encountered that signals the start of a (layout-sensitive) block, the indentation level of the first token of the first block item is remembered. Each line that is indented more continues the current block item; each line that is indented the same starts a new block item, and the first line that is indented less implies the closure of the block.
How your last example would be treated depends on whether the f = is a block item in some block or not. If it is, then there will be an implicit semicolon between the lambda expression and the (5), since the latter is indented the same as the former. If it is not, then the (5) will be treated as continuing whatever block item the f = is a part of, making it an argument to the lamda function.
The details are a bit messier than this; look at the Haskell 2010 report.

Turn string into number in Racket

I used read to get a line from a file. The documentation said read returns any, so is it turning the line to a string? I have problems turning the string "1" to the number 1, or "500.8232" into 500.8232. I am also wondering if Racket can directly read numbers in from a file.
Check out their documentation search, it's complete and accurate. Conversion functions usually have the form of foo->bar (which you can assume takes a foo and returns a bar constructed from it).
You sound like you're looking for a function that takes a string and returns a number, and as it happens, string->number does exist, and does pretty much exactly what you're looking for.
Looks like this was answered in another question:
Convert String to Code in Scheme
NB: that converts any s-expression, not just integers. If you want just integers, try:
string->number
Which is mentioned in
Scheme language: merge two numbers
HTH

Two DrRacket/scheme questions

Programming language: DrRacket/scheme
Hey guys,
I am preparing for my first comp sci midterm, and have two quick questions that I'd love to get some input on:
(1) What exactly is the difference between a data definition and a
structure definition?
I know that for a data definition, I can have something like:
;; a student is a
;; - (make-student ln id height gradyear), where
;; - ln is last name, and
;; - id is ID number, and
;; - height is height in inches, and
;; -gradyear is graduation year
but what is a structure definition?
(2) What exactly are the alphas and betas in contracts that come before functions, i.e.
take : num α-list -> α-list
Thank you in advance!
Quote from How to Design Programs (HtDP):
A DATA DEFINITION states, in a mixture of English and Scheme, how we
intend to use a class of structures and how we construct elements of
this class of data.
Given a problem to solve you as a programmer must decide how your input data is represented as values. In order for others to understand your program it is important to document how this is done in detail.
Some input data are simple and can be represented by a single number (e.g. temperature, pressure, etc).
Other types of data can be represented as fixed number of numbers/strings. (e.g. a cd can be represented as an author name (string), a title (string) and a price (number)). To pack a fixed number of values as one value one can represent this a structure.
If one needs to represent an unknown number of something, say cds, then one must use a list.
The data definition is simply your description of how the data are represented in your program.
To explain what a structure definition is, I'll quote from HtDP:
A STRUCTURE DEFINITION is, as the term says, a new form of definition.
Here >is DrScheme's definition of posn:
(define-struct posn (x y))
Let us look at the cd example again. Since there are no builtin "cd values" in Racket, one must define what a cd value is. This is done with a structure definition:
(define-struct cd (author title price))
After the definition is made one can use make-cd to construct cd values.
In order to explain that autor and title are expected to be strings, and price is expected to be a number, you must write down a data definition that explains how make-cd is supposed to be used.
I forgot to answer your second question:
(2) What exactly are the alphas and betas in contracts that come
before functions, i.e.
take : num α-list -> α-list
The alpha is supposed to be replaced with a type.
If take get a integer-list (list of integers as input) then the output is an integer-list.
If take get a string-list (list of strings as input) then the output is an string-list.
In short if take gets a list of values with some type (alpha) as input, then the output is a list of values with the same type (alsp alpha).
Jens Axel Soegaard's answer is correct, but does not elaborate on the relationship between the two, which I would state as follows.
A DATA DEFINITION describes to the reader how a value is going to be represented using Racket values.
Sometimes, the "built-in" values aren't enough, and we need to define a new kind of data, like the "CD" that Jens refers to. In order to define a new kind of data, we often use a STRUCTURE DEFINITION.
Put differently: some data definitions require structure definitions. Some do not.
If I were to elaborate any more, I would just be badly recapitulating HtDP; if what I've said thus far does not make sense, go read HtDP. :)

Erlang pattern matching bitstrings

I'm writing code to decode messages from a binary protocol. Each message type is assigned a 1 byte type identifier and each message carries this type id. Messages all start with a common header consisting of 5 fields. My API is simple:
decoder:decode(Bin :: binary()) -> my_message_type() | {error, binary()}`
My first instinct is to lean heavily on pattern matching by writing one decode function for each message type and to decode that message type completely in the fun argument
decode(<<Hdr1:8, ?MESSAGE_TYPE_ID_X:8, Hdr3:8, Hdr4:8, Hdr5:32,
TypeXField1:32, TypeXFld2:32, TypeXFld3:32>>) ->
#message_x{hdr1=Hdr1, hdr3=Hdr3 ... fld4=TypeXFld3};
decode(<<Hdr1:8, ?MESSAGE_TYPE_ID_Y:8, Hdr3:8, Hdr4:8, Hdr5:32,
TypeYField1:32, TypeYFld2:16, TypeYFld3:4, TypeYFld4:32
TypeYFld5:64>>) ->
#message_y{hdr1=Hdr1, hdr3=Hdr3 ... fld5=TypeYFld5}.
Note that while the first 5 fields of the messages are structurally identical, the fields after that vary for each message type.
I have roughly 20 message types and thus 20 functions similar to the above. Am I decoding the full message multiple times with this structure? Is it idiomatic? Would I be better off just decoding the message type field in the function header and then decode the full message in the body of the message?
Just to agree that your style is very idiomatic Erlang. Don't split the decoding into separate parts unless you feel it makes your code clearer. Sometimes it can be more logical to do that type of grouping.
The compiler is smart and compiles pattern matching in such a way that it will not decode the message more than once. It will first decode the first two fields (bytes) and then use the value of the second field, the message type, to determine how it is going to handle the rest of the message. This works irrespective of how long the common part of the binary is.
So their is no need to try and "help" the compiler by splitting the decoding into separate parts, it will not make it more efficient. Again, only do it if it makes your code clearer.
Your current approach is idiomatic Erlang, so keep going this direction. Don't worry about performance, Erlang compiler does good work here. If your messages are really exactly same format you can write macro for it but it should generate same code under hood. Anyway using macro usually leads to worse maintainability. Just for curiosity why you are generating different record types when all have exactly same fields? Alternative approach is just translate message type from constant to Erlang atom and store it in one record type.

Resources