Can we just take symbol literal as a "interned string" - ruby

Anyone knows if there is similar concept in other popular language compared to symbol literal in Ruby? Can I considered it just as an "Interned String"?

Yes, symbols (sometimes referred to as atoms in other languages) can be considered interned strings.
There's a ton of information on Ruby symbols here: Question - Understanding Symbols In Ruby
And, an afterthought, this question lists many examples of similar concepts in a couple languages:
Lisp and Erlang Atoms, Ruby and Scheme Symbols. How useful are they?

Anyone knows if there is similar concept in other popular language compared to symbol literal in Ruby?
Sure, symbols in Ruby come from symbols in Smalltalk, which in turn gets them from Lisp. Scala also has symbols, and Erlang's atoms are similar. Erlang probably got them from Prolog.
Can I considered it just as an "Interned String"?
You can consider it all sorts of things, but symbols are symbols. They aren't immutable strings or interned strings or whatever ... they are just symbols.

Related

What is the difference between an "interned" and an "uninterned" symbol

What is the difference between an "interned" and an "uninterned" symbol. Is it only Racket that has uninterned symbols or do other dialects of scheme or lisp have them?
Interned symbols are eq? if and only if they have the same name. Uninterned symbols are not eq? to any other symbol, so they are a kind of unique token with an attached string. Interned symbols are the kind that are produced by the default reader. Uninterned symbols can be used as identifiers when generating code in a macro, such an identifier cannot be shadowed by any other identifier. Most Lisp dialects have this concepts, in Scheme it is rarer, since hygienic macros are supposed to reduce its usefulness.
Common Lisp has uninterned symbols. As Juho's answer says, an uninterned symbol is guaranteed not to be equal to any other value.
Common Lisp-style requires uninterned symbols in order to write many macros correctly (particularly macros whose expansion requires introducing and binding new variables), because any interned symbol you use in a macro expansion might capture or shadow a binding in its expansion site.
Scheme's hygienic macro systems, on the other hand, do not have this problem, so a Scheme system does not need to provide uninterned symbols. Still, many of them do. Why? Several reasons:
Some Scheme systems offer a Common Lisp-style defmacro capability.
In others, the hygienic macro system's implementation may use uninterned symbols internally, but the concept of an uninterned symbol may be exposed.
Uninterned symbols can be useful in many programs that use s-expressions to represent a language, and transform this language into another s-expr language. These sorts of tasks often benefit from an ability to generate an identifier guaranteed to be new.
The other two excellent answers nevertheless fail to mention the virtue of interned values generally, which is that they can be compared in constant time. This typically means that these values are represented as pointers to a table without duplicates. In Racket, as of a few months ago[*], other values--floating point values and strings used as literals, for instance--will also be interned. In addition to allowing faster comparisons, I believe that this enables better compile-time optimizations, because these values can be compared for equality without running the code.
Are there other systems that do things like this? I bet there are.
[*] I'm sure someone will correct me if I'm wrong :).

Why doesn't Haskell have symbols (a la ruby) / atoms (a la erlang)?

The two languages where I have used symbols are Ruby and Erlang and I've always found them to be extremely useful.
Haskell does have algebraic datatypes, but I still think symbols would be mighty convenient. An immediate use that springs to mind is that since symbols are isomorphic to integers you can use them where you would use an integral or a string "primary key".
The syntactic sugar for atoms can be minor - :something or <something> is an atom. All atoms are instances of a Type called Atom which derives Show and Eq. You can then use it for more descriptive error codes, for example
type ErrorCode = Atom
type Message = String
data Error = Error ErrorCode Message
loginError = Error :redirect "Please login first"
In this case :redirect is more efficient than using a string ("redirect") and easier to understand than an integer (404).
The benefit may seem minor, but I say it is worth adding atoms as a language feature (or at least a GHC extension).
So why have symbols not been added to the language? Or am I thinking about this the wrong way?
I agree with camccann's answer that it's probably missing mainly because it would have to be baked quite deeply into the implementation and it is of too little use for this level of complication. In Erlang (and Prolog and Lisp) symbols (or atoms) usually serve as special markers and serve mostly the same notion as a constructor. In Lisp, the dynamic environment includes the compiler, so it's partly also a (useful) compiler concept leaking into the runtime.
The problem is the following, symbol interning is impure (it modifies the symbol table). Because we never modify an existing object it is referentially transparent, however, but if implemented naïvely can lead to space leaks in the runtime. In fact, as currently implemented in Erlang you can actually crash the VM by interning too many symbols/atoms (current limit is 2^20, I think), because they can never get garbage collected. It's also difficult to implement in a concurrent setting without a huge lock around the symbol table.
Both problems can be (and have been) solved, however. For example, see Erlang EEP 20. I use this technique in the simple-atom package. It uses unsafePerformIO under the hood, but only in (hopefully) rare cases. It could still use some help from the GC to perform an optimisation similar to indirection shortening. It also uses quite a few IORefs internally which isn't too great for performance and memory usage.
In summary, it can be done but implementing it properly is non-trivial. Compiler writers always weigh the power of a feature against its implementation and maintenance efforts, and it seems like first-class symbols lose out on this one.
I think the simplest answer is that, of the things Lisp-style symbols (which is where both Ruby and Erlang got the idea, I believe) are used for, in Haskell most are either:
Already done in some other fashion--e.g. a data type with a bunch of nullary constructors, which also behave as "convenient names for integers".
Awkward to fit in--things that exist at the level of language syntax instead of being regular data usually have more type information associated with them, but symbols would have to either be distinct types from each other (nearly useless without some sort of lightweight ad-hoc sum type) or all the same type (in which case they're barely different from just using strings).
Also, keep in mind that Haskell itself is actually a very, very small language. Very little is "baked in", and of the things that are most are just syntactic sugar for other primitives. This is a bit less true if you include a bunch of GHC extensions, but GHC with -XAndTheKitchenSinkToo is not the same language as Haskell proper.
Also, Haskell is very amenable to pseudo-syntax and metaprogramming, so there's a lot you can do even without having it built in. Particularly if you get into TH and scary type metaprogramming and whatever else.
So what it mostly comes down to is that most of the practical utility of symbols is already available from other features, and the stuff that isn't available would be more difficult to add than it's worth.
Atoms aren't provided by the language, but can be implemented reasonably as a library:
http://hackage.haskell.org/package/simple-atom
There are a few other libs on hackage, but this one looks the most recent and well-maintained.
Haskell uses type constructors* instead of symbols so that the set of symbols a function can take is closed, and can be reasoned about by the type system. You could add symbols to the language, but it would put you in the same place that using strings would - you'd have to check all possible symbols against the few with known meanings at runtime, add error handling all over the place, etc. It'd be a big workaround for all the compile-time checking.
The main difference between strings and symbols is interning - symbols are atomic and can be compared in constant time. Both are types with an essentially infinite number of distinct values, though, and against the grain of Haskell's specifying arguments and results with finite types.
I'm more familiar with OCaml than Haskell, so "type constructor" may not be the right term. Things like None or Just 3.
An immediate use that springs to mind is that since symbols are isomorphic to integers you can use them where you would use an integral or a string "primary key".
Use Enum instead.
data FileType = GZipped | BZipped | Plain
deriving Enum
descr ft = ["compressed with gzip",
"compressed with bzip2",
"uncompressed"] !! fromEnum ft

Are there good alternative Scheme syntaxes?

I imagine Scheme (and perhaps Lisp) could be made more `user friendly' by using a different syntax. For example, instead of nested S-expressions with ugly parentheses, one could devise some kind of syntax closer to some of the more widely used languages (e.g. Java-like without needing to define classes).
It's not necessarily a bad thing if it's more verbose. For example, the syntax may require line separators and commas in the places where many people will expect them, and expect explicit return statements. Also, it doesn't seem that difficult to allow some operators to be used infix style (just obey the generally accepted operator preference rules).
And if it doesn't make things too messy, the syntax could even be backwards-compatible, so that in any place where an expression is expected, a normal S-expression between parentheses can be used.
What are your opinions and ideas about this? And does anything like this exist? (I expect it does, but "Scheme" is a worthless google term, I can't find anything!)
Originally, Lisp was planned to use a syntax called M-Expressions, with S-Expressions being only a transitional solution for easier compiler building. When M-Expressions were ready to be introduces, the programmers who had already taken on Lisp just stayed with what they had become accustomed to, and M-Expressions never caught on.
There is an infix notation in Guile, but it's rarely used. A good Lisp programmer doesn't even see the parens anymore, and prefix notation does have its merits...
I think "sweet expressions" might be one of the more thoughtful approaches to getting rid of the parentheses in Lisp. It apparently even supports macros.
http://www.dwheeler.com/readable/sweet-expressions.html
However, I think most people eventually get over the parentheses or use another language.
Take a look at "sweet-expressions", which provides a set of additional abbreviations for traditional s-expressions. They add syntactically-relevant indentation, a way to do infix, and traditional function calls like f(x). Unlike nearly all past efforts to make Lisps readable, sweet-expressions are backwards-compatible (you can freely mix well-formatted s-expressions and sweet-expressions), generic, and homoiconic.
Sweet-expressions were developed on http://readable.sourceforge.net and there is a sample implementation.
For Scheme there is a SRFI for sweet-expresssions: http://srfi.schemers.org/srfi-110/
Try SRFI 49 for size. :-P
(Seriously, though, as Rafe commented, "I don't think anybody wants this".)
Some people consider Python to be a kind of Scheme with infix notation for operators, algebraic notation for functions and which uses a more "java-like" syntax for representing the language. I don't agree with that assessment, but I can see where the idea comes from.
The big problem with changing the notation for Scheme is that macros become very hard to write (to see how hard, take a look at the Nimrod language or Boo). Instead of working directly with the code as lists, you have to parse the input language first. This usually involves constructing an AST (abstract syntax tree) for the language from the input. When working directly with Scheme, this is unnecessary.
However, you might check out the SIX expression syntax in Gambit Scheme. There's a nice set of slides here which contains a discussion of this:
http://www.iro.umontreal.ca/~gambit/Gambit-inside-out.pdf
But don't tell anyone about it! (The inside joke is that someone suggests writing a Lisp without parentheses and with infix notation about once a day, and someone announces an implementation about once a month.)
There are some languages that do exactly that. For instance: Dylan.

Question about ruby symbols [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicates:
Why don't more projects use Ruby Symbols instead of Strings?
What's the difference between a string and a symbol in Ruby?
I am new to ruby language.
From what I read I must use symbols instead strings where is possible.
Is that correct ?
You don't HAVE TO do anything. However, it is advisable to use symbols instead of strings in cases where you will be using the same string literal over and over again. The reason is that there is only ever 1 of the symbol held in memory, while if you use the string literal it will created anew whenever you assign it, therefore potentially wasting memory.
Strings are mutable where as Symbols are not. Symbols are better performance wise.
Read this article to better understand symbols, strings and their differences.
It's sort of up to you which one you use -- you can use a string anywhere you'd use a symbol, but not the other way around. Symbols do have a number of advantages, in a few cases.
Symbols give you a performance advantage because two symbols of the same name actually map to the same object in memory, whereas two strings with the same characters create different objects. Symbols are immutable and lightweight, which makes them ideal for elements that you won't be changing around at runtime; keys in a hash table, for example.
Here's a nice excerpt from the Ruby Newbie guide to symbols:
The granddaddy of all advantages is
also the granddaddy of advantages:
symbols can't be changed at runtime.
If you need something that absolutely,
positively must remain constant, and
yet you don't want to use an
identifier beginning with a capital
letter (a constant), then symbols are
what you need.
The big advantage of symbols is that
they are immutable (can't be changed
at runtime), and sometimes that's
exactly what you want.
Sometimes that's exactly what you
don't want. Most usage of strings
requires manipulation -- something you
can't do (at least directly) with
symbols.
Another disadvantage of symbols is
they don't have the String class's
rich set of instance methods. The
String class's instance method make
life easier. Much easier.
Symbols have two main advantages:
They are like intern'ed strings, so they can improve performance.
They are syntactically distinct which can make your code more readable, particularly for programs that use hashtables as complex build-in data structures.

Why does Ruby expose symbols?

Why does Ruby expose symbols for explicit use? Isn't that the sort of optimisation that's usually handled by the interpreter/compiler?
Part of the issue is that Ruby strings are mutable. Since every string Ruby allocates must be independent (it can't cache short/common ones), it's convenient to have a Symbol type to let the programmer have what are essentially immutable, memory-efficient strings.
Also, they share many characteristics with enum's, but with less pain for the programmer.
Ruby symbols are used in lieu of string constants in other similar languages. Besides the performance benefit, they can be used to semantically distinguish between string data and a more abstract symbol. Being syntactically different, they can clearly be distinguished in code.
Have a look at Ruby symbols post.

Resources