Using Sets in Ruby

Using Sets in Ruby - ruby

I'm building a simple Ruby on Rails plugin and I'm considering using the Set class. I don't see the Set class used all too often in other people's code.
Is there a reason why people choose to use (subclasses of) Array rather than a set? Would using a set introduce dependecy problems for some people?

Set is part of the standard library, so it shouldn't pose any dependency problems. If it's the cleanest way to solve the problem, go for it.
Regarding use (or lack thereof) I think there are probably two main reasons:
programmers not being aware of the library
programmers not realising when sets are the best way to a solution
programmers not knowing/remembering anythign about sets at all.
Make that three main reasons.

Ruby arrays are very flexible anyway, there are plenty of methods that allow to threat It like a set, a stack or a queue for example.

The way I've understood it from what I've read here is that Set's are likely mathematical sets, ie. order doesn't matter and isn't preserved.
In most programming applications, unless you're doing maths, it have limited use, because normally you want to preserve order.

Related

How to use data structures in interviews

This question is about how to best approach a coding interview from a data structures point of view.
The way I see it, there are two different ways, I could implement a specific DS from scratch, initialise it and then use it to solve my problem, or simply use a library (I'm talking about Node.js here, but I guess this applies to other languages as well, at least those with some in-built support for DS) without worrying about the implementation and only focusing on how to use them to solve a problem.
In the first case, I'm also demonstrating that I can implement a specific DS from scratch, but at the same time I would need more time and there's some additional complexity. Instead, using a library would leave me more time to solve the actual problem, but some companies might take a dim view on this approach.
I know there's no silver bullet, and different companies will have different views, but what approach would you take if you could only pick one, and why?

Well it is always best to use the library but it is always better to know how common library functions work at least the basic ones.
For example, in many interviews Binary search is asked to be implemented instead of just using the library functions. This is because knowing the implementation adds some good concept which can be used in general problem solving like using the same concept in other divide and conquer algorithms.
In production level code we always look for the fail safe and properly tested library code.

You should pick available libraries, first hand. If needed, customize the behavior of already available libraries.

First time porting a library from one language to another

I am porting a library from C++ to Java. This is my first time and I am not sure
about what "porting" really means? Specifically, what if the author named a variable
as 'A' and I think that a better name would be 'B'. Same for methods, classes and namespaces.
Also, what if I think something can be done better? Does porting mean that I should try
to keep as much of the original code spirit as possible, but still allow myself freedom
to improve stuff?
Thanks

It doesn't necessarily have to be a one-to-one translation (and in many cases, it can't be done). Porting is just rewriting a piece of software in a different language/environment/etc. Sometimes porting will require you to tweak things and implement them in different ways altogether, so I think the last sentence of your post pretty much captures the gist of things.
I view it as comparable to translating a book from English to another language. There will be instances where judgment calls need to be made in terms of how to express the intent/function of the source material.

When porting from System A to System B, the world is your oyster. You can pretty much change anything if you believe it's an improvement. The only caveats to that are when dealing with interfaces. Say, you are porting an API, for example, it wouldn't be a good idea to name externally-available methods, as that would break something down the road. Tracing naming issues across multiple classes is a major pain.
As someone who's done a fair bit of porting from language to language, I would recommend sticking to implementation details first and foremost. A good engineering principle is to change one thing at any time. That way, when things don't run as expected, you'll know that it's your implementation that is to blame, and not some silly naming issue. And when you do come to renaming, I suppose it goes without saying, be very careful and backup often. This is one case where software versioning may save you hours of time.

When "porting" a library from one platform to another, you are porting functionality. You are not porting style of code. It isn't like in literature, where one must maintain the style of the piece, keeping in mind metaphors and iambic pentameter or what have you.

Is inject the same thing as reduce in ruby?

I saw that they were documented together here. Are they the same thing? Why does Ruby have so many aliases (such as map/collect for arrays)? Thanks a lot.

Yes, and it's also called fold in many other programming languages and in Mathematics. Ruby aliases a lot in order to be intuitive to programmers with different backgrounds. If you want to use #length on an Array, you can. If you want to use #size, that's fine too!

More recent versions of the documentation of Enumerable#reduce specify it explicitly:
The inject and reduce methods are aliases. There is no performance benefit to either.

Are they the same thing?
Yes, aliases run the exact same code in the end.
Why does Ruby have so many aliases (such as map/collect for arrays)?
It boils down to the language's approach
Different languages have different approaches, I tried to visualize it here:
Ruby does it in favor of developer productivity. Basically, by having aliases you give programmers from different programming languages and human languages backgrounds to write code more intuitively.
However, they can also help your code's clarity because some things may have different semantic possibilities like the method midnight() can also be expressed as start_of_day or end_of_day. Those can be more clear depending on the context.
By the way, some programmers use inject and reduce to differentiate between different semantic situations too.

Writing wrappers for libraries

I'm trying to write a wrapper for the third-party graphics library I'm using. I'd like to make it general enough you I could switch libraries easily if I decide to port it over to another platform or OS.
The problem is I can't really find a good enough design. Besides the library I'm using, I'm also following the design of two other libraries to ensure a general enough design. But there always seems to be something one lib can do the others can't.
Do you have any tips as to how I should make my code more portable (easy switching of libraries)? Maybe you can suggest a design for a graphics wrapper that's worked for you in the past.

Do you have any tips as to how I should make my code more portable (easy switching of libraries)?
This is rarely of significant value.
If you think you must ensure portability you have three choices.
Least Common Feature Set. Take all the libraries you think you might want to use. Write down all the classes, methods and attributes and find the smallest common subset. You have to do some thinking to match up all the various names to be sure you've got the semantics as close as possible.
This will give you a minimal graphics implementation that must run everywhere.
However, it will be feature-poor and your application (or wrapper) will have to do a lot of programming to fill in the missing features in a uniform way.
Union of All Features. Take all the libraries you think you might want to use. Write down all the classes, methods and attributes and simply add each new thing to the ever-growing list of features. You have to do some thinking to match up all the various names to be sure you've got the semantics as close as possible to avoid adding duplicates.
This will present problems because a feature that's in one library must be implemented in all the other libraries. It's a lot of programming.
You're not the first person to have this thought and realize that it's really, really hard to do.
So what's the fall back?
Choice 3.
Pick your favorite library. Get something to work.
Based on customer demand, identify the most-demanded alternate library. Create the necessary wrapper so that your application can work with this library.
Iterate this last step until you're out of customers or the customer demand is so low that it's cheaper to fire them as a customer than it is to support them.

Should data be descriptive of itself? In what cases should it preferably be or not be?

I am not sure how to put this easily into a simple question.
I am just going to use an example.
Say I am sending some parameter to a web browser from a server. The Javascript will know what to do with it. Say it was a setting for some page element that could have 4 different values. I could make it be 0-3, or I could make it be "bright", "dark", "transparent", "none". Do you see what I mean? In one case the data is descriptive.
Now step outside of the realm of web development. In fact, step away from any facet of programming that would NOT require one method or the other, and think of some that would prefer one over the other. Meaning it would be beneficial to the over all goals if it was done in a descriptive manner, or beneficial if it was done in a cryptic manner.
Can you think of some examples where you would want one over the other?
PS: I may need help with the tags on this one guys.

Benefit of the number variant is smaller data size. That can be useful if you are communicating a lot of data or communicating over a restricted bandwidth channel. Also comparing numbers is much faster than comparing strings.
The alternative with meaningful names is beneficial when you need easy extensibility and maintainability. You can see what the value means without using any other translation table. Also you can enable others to add new values by defining some naming rules.

The benefits of using the one strategy over the other is quite simmilar to the benefits of strong vs. weak typing. Values like "bright", "dark" etc. is strongly typed while 0, 1, 2 is weakly typed.
The most important benefits of using strongly typed data is 1) that it is easy for other people to know what the value means and how to use it and 2) that you will get a meaningful, syntactic error early if you use an illegal value.
The benefits of weakly typing is that you may introduce new values without having to change intermediate modules. I.e. you could introduce "4" without changing intermediate modules that don't really have to understand what the value means.
I would definitely go for "bright", "dark" etc.
NB! Some would probably argue that "bright" is a string and so is weakly typed in the same way as "1", but this depends on the perspective.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio