Reusable Priority queue implementation in google go - algorithm

How would one go about writing the code for a reusable priority queue in Google Go or is one expected to define the Less Push and Pop function everytime a priority queue implementation is needed?

The later case is what one has to do. As far as Go doesn't have generics, it's the only available option at the moment.

Haven't tried it, but maybe you could use reflection and struct tags, if your cases happened to fit certain restrictons. You would require that your heapable type be a struct with a tag like `pq:"Key"` on the field you use for ordering, and that that field type be < comparable. It's far less powerful than a Less method but it might meet your needs.
Sorry I have no example code for you. I don't think it wouldn be terribly hard, but it would take me some time. Left for an exercise.
I might try this technique if I had a situation where I needed to handle arbitrary structs and I could live with the simplistic key restriction. For a finite set of types though, I wouldn't do this. I would just do it by the book, implementing heap.Interface separately for each type. It's really not that much work nor that many lines of code.

There used to be vector types in the container/vector module of the standard library that implemented these methods, based on a resizable slice, and perfectly worked as the container to be used with the heap module. Unfortunately, they got rid of vector, which I never understood as it implemented a nice method-level abstraction over a slice variable. So now, every time I see someone using the heap module in Go, they have to basically re-implement part of vector -- writing Push, Pop, Length, etc. based on a slice variable.

I'm not sure I understand the question.
I've implemented a queue (and stack and ring) using interface{} as the stored type here:
https://github.com/iNamik/go_container
Using interface{} allows you to store any type you want - although it doesn't help you enforce types ala generics, it gets the job done.
I could see creating a priority queue without much hassle.
Am I missing something?
I'd be happy to add a priority queue container if you think you'd find it useful.

Related

Why do a Beam io need beam.AddFixedKey+beam.GroupByKey to work properly?

I'm working on a Beam IO for Elasticsearch in Golang and at the moment I have a working draft version but, only managed to make it work by doing something that's not clear to me why do I need it.
Basically I looked at existing IO's and found that writes only work if I add the following:
x := beam.AddFixedKey(s, pColl)
y := beam.GroupByKey(s, x)
A full example is in the existing BigQuery IO
Basically I would like to understand why do I need both AddFixedKey followed by a GroupByKey to make it work. Also checked the issue BEAM-3860, but doesn't have much more details about it.
Those two transforms essentially function as a way to group all elements in a PCollection into one list. For example, its usage in the BigQuery example you posted allows grouping the entire input PCollection into a list that gets iterated over in the ProcessElement method.
Whether to use this approach depends how you are implementing the IO. The BigQuery example you posted performs its writes as a batch once all elements are available, but that may not be the best approach for your use case. You might prefer to write elements one at a time as they come in, especially if you can parallelize writes among different workers. In that case you would want to avoid grouping the input PCollection together.

Is it bad to have many global functions?

I'm relatively new to software development, and I'm on my way to completing my first app for the iPhone.
While learning Swift, I learned that I could add functions outside the class definition, and have it accessible across all views. After a while, I found myself making many global functions for setting app preferences (registering defaults, UIAppearance, etc).
Is this bad practice? The only alternate way I could think of was creating a custom class to encapsulate them, but then the class itself wouldn't serve any purpose and I'd have to think of ways to passing it around views.
Global functions: good (IMHO anyway, though some disagree)
Global state: bad (fairly universally agreed upon)
By which I mean, it’s probably a good practice to break up your code to create lots of small utility functions, to make them general, and to re-use them. So long as they are “pure functions”
For example, suppose you find yourself checking if all the entries in an array have a certain property. You might write a for loop over the array checking them. You might even re-use the standard reduce to do it. Or you could write a re-useable function, all, that takes a closure that checks an element, and runs it against every element in the array. It’s nice and clear when you’re reading code that goes let allAboveGround = all(sprites) { $0.position.y > 0 } rather than a for…in loop that does the same thing. You can also write a separate unit test specifically for your all function, and be confident it works correctly, rather than a much more involved test for a function that includes embedded in it a version of all amongst other business logic.
Breaking up your code into smaller functions can also help avoid needing to use var so much. For example, in the above example you would probably need a var to track the result of your looping but the result of the all function can be assigned using let. Favoring immutable variables declared with let can help make your program easier to reason about and debug.
What you shouldn’t do, as #drewag points out in his answer, is write functions that change global variables (or access singletons which amount to the same thing). Any global function you write should operate only on their inputs and produce the exact same results every time regardless of when they are called. Global functions that mutate global state (i.e. make changes to global variables (or change values of variables passed to them as arguments by reference) can be incredibly confusing to debug due to unexpected side-effects they might cause.
There is one downside to writing pure global functions,* which is that you end up “polluting the namespace” – that is, you have all these functions lying around that might have specific relevance to a particular part of your program, but accessible everywhere. To be honest, for a medium-sized application, with well-written generic functions named sensibly, this is probably not an issue. If a function is purely of use to a specific struct or class, maybe make it a static method. If your project really is getting too big, you could perhaps factor out your most general functions into a separate framework, though this is quite a big overhead/learning exercise (and Swift frameworks aren’t entirely fully-baked yet), so if you are just starting out so I’d suggest leaving this for now until you get more confident.
* edit: ok two downsides – member functions are more discoverable (via autocomplete when you hit .)
Updated after discussion with #AirspeedVelocity
Global functions can be ok and they really aren't much different than having type methods or even instance methods on a custom type that is not actually intended to contain state.
The entire thing comes down mostly to personal preference. Here are some pros and cons.
Cons:
They sometimes can cause unintended side effects. That is they can change some global state that you or the caller forgets about causing hard to track down bugs. As long as you are careful about not using global variables and ensure that your function always returns the same result with the same input regardless of the state of the rest of the system, you can mostly ignore this con.
They make code that uses them difficult to test which is important once you start unit testing (which is a definite good policy in most circumstances). It is hard to test because you can't mock out the implementation of a global function easily. For example, to change the value of a global setting. Instead your test will start to depend on your other class that sets this global setting. Being able to inject a setting into your class instead of having to fake out a global function is generally preferable.
They sometimes hint at poor code organization. All of your code should be separable into small, single purpose, logical units. This ensures your code will remain understandable as your code base grows in size and age. The exception to this is truly universal functions that have very high level and reusable concepts. For example, a function that lets you test all of the elements in a sequence. You can also still separate global functions into logical units by separating them into well named files.
Pros:
High level global functions can be very easy to test. However, you cannot ignore the need to still test their logic where they are used because your unit test should not be written with knowledge of how your code is actually implemented.
Easily accessible. It can often be a pain to inject many types into another class (pass objects into an initializer and probably store it as a property). Global functions can often remove this boiler plate code (even if it has the trade off of being less flexible and less testable).
In the end, every code architecture decision is a balance of trade offs each time you go to use it.
I have a Framework.swift that contains a set of common global functions like local(str:String) to get rid of the 2nd parameter from NSLocalize. Also there are a number of alert functions internally using local and with varying number of parameters which makes use of NSAlert as modal dialogs more easy.
So for that purpose global functions are good. They are bad habit when it comes to information hiding where you would expose internal class knowledge to some global functionality.

Explain/Give example of "Hide pointer operations" in Code Complete 2

I am reading Code Complete 2, Chapter 7.1 and I don't understand the point author said below.
7.1 Valid Reasons to Create a Routine
Hide pointer operations
Pointer operations tend to be hard to read and error prone. By isolating them in routines (or a class, if appropriate), you can concentrate on the intent of the operation rather than the mechanics of pointer manipulation. Also, if the operations are done in only one place, you can be more certain that the code is correct. If you find a better data type than pointers, you can change the program without traumatizing the routines that would have used the pointers.
Please explain or give example of this purpose.
Essentially, the advice is a specific example of the data-hiding. It boils down to this -
Stick to Object-oriented design and hide your data within objects.
In case of pointers, the norm is to NEVER expose pointers of "internal" data-structures as public members. Rather make them private and expose ONLY certain meaningful manipulations that are allowed to be performed on the pointers as public member functions.
Portable / Easy to maintain
The added advantage (as explained in the section quoted) is that any change in the internal data structures never forces the external API to be changed. Only the internal implementation of the publicly exposed member functions needs to be modified to handle any changes.
Code re-use / Easy to debug
Also pointer manipulations are now NOT copy/pasted and littered all around the code with no idea what exactly they do. They are now limited to the member functions which are written keeping in mind how exactly the internal data structures are being manipulated.
For example if we have a table of data which the user is allowed to add rows into,
Do NOT expose
pointers to the head/tail of table.
pointers to the individual elements.
Instead create a table object that exposes the functions
addNewRowTop(newData)
addNewRowBottom(newData)
addNewRow(position, newData)
To take this further, we implement addNewRowTop() and addNewRowBottom() by simply calling addNewRow() with the proper position - another internal variable of the table object.

Vectored Referencing buffer implementation

I was reading code from one of the projects from github. I came across something called a Vectored Referencing buffer implementation. Can have someone come across this ? What are the practical applications of this. I did a quick google search and wasn't able to find any simple sample implementation for this.
Some insight would be helpful.
http://www.ibm.com/developerworks/library/j-zerocopy/
http://www.linuxjournal.com/article/6345
http://www.seccuris.com/documents/whitepapers/20070517-devsummit-zerocopybpf.pdf
https://github.com/joyent/node/pull/304
I think some more insight on your specific project/usage/etc would allow for a more specific answer.
However, the term is generally used to either change or start an interface/function/routine with the goal that it does not allocate another instance of its input in order to perform its operations.
EDIT: Ok, after reading the new title, I think you are simply talking about pushing buffers into a vector of buffers. This keeps your code clean, you can pass any buffer you need with minimal overhead to any function call, and allows for a better cleanup time if your code isn't managed.
EDIT 2: Do you mean this http://cpansearch.perl.org/src/TYPESTER/Data-MessagePack-Stream-0.07/msgpack-0.5.7/src/msgpack/vrefbuffer.h

Is 'handle' synonymous to pointer in WinAPI?

I've been reading some books on windows programming in C++ lately, and I have had some confusing understanding of some of the recurring concepts in WinAPI. For example, there are tons of data types that start with the handle keyword'H', are these supposed to be used like pointers? But then there are other data types that start with the pointer keyword 'P'. So I guess not. Then what is it exactly? And why were pointers to some data types given separate data types in the first place? For example, PCHAR could have easily designed to be CHAR*?
Handles used to be pointers in early versions of Windows but are not anymore. Think of them as a "cookie", a unique value that allows Windows to find back a resource that was allocated earlier. Like CreateFile() returns a new handle, you later use it in SetFilePointer() and ReadFile() to read data from that same file. And CloseHandle() to clean up the internal data structure, closing the file as well. Which is the general pattern, one api function to create the resource, one or more to use it and one to destroy it.
Yes, the types that start with P are pointer types. And yes, they are superfluous, it works just as well if you use the * yourself. Not actually sure why C programmers like to declare them, I personally think it reduces code readability and I always avoid them. But do note the compound types, like LPCWSTR, a "long pointer to a constant wide string". The L doesn't mean anything anymore, that dates back to the 16-bit version of Windows. But pointer, const and wide are important. I do use that typedef, not doing so will risk future portability problems. Which is the core reason these typedefs exist.
A handle is the same as a pointer only so far as both ID a particular item. Obviously a pointer is the address of the item so if you know it's structure you can start getting fields in the item. A handle may or may not be a pointer - basically if it is a pointer you don't know what it is pointing to so you can't get into the fields.
Best way to think of a handle is that it is a unique ID for something in the system. When you pass it to something in the system the system will know what to cast it to (if it is a pointer) or how to treat it (if it is just some id or index).

Resources