Difference between Control tables and Jump tables? - data-structures

What's the difference between the two? To my understanding they both control program flow, and the first is more loosely defined than the latter, but I can't see what distinguishes the two other than that.

A jump table is a type of control table, it's specifically an assembler construct that will be interpreted by the CPU.
Thanks to Bergi and Jim Mischel.

Related

thrust: sorting within a threadblock

I am dispatching a kernel with around 5k blocks. At some point, we need to sort an array within each threadblock. If possible we would like to use a library like thrust.
From the documentation I understand that how sort is executed in thrust depends on the specified execution_policy. However I don't understand if I can use execution_policies to specify that I would like to use the threads of my current block for sorting. Can someone explain or hint me towards a good documentation of execution policies and tell me if what I intend to do is feasible?
Turns out that execution policies are basically a bridge design pattern that uses template specialization instead of inheritance to select the appropriate implementation of an algorithm while exposing a stable interface towards the user of the library and avoiding the overhead/necessity of virtual functions. Thank you robert-crovella for the great video.
As for the actual implementation of sorting within a threadblock in thrust, talonmies is right. There simply is no implementation (currently?), I could not find anything in the source code.

To separate functions or not to separate functions? That is the

At this point in time I am approaching the 1800 line mark in my code.
It contains a series of different forms and one big function which checks, validates, and determines the next step in the process. I have a total of 12 functions and I'ld like to know the programming philosophies and thoughts on whether or not (or when?) to separate the functions into their own file and when to leave them all on the same page.
Any thoughts on both your style of programming and any links to established programming standards of a particular group or philosophy of programming?
Thanks
According to Code Complete book, a function needs to contain -one- logical unit, if it contains more than one, then break it into two functions. Another hint is if function name is too cumbersome or long. That too is a hint about a function that can be refactored.
Incidentally, Code Complete book should be on reading list of any serious software developer.

Should i write a Direct3D Shader Model language compiler using flex/yacc?

I am going to create a compiler for Direct3D's Shader Model language. The compiler's target platform and development environment are on Windows/VC++.
For those who are not familiar with the Shader Model Language, here are examples of instructions which the language consists of (some of the instructions are a bit outdated, but the syntax is basically the same as the version I will be using).
Here
And here
I am considering flex/yacc as the framework for developing the compiler. Would these be suitable for the job? Is there any better framework for developing in native C++?
In my opinion, a normal lexer and/or parser generator usually won't help much in writing an assembler. They're mostly helpful in dealing with relatively complex grammars, but in the case of an assembler, the "grammar" is usually so trivial that such a generator is more hindrance than help.
A typical assembler is mostly table driven -- you start by creating a table of defined op-codes, and the characteristics of the instruction it will generate (e.g. number and types of of registers that must be specified for it). You typically have a (smaller, in the case of shaders, probably much smaller) table defining how to encode addressing modes and such.
Most of the assembler works by consulting that table -- i.e. it reads something from input, and attempts to look it up in the table. If it's not present, it gives an error message saying it's an unknown opcode. If it's found, it gets information from the table about the number of operands associated with that op-code. It attempts to read that many operands. If it can't, it gives an error saying something's wrong with the instruction. If it can, it encodes the instruction, and starts over.
There are a few places it has to handle a bit more than that, of course. Where/when you define something like a label, it has to record the name and position of that label in a symbol table. When it encounters something like a branch to that address, it has to look up the target and encode its address appropriately.
Only when/if you decide to support macros do you depart much from that basic model. Depending on how elaborate you get with them, it might be worthwhile to use a parser generator and such for a macro expansion facility. Then again, given that shaders are mostly pretty small, macros aren't likely to be a very high priority for such an assembler.
Edit: rereading it, I should probably clarify/correct one point. The use for a parser generator isn't so much when the grammar itself becomes complex, as when the grammar allows for statements that are complex. Consider a really trivial grammar:
expression := expression '+' value
| expression '-' value
| value
Even though this allows only addition and subtraction, it still defines statements that are arbitrarily complex (or at least arbitrarily long strings of values being added or subtracted). Of course, for even a fairly trivial real language, we'll normally have multiplication, division, function calls, etc.
This is considerably different from a typical assembly language, where each each instruction has a fixed format. For example, an addition or subtraction operation has exactly two source operands and one destination operand.

What does this software quote mean?

I was reading Code Complete (2nd Edition), and came across a quote in the margin on page 87 by Bertrand Meyer.
Ask not first what the system does; ask WHAT it does it to!
What exactly is the point Mr. Meyer is trying to get across here. I have some rough ideas, but I would like to make sure I really understand.
... So this is the second fallacy of teleology
- to attribute goal-directed
behavior to things that are not
goal-directed, perhaps without even
thinking of the things as alive and
spirit-inhabited, but only thinking, X
happens in order to Y. "In order to"
is mentalistic language, even though
it doesn't seem to name a blatantly
mental property like "fearful" or
"thinks it can fly". — Eliezer Yudkowsky, artificial intelligence theorist
concerned with self-improving AIs with stable goal systems
Bertrand Meyer's homily suggests that sound reasoning about systems is grounded in knowing what concrete entities are altered by the system; the purpose of the alterations is an emergent property.
I believe the point here is not on what the system does, but on the data it operates on and what those operations are.
This provides two major thinking shifts:
You think of the data and concepts first
You think of operations on that data
With those two "baselines" you will better prepared to organize a system to achieve your goals so that operations on data are well understood and make sense.
In effect, he is laying the ground work to be able to write the "contracts" on the code you write.
From Google search it picked up Art Gittleman's Computing With C# and the .Net Framework:
Bertrand Meyer gives an example of
payroll program, which produces
paychecks from timecards. Management
may later want to extend this program
to produce statistics or tax
information. The payroll function
itself may need to be changed to
produce weekly checks instead of
biweekly checks, for example. The
procedures used to implement the
original payroll program would need to
be changed to make any of these
modifications. Meyer notes that any of
these payroll programs will manipulate
the same sort of data, employee
records, company regulations, and so
forth.
Focusing on the more stable
aspect of such systems, Mayer states a
principle: "Ask not first what the
system does: Ask WHAT it does to!";
and a definition: "Object-oriented
design is the method which leads to
software architectures based on
objects every system or subsystem
manipulates (rather than "the"
function it meant to ensure)."
We today take UML's class diagram and other OOAD approach for granted, but it was something that was "discovered" along the way.
Also see Object-Oriented Design.
My opinion is that the quote is meant as a method to find good abstractions in your software. The text next to this quote deals with finding real-world objects to design your classes.
An simple example would be something like this:
You are making software for a bank. Because your software is working with bank accounts, it should have a class for an account. Then you start thinking what properties accounts have and the interactions you can have with accounts.
Of course, this quote makes more sense if the objects you are trying to model aren't as clear as this case.
Fred Brooks stated it this way:
"Show me your flowcharts and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won't usually need your flowcharts; they'll be obvious."
Domain-Driven design... Understand the problem the software is designed to solve. What "domain" entities, (data abstractions) does the system manipulate ? And what does it do to those domain entities?

Is it stupid to write a large batch processing program entirely in PL/SQL?

I'm starting work on a program which is perhaps most naturally described as a batch of calculations on database tables, and will be executed once a month. All input is in Oracle database tables, and all output will be to Oracle database tables. The program should stay maintainable for many years to come.
It seems straight-forward to implement this as a series of stored procedures, each performing a sensible transformation, for example distributing costs among departments according to some business rules. I can then write unit tests to check if the output of each transformation is as I expected.
Is it a bad idea to do this all in PL/SQL? Would you rather do heavy batch calculations in a typical object oriented programming language, such as C#? Isn't it more expressive to use a database centric programming language such as PL/SQL?
You describe the following requirements
a) Must be able to implement Batch Processing
b) Result must be maintainable
My Response:
PL/SQL was designed to achieve just what you describe. It's also important to note that there are efficiencies in PL/SQL that are not available in other tools. An stored procedure language put the processing next to the data - which is where batch processing ought to sit.
It easy enough to write poorly maintainable code in any language.
Having said the above, your implementation will depend on the available skills, a proper design and adherence to good quality processes.
To be efficient your implementation must process data in batches ( select in batches and insert/update in batches ). The danger with an OO approach is that it is easy to be led towards a design that processes data row by row. This type of approach contains unnecessary overhead, and will be significantly less efficient than a design that processes data in batches of rows.
It is possible to use both approaches successfully.
Mathew Butler
Something for other commenters to note - the question is about PL/SQL, not about SQL. Some of the answers have obviously been about SQL, not PL/SQL. PL/SQL is a fully functional database language, and it's mature as well. There are some shortcomings, but for the type of thing the poster wants to do, it's very good.
No, it isn't necessarily a bad idea. If the solution seems straightforward to you and allows you to test and verify each process, its sounds like it could be a good idea. OO platforms can be (though they don't have to be) bad for large data sets, as object creation and overhead can kill performance.
Oracle designed PL/SQL with problems like yours in mind, if there is sufficient corporate knowledge of the database and PL/SQL this seems like a reasonable solution. Keep large batch sets in mind, as each call from PL/SQL to the actual SQL engine is a context switch, so single record processes should be batched together where possible to improve performance.
Just make sure you somehow log what is happening while it's working. Otherwise you'll have a black box and if it gets stuck somewhere for hours, you'll be wondering whether to stop it or let it work 'a little bit more'.
PL/SQL is a mature language that integrates well with SQL. With each version of Oracle it becomes more and more powerful.
Also starting from Oracle 11, PL/SQL compiles to machine code by default.
Normally I say put as little in PL/SQL as possible - it is typically a lot less maintainable - at one of my last jobs I really saw how messy and hard to work with it could get.
However, since it is batch processing - and since the input and output are both the DB - it makes good sense to put the logic into PL/SQL - to minimize "moving parts". However, if it were business logic - or components used by other pieces of your system - I would say don't do it..
I wrote a huge amount of batch processing and report generation programs in both PL/SQL and ProC for one project. They generally preferred I write in PL/SQL as their own developers who would maintain in the future found that easier to understand than ProC code.
It ended up being only the really funky processing or reports that ended up being written in Pro*C.
It is not necessary to write these as stored procedures as other people have alluded to, they can be just script files that are run as necessary, kind of like a shell script. Make source code revision control and migration between test and production systems a heck of a lot easier, too.
As long as the calculations you need to perform can be adequately AND readably captured in PL/SQL, then using only PL/SQL would make the most sense.
The real catch is maintainability -- it's very easy to write unmaintainable SQL, if only because every RDBMS has a different syntax and different function set once you step outside of simple SQL DML, and no real standards for formatting. commenting, etc.
I've created batch programs using C# and SQL.
Pros of C#:
You've got the full library of .NET and all the power of an OO
language.
Cons of C#:
Batch program and db separate - this means, you'll have to manage your batch program separate from the database.
You need to escape all that dang sql code
Pros of SQL:
Integrates nicely with the DBMS. If this job only manipulates the database, it would make sense to include it with the database. You end up with a single db and all of its components in one package.
No need to escape sql code
keeping it real - you are programming in your problem domain
Cons of SQL:
Its SQL and I personally just don't know it as well as C#.
In general, I would stick with using SQL because of the Pros outlined above.
This is a loaded question :)
There's a couple of database programming architecture designs you should know of, and what their costs/benefits are.
2 Tier generally means you have a client connecting to a DB, issuing direct SQL calls.
3 Tier generally means you have an "application server" that is issuing direct SQL calls to the DB, but the client is talking to the app server. Generally, this affords "scaling out".
Finally, you have 2 1/2 tiered apps that employ a 2 Tier like format, only the work is compartmentalized within stored procedures.
Your process sounds like a "back office" kind of thing, and clients/processes just need results that are being aggregated and cached on a once a month basis.
That is, there is no agent that connects, and connects often, and says "do these calculations". Instead you allude to a process that happens once in a while, and you can get away with non-real time.
Therefore, given those requirements, I'd say that generally, it will be faster to be closer to the data, and let SQL server do all the calculations.
I think you'll find that proximity to the data will serve you well.
However, in performing these calculations, you may find that some calculations are not amenable to SQL Servers. Take for example calculating the accrued interest of a bond, or any fixed income instrument. Not very pretty in SQL, and much more suited for a richer programming language. However, if you just have simple averages and other relatively sane aggregates, I'd stick to stored procedures, on the SQL side.
So again, there's not enough information as to the nature of your calculations, or what your house mandates in terms of SQL capabilities of devs for support, or what your boss says...but since I know my way around SQL, and like to stay close to the data, I'd stay pure SQL/Stored Procedures for a task like this.
YMMV :)
It's not usually more expressive because most stored procedure languages suck by design. But it will probably run faster than in an external app.
I guess it boils down to how familiar you are with PL/SQL, how much time you have to write this, how important is performance and if you can reasonably expect maintainers to be familiar enough with PL/SQL to maintain a big program written in it.
If speed is not relevant and maintainers will probably be not PL/SQL proficient, you might be better using a 'traditional' language.
You could also use a hybrid approach, where you use PL/SQL to generate intermediate data (say, table joins and sums or whatever) and a separate application to control flow and check values and errors.

Resources