Image Recognition - image

I'd like to do some work with the nitty-gritties of computer imaging. I'm looking for a way to read single pixels of data, analyze them programatically, and change them. What is the best language to use for this (Python, c++, Java...)? What is the best fileformat?
I don't want any super fancy software/APIs... I'm looking for the bare basics.

If you need speed (you'll probably always want speed with image processing) you definitely have to work with raw pixel data.
Java has some real disadvantages as you cannot access memory directly which makes pixel access quite slow compared to accessing the memory directly.
C++ is definitely the language of choice for production use image processing. But you can, for example, also use C# as it allows for unsafe code in specific areas. (Take a look at the scan0 pointer property of the bitmapdata class.)
I've used C# successfully for image processing applications and they are definitely much faster than their java counterparts.
I would not use any scripting language or java for such a purpose.

It's very east to manipulate the large multi-dimensional or complex arrays of pixel information that are pictures using high-level languages such as Python. There's a library called PIL (the Python Imaging Library) that is quite useful and will let you do general filters and transformations (change the brightness, soften, desaturate, crop, etc) as well as manipulate the raw pixel data.
It is the easiest and simplest image library I've used to date and can be extended to do whatever it is you're interested in (edge detection in very little code, for example).

I studied Artificial Intelligence and Computer Vision, thus I know pretty well the kind of tools that are used in this field.
Basically: you can use whatever you want as long as you know how it works behind the scene.
Now depending on what you want to achieve, you can either use:
C language, but you will lose a lot of time in bugs checking and memory management when implementing your algorithms. So theoretically, this is the fastest language to do that kind of job, but if your algorithms are not computationnally efficient (in terms of complexity) or if you lose too much time in bugs checking, this is clearly not worth it. So I would advise to first implement your application in another language, and then later you can always optimize small parts of your code with C bindings.
Octave/MatLab: very efficient language, almost as much as C, and you can make very elegant and succinct algorithms. If you are into vectorization, matrix and linear operations, you should go with that. However, you won't be able to develop a whole application with this language, it's more focused on algorithms, but then you can always develop an interface using another language later.
Python: all-in-one elegant and accessible language, used in gigantically large scale applications such as Google and Facebook. You can do pretty much everything you want with Python, any kind of application. It will be perfectly adapted if you want to make a full application (with client interaction and all, not only algorithms), or if you want to quickly draft a prototype using existent libraries since Python has a very large set of high quality libraries, like OpenCV. However if you only want to make algorithms, you should better use Octave/MatLab.
The answer that was selected as a solution is very biaised, and you should be careful about this kind of archaic comment.
Nowadays, hardware is cheaper than wetware (humans), and thus, you should use languages where you will be able to produce results faster, even if it's at the cost of a few CPU cycles or memory space.
Also, a lot of people tends to think that as long as you implement your software in C/C++, you are making the Saint Graal of speedness: this is just not true. First, because algorithms complexity matters a lot more than the language you are using (a bad algorithm will never beat a better algorithm, even if implemented in the slowest language in the universe), and secondly because high-level languages are nowadays doing a lot of caching and speed optimization for you, and this can make your program run even faster than in C/C++.
Of course, you can always do everything of the above in C/C++, but how much of your time are you willing to waste to reinvent the wheel?

Not only will C/C++ be faster, but most of the image processing sample code you find out there will be in C as well, so it will be easier to incorporate things you find.

if you are looking to numerical work on your images (think matrix) and you into Python check out http://www.scipy.org/PyLab - this is basically the ability to do matlab in python, buddy of mine swears by it.

(This might not apply for the OP who only wanted the bare basics -- but now that the speed issue was brought up, I do need to write this, just for the record.)
If you really need speed, it's better to forget about working on the pixel-by-pixel level, and rather see whether the operations that you need to perform could be vectorized. For example, for your C/C++ code you could use the excellent Intel IPP library (no, I don't work for Intel).

It depends a little on what you're trying to do.
If runtime speed is your issue then c++ is the best way to go.
If speed of development is an issue, though, I would suggest looking at java. You said that you wanted low level manipulation of pixels, which java will do for you. But the other thing that might be an issue is the handling of the various file formats. Java does have some very nice APIs to deal with the reading and writing of various image formats to file (in particular the java2d library. You choose to ignore the higher levels of the API)
If you do go for the c++ option (or python come to think of it) I would again suggest the use of a library to get you over the startup issues of reading and writing files. I've previously had success with libgd

What language do you know the best? To me, this is the real question.
If you're going to spend months and months learning one particular language, then there's no real advantage in using Python or Java just for their (to be proven) development speed.
I'm particularly proficient in C++ and I think that for this particular task I can be as speedy as a Java programmer, for example. With the aid of some good library (OpenCV comes to mind) you can create anything you need in a matter of a couple of lines of C++ code, really.

Short answer: C++ and OpenCV

Short answer? I'd say C++, you have far more flexibility in manipulating raw chunks of memory than Python or Java.

Related

What language would be more efficient to implement a graph library?

Brief Description
I have a college work where I have to implement a graph library (I'll have to give a presentation about this work later)
The basic idea is to write all the code of the data structures and their algorithms from scratch, using the tools provided by some programming language, like C/C++, Java, Python, doesn't really matter which one of them I'll pick at first.
But I should not use any built-in graph libraries in the language: the goal of the work is to make the students learn how these algorithms work. There are some test cases which my program will be later submitted to.
It is not really necessary but, if you wanna take a look, here is the homework assignment: http://pastebin.com/GdtvMTMR (I used Control-C Control-V plus google translate from a LaTeX text, this is why the formatting is poor).
The Question
So, my question is: which programming language would be more time efficient to implement this library?
It doesn't really matter if the language is functional, structured or object oriented. My priority is time efficiency and performance.
The better language is the one you know more.
But if you're looking for some performance, take a look at compiled languages with optimisations. Keep in mind that the code you write is the major component responsible in final performance, the language itself cant do miracles.
A more low level language give to you controls but requires deeply knowledge of the language and the machine you're running your code, so it's a tradeoff.
By a personal choose I would recommend C/C++ to implement a graph library. I've already done this in the past and I used vanilla ANSI C and the performance was awesome.
The one you feel more passionate about and feel more comfortable coding with.
This way you will rock your project.
Myself would pick Java.

Performance-focused desktop-program: Ruby or Go?

I currently don't know either of the two languages. Design of a piece of software is close to complete.
The intriguing:
Ruby: Enjoyable. Follows thought process. Made for humans.
Go: Good performance. Fast compile times.
I don't know about Ruby's performance. If it's a lot slower than Go, I'll go with the latter (talking about typical speed here).
I'll learn both eventually, but right now, this will determine which one first.
Update: It's a very basic image-editing program. Technical and especially perceived speed should be high. Startup time is especially important.
Sadly, neither language is appropriate for a desktop image editing program.
You haven't told us which desktop you have in mind, I'll assume it's either Windows or Mac.
Ruby is not appropriate because it fails 2 of your requirements:
it has a terrible startup time because at startup it has to initialize a rather complicated VM, which involves loading quite a big part of its standard library
it's very slow (compared to C/Java/Go) doing the kind of computations that image processing entails
Go is statically linked and is compiled to machine code, so its startup time is excellent and the speed is close to C (i.e. it's the fastest language you can hope to choose after C/C++).
However, Go has no support whatsoever for writing Mac desktop apps (i.e. it has no bridge to Objective-C/Cocoa runtime) and the support for writing Windows desktop apps is extremely poor.
If you're doing Windows, the only language that gives you fast startup time is C/C++/Delphi. C# might have acceptable startup time and it's fast enough for the task (very popular paint.net is written in C# and you can find an old version of the code which is BSD-licensed and re-use a lot of its code).
For Mac, I would recommend Objective C - it's the native language of the platform, best documented and with the best, free dev tools (XCode). You can use https://github.com/philippec/Pixen as a starting point.
You really need to give us some idea as to what you consider to be good and bad performance because it's a very subjective subject.
For example, people are usually willing to trade a certain amount of technical or perceived speed for a system that easier to work with or develope. Plus it also matters what you are tying to do. Each language has it's own strengths and weaknesses. Ruby may be faster at some things than Go. Then again, if you really need speed, perhaps you should be looking at a language that is closer to the metal such as C.
Sometimes though, requests for speed from users are subjective too. I once had a system that the users thought was taking too long to do a specific task. There was no way technically to speed it up, so I animated the "Processing ..." window. Because the users could now see something "happening" on the screen, they thought it was going faster. On a stop watch, it actually took a couple of seconds longer.
I think those languages are the worst you can choose for performance-critical application. I don't know much about Go, but Ruby is similar to Python (even slower) and Python is slow as hell. As i've been reading, Go is much faster than Ruby, but still is like two or three times slower compared to other programming languages... It depends on what are you trying to do, of course, ie. I wouldn't choose any of those for real-time physics or something like that.
http://shootout.alioth.debian.org/u32/performance.php?test=nbody
Why is go language so slow?
http://attractivechaos.github.com/plb/
I've been working with python for a couple of years and it's really slow and I'm sure you will hate it and Ruby is very similar to Python and it's slower but as Go is too new I don't really know much about it, I can't tell..

Is there any scripting language that's fast, easy to embed, and well-suited for high-level game-programming?

First off, I'm aware that there are many questions related to this, but none of them seemed to help my specific situation. In particular, lua and python don't fit my needs as well as I could hope. It may be that no language with my requirements exists, but before coming to that conclusion it'd be nice to hear a few more opinions. :)
As you may have guessed, I need such a language for a game engine I'm trying to create. The purpose of this game engine is to provide a user with the basic tools for building a game, while still giving her the freedom of creating many different types of games.
For this reason, the scripting language should be able to handle game concepts intuitively. Among other things, it should be easy to define a variety of types, sub-type them with slightly different properties, query and modify objects dynamically, and so on.
Furthermore, it should be possible for the game developer to handle every situation they come across in the scripting language. While basic components like the renderer and networking would be implemented in C++, game-specific mechanisms such as rotating a few hundred objects around a planet will be handled in the scripting language. This means that the scripting language has to be insanely fast, 1/10 C speed is probably the minimum.
Then there's the problem of debugging. Information about the function, stack trace and variable states that the error occurred in should be accessible.
Last but not least, this is a project done by a single person. Even if I wanted to, I simply don't have the resources to spend weeks on just the glue code. Integrating the language with my project shouldn't be much harder than integrating lua.
Examining the two suggested languages, lua and python, lua is fast(luajit) and easy to integrate, but its standard debugging facilities seem to be lacking. What's even worse, lua by default has no type-system at all. Of course you can implement that on your own, but the syntax will always be weird and unintuitive.
Python, on the other hand, is very comfortable to use and has a basic class system. However, it's not that easy to integrate, it's paradigm doesn't really involve type-checking and it's definitely not fast enough for more complex games. I'd again like to point out that everything would be done in python. I'm well aware that python would likely be fast enough for 90% of the code.
There's also Scala, which I haven't seen suggested so far. Scala seems to actually fulfill most of the requirements, but embedding the Java VM with C doesn't seem very easy, and it generally seems like java expects you to build your application around java rather than the other way around. I'm also not sure if Scala's functional paradigm would be good for intuitive game-development.
EDIT: Please note that this question isn't about finding a solution at any cost. If there isn't any language better than lua, I will simply compromise and use that(I actually already have the thing linked into my program). I just want to make sure I'm not missing something that'd be more suitable before doing so, seeing as lua is far from the perfect solution for me.
You might consider mono. I only know of one success story for this approach, but it is a big one: C++ engine with mono scripting is the approach taken in Unity.
Try the Ring programming language
http://ring-lang.net
It's general-purpose multi-paradigm scripting language that can be embedded in C/C++ projects, extended using C/C++ code and/or used as standalone language. The supported programming paradigms are Imperative, Procedural, Object-Oriented, Functional, Meta programming, Declarative programming using nested structures, and Natural programming.
The language is simple, trying to be natural, encourage organization and comes with transparent implementation. It comes with compact syntax and a group of features that enable the programmer to create natural interfaces and declarative domain-specific languages in a fraction of time. It is very small, fast and comes with smart garbage collector that puts the memory under the programmer control. It supports many programming paradigms, comes with useful and practical libraries. The language is designed for productivity and developing high quality solutions that can scale.
The compiler + The Virtual Machine are 15,000 lines of C code
Embedding Ring Interpreter in C/C++ Programs
https://en.wikibooks.org/wiki/Ring/Lessons/Embedding_Ring_Interpreter_in_C/C%2B%2B_Programs
For embeddability, you might look into Tcl, or if you're into Scheme, check out SIOD or Guile. I would suggest Lua or Python in general, of course, but your question precludes them.
Since noone seems to know a combination better than lua/luajit, I think I will leave it at that. Thanks for everyone's input on this. I personally find lua to be very lacking as a high-level language for game-programming, but it's probably the best choice out there. So to whomever finds this question and has the same requirements(fast, easy to use, easy to embed), you'll either have to use lua/luajit or make your own. :)

most suitable language for computationally and memory expensive algorithms

Let's say you have to implement a tool to efficiently solve an NP-hard problem, with unavoidable possible explosion of memory usage (the output size in some cases exponential to the input size) and you are particularly concerned about the performances of this tool at running time. The source code has also to be readable and understandable once the underlying theory is known, and this requirement is as important as the efficiency of the tool itself.
I personally think that 3 languages could be suitable for these three requirements: c++, scala, java.
They all provide the right abstraction on data types that makes it possible to compare different structures or apply the same algorithms (which is also important) to different data types.
C++ has the advantage of being statically compiled and optimized, and with function inlining (if the data structures and algorithms are designed carefully) and other optimisation techniques it's possible to achieve a performance close to that of pure C while maintaining a fairly good readability.
If you also put a lot of care in data representation you can optimise the cache performance, which can gain orders of magnitude in speed when the cache miss rate is low.
Java is instead JIT compiled, which allows to apply optimisations during runtime, and in this category of algorithms that could have different behaviours between different runs, that may be a plus. I fear instead that such an approach could suffer from garbage collector, however in the case of this algorithm it's common to continuously allocate memory and java heap performance is notoriously better than C/C++ and if you implement your own memory manager inside the language you could even achieve good efficiency.
This approach instead is not able to inline method invocation (which induces a huge performance penalty) and doesn't give you control over the cache performance. Among the pros there's a better and cleaner syntax than C++.
My concerns about scala are more or less the same as Java, plus the fact that I can't control how the language is optimised unless I have a deep knowledge on the compiler and the standard library. But well: I get a very clean syntax :)
What's your take on the subject? Have you had to deal with this already? Would you implement an algorithm with such properties and requirements in any of these languages or would you suggest something else? How would you compare them?
Usually I’d say “C++” in a heartbeat. The secret being that C++ simply produces less (memory) garbage that needs managing.
On the other hand, your observation that
however in the case of this algorithm it's common to continuously allocate memory
is a hint that Java / Scala may actually be more suited. But then you could use a small object heap in C++ as well. Boost has one that uses the standard allocator interface, if memory serves.
Another advantage of C++ is obviously the use of abstraction without penalty through templates – i.e. that you can easily create generic algorithmic components that can interact without incurring a runtime overhead due to abstraction. In fact, you noted that
it's possible to achieve a performance close to that of pure C while maintaining a fairly good readability
– this is looking at things the wrong way: Templates allow C++ to achieve performance superior to that of C while still maintaining high abstraction.
D might be worth a look, seeing as how it tries to be a better C++.
From a superficial glance, it has better source code readability than C++ does, so that's one of your points covered.
It also has memory management, which makes playing with algorithms a bit easier.
And templates
Here is a stackoverflow discussion comparing the performance of C++ and D
The languages you noticed were my first guesses as well.
Each language has a different take on how to handle specific issues like compilation, memory management and source code, but in theory, any of them should be fitting to your problem.
It is impossible to tell which is best, and there is likely no major difference if you are familiar enough with all of them to work around their respective quirks.
And obviously, if you actually find the need to optimize (I'm not sure if that's a given), that's possible in each language. Lower level languages obviously offer more options, but are also (far) more complex to actually improve.
A single note about C++ vs Java: This is really a holy war, and if you've followed the recent development you'll probably have your own opinion. I, for one, think Java offers enough good aspects to make up for its flaws, usually.
And a final note on C++ vs C: According to my knowledge, the difference usually amounts to a sufficiently low percentage to ignore this. It it doesn't make a difference for the source code, it's fine to go with C, if C++ could make for easier-to-read source code, go with C++. In any case, the choice is kind of negligible.
In the end, remember that money spent on a few hours of programming/optimizing this could as well go into slightly superior hardware to make up for missed tiny details.
It all boils down to: Any of your options is fine as long as you do it right (domain knowledge).
I would use a language which makes it very easy to work on the algorithm. Get the algorithm right and it could very easily outweigh any advantage from fine-tuning the wrong algorithm. Don't be scared to play around in a language normally thought of as slow in execution speed if that language makes it easier to express algorithmic ideas. It is usually much easier to transcribe the right algorithm into another language than it is to eek-out the last dregs of speed from the wrong algorithm in the fastest executing language.
So do it in a language you are comfortable with and which is expressive. You might surprise yourself and find that what is produced is fast enough!

Java or C for image processing

I am looking in to learning a programming language (take a course) for use in image analysis and processing. Possibly Bioinformatics too. Which language should I go for? C or Java? Other languages are not an option for me. Also please explain why either of the languages is a better option for my application.
You have to balance raw processing power and developer time. Java is getting pretty fast too and if you are finished a couple of days early, you have more time to process the data.
It all depends on volume.
More importantly, I suggest you look for the libraries and frameworks which already exist, see which fits closest to what needs to be done, and choose whatever language the library was written be it C, Java or Fortran.
For Java I found BioJava.org as a starting point.
Java isn't TOOO bad for image processing. If you manage your source objects appropriately, you ll have a chance at getting reasonable performance out of it. Some of the things I like with Java that relates to imaging:
Java Advanced Imaging
2D Graphics utilities (take a look at BufferedImages)
ImageJ, etc
Get it to work with JAMA
Ask someone in the field you're working in (ie, bioinformatics)
For solar images, the majority of the work is done in IDL, Fortran, Matlab, Python, C or Perl (PDL). (Roughly in that order ... IDL is definitely first, as the majority of the instrument calibration software is written in IDL)
Because of this, there's a lot of toolkits already written in those languages for our field. Frequently, with large reference data sets, the PI releases some software package as an example of how to interpret / interact with the data format. I can only assume that Bioinformatics would be similar.
If you end up going a different route than the rest of the field, you're going to have a much harder time working with other scientists as you can't share code as easily.
Note -- There are a number of the visualization tools that have been released in our field that were written in Java, but they assume that the images have already been prepped by some other process.
The most popular computer vision (image processing, image analysis) library is OpenCV which is written in C++, but can also be used with Python, and Java (official OpenCV4Android and non-official JavaCV).
There are Bioinformatic applications that are basically image processing, so OpenCV will take care of that. But there are also some which are not, they are, for example, based on Machine Learning, so if you need something other than image/video related you will need another Bioinformatic oriented library. Opencv also has a machine learning module but it is more focused for computer vision.
About the languages C vs Java, most has been said in the other answers. I should add that these libraries are now C++ based and not plain C. If your applications have real-time processing needs, C++ will probably be better for that, if not, Java will be more than enough as it is more friendly.
Ideally, you would use something like Java or (even better) Python for "high-level" stuff, and compile in C the routines that require a lot of processing power (for instance using Cython, etc).
Some scientific libraries exist for Python (SciPy and NumPy), and they are a good start, although it isn't yet straightforward to combine Python and C (you need to tweak things a bit).
just my two pence worth: java doesn't allow the use of pointers as opposed to C/C++ or C#. So if you are going to manipulate pixels directly, i.e. write your own image processing functions then they will be much slower than the equivalent in C++. On the otherhand C++ is a total nightmare of a language compared to java. it will take you at least twice as long to write the equivalent bit of code in c++. so with all the productivity gain you can probably afford to buy a computer that makes up for the difference in runtime ;-)
i know other languages aren't an option for you, but personally i can highly recommend c# for image processing or computer vision: it allows pointers and hence IP functions in c# are only half as slow as in C++ (an acceptable trade-off i think) and it has excellent integration with native C++ and a good wrapper library for opencv.
Disclaimer: I work for TunaCode.
If you have to make a choice between different languages to get started on Image Processing, I would recommend to start with C++. You can raw pointer access which is a must if you want to operate on individual pixels.
Next, what kind of Imaging are you interested in? Just for fun image filters or some heavy stuff like motion estimation, tracking and detection etc? For that I would recommend you take a look at CUVILib since sooner than later, you will need performance on Imaging functionality and that's what CUVI provides. You can use it as standalone if it serves your purposes or you can plug it with other libraries like Intel IPP, ITK, OpenCV etc.

Resources