We're writing a chess app for terminal. We want the squares to be actual squares (or close enough) instead of the 2:1-ish, regular monospace, Unicode character rectangles.
As of right now We're thinking of using some of the available open-source chess fonts and turning them into two character-wide ligatures. We want the custom font to be interoperable with the user's choice of terminal font, assuming it is monospace.
This seems problematic as we would need to override the font only for specific character-pairs.
Another way is to take an already-existing, widely used type face that has wide character variety (say, JetBrains Mono), add the chess piece ligatures to it and package it as the canonical font.
However this seems overly complicated and opinionated route. Are we missing a simple alternative?
We are using Python's Textual library for drawing the actual squares and drawing the square outlines seems relatively straightforward.
Here's the opening position of the game as a reference of what we want to achieve.
BRBNBBBQBKBBBNBR
BPBPBPBPBPBPBPBP
................
................
................
................
WPWPWPWPWPWPWPWP
WRWNWBWQWKWBWNWR
Where one letter-pair is one piece. The first letter signifies the color and the second one signifies type of the piece. Obviously, the letter-pairs are to be replaced with the corresponding chess piece symbols.
Related
I want to print a qr code with its middle portion blacked out and then print variable data on the black square (which would be none of QR code's business).
How can I achieve that? One way could be while generating QR, I define my timing pattern or some configuration to be fixed as this centered black square.
I'll be using my own app to decode it, so I would know the configuration while decoding as well.
The concept of a 2D barcode with a "free" center area is certainly plausible--Snapcodes are one example, and Denso Wave (who invented QR) have a proprietary form called Frame QR.
That said, what you create should not be called a QR code, and decoders will not be compatible. The ones you see in the wild with artistic centers are using error correction to achieve the effect.
I have an application that needs separating Japanese characters one by one from an image.
Input: an image with ONE line of Japanese text. It can have halfwidth Katakana, halfwidth numbers, fullwidth Katakana, Hiragana and numbers as well. Maybe halfwidth or fullwidth English characters as well. (let's forget about English characters for the moment)
Issue:
I can easily separate out the characters by using adaptive thresholding, dilating and eroding. But there is one big issue.
Some of the Japanese characters have a space in between them. Like 川, 体, 休, 非. So simply looking at vertical white gaps doesn't help. Finding the width doesn't help either because there can be fullwidth characters (2btyte) or halfwidth characters (1byte). i seem to need an exquisite way to do this.
any idea how i should proceed with this? any idea is a good idea :)
here are couple of sample images. (characters circled in red are the problematic ones)
http://imageshack.us/a/img833/3810/e31z.png
http://imageshack.us/a/img12/2395/7mqn.png
Don't expect to find one single simple algorithm able to do what you want, be prepared to combine a handful of techniques, including, but not limiting to those you already mentioned.
My personal advice, taken out of previous personal experience, would be for you to take a look at template matching techniques.
Basicaly that's what you'll need to do:
Select a few sample images of each symbol you want to identify to form your templates database.
Develop an algorithm to segment each individual character out of the image. That I think you've acomplished already.
Here it is important that you scale the characters and normalize their perspective so that they match the exact conditions on which the templates were generated. getperspectivetransform and warpPerspective might come in handy.
Compare each character against each of your templates using cv::matchTemplate for example.
Out of the top matches do some fine selection using heuristics like those you mentioned yourself, namely, checking for the existance of gaps on expected places and so on.
Test and retest, refining the heuristics for the closest cases till you reach the desired accuracy.
If you find yourself dealing with too much variety in terms of lighting conditions, characters colors, fonts, sizes and so on, you'll realize you'll be needing a huge database to cover all the various possibilities. In this case, it might help to use some transform invariant to the varying conditions. For character identification I believe skeletonization could work well. Take a look at topological skeleton and morphological skeleton and also here for a brief example.
Hope OCR is what you need to do. As this link says opencv doesnt support OCR. But there is another opensource tesseract which will do this. Just check if this helps.
Few more links I got on googling.
Opencv OCR
OCR exaple in Opencv
Hope this helps!
I am trying to crop out the printer marks that are at the edges of a PDF.
The path i want to take to solve this problem is as follows:
Convert PDF into a bitmap and then traverse the bitmap and try to find the lines, then once the lines are found, find the coordinates of the edges of these lines, set the cropping coordinates to the coordinates just found.
However the problems that pop up in my mind with this approach is how to know when the lines end and the actual page starts. How to differentiate lines from letters.
How do I overcome these hurdles, or is there a better way to crop out the printer marks from a PDF?
There is no general answer that works for ALL PDF files, however there are a few useful strategies that are implemented by existing solutions for graphics arts such as callas pdfToolbox (watch it, I'm associated with this product) or PitStop. The strategies center around a number of facts:
trim and bleed marks are usually simple lines (though thin rectangles are sometimes used as well). They are short and straight (horizontal or vertical).
These marks are usually drawn in specific colours. Either CMYK with the color set to 100%, 100%, 100%, 100% or - more commonly - a special spot color called "All". You're almost guaranteed of this because these marks need to show up on every printed separation (sorry for the technical printing terms if you're not familiar with them).
These marks normally are mirrored symmetrically. You're not looking for a single mark - you're looking for a set of them and this typically helps with recognition a lot. Watch out however that you're not confused by bad applications which don't place marks with absolute accuracy.
Lastly but perhaps not important in your application, different regions can actually work with different types of marks. Japanese trim and bleed marks for example look completely different than European or US marks.
I have nothing useful to do and was playing with jigsaw puzzle like this:
alt text http://manual.gimp.org/nl/images/filters/examples/render-taj-jigsaw.jpg
and I was wondering if it'd be possible to make a program that assists me in putting it together.
Imagine that I have a small puzzle, like 4x3 pieces, but the little tabs and blanks are non-uniform - different pieces have these tabs in different height, of different shape, of different size. What I'd do is to take pictures of all of these pieces, let a program analyze them and store their attributes somewhere. Then, when I pick up a piece, I could ask the program to tell me which pieces should be its 'neighbours' - or if I have to fill in a blank, it'd tell me how does the wanted puzzle piece(s) look.
Unfortunately I've never did anything with image processing and pattern recognition, so I'd like to ask you for some pointers - how do I recognize a jigsaw piece (basically a square with tabs and holes) in a picture?
Then I'd probably need to rotate it so it's in the right position, scale to some proportion and then measure tab/blank on each side, and also each side's slope, if present.
I know that it would be too time consuming to scan/photograph 1000 pieces of puzzle and use it, this would be just a pet project where I'd learn something new.
Data acquisition
(This is known as Chroma Key, Blue Screen or Background Color method)
Find a well-lit room, with the least lighting variation across the room.
Find a color (hue) that is rarely used in the entire puzzle / picture.
Get a color paper that has that exactly same color.
Place as many puzzle pieces on the color paper as it'll fit.
You can categorize the puzzles into batches and use it as a computer hint later on.
Make sure the pieces do not overlap or touch each other.
Do not worry about orientation yet.
Take picture and download to computer.
Color calibration may be needed because the Chroma Key background may have upset the built-in color balance of the digital camera.
Acquisition data processing
Get some computer vision software
OpenCV, MATLAB, C++, Java, Python Imaging Library, etc.
Perform connected-component on the chroma key color on the image.
Ask for the contours of the holes of the connected component, which are the puzzle pieces.
Fix errors in the detected list.
Choose the indexing vocabulary (cf. Ira Baxter's post) and measure the pieces.
If the pieces are rectangular, find the corners first.
If the pieces are silghtly-off quadrilateral, the side lengths (measured corner to corner) is also a valuable signature.
Search for "Shape Context" on SO or Google or here.
Finally, get the color histogram of the piece, so that you can query pieces by color later.
To make them searchable, put them in a database, so that you can query pieces with any combinations of indexing vocabulary.
A step back to the problem itself. The problem of building a puzzle can be easy (P) or hard (NP), depending of whether the pieces fit only one neighbour, or many. If there is only one fit for each edge, then you just find, for each piece/side its neighbour and you're done (O(#pieces*#sides)). If some pieces allow multiple fits into different neighbours, then, in order to complete the whole puzzle, you may need backtracking (because you made a wrong choice and you get stuck).
However, the first problem to solve is how to represent pieces. If you want to represent arbitrary shapes, then you can probably use transparency or masks to represent which areas of a tile are actually part of the piece. If you use square shapes then the problem may be easier. In the latter case, you can consider the last row of pixels on each side of the square and match it with the most similar row of pixels that you find across all other pieces.
You can use the second approach to actually help you solve a real puzzle, despite the fact that you use square tiles. Real puzzles are normally built upon a NxM grid of pieces. When scanning the image from the box, you split it into the same NxM grid of square tiles, and get the system to solve that. The problem is then to visually map the actual squiggly piece that you hold in your hand with a tile inside the system (when they are small and uniformly coloured). But you get the same problem if you represent arbitrary shapes internally.
I've been researching about adding Arabic localisation into our software. I understand that mirroring of some controls is essential, such as the relationship of labels and textboxes: any labels that are to the left of textboxes in left-to-right languages need to be to the right of the textboxes in Arabic and other right-to-left languages. But I can't find a definite answer on whether the entire layout of the graphical user interface should be mirrored.
For example in our software we have a panel with a map in it on the left of a window and secondary information in a panel on the right. Should the position of these panels be reversed for right-to-left languages?
The main reason I ask is that Microsoft Windows takes the path of mirroring everything when in Arabic, but I just noticed in the new support for Arabic in the iPhone OS 3.0, virtually nothing is mirrored, not even label-textbox examples. I would have thought this to be quite bad on Apple's part, but is this sort of thing widely accepted in the Arab world?
In Arabic, we read and scan graphic elements right-to-left
You will need to horizontal-flip the whole display, except for the images and text (the BiDi library - usually in the OS - handles that). You're only flipping the locations of the control.
A control bound by (x1,y1,x2,y2) in a w*h screen becomes (w-x2,y1,w-x1,y2), the contents of the rectangle are not to be flipped, just the position. It would keep its dimensions.
For example, compare Google news in English, and Arabic... notice the menu, the images, the news items.
Usually, one exception is acceptable, media players are still left-to-right for historical reasons (The tape recorders, VHS video devices were never localized).
Its important. Imagine if you were using some software written by an Arabic speaker that was translated into English and some things were not correctly flipped.
In general, Windows (Vista and WIn7 in particular) and Office are very well translated for right to left reading languages. You can use this as an example of what to do - this is especially true for the top level UI elements
I would agree that purely graphical elements may not need to be flipped, but you should consider doing the right thing for anything that is generally scanned in reading order.
Mirroring is extremely important, not just because the text will be displayed properly, but because of the "mental model" of the user.
When a language is written right-to-left, the entire expectation of the user is shifted, but an even more interesting effect is the change of terminology. What we usually term positioning "left" and "right" actually end up being "beginning" and "end" (or "before" and "after") which makes mirroring the interface a little more elaborate.
On top of that, you will have elements that should not be mirrored. There are several examples in software that indiscriminantly mirrors all buttons and creates hilarious (horrible...) results. A few examples:
"Undo" and "Redo" buttons: Undo means 'go backwards'; in English (LTR) this means go-to-the-left, but in Arabic or Hebrew, it means go-to-the-right, so the button is flipped, and the same goes to the redo button. However, in RTL, the order of those two buttons is also flipped, which means you "double flipped" their representation, which essentially gives you a pair of buttons in RTL that look exactly the same as the pair in LTR, except with different meanings. This can be confusing to implement.
Icon mirroring #1 (lists): For the most part, icons are usually mirrored. For instance, an icon representing a bullet list will be bullets-on-the-left for LTR and bullets-on-the-right for RTL. Most software "simply" mirror it, which is fairly reasonable. However, there are a few pieces of software (OpenOffice is one, unfortunately) that also mirror the numbered list. This creates a situation where the numbers themselves are flipped (so the number 2 in that icon looks like a 5).
Icon mirroring #2 (question mark): Another example is a "help" button that is usually represented with a question mark. This is even more elaborate, because it is different for several RTL languages. Some RTL languages like Arabic display question mark reversed (؟) but others, like Hebrew, display it the same as English (?). This mirroring is dependent on language, not just on RTL/LTR.
Another challenge is having a mixed-content software. For example, if your software displays some set content but allows the user to change the interface language, you may run into interesting flip-no-flip issues.
If you have some software that lets you read RSS feeds, for example, the content may consistently be LTR, but if your user switches their interface to RTL, then some interface elements should mirror while others shouldn't. If we go by the icon examples above, for instance, then "bullet lists" shouldn't flip in this case, because while the interface is RTL, the actual content you insert is expected LTR, so the bullet will practically be left-to-right, and the bullet represents that.
The challenge in these cases is deciding which piece of your software is "content" and which is "interface" and while it is doable, it's a challenging endeavor.
To sum up - mirroring interfaces is not as straight forward as it sounds, and many times really depends on the purpose and context of your software. I would say it is extremely important, because RTL users' mental model (in terms of where things "start" and "end") is reverse, and just like LTR users' eyes are automatically drawn to the top-left (which is why logos are usually placed there) RTL users' eyes are usually automatically drawn to the top-right - and your interface should respect that. That said, you should analyze your interface and the way your content and interface are displayed in general, and beware of cases where flipping/mirroring is actually not suitable.
Hope this helps.