I have a point-of-sale application written in Perl/Tk. I use X11::GUITest to do automated testing of it, driving the app via hot-keys bound to the buttons and other widgets (it's normally touch-screen driven). However, X11::GUITest doesn't have a way to "read" text back from the screen, so I resort to augmenting the app to write temp files as well as putting data on the screen. The test scripts then look at the temp files, not the GUI. But I'd love to extend X11::GUITest or make a new CPAN module that can scrape text strings from X11 GUIs. I'm not after graphics-to-text conversion; it's my (faint) understanding that somewhere in the depths of the X window system, label text and such are stored as text strings and rendered to bitmap form late in the pipeline (?).
Anyone have guidance on how to do this, or pointers on where to start?
Yeah, I know I should've adhered to better MVC separation and not actually test at the GUI level, but just below it; however reality got in the way and it is what it is!
The best way to do this is to make your program work with an accessibility framework like ATK (used in GTK applications) and then use that to query for the strings as a screen reader would have to for text-to-speech translation. This is the approach taken by the Linux Desktop Testing Project and dogtail testing frameworks. You get the bonus of using existing, well tested code and making your application more usable by disabled users (as may be required by laws such as the Americans with Disabilities Act in the US and similar laws in many other countries).
If your application is using modern font frameworks, like libXft2, this may be your only choice, as those strings are only in the client application, not the X server, and the character to pixmap conversion is done in the client. (If your text is antialiased, it must be using these instead of the legacy X11 API's.)
Even with the legacy X11 API's though, the X server doesn't store the strings once the text to bitmap conversion is done, so there's no good way to query them short of intercepting them in that case.
The program listres lists resources in widgets, including label texts and I think the contents of text entry boxes. You may be able to use its output directly, extracting what you need, or you may need to look at the source and see how it's done.
Related
Before I get shot down on this one, I realize that the 'how' answer for this question might be slightly debatable, however I'm more interested in the 'what'.
In a nut shell I want to know which methods I can use to interact with a PC video game interface. I want to create a program that can extract data from a video game market interface.
My first initial thought was that I would need to programmatically take screen shots and then use some Optical Character Recognition software to extract the text. Then run whatever operation on the extracted text to derive my incites.
Then I was thinking it might just be easier to have a bunch of mini screen shots that I just use to find matches on certain sections of the screen. When a match is found, I would then know what the text is on the screen, without having to actually 'extract' it.
For those out there whom have done this, can you point me in one direction or the other? Perhaps there is a method that I am completely unaware of.
If its the case that this question is not suitable for this forum. It would be much appreciated if you could direct me elsewhere.
Edit: I should probably add that I'm not looking to spend a fortune on this project... so any free software would be the best. Perhaps that's a tall order.
I'm starting to think Sikuli is the direction I'm going to go. Open Source image recognition software, integrates with Python, Ruby, Java, JDBC, JavaScript and more.
-- Expanding on the question --
There are basically 3 categories of tools:
Recorder while you manually work along your workflow, a recorder tracks your mouse and keyboard actions. After stopping the recording, you might playback (autorun your worflow). The recordings can usually be edited and augmented with additional features.
GUI aware the tool allows to programmatically operate on GUI elements like buttons. This is based on the knowledge of internal structures and names of the GUI elements and their features. Some of these tools also have a recording feature.
Visually the tool “sees” images (usually retangular pixel areas) on the screen and allows to act on these images using mouse and keyboard simulation. There might be some recorder feture as well with such a tool.
SikuliX belongs to the 3rd category and currently does not have a recorder feature.
Answer in progress...
In games with moddable UIs, like many MMOs, you could create a mod that streams data through a series of black and white squares that could be read with optical sensors. From there, a microcontroller could deliver the data back to the PC via USB or wifi.
My approach as a noob. First determine if OCR 100% needed, I think this plays a role in speed.
if possible:
-run game in window (allows for trouble shooting and easy troubleshooting)
-is there a high contrast option for game? Will help Sikuli find things
then you plan out your scenarios:
You have to create different functions for different situations. A lot of gaming is "do you see this?" Then "do this" until that is gone.
Start with small parts you want to automate then build on them. Making sure your parts can scale in case small change need to happen, they will. For instance you want to open the menu if you see an object, lets say a tree.
Assume you have some sort of walking algorithm.
setROI(region1) #focus here for tree
if exists(tRee):
click(loCation) #you could hit the shortcut key to opening the menu
click(iTem) #if the item moves in the menu then you may need to scroll to find it first or you can change the ROI and start seeing if sikuli can differentiate your item from one you dont want to click.
You would get that to loop into other actions and proceed. Goodluck.
I tried to sign up, but I was unable; perhaps a problem from my side. Hopefully I'll get an answer as anonymous.
I apologize for the grammar/syntax, but English isn't my native language.
Recently I lost my job, so I have enough spare time to try something fun. I decided to create a simple text RPG game for me and some friends. It will very close to the board games like Talisman, Dungeon Run, and HeroQuest, using dice and a simple attribute/skill system. So no 3d graphics. The only 2d element, if I decide to include it, will be a map
that will allow the hero to move between locations. Currently I'm using Windows XP SP3, for the game I use wxDev-C++, and although cross platform would be cool, I don't really care.
I have some experience in C++ (currently using wxDev-C++), but I'm far from being called an expert or even a great programmer. I was about to start writing parts of the code, but I decided to check if creating a GUI for the game is possible. In some forums, many suggested I use Qt, CEGUI or wxWidgets, but most examples I saw are grey boxes that are
indifferent at best, when I want something that fits better in a fantasy setting. I don't claim I would do better, but I want a GUI that is more fantasy related.
What I want from the GUI:
1. A "cool" Gui with decent graphics. I could even create an image to serve as a mask in Photoshop, but the GUI builder will have to support imported images.
2. A relatively large textbox in the middle (with a scrollbar) that will display die rolls, damage and options.
3. The ability to display dynamically values (like the change in the health after each action without requiring to refresh manually)
4. Display an icon or a small image of the character in the area where I display stats/abilities.
5. Open new windows created with tha same GUI builder to allocate points, buy/sell things and open a map.
About the map in the game: I decided to create a map in photoshop. When the hero decides to move to another location, a new window will open showing the map. I thought of 2 possible ways to move between locations: 1) Create hotspots on the image and select one by clicking on the name of the location.(I dare not think about the complexity of this so we
move to idea #2) and 2) Have the image as a backgroung to a grid with vertical and horizontal coordinates. When the hero selects a new area to visit, he clicks on the area, but what he really does is click on the grid, which returns the two values (x,y) of the location and informs the game about the area the hero wants to visit.
Yeah, yeah, I know it's too much, so what I'm most interested in are the 1-3. I know that even if they are possible, it will propably take forever, but as I said I have spare time, and I like learning new things. I apologize for the size of the post, but I decided to post as many info as possible so you know what I want.
If any of you has used Qt, CEGUI or wxWidgets could you tell which covers most of my criteria? I saw some great stuff build with CEGUI, but I don't know if it is too hard to learn?
Thank in advance.
I know my answer comes pretty late, I only recently started using stackoverflow fairly recently, but maybe this response will help anybody.
CEGUI fully supports skinning widgets using XML. Our CEED editor (WYSIWYG) fully supports layout editing, but the skinning editor (LNF editor) is not finished as of now (11.11.2014), the development version supports exchanging images however and changing sizes and proportions, but more advanced adjustments have to be done in XML.
CEGUI has an imageset editor, fully supported by the CEED editor. Creating imagesets (sets of named subimages, with position and dimension inside a big texture atlas) is supported there. Additionally there is a way to create imagesets from just a bunch of jpg/png/... files using a tool. You would have to ask for specifics in the forum though because it is not integrated into CEED yet.
So basically with CEGUI you are free to make whatever fantasy GUI you want. Skinning simple elements like buttons and progress bars isn't much work in XML anyways. Without the finished editor, some more advanced widgets are more work to skin, but many skins have already been created done this way and some of them are even publically available in the forum and in the CEGUI stock files.
StaticText widgets supports what you want, you can even use images in there or change fonts and colours in the text if you want. Scrollbars are supported too.
I am not sure what you mean by this. You have to specify this.
A simple "Generic/Image" widget is available in CEGUI for this purpose. You can use precreated images or even RTT textures.
You can create and destroy windows in CEGUI without issues.
Regarding the map: I m not sure what you mean, but getting the position of a click in respect to an image (representing the map) is possible in CEGUI.
CEGUI is not particularly hard to learn. There is always the forums and the chat if you got questions. For an Open Source project it is quite well documented so if you read all of the API docu, and look at the supplied samples in the sample browser, you should already get quite far. And for everything additional there is the forum (search), the IRC chat and a community wiki (mind the targeted versions of an article there though)
For a project like yours, CEGUI seems perfectly suited (this is what it was created for in the first place). Qt is not really optimal for games for numerous reasons. wxWidgets I have never used.
Does anyone knows about any editor allowing to visually design a form (by form I do not mean DFM or Delphi form, but a "paper form", like those pre-printed forms that you fill with some info) and that generates pascal commands to draw that form in a Printer (or Image) canvas?
What I want is an easy way to draw/design this form visually, composed just by lines and text, and a way to convert this to Pascal commands that when run, will draw that form in a Canvas (Image or Printer), respecting the original layout and scale, doesn't matter the Canvas DPI where it is being drawn.
Update: Maybe I wasn't clear enough about what I need and why I need it. I developed an Open Source component called TFreeBoleto (freeboleto.sf.net). It is used to generate and print bank billets (a common method for billing people in Brazil). Right now, the component uses a TBitmap image containing the "billet" mask, and TextOut methods for the dynamic areas (ie: billet number, customer name, etc). It is fine when looked in the screen, but some people complains that the quality of the printed image is not good. The component uses a BltTBitmapAsDib procedure to maximize the quality of printing, but some people still think it is not good enough. So, my idea was to avoid using a bitmap image as the form layout, and draw everything direct in the canvas (both form and printer). Check here for a sample of what a bank billet looks like.
Of course ReportBuilder and/or FastReport could solve the problem, but they are not free, so I cannot include it in the component. I need "native" solution that any standard Delphi install would be able to compile.
You might get what you want out of the Fast Reports Report Designer which is a commercial reporting system for Delphi. Remember that a report is just a page. That page can be shown on the screen or printed on the printer.
You also might find that something like TRichView helps you.
Whether using TRichView in particular or not, I would look into using HTML to do what you want. I would use HTML+CSS to do both a screen and printer layout, that can also be viewed on the web. For simple text layout plus text boxes I think even bare HTML and HTML tables might be sufficient. To visually design simple text pages, using a Delphi application, I would use TRichView.
In both cases, you would be creating documents, not code. To create code that creates a page, without using any document system, would be very difficult indeed, and I am not sure what you would really do with that code, since you would need a compiler or interpreter to convert that code into something that you could use. Please clarify what you mean by "creating code", and what syntax you would want that code to be using. If HTML is code in your definition of "code" then maybe HTML is the best kind of "code" for your problem.
I do my form-work with WPTools. It is also a commercial product. The core is a very good wordprocessor and form-designer. The engine can render text and forms to any canvas (screen, printer, also create pdf) and is highly flexible. Output is mainly rtf and html.
I also see no advantage in creating pascal code to redraw the form. What you need, i think, is a good WYSIWYG-editor which creates a document that fits your needs.
Check out ReportBuilder # http://www.digital-metaphors.com/
It is a commercial reporting tool for Delphi - around a long time, very high quality, with all native Delphi source code packaged with it. I am using it for an important commercial project right now and I recommend it highly (I'm not working for them.) I've used MANY Delphi reporting tools over the years and this one is the best IMO.
RBuilder also has extensive support for paper form emulation see:
http://www.digital-metaphors.com/products/report_design/form_emulation.html
I haven't worked with that feature, but you can download a full-featured demo and try it.
Yoy can use Adobe Acrobat (full version) to create forms.
Then you can use free Acrobat Reader to display and print forms or other COM object in your application.
I think it is best solution for you.
PS
All tools for reports that are included in Delphi are free for you to design form and are free to distribute if user only preview and print already designed reports.
The same is valid for Adobe Acrobat (you may distribute forms) but you have added that you need to print form and some text over form. Maybe it is easier if you use reports but it is possible to do the same using PDF.
Most report engines are not open source but are free to distribute. There is many components for creating PDF - paid (one time), free, as well as open source.
PPS
I have read your updete for second time. Since you are using TBitmap and you can to TextOut so: You can use TMetafile. There is many editors for metafiles and it is free to distribute metafiles.
Ours is an open-source Mac application localized by volunteers. These volunteers will do their work on special localization builds of the software (with unstripped nibs), then send us the changes to integrate into the original xib and strings files.
The problem is that, while there is a way to integrate string changes without blowing away previous size changes, I can't see a way to integrate new string and size changes (as when we add or replace views).
The only way to do both that I can see is for localizers to work directly with the original xibs and send us diffs. That means they have to download the entire source code, not just a localizable version of the release, work in Xcode as well as IB, and either run the diff command themselves (per xib) or install and use Mercurial.
Is there any better way for a xib-based application?
UPDATE January 2014: Apple’s ‘autolayout’ code combined with their ‘Base’ localization stuff took most my ideas and improved on them. I recommend against using my old tools that I talk about in this answer. But, also, man was I right!
I strongly strongly STRONGLY recommend against frame changes in localizations. I know this runs counter to Apple's advice, but there are SO MANY problems with allowing frame changes - you end up with a billion edge cases.
Imagine you have 10 XIBs in your app, and you support 12 languages. You've got 120 different layouts to support, now. You just can't do this.
Change the strings, leave the views where they are. Make 'em bigger in ALL languages, if you need to. It sounds like this shouldn't work but it does. (I won three Apple Design Awards with an app that's localized in 10 or so languages this way.)
Specifics:
For radio and checkboxes, just let them extend far to the right, beyond the last English character. That also provides a nice big landing area for imprecise mousers.
For buttons, they should be wide anyhow, because it never looks good to have text cramped in the middle of the buttons.
For titles on tableview columns, you should autosize when you load 'em up, if needed.
For explanatory text, you should have some extra space to the right, and maybe an extra line. It just makes the English version of the XIB seem less cluttered. Sure, the Germans are going to see a slightly tighter XIB, but, hey, they're Germans -- they're probably used to that. There's probably even a German word for it. "Deutscheninterfakkenclutterlongen."
If a text field is centered, just add equal space on both sides. There's no reason not to.
I've combined this with scripts that suck all the strings out of my XIBs and put them in .strings files, and then dynamically put the strings back at run-time, so anyone can localize my app without any special tools. Just drop in a bunch of .strings files and run it!
Blog post including full source: [Lost in Translations]¹.
I confess that I'm not all that familiar with the process of localizing Mac apps. But I did run across a script that's part of the Three20 iPhone library that seems like it might be useful: diffstrings.py is a Python script that "compares your primary locale with all your other locales to help you determine which new strings need to be translated. It outputs XML files which can be translated, and then merged back into your strings files."
EDIT: As a companion to Wil Shipley's answer to this question, I'll add a link to a blog post he just wrote that goes into more detail about localization, and provides some of the tools that he's built to ease the process.
Where I used to work, we had this issue as well. Our app was getting translated into 10 different languages.
At first, we tried doing what Wil suggested, which is to make everything super wide and fit in every language. Unfortunately, "online backup" might be pretty short in English, but in other languages (especially Spanish), it's really long ("copia de seguridad" just means "backup"). Widening our UI made everything look pretty terrible.
One day, I was playing around with some Core Animation stuff and discovered the CAConstraint class. CAConstraint is basically a way to define a layout relationship between two CALayers. You give one layer a name (like "layerA") and then say "layerB is constrained [in such-and-such a way] to a sibling layer called layerA". Then, whenever the layer named layerA is repositioned or resized, layerB automatically moves as well. It's really neat, and it's just what we were looking for.
After a couple of days of work, I came up with what is now CHLayoutManager. It's basically a re-make of CAConstraint and friends, but for NSViews. Here's a simple example of how it works:
CHLayoutConstraint * centerHorizontal = [CHLayoutConstraint constraintWithAttribute:CHLayoutConstraintAttributeMidX relativeTo:#"superview" attribute:CHLayoutConstraintAttributeMidX];
CHLayoutConstraint * centerVertical = [CHLayoutConstraint constraintWithAttribute:CHLayoutConstraintAttributeMidY relativeTo:#"superview" attribute:CHLayoutConstraintAttributeMidY];
[aView addConstraint:centerHorizontal];
[aView addConstraint:centerVertical];
This will keep aView centered in its superview, regardless of how the superview is resized. Here's another:
[button1 setLayoutName:#"button1"];
[button2 addConstraint:[CHLayoutConstraint constraintWithAttribute:CHLayoutConstraintAttributeMinX relativeTo:#"button1" attribute:CHLayoutConstraintAttributeMaxX]];
[button2 addConstraint:[CHLayoutConstraint constraintWithAttribute:CHLayoutConstraintAttributeMaxY relativeTo:#"button1" attribute:CHLayoutConstraintAttributeMaxY]];
[button2 addConstraint:[CHLayoutConstraint constraintWithAttribute:CHLayoutConstraintAttributeWidth relativeTo:#"button1" attribute:CHLayoutConstraintAttributeWidth]];
This will keep button2 anchored to the right edge of button1, as well as keeping button2's Y position and width the same as button1's.
Internally, CHLayoutManager uses an NSValueTransformer to calculate the new positioning information. Some of the CHLayoutConstraint initializers accept an NSValueTransformer, so you can create arbitrarily complex layout manipulations.
We used this for constraining and laying out the entire UI, and then doing all of the localization in code (and subsequently calling -sizeToFit, with some modifications). Our UI would just flow into its final layout. It turned out to be extremely convenient. We'd just package up our .strings files, send them off to the translators, and then drop them in to place when we got them back, and our app would instantly be localized for that language.
CHLayoutManager isn't perfect. It doesn't resolve conflicts, but simply applies constraints in the order they're added. So you can constrain (for example) the MinX of a view 42 different ways, but only the last one will be used. Also, if you constrain the MinX and the MaxX, they'll also be applied in the order they're added and will not end up stretching or shrinking the width. In other words, constraining one attribute of a view will not affect the other attributes. It's compatible with 10.5+ (GC and non). However, due to some changes in Lion, it's unlikely that I'll address the shortcomings.
Despite these shortcomings, it's an extremely flexible framework, and (IMO) some pretty nifty code. (Plus, I swizzle -[NSView dealloc]! Yay!)
Update
Now that AppKit has this same functionality (via NSLayoutConstraint), I recommend using that system instead of CHLayoutManager. It is far more robust.
Bear in mind that XIB files are merely XML files in an obscure format, so you can translate those easily enough, provided that you can find what strings there are there to translate in the first place. For example, here's the snippet that creates a button called Jenson:
<object class="NSButtonCell" key="NSCell" id="41219959">
<int key="NSCellFlags">67239424</int>
<int key="NSCellFlags2">134217728</int>
<string key="NSContents">Jenson</string>
...
</object>
So you can get that string translated and then substitute occurrences of it in the XIB with your translated value. In order to verify it's working as expected, you could change the language to use random keys instead (like BUTTON_TITLE) which will make it easier to spot when one's missing.
However, the positions/sizes of the items are fixed, so you can have titles that overflow the space given in a different language. That's one of the reasons why Macs have separate XIB files for every language, to allow adjustments to be made on a language-by-language basis, however difficult it is to maintain.
Is there some standard way to make my applications skinnable?
By "skinnable" I mean the ability of the application to support multiple skins.
I am not targeting any particular platform here. Just want to know if there are any general guidelines for making applications skinnable.
It looks like skinning web applications is relatively easy. What about desktop applications?
Skins are just Yet Another Level Of Abstraction (YALOA!).
If you read up on the MVC design pattern then you'll understand many of the principles needed.
The presentation layer (or skin) only has to do a few things:
Show the interface
When certain actions are taken (clicking, entering text in a box, etc) then it triggers actions
It has to receive notices from the model and controller when it needs to change
In a normal program this abstraction is done by having code which connects the text boxes to the methods and objects they are related to, and having code which changes the display based on the program commands.
If you want to add skinning you need to take that ability and make it so that can be done without compiling the code again.
Check out, for instance, XUL and see how it's done there. You'll find a lot of skinning projects use XML to describe the various 'faces' of the skin (it playing music, or organizing the library for an MP3 player skin), and then where each control is located and what data and methods it should be attached to in the program.
It can seem hard until you do it, then you realize it's just like any other level of abstraction you've dealt with before (from a program with gotos, to control structures, to functions, to structures, to classes and objects, to JIT compilers, etc).
The initial learning curve isn't trivial, but do a few projects and you'll find it's not hard.
-Adam
Keep all your styles in a separate CSS file(s)
Stay away from any inline styling
It really depends on how "skinnable" you want your apps to be. Letting the user configure colors and images is going to be a lot easier than letting them hide/remove components or even write their own components.
For most cases, you can probably get away with writing some kind of Resource Provider that serves up colors and images instead of hardcoding them in your source file. So, this:
Color backgroundColor = Color.BLUE;
Would become something like:
Color backgroundColor = ResourceManager.getColor("form.background");
Then, all you have to do is change the mappings in your ResourceManager class and all clients will be consistent. If you want to do this in real-time, changing any of the ResourceManager's mappings will probably send out an event to its clients and notify them that something has changed. Then the clients can redraw their components if they want to.
Implementation varies by platform, but here are a few general cross-platform considerations:
It is good to have an established overall layout into which visual elements can be "plugged." It's harder (but still possible) to support completely different general layouts through skinning.
Develop a well-documented naming convention for the assets (images, HTML fragments, etc.) that comprise a skin.
Design a clean way to "discover" existing skins and add new ones. For example: Winamp uses a ZIP file format to store all the images for its skins. All the skin files reside in a well-known folder off the application folder.
Be aware of scaling issues. Not everyone uses the same screen resolution.
Are you going to allow third-party skin development? This will affect your design.
Architecturally, the Model-View-Controller pattern lends itself to skinning.
These are just a few things to be aware of. Your implementation will vary between web and fat client, and by your feature requirements. HTH.
The basic principle is that used by CSS in web pages.
Rather than ever specifying the formatting (colour / font / layout[to some extent]) of your content, you simply describe what kind of content it is.
To give a web example, in the content for a blog page you might mark different sections as being an:
Title
Blog Entry
Archive Pane
etc.
The Entry might be made of severl subsections such as "heading", "body" and "timestamp".
Then, elsewhere you have a stylesheet which specifies all the properties of each kind of element, size, alignment, colour, background, font etc. When rendering the page or srawing / initialising the componatns in your UI you always consult the current stylesheet to look up these properties.
Then, skinning, and indeed editing your design, becomes MUCH easier. You simple create a different stylesheet and tweak the values to your heat's content.
Edit:
One key point to remember is the distinction between a general style (like classes in CSS) and a specific style (like ID's in CSS). You want to be able to uniquely identify some items in your layout, such as the heading, as being a single identifiable item that you can apply a unique style to, whereas other items (such as an entry in a blog, or a field in a database view) will all want to have the same style.
It's different for each platform/technology.
For WPF, take a look at what Josh Smith calls structural skinning: http://www.codeproject.com/KB/WPF/podder2.aspx
This should be relatively easy, follow these steps:
Strip out all styling for your entire web application or website
Use css to change the way your app looks.
For more information visit css zen garden for ideas.
You shouldn't. Or at least you should ask yourself if it's really the right decision.
Skinning breaks the UI design guidelines. It "jars" the user because your skinned app operates and looks totally different from all the other apps their using. Things like command shortcut keys won't be consistent and they'll lose productivity. It will be less handicapped accessible because screen readers will have a harder time understanding it.
There are a ton of reasons NOT to skin. If you just want to make your application visually distinct, that's a poor reason in my opinion. You will make your app harder to use and less likely that people will ever go beyond the trial period.
Having said all that, there are some classes of apps where skinning is basically expected, such as media players and imersive full screen games. But if your app isn't in a class where skinning is largely mandated, I would seriously consider finding other ways to make your app better than your competition.
Depending on how deep you wish to dig, you can opt to use a 'formatting' framework (e.g. Java's PLAF, the web's CSS), or an entirely decoupled multiple tier architecture.
If you want to define a pluggable skin, you need to consider that from the very beginning. The presentation layer knows nothing about the business logic but it's API and vice versa.
It seems most of the people here refer to CSS, as if its the only skinning option.
Windows Media Player (and Winamp, AFAIR) use XML as well as images (if neccesary) to define a skin.
The XML references hooks, events, etc. and handles how things look and react. I'm not sure about how they handle the back end, but loading a given skin is really as simply as locating the appropriate XML file, loading the images then placing them where they need to go.
XML also gives you far more control over what you can do (i.e. create new components, change component sizes, etc.).
XML combined with CSS could give wonderful results for a skinning engine of a desktop or web application.