How to create custom single byte character set for Windows?

How to create custom single byte character set for Windows? - windows

Windows uses some encoding table for non-unicode applications to map characters from unicode table to 1-byte table. There are many predefined character sets, user can choose one in windows settings. I need to create a custom character set. Where can I find some information about that process? I tried to Google it, but didn't have any luck, I guess, few people are doing that.

AFAIK, you can't do that, I don't think there's even a way to write some kernel mode "driver" for it, but, haven't looked into these things for a while, maybe there is some way (now).
In any case, you might be better off using a library you can change/update, such as libiconv.
UPDATE:
Since you don't have the source code, you're in a very unfortunate position.
For all string resources (in EXE or any DLLs or, though unlikely, in some other file(s)), you can "read them out" and figure out what's the code page used in them and change it (and the strings themselves), tweaking it in some way that would achieve your purpose - to have the right glyphs appear (yes, you might actually see different glyphs in Notepad, but, who cares if you application shows the right one(s) - FWIW, for such hacks, it's best to use a hex-editor). Then, of course, "put" the (changed) resources back in (EXE/DLL). But, it's quite possible not all strings are in resources, and that's when the "real" problems start.
There's any number of hacks that could have been done here. Your best option is to use some good debugger (WinDbg or better) and figure out what's going on and how are character sets handled = since you don't have the source code, it's gonna be quite painful. You want to find out:
Are the default charset(s) used (OEM/ANSI), or some specific (via NLS APIs)?
Whatever charset is used, is it a standard one or not? The charset here is the "code" Windows assigns to it. Look at Windows lists of available charsets.
Is the application installing fonts? If it is, use a font tool to examine them - maybe it has a specific (non-standard?) code-page supported in it.
Is the application installing some some drivers. If it is, the only way to gain more insight is to use a kernel debugger (which is very tricky and annoying, but, as already said, you're in an unfortunate situation).

It appears that those tables are located at C:\Windows\system32*.nls. I'm not sure whether there's proper documentation for their structure. There's some information in Russian here. Also you might want to tinker with registry at HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls

Related

Windows: redirect ReadFile to run process and pipe it's stdout

I was wondering how hard it would be to create a set-up under Windows where a regular ReadFile on certain files is being redirected by the file system to actually run (e.g. ShellExecute) those files, and then the new process' stdout is being used as the file content streamed out to the ReadFile call to the callee...
What I envision the set-up to look like, is that you can configure it to denote a certain folder as 'special', and that this extra functionality is then only available on that folder's content (so it doesn't need to be disk-wide). It might be accessible under a new drive letter, or a path parallel to the source folder; the location it is hooked up to is irrelevant to me.
To those of you that wonder if this is a classic xy problem: it might very well be ;) It's just that this idea has intrigued me, and I want to know what possibilities there are. In my particular case I want to employ it to #include content in my C++ code base, where the actual content included is being made up on the spot, different on each compile round. I could of course also create a script to create such content to include, call it as a pre-build step and leave it at that, but why choose the easy route.
Maybe there are already ready-made solutions for this? I did an extensive Google search for it, but came out empty handed. But then I'm not sure I already know all the keywords involved to do a good search...
When coding up something myself, I think a minifilter driver might be needed intercepting ReadFile calls, but then it must at that spot run usermode apps from kernel space - not a happy marriage I assume. Or use an existing file system driver framework that allows for usermode parts, but I found the price of existing solutions to be too steep for my taste (several thousand dollars).
And I also assume that a standard file system (minifilter) driver might be required to return a consistent file size for such files, although the actual data size returned through ReadFile would of course differ on each call. Not to mention negating any buffering that takes place.
All in all I think that a create-it-yourself solution will take quite some effort, especially when you have never done Windows driver development in your life :) Although I see myself quite capable of learning up on it, the time invested will be prohibitive I think.
Another approach might be to hook ReadFile calls from the process doing the ReadFile - via IAT hooking, or via code injection. But I want this solution to more work 'out-of-the-box', i.e. all ReadFile requests for these special files trigger the correct behavior, regardless of origin. In my case I'd need to intercept my C++ compiler (G++) behavior, but that one is called on the fly by the IDE, so I see no easy way to detect it's startup and hook it up quickly before it does it's ReadFiles. And besides, I only want certain files to be special in this regard; intercepting all ReadFiles for a certain process is overkill.

You want something like FUSE (which I used with profit many times), but for Windows. Apparently there's Dokan, I've never used it but seems to be well known enough (and, at very least, can be used as an inspiration to see "how it's done").

How do I reverse-engineer the "import file" feature of an abandoned pascal application?

first question I've asked and I'm not sure how to ask it clearly, or if there will be an answer that I want to hear ;)
tl;dr: "I want to import a file into my application at work but I don't know the input format. How can I discover it?"
Forgive any pending wordiness and/or redaction.
In my work I depend on an unsupported (and proprietary) application written in Pascal. I have no experience with pascal (yet...) and naturally have no source code access. It is an excellent (and very secret/NDA sort of deal I think) application that allows us to deal with inventory and financial issues in my employer's organization. It is quite feature-comprehensive, reasonably stable and robust, and kind of foistered (word?) on us by a higher power.
One excellent feature that it has is the ability to load up "schedules" into our corporate system. This feature should be saving us hundreds of hours in data entry.
But it isn't.
The problem is, the schedules we receive are written in a legacy format intended for human eyes. The "new" system can't interpret them.
Our current information (which I have to read and then re-enter into the database by hand) is send in a sort of rich-text flat-file format, which would be easy to parse with the string library of probably any mainstream language.
So I want to write a converter to convert our data into a format that the new software can interpret.
By feeding certain assorted files into the system, I have learned a little bit about what kind of file it expects:
I "import" a zero-byte file. Nothing happens (same as printing a report with no data)
I "import" an XML file that I guess might look like the system expects. It responds with an exception dialog and a stacktrace. Apparently the string <?xml contains illegal characters or something
I "import" a jpeg image -- similar result to #2.
So I think that my target wants a flat-file itself. The file would need to contain a "document number" along with {entries with "incident IDs" and descriptions and numeric values}.
But I don't know this for certain.
Nobody is able to tell me exactly what these files should look like. Someone in the know said that they have seen the feature demonstrated -- somewhere out there is a utility that creates my importable schedules. But for now, the utility is lost and I am on my own.
What methods can I use to figure out the input file format? I know nothing about debugging pascal, but I assume that that is probably my best bet. Or do I have to keep on with brute force until I can afford a million monkey-operated typewriters? Do I have to decompile the target application? I don't know if I can get away with that, let alone read the decompiled source.
My google-fu has failed me.
Has anyone done something like this before or could they point me in the right direction? Are there any guides on this subject?
Thanks in advance.
PS: I am certain that I am not breaking any laws at this point, although I will have to check to find out if decompilation would get me into trouble or not, and that might be outside of my technical competence anyway.

If you have an example file you can try to take a hexdump utility and try to see if there things you can identify. Any additional info that you have (what should in the file) helps with that. Better even, if you know a program that can edit the file, you can use the editor to make minimal changes and then compare the file before and after.
IOW standard tricks of binary file format reverse engineering.
...If you have no existing files whatsoever, then reverse engineering the binary is your only option, and that is not pretty. Decompilation of native binaries is a black art that requires considerable time and skill. Read the various decompilation FAQs on the net.
First and for all, I would try to contact the authors of the program. Source code are options 1,2,3 and you only go with other options if there is really, really, really no hope whatsoever of obtaining source or getting normal support.

Colorized output in the swipl-win (SWI-Prolog) window

What I'm Doing
I am currently working on creating a SWI-Prolog module that adds tab-completion capability to the swipl-win window. So far I've actually gotten it to where it reads a single character at a time without stopping/returning anything until a tab character is typed. I have also already written a predicate that returns all possible completions of an incompletely typed term by using substring-matching on a list of current terms (obtained via current_functor/2, current_arithmetic_function/1, current_predicate/2, etc [the predicate used will eventually be based off of context]).
If you want to see my code, it is here...just keep in mind that I'm not exactly a Prolog master yet (friendly tips are more than welcome).
What I'm Thinking
I've long abandoned any efforts at using XPCE to do popup-dropdown-completion in the swipl-win window (I'll eventually try to get that into Pce-Emacs [it won't be as polished as Visual Studio --picture something more like Python's IDLE], but I don't know if that's really even practical since I'm starting to use actual Emacs a lot more nowadays anyway), but is there any way to modify the output color in the swipl-win window? I know syntax highlighting has already been implemented in other Prolog systems' command-prompt windows, but I really just want to have it so that when tabber.pl suggests a completion, it also shows the arity (and perhaps the rest of the partially-typed term) of the suggested term in light gray. I know there is already color output from the system (like when it starts up), but I don't know how to hook into output stuff to control it myself. (Obviously, I'd probably define print/1 but...)
I know I could write my own SWI-Prolog console like one guy has done with C#, but I really wanted it so people (including me) could just load the tabber module somewhere in their config file and continue to use the swipl-win window, rather than having it be a completely different executable... Would I have to use some kind of C API?
Note:
The actual implementation will likely be influenced by the answers that I get to this question, because I'm going to base my decision on the use of strings and/or atoms in this project off of them.
What I'm Asking
Is there a way or something (even if it's really low-level) I can implement to colorize output in the swipl-win window?

AFAIU the question you have to deal with is to avoid calling the fontify-function every seconds as common buffers do. I.e. call it only once when output arrived and restrict fontification to previous prompt in buffer.

Html entities in file names: Possible mine traps?

When I thought about resizing images and saving the new sizes parallel on the server, I came to the following question:
// Original size
DSC_18342.jpg
// New size: Use an "x" for "times"
DSC_18342_640x480px.jpg
// New size: Use the real "×" for "times"
DSC_18342_640×480px.jpg
The point is, that it's slightly easier if you got a real × instead of an x in the file name, as the unit px already contains the x, which makes it a little bit harder to read.
Question: What problems could I get in, when using the Html entity in the filename?
Sidenotes: I'm writing an open source, publicly available script, so the targeted server can be anything - therefore I'm also interested (and will vote up) edge cases, that I'm not aware off.
Thank you all!
You may have noticed, that I'm aware, that I could simply avoid it (which I'll do anyway), but I'm interested in this issue and learning about it, so please just take above example as possible case.

There are file systems that simply don't support unicode. This may be less of a problem if you make unicode support a requirement of your application.
Some consideration about different unicode file system are given in File Systems, Unicode, and Normalization.
A concluding remark (from a viewpoint of solaris file systems) is:
Complete compatibility and seamless interoperability with
all other existing Unicode file systems appears not 100%
possible due to inherent differences.
I can imagine that there will be problems especially when migrating the application. Just storing files is probably no problem but if their names are stored in a database there might be a mismatch after migration.

Most suitable language for cheque/check printing on Windows Platform

I need to create a simple module/executable that can print checks (fill out the details). The details need to be retried from an existing Oracle 9i DB on the Windows(xp or later)
Obviously, I shall need to define the pixel format as to where the details (Name, amount, etc) are to be filled.
The major constraint is that the client needs / strongly prefers a executable , not code that is either interpreted or uses a VM. This is so that installation is extremely simple. This requirement really cannot be changed.
Now, the question is, how do I do it.
(.NET, java and python are out of the question, unless there is a way around the VMs)
I have never worked with MFC or other native windows APIs. I am also unfamiliar with GDI.
Do I have any other option? Any language that can abstract the complexities and can be packed into a x86 binary?
Also, if not then any code help with GDI would be appreciated.

The most obvious possibilities would probably be C, C++, and Delphi. There are a few others such as Ada (e.g., Gnat), but offhand I don't see a lot of reason to favor them (especially for a job this small).
At least the way I'd write this, the language would be almost irrelevant. I'd have it run almost entirely by an external configuration file that gave the name of each field, and the location where it should be printed. I'd probably use something like MM_LOMETRIC mapping mode, so Windows will handle most of the translation to real-world coordinates (and use tenths of a millimeter in the configuration file, so you can use the coordinates without any translation).
Probably the more difficult part of this would/will be the database connectivity. There are various libraries around to help out with that, so this won't be terribly difficult, but it's still not (quite) as trivial as the drawing part.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio