Opening an IStorage from an IStream - winapi

I'm implementing a property handler for a structured storage file and would like to initialize it with IInitializeWithStream given its benefits of isolation, handling slow transfer, etc. But I see no obvious way to open an IStorage from an IStream. I don't want to load the whole file into global mem, and the documentation doesn't state whether ILockBytes is necessarily implemented on the IStream passed to Initialize.
Implementing IInitializeWithFile would be easy, but then cannot be isolated.
Any thoughts to how I might be able to get an IStorage from this IStream?
To note, I do not own this file type but for my common work having these extra properties exposed would be helpful.

Related

Providing a basic filesystem from a char driver

I have an existing Linux device driver that exposes a basic char device to userland. (I am not its original author, but I'm trying to modify it.)
Currently it provides a maze of ioctl functions to do various things (though also wrapped in a handy library so most user code doesn't need to deal with the details of it).
One of the things that it does is to provide a sub-stream interface, where given a bunch of device-specific identifying information (including a string and some numeric ids) it can read or write (but not both at once) some data (up to a small number of MB) in a strictly sequential manner. Currently it does this with explicit ioctls.
I'm wondering if there is a way to leverage the existing file_operations infrastructure or similar to provide either a virtual filesystem or just an ioctl that can return a new already-open fd that can then be used with read/write/close (but not lseek) from userland as you'd normally expect?
The device does have a concept of a filename (that's the string) but it is not possible to enumerate existing valid filenames (only to try to open a specific filename and see if it gives an error or not), and the filename is not sufficient to open a stream by itself, which is why I'm currently leaning more towards the "special open" ioctl on the parent device rather than trying to expose things directly in some userland-visible fs that can be opened directly. (Also there's no concept of subdirs and only basic write-protect permissions, so a full fs seems like overkill anyway.) But I'm willing to be persuaded otherwise if there's a better way to do it.
I have written basic char drivers from scratch myself before, so I'm reasonably confident that I can get the read/write ops and other supporting things to work; I'm just not sure how to best handle that initial step of opening the handle.
I'm currently targeting kernel 3.2+.
Edit: The main reason that I think making an actual filesystem (or trying to expose it via procfs or sysfs) wouldn't work is that there's no way to populate a directory -- the only ops available are "open for read" and "open for write", and there's no way to tell which names are valid prior to the open attempt (the files are stored in external hardware and accessed via a protocol I cannot change). If I'm missing something and it is possible to support this sort of thing, that would be useful to know as well.
You can most certainly create a file system where readdir() is not implemented, but the open() method is. It's normally not done because it's not particularly user-friendly, but it certainly is doable.
You're targetting really ancient kernels if you're looking at 3.2 -- the upstream kernel developers aren't even bother to try to backport security fixes that far back, so I certainly wouldn't recommend shipping something as ancient as 3.2, but it's technically doable.
All you need to do is to implement lookup() method in the inode_operations structure for directories. You'll need to figure out some way of creating inodes with unique inode numbers, that contains private information so you can identify the subtream. The inode will have a file_operations structure that implements the read/write methods for reading and writing the substream.
You can try looking at a simple file system such as cramfs or minix to see how things are done.

Best way to cache an NSArray of text/dictionaries and have it useable across the entire app?

I am making a request for an array of perhaps 10-100 objects, all of which are JSON objects that I parse into NSDictionary's. I want to cache this and use this data across the entire application. Is NSCache useful for this or is it better to use NSUserDefaults or what is actually the most accepted way of persisting data across an entire app? CoreData? I'm a iOS newb and don't have too much experience in this.
What you are looking for is a way to access data across your app. This is typically the role a Model plays in MVC.
CoreData and NSUserDefaults are ways to save data so it is not lost when your app closes or is quit. They can be parts of a Model, but do not help in having that data be accessible throughout your app.
If you want an object that stores data and can be accessed anywhere in your code, you are probably looking for a Singleton.
As this excellent Stack Overflow answer explains:
Use a singleton class, I use them all the time for global data manager classes that need to be accessible from anywhere inside the application.
The author provides some sample code you might find helpful.
This would allow you to create a simple object accessible throughout your program that has your NSDictionaries. Because it is a singleton, other classes in your program can easily access it - meaning they can also easily access the NSDictionaries you've stored in it.
If you do decide you want to save data, that singleton object would also be an ideal location to write any load and save code.
Good luck!
Other good resources are:
Wikipedia's Entry on Singeltons
What Should My Objective C Singleton Look Like?
Singeltons and ARC/GCD

Extending functionality of existing program I don't have source for

I'm working on a third-party program that aggregates data from a bunch of different, existing Windows programs. Each program has a mechanism for exporting the data via the GUI. The most brain-dead approach would have me generate extracts by using AutoIt or some other GUI manipulation program to generate the extractions via the GUI. The problem with this is that people might be interacting with the computer when, suddenly, some automated program takes over. That's no good. What I really want to do is somehow have a program run once a day and silently (i.e. without popping up any GUIs) export the data from each program.
My research is telling me that I need to hook each application (assume these applications are always running) and inject a custom DLL to trigger each export. Am I remotely close to being on the right track? I'm a fairly experienced software dev, but I don't know a whole lot about reverse engineering or hooking. Any advice or direction would be greatly appreciated.
Edit: I'm trying to manage the availability of a certain type of professional. Their schedules are stored in proprietary systems. With their permission, I want to install an app on their system that extracts their schedule from whichever system they are using and uploads the information to a central server so that I can present that information to potential clients.
I am aware of four ways of extracting the information you want, both with their advantages and disadvantages. Before you do anything, you need to be aware that any solution you create is not guaranteed and in fact very unlikely to continue working should the target application ever update. The reason is that in each case, you are relying on an implementation detail instead of a pre-defined interface through which to export your data.
Hooking the GUI
The first way is to hook the GUI as you have suggested. What you are doing in this case is simply reading off from what an actual user would see. This is in general easier, since you are hooking the WinAPI which is clearly defined. One danger is that what the program displays is inconsistent or incomplete in comparison to the internal data it is supposed to be representing.
Typically, there are two common ways to perform WinAPI hooking:
DLL Injection. You create a DLL which you load into the other program's virtual address space. This means that you have read/write access (writable access can be gained with VirtualProtect) to the target's entire memory. From here you can trampoline the functions which are called to set UI information. For example, to check if a window has changed its text, you might trampoline the SetWindowText function. Note every control has different interfaces used to set what they are displaying. In this case, you are hooking the functions called by the code to set the display.
SetWindowsHookEx. Under the covers, this works similarly to DLL injection and in this case is really just another method for you to extend/subvert the control flow of messages received by controls. What you want to do in this case is hook the window procedures of each child control. For example, when an item is added to a ComboBox, it would receive a CB_ADDSTRING message. In this case, you are hooking the messages that are received when the display changes.
One caveat with this approach is that it will only work if the target is using or extending WinAPI controls.
Reading from the GUI
Instead of hooking the GUI, you can alternatively use WinAPI to read directly from the target windows. However, in some cases this may not be allowed. There is not much to do in this case but to try and see if it works. This may in fact be the easiest approach. Typically, you will send messages such as WM_GETTEXT to query the target window for what it is currently displaying. To do this, you will need to obtain the exact window hierarchy containing the control you are interested in. For example, say you want to read an edit control, you will need to see what parent window/s are above it in the window hierarchy in order to obtain its window handle.
Reading from memory (Advanced)
This approach is by far the most complicated but if you are able to fully reverse engineer the target program, it is the most likely to get you consistent data. This approach works by you reading the memory from the target process. This technique is very commonly used in game hacking to add 'functionality' and to observe the internal state of the game.
Consider that as well as storing information in the GUI, programs often hold their own internal model of all the data. This is especially true when the controls used are virtual and simply query subsets of the data to be displayed. This is an example of a situation where the first two approaches would not be of much use. This data is often held in some sort of abstract data type such as a list or perhaps even an array. The trick is to find this list in memory and read the values off directly. This can be done externally with ReadProcessMemory or internally through DLL injection again. The difficulty lies mainly in two prerequisites:
Firstly, you must be able to reliably locate these data structures. The problem with this is that code is not guaranteed to be in the same place, especially with features such as ASLR. Colloquially, this is sometimes referred to as code-shifting. ASLR can be defeated by using the offset from a module base and dynamically getting the module base address with functions such as GetModuleHandle. As well as ASLR, a reason that this occurs is due to dynamic memory allocation (e.g. through malloc). In such cases, you will need to find a heap address storing the pointer (which would for example be the return of malloc), dereference that and find your list. That pointer would be prone to ASLR and instead of a pointer, it might be a double-pointer, triple-pointer, etc.
The second problem you face is that it would be rare for each list item to be a primitive type. For example, instead of a list of character arrays (strings), it is likely that you will be faced with a list of objects. You would need to further reverse engineer each object type and understand internal layouts (at least be able to determine offsets of primitive values you are interested in in terms of its offset from the object base). More advanced methods revolve around actually reverse engineering the vtable of objects and calling their 'API'.
You might notice that I am not able to give information here which is specific. The reason is that by its nature, using this method requires an intimate understanding of the target's internals and as such, the specifics are defined only by how the target has been programmed. Unless you have knowledge and experience of reverse engineering, it is unlikely you would want to go down this route.
Hooking the target's internal API (Advanced)
As with the above solution, instead of digging for data structures, you dig for the internal API. I briefly covered this with when discussing vtables earlier. Instead of doing this, you would be attempting to find internal APIs that are called when the GUI is modified. Typically, when a view/UI is modified, instead of directly calling the WinAPI to update it, a program will have its own wrapper function which it calls which in turn calls the WinAPI. You simply need to find this function and hook it. Again this is possible, but requires reverse engineering skills. You may find that you discover functions which you want to call yourself. In this case, as well as being able to locate the location of the function, you have to reverse engineer the parameters it takes, its calling convention and you will need to ensure calling the function has no side effects.
I would consider this approach to be advanced. It can certainly be done and is another common technique used in game hacking to observe internal states and to manipulate a target's behaviour, but is difficult!
The first two methods are well suited for reading data from WinAPI programs and are by far easier. The two latter methods allow greater flexibility. With enough work, you are able to read anything and everything encapsulated by the target but requires a lot of skill.
Another point of concern which may or may not relate to your case is how easy it will be to update your solution to work should the target every be updated. With the first two methods, it is more likely no changes or small changes have to be made. With the second two methods, even a small change in source code can cause a relocation of the offsets you are relying upon. One method of dealing with this is to use byte signatures to dynamically generate the offsets. I wrote another answer some time ago which addresses how this is done.
What I have written is only a brief summary of the various techniques that can be used for what you want to achieve. I may have missed approaches, but these are the most common ones I know of and have experience with. Since these are large topics in themselves, I would advise you ask a new question if you want to obtain more detail about any particular one. Note that in all of the approaches I have discussed, none of them suffer from any interaction which is visible to the outside world so you would have no problem with anything popping up. It would be, as you describe, 'silent'.
This is relevant information about detouring/trampolining which I have lifted from a previous answer I wrote:
If you are looking for ways that programs detour execution of other
processes, it is usually through one of two means:
Dynamic (Runtime) Detouring - This is the more common method and is what is used by libraries such as Microsoft Detours. Here is a
relevant paper where the first few bytes of a function are overwritten
to unconditionally branch to the instrumentation.
(Static) Binary Rewriting - This is a much less common method for rootkits, but is used by research projects. It allows detouring to be
performed by statically analysing and overwriting a binary. An old
(not publicly available) package for Windows that performs this is
Etch. This paper gives a high-level view of how it works
conceptually.
Although Detours demonstrates one method of dynamic detouring, there
are countless methods used in the industry, especially in the reverse
engineering and hacking arenas. These include the IAT and breakpoint
methods I mentioned above. To 'point you in the right direction' for
these, you should look at 'research' performed in the fields of
research projects and reverse engineering.

How to store preferences for an application?

I am a newbie in Ruby coming from web development with mainly PHP/SQL. I was thinking about how I store preferences in my application. For instance, if I want to store a path as default_path and have that set also when the user restarts the application.
In the web world one would probably store this in a database or XML. Database seems overkill for a standalone application. But I am unsure wheter XML/YAML/Other-Write-Format is the way to go. And if so, where should I store these preferences? Should they be, for instance on a Mac, in ~/Library/MyAppName?
I like using YAML because it's very easily read/written by a lot of languages, making it possible for several apps to share the same configuration info. It's a well documented standard so there should be very little chance of data falling into a hole with it.
Also, because it's easy for a human to understand, and doesn't take any special tools to change, it works nicely for any data that might occasionally change in an app, either for fine-tuning or to enable special behaviors.
A little creative coding on your part that periodically checks the last modified time of the YAML file could make it so your app would modify its behavior on the fly as the prefs file is tweaked. I had a big app I didn't want to shut down for changes and set up that behavior. It ran three weeks straight, and I tweaked its operating parameters via its config file. It would read the file every minute and inherit any changes to its parameters on the fly.
Databases are a good way to store parameters/preferences if it's a centralized server or web-based app. For something distributed that runs on individual machines it makes no sense.
Ruby gives you another method for storing data called Marshaling. This will let you store a class/object to a file and reconstitute it later. If all of your user preferences are stored in a single object (or you can create an object which can hold all of the data that you need), it may be easiest to marshal the data instead of writing import/export routines to a text-based format or trying to pull in an additional library or gem.
As to where on the disk to store the data, that's up to you. Most platforms have a standard location for storing application data based on whether it's available to a single user or all users. It's usually safest to follow the common practice on your target platform of choice.
Update: The simplest example of marshaling would probably be this: Say that you have a class called UserPrefs that you use to store all of your user preferences. You can use the following code to store the preferences data into a file:
my_prefs = UserPrefs.new
# ... Fill in the 'my_prefs' object with the user's preferences, etc ...
# Store the object into a file
File.open("user_prefs.data", "wb") do |file|
Marshal.dump(my_prefs, file)
end
The next time that you load the application, you can restore those preferences using the following:
# Load prefs from file
my_prefs = nil
File.open("user_prefs.data", "rb") {|f| my_prefs = Marshal.load(f)}
At this point, the my_prefs object should be exactly the same as it was when the marshaling code was originally run. This essentially lets you take a 'snaphot' of an object at one point in time (say, when your program shuts down) and restore it later (say, when your program loads). Internally, all of the data in the structure is encoded into a single string and that string is what is stored to disk; the Marshal module simply takes care of the encoding and decoding for you.
Here is another example of using marshaling to store and retrieve data.
The default encode/decode routines built into the Marshal module are usually sufficient for most data-storing classes. Particularly complex classes may have problems, and if that is the case then you can define your own encode and decode methods (the first link includes an example of defining custom methods).
Some types of data, however, cannot be marshaled (things like handles to open files, Proc objects, etc) since they don't normally persist across Ruby sessions. If you are needing to marshal a class that includes members like this that Marshal doesn't like, you can use custom encode/decode functions to marshal the rest of the class and omit the problematic members.
I saw some applications using ruby gconf2

What can I access from a BackgroundWorker without "Cross Threading"?

I realise that I can't access Form controls from the DoWork event handler of a BackgroundWorker. (And if I try to, I get an Exception, as expected).
However, am I allowed to access other (custom) objects that exist on my Form?
For instance, I've created a "Settings" class and instantiated it in my Form and I seem to be able to read and write to its properties.
Is it just luck that this works?
What if I had a static class? Would I be able to access that safely?
#Engram:
You've got the gist of it - CrossThreadCalls are just a nice feature MS put into the .NET Framework to prevent the "bonehead" type of parallel programming mistakes. It can be overridden, as I'm guessing you've already found out, by setting the "AllowCrossThreadCalls" property on the class (and not on an instance of the class, e.g. set Label.AllowCrossThreadCalls and not lblMyLabel.AllowCrossThreadCalls).
But more importantly, you're right about the need to use some kind of locking mechanism. Whenever you have multiple threads of execution (be it threads, processes or whatever), you need to make sure that when you have one thread reading/writing to a variable, you probably don't want some other thread barging and changing that value under the feet of the first thread.
The .NET Framework actually provides several other mechanisms which might be more useful, depending on circumstances, than locking in code. The first is to use a Monitor class, which has the effect of locking a particular object. When you use this, other threads can continue to execute, as long as they don't try to lock that same object. Another very useful and common parallel-programming idea is the Mutex (or Semaphore). The Mutex is basically like a game of Capture the Flag between your threads. If one thread grabs the flag, no other threads can grab it until the first thread drops it. (A Semaphore is just like a Mutex, except that there can be more than one flag in a game.)
Obviously, none of these concepts will work in every particular problem - but having a few more tools to help you out might come in handy some day :)
You should communicate to the user interface through the ProgressChanged and RunWorkerCompleted events (and never the DoWork() method as you have noted).
In principle, you could call IsInvokeRequired, but the designers of the BackgroundWorker class created the ProgressChanged callback event for the purpose of updating UI elements.
[Note: BackgroundWorker events are not marshaled across AppDomain boundaries. Do not use a BackgroundWorker component to perform multithreaded operations in more than one AppDomain.]
MSDN Ref.
Ok, I've done some more research on this and I think have an answer. (Let the votes decide if I'm right!)
The answer is.. you can access any custom object that's in scope, however your access will not be thread-safe.
To ensure that it is thread-safe you should probably be using lock. The lock keyword prevents more than one thread executing a particular piece of code. (Subject to actually using it properly!)
The Cross Threading Exception that occurs when you try and access a Control is a safety mechanism designed especially for Controls. (It's easier and probably more efficient to get the user to make thread-safe calls then it is to design the controls themselves to be thread-safe).
You can't access controls that where created in one thread from another thread.
You can either use Settings class that you mentioned, or use InvokeRequired property and Invoke methods of control.
I suggest you look at the examples on those pages:
http://msdn.microsoft.com/en-us/library/ms171728.aspx
http://msdn.microsoft.com/en-us/library/system.windows.forms.control.invokerequired.aspx

Resources