How can I pass Selenium WebDriver objects between seperate Ruby processes? - ruby

I want to pass an instance of an object between two Ruby processes. Specifically, I want to pass an instance of a Selenium WebDriver from one process to another process. The reason I want to do this is because it takes a lot of time for Ruby to create this object, but I want it to be used by the other process.
I've found some related questions here and here that seem to point towards using DRb, but I've been unable to find any useful examples or sample code.
Is there a tool other than DRb that I should be using? Does anyone have an example similar to this that I could copy from?

It looks like you're going to have to use DRb, although the documentation for it seems to be lacking. There is however an interesting article here. You might also want to consider purchasing The dRuby Book by Masatoshi Seki to get a better idea of how to do this effectively.
Another option to investigate if you are not looking at simultaneous access, but you just want to send the object from one process to another, is to serialize (that is, encode in a way that Ruby can read) the object with YAML (for a human readable file) or Marshall (for a binary encoded file) and send it using a pipe. This was mentioned in another answer that has since been deleted.
Note that either of these solutions require modifying the Selenium code heavily since the objects you want to manipulate neither support copying, nor simultaneous access natively.

TL;DR
Most queue or distributed processes are going to require some sort of serialization to work properly. If you want to pass objects rather than messages, then this will a limiting factor in how you approach the problem.
DRb
I don't know if you can marshal a WebDriver object. If you can't, then DRb may be a good choice for your distributed Ruby programs because it supports DRbObject references for things that can't be marshaled. There are some examples provided in the DRb documentation.
Selenium Wire Protocol
Depending on what you're really trying to do, it may be worth taking a closer look at using the remote bindings for the Remote WebDriver client/server, or Selenium's JSON Wire Protocol as an alternative to passing objects between processes.
Other Alternatives: Fixtures, Factories, Stubs, and Mocks
Whether or not these work in your specific case will depend a lot on why you want to pass objects instead of simply driving the remote server. If it's largely an issue of how long it takes to build your object, then the serialization/de-serialization cycle may not necessarily be faster in all cases.
You might want to revisit why your object is so slow to create. If gathering and processing the data for it is what's taking too long, you can use some sort of test fixture or factory to trim that time, either by using a smaller set of fixed data, or using a pre-serialized object that's optimized for speed.
You might also consider whether you actually need real data or objects for your test at all. In many cases, you can speed up your tests a lot by stubbing methods or creating mock objects that will return the values you need for your integration tests without needing to perform expensive calculations or long-running operations.
There are certainly cases where you need to drive the full stack and perform acceptance tests on real data. Even then, you may be able to devise a set of fixture data that will take less time or memory to process. It's certainly worth at least thinking about.

Related

What's the best practice for NSPersistentContainer newBackgroundContext?

I'm familiarizing myself with NSPersistentContainer. I wonder if it's better to spawn an instance of the private context with newBackgroundContext every time I need to insert/fetch some entities in the background or create one private context, keep it and use for all background tasks through the lifetime of the app.
The documentation also offers convenience method performBackgroundTask. Just trying to figure out the best practice here.
I generally recommend one of two approaches. (There are other setups that work, but these are two that I have used, and tested and would recommend.)
The Simple Way
You read from the viewContext and you write to the viewContext and only use the main thread. This is the simplest approach and avoid a lot of the multithread issues that are common with core-data. The problem is that the disk access is happening on the main thread and if you are doing a lot of it it could slow down your app.
This approach is suitable for small lightweight application. Any app that has less than a few thousand total entities and no bulk changes at once would be a good candidate for this. A simple todo list, would be a good example.
The Complex Way
The complex way is to only read from the viewContext on the main thread and do all your writing using performBackgroundTask inside a serial queue. Every block inside the performBackgroundTask refetches any managedObjects that it needs (using objectIds) and all managedObjects that it creates are discarded at the end of the block. Each performBackgroundTask is transactional and saveContext is called at end of the block. A fuller description can be found here: NSPersistentContainer concurrency for saving to core data
This is a robust and functional core-data setup that can manage data at any reasonable scale.
The problem is that you much always make sure that the managedObjects are from the context you expect and are accessed on the correct thread. You also need a serial queue to make sure you don't get write conflicts. And you often need to use fetchedResultsController to make sure entities are not deleted while you are holding pointers to them.

Concept of 'serializing' complete memory of object

I would like to ask a very general question about a technical concept of which I do not know whether it exists or whether it is feasible at all.
The idea is the following:
I have an object in Garbage Collected language (e.g. C# or Java). The objects may itself contain several objects but there is no reference to any other objects that are not sub-element of the objects (or the object itself).
Theoretically it would be possible to get the memory used by this object which is most likely not a connected piece. Because I have some knowledge about the objects I can find all reference variables/properties and pointers that at the end point to another piece of the memory (probably indirectly, depending on the implementation of the programming language and virtual machine). I can take this pieces of memory combining them to a bigger piece of memory (correcting the references/pointers so that they are still intact). This piece of memory, basically bytes, could be written to a storage for example a database or a redis cache. On another machine I could theoretically load this object again an put it into the memory of the virtual machine (maybe again correct the references/pointers if they are absolute and not relative). Then I should have the same object on the other VM. The object can as complicated as I want, may also contain events or whatever and I would be able to get the state of the object transfered to anther VM (running on another computer). The only condition is that it would not contain references to something outside the objects. And of course I have to know the class type of the object on the other VM.
I ask this question because I want to share the state of an object and I think all this serialization work is just an overhead and it would be very simple if I could just freeze the memory and transport to another VM.
Is something like this possible, I'd say yes, though it might be complicated. maybe it is not possible with some VM's due to their architecture. Does something like this exist in any programming language? Maybe even in non garbage collected languages?
NOTE: I am not sure what tags should be added to this question except from programming-language, also I am not sure if there might be a better place for such a question. So please forgive me.
EDIT:
Maybe the concept can be compared to the initrd on Linux or hibernation in general.
you will have to collect all references to other objects. including graphs of objects (cycles) without duplications. it would require some kind of 'stop the world' at least for the serializing thread. it's complicated to do effectively but possible - native serialization mechanisms in many languages (java) are doing it for the developer.
you will need some kind of VM to abstract from the byte order in different hardware architectures.
you will have to detach object from any kind of environment. you can't pass objects representing threads, files handles, sockets etc. how will you detect it?
in nowadays systems memory is virtual so it will be impossible to simply copy addresses from one machine to another - you will have to translate them
objects are not only data visible to developer, it's also structure, information of sandboxing, permissions, superclasses, what method/types were already loaded and which are still not loaded because of optimalizations and lazy loading, garbage collector metadata etc
version of your object/class. on one machine class A can be created from source ver 1 but on another machine there allready might be objects of class A built from source of version 2
take performacne into consideration. will it be faster then old-school serialization? what benefits will it have?
and probably many more things none of us thought about
so: i've never heard of such solution. it seems theoretically doable but for some reason no one have ever done that. everyone offers plain old programmatic serialization. maybe you discover new, better way but keep in mind you'll be going against the crowd

SearchScope fetchRows vs fetchObjects (IBM FileNet CE API)

I've been using SearchScope.fetchObjects() method till this time, and then it just occurred to me that fetchRows might be the better choice in some cases (when you don't need metadata like class names, object stores etc). Something tells me it might be faster, but I didn't found any arguments about what method to use in which case, and why.
Here is SearchScope documentation.
The difference in performance of fetchRows() and fetchObjects() is negligible in most cases. If you process significant volume of data and still are concerned about performance I suggest making a simple test.
The only reason for existence of fetchRows() is the possibility to query disparate object classes using JOIN.

Extending functionality of existing program I don't have source for

I'm working on a third-party program that aggregates data from a bunch of different, existing Windows programs. Each program has a mechanism for exporting the data via the GUI. The most brain-dead approach would have me generate extracts by using AutoIt or some other GUI manipulation program to generate the extractions via the GUI. The problem with this is that people might be interacting with the computer when, suddenly, some automated program takes over. That's no good. What I really want to do is somehow have a program run once a day and silently (i.e. without popping up any GUIs) export the data from each program.
My research is telling me that I need to hook each application (assume these applications are always running) and inject a custom DLL to trigger each export. Am I remotely close to being on the right track? I'm a fairly experienced software dev, but I don't know a whole lot about reverse engineering or hooking. Any advice or direction would be greatly appreciated.
Edit: I'm trying to manage the availability of a certain type of professional. Their schedules are stored in proprietary systems. With their permission, I want to install an app on their system that extracts their schedule from whichever system they are using and uploads the information to a central server so that I can present that information to potential clients.
I am aware of four ways of extracting the information you want, both with their advantages and disadvantages. Before you do anything, you need to be aware that any solution you create is not guaranteed and in fact very unlikely to continue working should the target application ever update. The reason is that in each case, you are relying on an implementation detail instead of a pre-defined interface through which to export your data.
Hooking the GUI
The first way is to hook the GUI as you have suggested. What you are doing in this case is simply reading off from what an actual user would see. This is in general easier, since you are hooking the WinAPI which is clearly defined. One danger is that what the program displays is inconsistent or incomplete in comparison to the internal data it is supposed to be representing.
Typically, there are two common ways to perform WinAPI hooking:
DLL Injection. You create a DLL which you load into the other program's virtual address space. This means that you have read/write access (writable access can be gained with VirtualProtect) to the target's entire memory. From here you can trampoline the functions which are called to set UI information. For example, to check if a window has changed its text, you might trampoline the SetWindowText function. Note every control has different interfaces used to set what they are displaying. In this case, you are hooking the functions called by the code to set the display.
SetWindowsHookEx. Under the covers, this works similarly to DLL injection and in this case is really just another method for you to extend/subvert the control flow of messages received by controls. What you want to do in this case is hook the window procedures of each child control. For example, when an item is added to a ComboBox, it would receive a CB_ADDSTRING message. In this case, you are hooking the messages that are received when the display changes.
One caveat with this approach is that it will only work if the target is using or extending WinAPI controls.
Reading from the GUI
Instead of hooking the GUI, you can alternatively use WinAPI to read directly from the target windows. However, in some cases this may not be allowed. There is not much to do in this case but to try and see if it works. This may in fact be the easiest approach. Typically, you will send messages such as WM_GETTEXT to query the target window for what it is currently displaying. To do this, you will need to obtain the exact window hierarchy containing the control you are interested in. For example, say you want to read an edit control, you will need to see what parent window/s are above it in the window hierarchy in order to obtain its window handle.
Reading from memory (Advanced)
This approach is by far the most complicated but if you are able to fully reverse engineer the target program, it is the most likely to get you consistent data. This approach works by you reading the memory from the target process. This technique is very commonly used in game hacking to add 'functionality' and to observe the internal state of the game.
Consider that as well as storing information in the GUI, programs often hold their own internal model of all the data. This is especially true when the controls used are virtual and simply query subsets of the data to be displayed. This is an example of a situation where the first two approaches would not be of much use. This data is often held in some sort of abstract data type such as a list or perhaps even an array. The trick is to find this list in memory and read the values off directly. This can be done externally with ReadProcessMemory or internally through DLL injection again. The difficulty lies mainly in two prerequisites:
Firstly, you must be able to reliably locate these data structures. The problem with this is that code is not guaranteed to be in the same place, especially with features such as ASLR. Colloquially, this is sometimes referred to as code-shifting. ASLR can be defeated by using the offset from a module base and dynamically getting the module base address with functions such as GetModuleHandle. As well as ASLR, a reason that this occurs is due to dynamic memory allocation (e.g. through malloc). In such cases, you will need to find a heap address storing the pointer (which would for example be the return of malloc), dereference that and find your list. That pointer would be prone to ASLR and instead of a pointer, it might be a double-pointer, triple-pointer, etc.
The second problem you face is that it would be rare for each list item to be a primitive type. For example, instead of a list of character arrays (strings), it is likely that you will be faced with a list of objects. You would need to further reverse engineer each object type and understand internal layouts (at least be able to determine offsets of primitive values you are interested in in terms of its offset from the object base). More advanced methods revolve around actually reverse engineering the vtable of objects and calling their 'API'.
You might notice that I am not able to give information here which is specific. The reason is that by its nature, using this method requires an intimate understanding of the target's internals and as such, the specifics are defined only by how the target has been programmed. Unless you have knowledge and experience of reverse engineering, it is unlikely you would want to go down this route.
Hooking the target's internal API (Advanced)
As with the above solution, instead of digging for data structures, you dig for the internal API. I briefly covered this with when discussing vtables earlier. Instead of doing this, you would be attempting to find internal APIs that are called when the GUI is modified. Typically, when a view/UI is modified, instead of directly calling the WinAPI to update it, a program will have its own wrapper function which it calls which in turn calls the WinAPI. You simply need to find this function and hook it. Again this is possible, but requires reverse engineering skills. You may find that you discover functions which you want to call yourself. In this case, as well as being able to locate the location of the function, you have to reverse engineer the parameters it takes, its calling convention and you will need to ensure calling the function has no side effects.
I would consider this approach to be advanced. It can certainly be done and is another common technique used in game hacking to observe internal states and to manipulate a target's behaviour, but is difficult!
The first two methods are well suited for reading data from WinAPI programs and are by far easier. The two latter methods allow greater flexibility. With enough work, you are able to read anything and everything encapsulated by the target but requires a lot of skill.
Another point of concern which may or may not relate to your case is how easy it will be to update your solution to work should the target every be updated. With the first two methods, it is more likely no changes or small changes have to be made. With the second two methods, even a small change in source code can cause a relocation of the offsets you are relying upon. One method of dealing with this is to use byte signatures to dynamically generate the offsets. I wrote another answer some time ago which addresses how this is done.
What I have written is only a brief summary of the various techniques that can be used for what you want to achieve. I may have missed approaches, but these are the most common ones I know of and have experience with. Since these are large topics in themselves, I would advise you ask a new question if you want to obtain more detail about any particular one. Note that in all of the approaches I have discussed, none of them suffer from any interaction which is visible to the outside world so you would have no problem with anything popping up. It would be, as you describe, 'silent'.
This is relevant information about detouring/trampolining which I have lifted from a previous answer I wrote:
If you are looking for ways that programs detour execution of other
processes, it is usually through one of two means:
Dynamic (Runtime) Detouring - This is the more common method and is what is used by libraries such as Microsoft Detours. Here is a
relevant paper where the first few bytes of a function are overwritten
to unconditionally branch to the instrumentation.
(Static) Binary Rewriting - This is a much less common method for rootkits, but is used by research projects. It allows detouring to be
performed by statically analysing and overwriting a binary. An old
(not publicly available) package for Windows that performs this is
Etch. This paper gives a high-level view of how it works
conceptually.
Although Detours demonstrates one method of dynamic detouring, there
are countless methods used in the industry, especially in the reverse
engineering and hacking arenas. These include the IAT and breakpoint
methods I mentioned above. To 'point you in the right direction' for
these, you should look at 'research' performed in the fields of
research projects and reverse engineering.

TDD for a Device Communicator

I've been reading about TDD, and would like to use it for my next project, but I'm not sure how to structure my classes with this new paradigm. The language I'd like to use is Java, although the problem is not really language-specific.
The Project
I have a few pieces of hardware that come with a ASCII-over-RS232 interface. I can issue simple commands, and get simple responses, and control them as if from their front panels. Each one has a slightly different syntax and very different sets of commands. My goal is to create an abstraction/interface so I can control them all through a GUI and/or remote procedure calls.
The Problem
I believe the first step is to create an abstract class (I'm bad at names, how about 'Communicator'?) to implement all the stuff like Serial I/O, and then create a subclass for each device. I'm sure it will be a little more complicated than that, but that's the core of the application in my mind.
Now, for unit tests, I don't think I really need the actual hardware or a serial connection. What I'd like to do is hand my Communicators an InputStream and OutputStream (or Reader and Writer) that could be from a serial port, file, stdin/stdout, piped from a test function, whatever. So, would I just have the Communicator constructor take those as inputs? If so, it would be easy to put the responsibility of setting it all up on the testing framework, but for the real thing, who makes the actual connection? A separate constructor? The function calling the constructor again? A separate class who's job it is to 'connect' the Communicator to the correct I/O streams?
Edit
I was about to rewrite the problem section in order to get answers to the question I thought I was asking, but I think I figured it out. I had (correctly?) identified two different functional areas.
1) Dealing with the serial port
2) Communicating with the device (understanding its output & generating commands)
A few months ago, I would have combined it all into one class. My first idea towards breaking away from this was to pass just the IO streams to the class that understands the device, and I couldn't figure out who would then be responsible for creating the streams.
Having done more research on inversion of control, I think I have an answer. Have a separate interface and class that solve problem #1 and pass it to the constructor of the class(es?) that solve problem #2. That way, it's easy to test both separately. #1 by connecting to the actual hardware and allowing the test framework to do different things. #2 can be tested by being given a mock of #1.
Does this sound reasonable? Do I need to share more information?
With TDD, you should let your design emerge, start small, with baby steps and grow your classes test by test, little by little.
CLARIFIED: Start with a concrete class, to send one command, unit test it with a mock or a stub. When it will work enough (perhaps not with all options), test it against your real device, once, to validate your mock/stub/simulator.
Once the class for the first command is operational, start implementing a second command, the same way: first again your mock/stub, then once against the device for validation. Now if you're seeing similarities between your two classes, you can refactor to your abstract class based design - or to something different.
Sorry for being a little Linux centric ..
My favorite way of simulating gadgets is to write character device drivers that simulate their behavior. This also gives you fun abilities, like providing an ioctl() interface that makes the simulated device behave abnormally.
At that point .. from testing to real world, it only matters which device(s) you actually open, read and write.
It should not be too hard to mimic the behavior of your gadgets .. it sounds like they take very basic instructions and return very basic responses. Again, a simple ioctl() could tell the simulated device that its time to misbehave, so you can ensure that your code is handling such events adequately. For instance, fail intentionally on every n'th instruction, where n is randomly selected upon the call to ioctl().
After seeing your edits I think you are heading in exactly the right direction. TDD tends to drive you towards a design composed of small classes with a well-defined responsibility. I would also echo tinkertim's advice - a device simulator which you can control and "provoke" into behaving in different ways is invaluable for testing.

Resources