How to fix the inputshow message in normal GUl in java? - user-interface

For making the gui more secure I want to assess the output only writing on one and only word.

Related

Python structure for working with variable log messages in a large file

I am trying to get data out of debug log messages created by a certain piece of open source software. It has many lines describing what it is doing during stages. It does not have a specific structure, i.e. some data covers multiple lines with different indents and no separator so does not import nicely into a pandas data frame, which would be my go-to usually.
Is there a good way to structure a python script that parses this data and one that can be used in the future for the same function, and also be extendable to extract different data? I have to do a bunch of different steps to extract the data. The other complication is that the file is much too big to store in memory (10^6 lines) so i need to iterate through the lines.
Please could anyone give me some tips on how to do this, is it best to move to do each step and save to a new file? Or my idea is to create a data object and store relevant line numbers as attributes in lists, that are generated in different method. Then each subsequent method only loads the lines from that list.
Or alternatively, maybe I am totally using the wrong tool and I need to learn awk or regex commands to do it? I just know python already so have a preference for it. Not looking for a specific answer necessarily, some tips and pointers would also be very useful!
(--details--) I am trying to trace on a freeradius server the difference between log messages of requests, accepts and rejects of a mac address to see if I can find out why it is sometimes accepted and other times rejected, seemingly randomly.
There are a lot of plugins running on the server setup before I got to dealing with it so the debug is a massive wall of text, labelling each request with a number. I can split it into requests by that number, find the request that mentions the mac, split those requests into different files, then run want to filter out all the boilerplate info that comes with each message and get to the things that are different between them. (--details--)

Is it bad idea to use Flex/Bison as parser for backend of terminal emulator?

I am implementing terminal emulator similar to default terminal emulator for OSX: Terminal.app.
I open terminal connection with openpty and then use Flex to parse incoming input: normal text and control sequences and Bison which calls callbacks based on tokens produced by Flex (insert string, cursor forward sequence etc).
Along with normal text tokens I have implemented around 30 escape sequences without any outstanding problems.
I made Flex/Bison re-entrant because I needed multiple terminal windows to work simultaneously
I did some workarounds to make Flex/Bison to read continuous output based on my another question: How to detect partial unfinished token and join its pieces that are obtained from two consequent portions of input?.
So far it looks like Flex/Bison do their job, however I suspect that sooner or later I will encounter something that reveals the fact that Flex/Bison should not be used as a tool to parse terminal input.
The question is: what are the problems that Flex/Bison can cause if they are used instead of hand-written parser for terminal input? Can performance be a concern?
Abbreviations are sometimes helpful:
DPDA "deterministic push-down automata"
DFA "deterministic finite automata"
What #ejp is saying is that bison is not needed in your solution because there is only one way that the tokens from lexical analyzer can be interpreted. The stack which was mentioned is used for saving the state of the machine while looking at the alternative ways that the input can be interpreted.
In considering whether flex (and possibly bison) would be a good approach, I would be more concerned with how you are going to solve the problem of control characters which can be interspersed within control sequences.
Further reading:
Control Bytes, Characters, and Sequences

Best form of IPC for a decentralized roguelike?

I've got a project to create a roguelike that in some way abstracts the UI from the engine and the engine from map creation, line-of-site, etc. To narrow the focus, i first want to just get the UI (player's client) and engine working.
My current idea is to make the client basically a program that decides what one character (player, monsters) will do for its turn and waits until it can move again. So each monster has a client, and so does the player. The player's client prints the map, waits for input, sends it to the engine, and tells the player what happened. The monster's client does the same except without printing the map and using AI instead of keyboard input.
Before i go any futher, if this seems somehow an obfuscated way of doing things, my goal is to learn, not write a roguelike. It's the journy, not the destination.
And so i need to choose what form of ipc fits this model best.
My first attempt used pipes because they're simplest and i wrote a
UI for the player and a program to pipe in instructions such as
where to put the map and player. While this works, it only allows
one client--communicating through stdin and out.
I've thought about making the engine a daemon that looks in a spool
where clients, when started, create unique-per-client temp files to
give instructions to the engine and recieve feedback.
Lastly, i've done a little introductory programing with sockets.
They seem like they might be the way to go, and would allow the game
to perhaps someday be run over a net. I'd like to, if possible, use
a simpler solution, and since i'm unfamiliar with them, it's more
error prone.
I'm always open to suggestions.
I've been playing around with using these combinations for a similar problem (multiple clients talking via a single daemon on the local box, with much of the intelligence shoved off into the clients).
mmap for sharing large data blobs, with unix domain sockets, messages queues, or named pipes for notification
same, but using individual files per blob instead of munging them all together in an mmap
same, but without the files or mmap (in other words, more like conventional messaging)
In general I like the idea of breaking things up into separate executables this way -- it certainly makes testing easier, for instance. I think the choice of method comes down to usage patterns -- how large are messages, how persistent does the data in them need to be, can you afford the cost of multiple trips through the network stack for a socket-based message, that sort of thing. The fact that you're sticking to Linux makes things easy in terms of what's available -- you don't need to worry about portability of message queues, for instance.
This one's also applicable: https://stackoverflow.com/a/1428542/1264797

Is avoiding the T in ETL possible?

ETL is pretty common-place. Data is out there somewhere so you go get it. After you get it, it's probably in a weird format so you transform it into something and then load it somewhere. The only problem I see with this method is you have to write the transform rules. Of course, I can't think of anything better. I supposed you could load whatever you get into a blob (sql) or into a object/document (non-sql) but then I think you're just delaying the parsing. Eventually you'll have to parse it into something structured (assuming you want to). So is there anything better? Does it have a name? Does this problem have a name?
Example
Ok, let me give you an example. I've got a printer, an ATM and a voicemail system. They're all network enabled or I can give you connectivity. How would you collect the state from all these devices? For example, the printer dumps a text file when you type status over port 9000:
> status
===============
has_paper:true
jobs:0
ink:low
The ATM has a CLI after you connect on port whatever and you can type individual commands to get different values:
maint-mode> GET BILLS_1
[$1 bills]: 7
maint-mode> GET BILLS_5
[$5 bills]: 2
etc ...
The voicemail system requires certain key sequences to get any kind of information over a network port:
telnet> 7,9*
0 new messages
telnet> 7,0*
2 total messages
My thoughts
Printer - So this is pretty straight-forward. You can just capture everything after sending "status", split on lines and then split on colons or something. Pretty easy. It's almost like getting a crap-formatted result from a web service or something. I could avoid parsing and just dump the whole conversation from port 9000. But eventually I'll want to get rid of that equal signs line. It doesn't really mean anything.
ATM - So this is a bit more of a pain because it's interactive. Now I'm approaching expect or a protocol territory. It'd be better if they had a service that I could query these values but that's out of scope for this post. So I write a client that gets all the values. But now if I want to collect all the data, I have to define what all the questions are. For example, I know that the ATM has more bills than $1 and $5 so I'd have a complete list like "BILLS_1 BILLS_5 BILLS_10 BILLS_20". If I ask all the questions then I have an inventory of the ATM machine. Of course, I still have to parse out the results and clean up the text if I wanted to figure out how much money is left in the ATM machine. So I could parse the results and figure out the total at data collection time or just store it raw and make sense of it later.
Voicemail - This is similar to the ATM machine where it's interactive. It's just a bit weirder because the key sequences/commands aren't "get key". But essentially it's the same problem and solution.
Future Proof
Now what if I was going to give you an unknown device? Like a refrigerator. Or a toaster. Or anything? You'd have to write "connectors" ahead of time or write a parser afterwards against some raw field you stored earlier. Maybe in the case of these very limited examples there's no alternative. There's no way to future-proof. You just have to understand the new device and parse it at collection or parse it after the fact (your stored blob/object/document).
I was thinking that all these systems are text driven so maybe you could create a line iterator type abstraction layer that simply requires the device to split out lines. Then you could have a text processing piece that parses based on rules. For the ATM device, you'd have to write something that "speaks ATM" and turns it into lines which the iterator would then take care of. At this point, hopefully you'd be able to say "I can handle anything that has lines of text".
But then what will you call these rules for parsing the text? "Printer rules" might as well be called "printer parser" which is the same to me as "printer transform". Is there a better term for all of this?
I apologize for this question being so open ended. :)
When your sources of information are as disparate as what you illustrate then you have no choice but to implement the Transform in order to bring the items into a common data repository. Usually your data sources won't be this extreme, the data will all be related in some way but you may be retrieving it from different sources (some might come from a nicely structured database, some more might come from an Excel or XML or text file, some more might come from a web service call, etc).
When coding up a custom ETL application, a common pattern that is used is the Provider model, this enables you to write a whole bunch of custom providers to load/query and then transform the data. All the providers will implement a common interface with some relatively common function definitions (for example QueryData(), TransformData()), but the implementation of those methods will be wildly different depending on the data source being dealt with - the interface just gives a common way to deal with all the different providers. You can then use an XML configuration file to dictate which providers to run and any other initial settings they may require. Tools like SSIS abstract this stuff away for you by giving you a nice visual designer, but you can still get down and dirty and write your own code which it calls.
Now what if I was going to give you an unknown device? Like a refrigerator. Or a toaster.
No problem, i would just write a new provider, which can sit in its very own assembly (dll), so it can be shipped (or modified, upgraded, etc) in isolation to any other providers i already have. Or if i was using SSIS then i would write a new DTS package.
I was thinking that all these systems are text driven so maybe you could create a line iterator type abstraction layer ... Then you could have a text processing piece that parses based on rules.
Absolutely - you can have a base class containing common functionality which several different providers can implement, and each provider can use its own set of rules which could be coded into it or they can be contained in an external configuration file.
So I could parse the results and figure out the total at data collection time or just store it raw and make sense of it later.
Use whichever approach makes sense for the data you are grabbing. It is also quite common for an ETL process to dump its data into a staging area (like some staging tables in a database) while the data is all being aggregated and accumulated, and then further process it to link related data and perform calculations. In the case of your ATM it may not be necessary to calculate a cash balance at ETL time because you can easily calculate it at any time in the future.

How to test-drive software that uses external command-line tools

I'm trying to figure out how to test-drive software that launches external processes that take file paths as an input and write output after lengthy processing to stdout or some file? Is there some common patterns on writing tests in this kind of situations? It is hard to create fast executing tests that could verify correct usage of external tools without launching actual tools in tests and inspecting the results.
You could memoize (http://en.wikipedia.org/wiki/Memoization) the external processes. Write a wrapper in Ruby that computes the md5 sum of the input file and checks it against a database of known checksums. If it matches one, copy over the right output; otherwise, invoke the tool normally.
Test right up to your boundaries. In your case, the boundary is the command-line that you construct to invoke the external program (which you can capture by monkey patching). If you're gluing yourself in to that program's stdout (or processing its result by reading files) that's another boundary. The test is whether your program can process that "input".
The 90%-case answer would be to mock the external command-line tools and verify that the right input is being passed to them at the dividing interface between the two. This helps keep the test suite fast. Also you shouldn't have to bring in the command-line tools since they are not 'your code under test' - it brings in the possibility that the unit test could fail either due changes to your code or some change in behavior in the command line utility.
But it seems like you're having trouble defining the 'right input' - in which case using Optimizations like Memoization (as Dave suggests) might give you the best of both worlds.
Assuming the external programs are well-tested, you should just test that your program is passing the correct data to them.
I think you are getting into a common issue with unit testing, in that correctness is really determined by if the integration works, so how does the unit test help you?
The basic answer is that the unit test tests that the parameters you intend to pass to command line tool are in fact getting passed that way, and that the results you anticipate getting back are in fact processed the way you intend to process them.
Then there is a second level of tests, which may or may not be automated (preferably they are, but it does depend on if it is practical), which are at the functional level where the real utilities are called so that you can see that what you intend to pass and what you anticipate getting back match what actually happens.
There would also be nothing wrong with a set of tests which "tests" the external tools (which perhaps run on a different schedule, or only when you upgrade those tools) which establish your assumptions, passing in the raw input and asserting that you get back the raw output. That way if you upgrade the tool you can catch any behavior changes which may affect you.
You have to decide if those last set of tests are worthwhile or not. It very much depends on the tools involved.

Resources