passing rather huge arguments to ruby script, problems? - ruby

ruby somescript.rb somehugelonglistoftextforprocessing
is this a bad idea? rather should i create a separate flat file containig the somehugelonglistoftextforprocessing, and let somescript.rb read it ?
does it matter if the script argument is very very long text(1KB~300KB) ? what are some problems that can arise if any.

As long as the limits of your command-line handling code (e.g., bash or ruby itself) are not exceeded, you should have no technical problems in doing this.
Whether it's a good idea is another matter. Do you really want to have to type in a couple of hundred kilobytes every single time you run your program? Do you want to have to remember to put quotes around your data if it contains spaces?
There are a number of ways I've seen this handled which you may want to consider (this list is by no means exhaustive):
Change your code so that, if there's no arguments, read the information from standard input - this will allow you to do either
ruby somescript.rb myData
or
ruby somescript.rb <myFile.txt.
Use a special character to indicate file input (I've seen # used in this way). So,
ruby somescript.rb myData
would use the data supplied on the command line whilst
ruby somescript.rb #myFile.txt
would get the data from the file.
My advice would be to use the file-based method for that size of data and allow an argument to be used if specified. This covers both possible scenarios:
Lots of data, put it in a file so you won't have to retype it every time you want to run your command.
Not much data, allow it to be passed as an argument so that you don't have to create a file for something that's easier to type in on the command line.

Related

pasting input to command line with multiple carriage returns

I'm new to ruby and have written a program that takes several lines of data (actually a JSON) and converts it into a table in the command line. Everything works fine with the JSON data embedded in the program but I would like to have it prompt the user to paste the data into the command line. I know about gets and chomp, but since a JSON is formatted with multiple lines/carriage returns, when I paste in the command line it takes each line as a separate entry. I feel like the answer is simple but I'm having a hard time finding info online. I just want it to take everything I paste all at once and ignore all carriage returns.
Any suggestions?
If you're taking input via the console, $stdin in technical parlance, and you want to receive multiple, independent multi-line objects you'll need to have some kind of delimiter.
This could be as simple as one or more blank lines between each JSON object, or it could be a marker like END or --.
It depends on how your JSON data is formatted, as blank lines within a JSON object are valid, yet are not normally emitted by most JSON generators.
Don't forget that the UNIX model strongly encourages you to be able to do things like this:
processor < input.json
Or things like this:
processor *.json
Where you can receive multiple files via ARGV and process those sequentially. That avoids a lot of this mess.

How to reference and run a python document from the python interpreter

I just want to be able to run a python script from the interpreter, so that I can work on my changes to my script in notepad or other editor, save, and then interactively test changed code in the python interpreter.
Also, IDLE is not a solution. I'm operating on a government computer that is blocking the port it uses to communicate interaction between console and module.
To clear up any confusion, here's a demonstration of what I'm trying to do:
So, how do I do it?
EDIT:
Okay so I found a statement that does what I want. exec(open('dir').read()). The problem I think is that the directory I want to refer to contains periods. But I'm sure this will work, because open('dir').read() produces a string of the contents of a document specified, as long as I reference the likes of C:\myTest.py, and exec() obviously runs strings as input. So how can I reference files from the location I want?
Okay so the problem seems to be that Windows addresses often contain what python sees as 'unicode exits'. I'm not sure what they do or how they work, but I know they start with \ and are followed by a single letter and that there are enough of them to use up half the alphabet. There are a few solutions but only one is worth a damn for this application. I came across an operator that can be used in conjunction with strings, similarly to how + can be used to concatenate multiple strings, it seems r or R if you prefer (interestingly), can be used immediately before a string to tell the interpreter to take the string 'literally' as a string, and nothing else.
One would think that the quotes would be enough to express this, but they aren't and I'll probably eventually find out why. But for now, here's the answer to my question. I hope someone else finds it useful:
In plain text: >>> exec(open(R'C:\Users\First.Last\Desktop\myScript.py').read())

How to split a large csv file into multiple files in GO lang?

I am a novice Go lang programmer,trying to learn Go lang features.I wanted to split a large csv file into multiple files in GO lang, each file containing the header.How do i do this? I have searched everywhere but couldnt get the right solution.Any help in this regard will be greatly appreciated.
Also please suggest me a good book for reference.
Thanking You
Depending on your shell fu this problem might be better suited for common shell utilities but you specifically mentioned go.
Let's think through the problem.
How big is this csv file? Are we talking 100 lines or is it 5G ?
If it's smallish I typically use this:
http://golang.org/pkg/io/ioutil/#ReadFile
However, this package also exists:
http://golang.org/pkg/encoding/csv/
Regardless - let's return to the abstraction of the problem. You have a header (which is the first line) and then the rest of the document.
So what we probably want to do (if ignoring csv for the moment) is to read in our file.
Then we want to split the file body by all the newlines in it.
You can use this to do so:
http://golang.org/pkg/strings/#Split
You didn't mention but do you know how many files you want to split by or would you rather split by the line count or byte count? What's the actual limitation here?
Generally it's not going to be file count but if we pretend it is we simply want to divide our line count by our expected file count to give lines/file.
Now we can take slices of the appropriate size and write the file back out via:
http://golang.org/pkg/io/ioutil/#WriteFile
A trick I use sometime to help think me threw these things is to write down our mission statement.
"I want to split a large csv file into multiple files in go"
Then I start breaking that up into pieces but take the divide/conquer approach - don't try to solve the entire problem in one go - just break it up to where you can think about it.
Also - make gratiutious use of pseudo-code until you can comfortably write the real code itself. Sometimes it helps to just write a short comment inline with how you think the code should flow and then get it down to the smallest portion that you can code and work from there.
By the way - many of the golang.org packages have example links where you can literally run in your browser the example code and cut/paste that to your own local environment.
Also, I know I'll catch some haters with this - but as for books - imo - you are going to learn a lot faster just by trying to get things working rather than reading. Action trumps passivity always. Don't be afraid to fail.
Here is a package that might help. You can set a necessary chunk size in bytes and a file will be split on an appropriate amount of chunks.

Writing hash information to file and reloading it automatically on program startup?

I wrote a little program that creates a hash called movies. Then I can add, update, delete, and display all current movies in the hash by typing the title.
Instead of having it start a new hash each time and save anything added to a file, and, when updated or deleted, update or delete the key, value pair from the file, I want the program to auto-load the file on startup and create it if it doesn't exist.
I have no idea how to go about doing this.
After reading a lot of the comments I have decided that maybe I should do this with SQL instead, seems like a much better approach!
You can't store Ruby objects directly on the disk; you will first need to convert them to some sequence of bytes (i.e. a string). This is called serialization, and there are several different ways to do it and several different formats the data could be in. I think I would recommend JSON, but you might also want to try YAML or Marshal.
Any of those libraries will allow you to convert your hash into a string and allow you to convert that same string back into a hash. Then you can use Ruby's File class to save and load that string from the disk.
This should get you pointed in the right direction. From here you can search for more specific things like "how do I convert a hash to JSON" or "how do I write a string to a file".
You have the ability to marshal your code in a few ways.
YAML if you would like to use a gem, or JSON. There is also a built in Marshal
RI tells us:
Marshal
(from ruby site)
----------------------------------------------------------------------------- The marshaling library converts collections of Ruby objects into a
byte stream, allowing them to be stored outside the currently active
script. This data may subsequently be read and the original objects
reconstituted.
Marshaled data has major and minor version numbers stored along with
the object information. In normal use, marshaling can only load data
written with the same major version number and an equal or lower minor
version number. If Ruby's ``verbose'' flag is set (normally using -d,
-v, -w, or --verbose) the major and minor numbers must match exactly. Marshal versioning is independent of Ruby's version numbers. You can
extract the version by reading the first two bytes of marshaled data.
And I will leave it at that for Marshal. But there is a bit more documentation there.
You can also use IO#puts to write to a file, and then modify that file to load later, which I use sometimes for config settings. Why use YAML or another external source, when Ruby is easy enough to have a user modify? You use YAML when it needs to be more generally accessible, as the Tin Man points out.
For example this file is the sample file, but is intended for interactive editing (with constraints, of course) but it is simply valid Ruby. And it gets read by a Ruby program, and is a valid object (in this case a Hash stored in a constant.)

What exactly is going on in Proc::Background?

I am trying to write a script that automates other perl scripts. Essentially, I have a few scripts that rollup data for me and need to be run weekly. I also have a couple that need to be run on the weekend to check things and email me if there is a problem. I have the email worked out and everything but the automation. Judging by an internet search, it seems as though using Proc::Background is the way to go. I tried writing a very basic script to test it and can't quite figure it out. I am pretty new to Perl and have never automated anything before (other than through windows task scheduler), so I really don't understand what the code is saying.
My code:
use Proc::Background;
$command = "C:/strawberry/runDir/SendMail.pl";
my $proc1 = Proc::Background -> new($command);
I receive an error that says no executable program located at C:... Can someone explain to me what exactly the code (Proc::Background) is doing? I will then at least have a better idea of how to accomplish my task and debug in the future. Thanks.
I did notice on Proc::Background's documentation the following:
The Win32::Process module is always used to spawn background processes
on the Win32 platform. This module always takes a single string
argument containing the executable's name and any option arguments.
In addition, it requires that the absolute path to the executable is
also passed to it. If only a single argument is passed to new, then
it is split on whitespace into an array and the first element of the
split array is used at the executable's name. If multiple arguments
are passed to new, then the first element is used as the executable's
name.
So, it looks like it requires an executable, which a Perl script would not be, but "perl.exe" would be.
I typically specify the "perl.exe" in my Windows tasks as well:
C:\dwimperl\perl\bin\perl.exe "C:\Dropbox\Programming\Perl\mccabe.pl"

Resources