I am new programmar in Ruby. Can someone take an example about opening file with r+,w+,a+ mode in Ruby? What is difference between them and r,w,a?
Please explain, and provide an example.
The file open modes are not really specific to ruby - they are part of IEEE Std 1003.1 (Single UNIX Specification). You can read more about it here:
http://pubs.opengroup.org/onlinepubs/009695399/functions/fopen.html
r or rb
Open file for reading.
w or wb
Truncate to zero length or create file for writing.
a or ab
Append; open or create file for writing at end-of-file.
r+ or rb+ or r+b
Open file for update (reading and writing).
w+ or wb+ or w+b
Truncate to zero length or create file for update.
a+ or ab+ or a+b
Append; open or create file for update, writing at end-of-file.
Any mode that contains the letter 'b' stands for binary file. If the 'b' is not present is a 'plain text' file.
The difference between 'open' and 'open for update' is indicated as:
When a file is opened with update mode ( '+' as the second or third character in the mode argument), both input and output may be performed on the associated stream. However, the application shall ensure that output is not directly followed by input without an intervening call to fflush() or to a file positioning function ( fseek(), fsetpos(), or rewind()), and input is not directly followed by output without an intervening call to a file positioning function, unless the input operation encounters end-of-file.
Related
I'm developing a software that stores its data in a binary file format. However, as a courtesy to innocent shell users that might cat to inspect the contents of such a file, I'm thinking of having an ASCII-compatible "magic string" in the start of the file that tells the name and the version of the binary format.
I'm thinking of having at least ten rows (\n) in the message so that head by default settings doesn't hit the binary part.
Now, I wonder if there is any control character or escape code that would hint to the shell that the following content isn't interpretable as printable text, and should be just ignored? I tried 0x00 (the null byte) and 0x04 (ctrl-D) but they seem to be just ignored when catting the file.
Cat regards a file as text. There is no way you can trigger an end-of-file, since EOF is not actually any character.
The other way around works of course; specifying a format that only start reading binary format from a certain character on.
I am looking for a way to check if a PDF is missing an end of file character. So far I have found I can use the pdf-reader gem and catch the MalformedPDFError exception, or of course I could simply open the whole file and check if the last character was an EOF. I need to process lots of potentially large PDF's and I want to load as little memory as possible.
Note: all the files I want to detect will be lacking the EOF marker, so I feel like this is a little more specific scenario then detecting general PDF "corruption". What is the best, fast way to do this?
TL;DR
Looking for %%EOF, with or without related structures, is relatively speedy even if you scan the entirety of a reasonably-sized PDF file. However, you can gain a speed boost if you restrict your search to the last kilobyte, or the last 6 or 7 bytes if you simply want to validate that %%EOF\n is the only thing on the last line of a PDF file.
Note that only a full parse of the PDF file can tell you if the file is corrupted, and only a full parse of the File Trailer can fully validate the trailer's conformance to standards. However, I provide two approximations below that are reasonably accurate and relatively fast in the general case.
Check Last Kilobyte for File Trailer
This option is fairly fast, since it only looks at the tail of the file, and uses a string comparison rather than a regular expression match. According to Adobe:
Acrobat viewers require only that the %%EOF marker appear somewhere within the last 1024 bytes of the file.
Therefore, the following will work by looking for the file trailer instruction within that range:
def valid_file_trailer? filename
File.open filename { |f| f.seek -1024, :END; f.read.include? '%%EOF' }
end
A Stricter Check of the File Trailer via Regex
However, the ISO standard is both more complex and a lot more strict. It says, in part:
The last line of the file shall contain only the end-of-file marker, %%EOF. The two preceding lines shall contain, one per line and in order, the keyword startxref and the byte offset in the decoded stream from the beginning of the file to the beginning of the xref keyword in the last cross-reference section. The startxref line shall be preceded by the trailer dictionary, consisting of the keyword trailer followed by a series of key-value pairs enclosed in double angle brackets (<< … >>) (using LESS-THAN SIGNs (3Ch) and GREATER-THAN SIGNs (3Eh)).
Without actually parsing the PDF, you won't be able to validate this with perfect accuracy using regular expressions, but you can get close. For example:
def valid_file_trailer? filename
pattern = /^startxref\n\d+\n%%EOF\n\z/m
File.open(filename) { |f| !!(f.read.scrub =~ pattern) }
end
Using the Read from Text File Function I am able to easily read the first line of my file. However I now want it to read the second line. It would be great to just a for loop or something if I could specify the line number somewhere. Is there a way to do so? Thanks!
First, you can read the entire file as lines by right-clicking on the Read From Text File node and selecting "Read Lines". One read will return an array containing one element for each line and you can work with the lines with regular array handling methods. If you want to read each line individually, you can by wiring a 1 into the Count input and looping. Each iteration will return an array with one element (the current line read). You can get/set the offset (in bytes) to specify where in the file you want to read, but that's not necessary if I read your question correctly.
I am using SAS's FILE statement to output a text file having fixed format (RECFM=F).
I would like each row to end in a end-of-line control character(s) such as linefeed/carriage return. I tried the FILE statement's option TERMSTR=CRLF but still I see no end-of-line control characters in the output file.
I think I could use the PUT statement to insert the desired linefeed and carriage return control characters, but would prefer a cleaner method. Is it a reasonable thing to expect of the FILE statement? (Is it a reasonable expectation for outputting fixed format data?)
(Platform: Windows v6.1.7600, SAS for Windows v9.2 TS Level 2M3 W32_VSPRO platform)
Do you really need to use RECFM=F? You can still get fixed length output with V:
data _null_;
file 'c:\temp\test.txt' lrecl=12 recfm=V;
do i=1 to 5;
x=rannor(123);
put #1 i #4 x 6.4;
end;
run;
By specifying where you want the data to go (#1 and #3) and the format (6.4) along with lrecl you will get fixed length output.
There may be a work-around, but I believe SAS won't output a line-ending with the Fixed format.
Using Ruby, I am reading a file line by line, using IO.gets to incrementally read the next line of the file. Under certain circumstances I want to do the opposite (look at the previous line by decrementing). The way I tried to accomplish this was...
IO.lineno = int
IO.gets
It seems that no matter what I set "lineno" to equal it still just reads the next line when I follow up by calling "gets". How should I go about reading previous lines in the file?
You need to use
IO.readlines("myfile")
This returns the file as an array of strings and then iterate over it with indizies. With a stream there is no way to go back one line.