What does \n\r mean? - ascii

When reading from a pseudo-terminal via java, I'm seeing "\n\r" in the text. What is that representative of? Note its not "\r\n" which I'm familiar with.

\n is a line feed (ASCII code 10), \r is a carriage return (ASCII code 13).
Different operating systems use different combinations of these characters to represent the end of a line of text. Unix-like operating systems (Linux, Mac OS X) usually use only \n. MS-DOS and Windows use \r\n (carriage return, followed by a line feed).
The code you're using uses \n\r (line feed, carriage return). There are operating systems that use that sequence, but probably it's a mistake and it should have been \r\n.
See Newline on Wikipedia.
If you're programming in Java and you want to know what the newline sequence is for the operating system that your program is running on, you can get the system property line.separator:
String newline = System.getProperty("line.separator");

Related

How is '\x1A' special in Windows?

Reading some references1, 2, I learned that the modifier b in the second argument in fopen(3) has no effect in POSIX systems, while it prevents special handling for \n and \x1A in Windows (See below).
I well know how \n (LF) is special in Windows as text files use CRLF for line break (i.e. printf("\n") actually prints \r\n), but how is \x1A (SUB) special?
fopen("D:\\foo.txt", "rb");
^
\x1A is Ctrl+Z, which used to be used as the end-of-file marker in MS-DOS (maybe even as far back as CP/M).
The Microsoft documentation makes no mention of Ctrl+Z under the "b" mode (only under the "t" mode), so this could be cargo cult programming. I don't have a Windows box handy right now, so I can't easily check.

Do I need to add CHAR(13) in put_line() statement in Oracle to use fflush()?

I've read that to use fflush() function in oracle, every line in the output should end with a new line character. Will put_line() automatically introduce a new line character that needs fflush() to work ?
What is the new line character (\r\n or \n or depends on OS) that fflush() needs ? And what is the new line (\r\n or \n or depends on OS) character that put_line() introduces if at all it does ?
Yes, put_line() adds the required new line character(s). From the documentation for put_line():
This procedure writes the text string stored in the buffer parameter to the open file identified by the file handle. The file must be open for write operations. PUT_LINE terminates the line with the platform-specific line terminator character or characters.
That's really the difference between put() and put_line():
No line terminator is appended by PUT; use NEW_LINE to terminate the line or use PUT_LINE to write a complete line with a line terminator.
It's slightly confusing that the description of fflush() refers to just "a newline character" while put_line() refers to "line terminator character or characters", but they do mean the same thing - to flush the buffer must end with the operating-system line terminator character(s).
Note that it means the database server's operating system, not your client operating system, since utl_file (and all PL/SQL) is on the server and doesn't know anything about the client environment. It's generally safer to use put_line() or new_line() than to manually add \n or \r\n; even if you know the OS your database is running on now, it may move to a different OS one day.

Why does carriage return come before new line

There are lots of questions asking what the correct order of the carriage return and new line characters is on Windows (it's \r\n) but I have not found any real explanation as to why this is the case.
\n is the new line character, and \r is carriage return. So, if you have \r first, which returns the cursor to the beginning of the current line - and then \n afterwards, wouldn't that logically insert the \n at the beginning of the current line and just move the current line down one instead of creating a line after?
I mean I understand that when simply writing these to a file it doesn't really matter, but when parsing/reading and outputting the text, it seems backwards to me.
The order is a homage to the typewriter days.
Early mechanical printers were too slow to return the carriage in the time it took to process one character. Therefore the time spent sending the line feed was not wasted (often several more characters had to be sent to ensure the carriage return had happened before sending a printing character). This is why the carriage return was always sent first.
Link: http://en.wikipedia.org/wiki/Carriage_return

Other encodings for line breaks?

I have some database records in a Rails app that I'm trying to export to CSV, avoiding line breaks in the values. I run something to the effect of this:
File.open(new_file_path, 'w+') do |f|
bio = c.biography.gsub("\n", "->")
f.print "\"#{bio}\","
end
And I see results like this:
"Katcho Achadjian was first elected to the Assembly in 2010.
->
->Prior to being elected to the Legislature, Achadjian served as a member of the San Luis Obispo County Board of Supervisors from 1998 to 2010.
->
->Achadjian graduated from Cal Poly San Luis Obispo with a bachelor’s degree in business administration. Achadjian and his wife have two adult children and reside in Arroyo Grande.",
You'll notice that the substituted character sequence appears with line breaks as well. Is there another encoding for line break that I'm somehow missing?
It depends on the platform.
On Windows you'll have \r\n (carriage return, new line).
On Linux, OS X, and other Unix-like systems, it is just \n (new line).
On Classic Mac OS (up to version 9) it is just \r (carriage return).
Probably you are using windows. The best way to deal with cross-platform texts is this:
substitute \r\n with \r
substitute \r with \n
This way you'll end up with just \n, whatever platform the text originated from and you'll have the benefit of uniforming line endings, because some editors do not touch unmodified lines and on saving you end up with mixed line endings, which are even worse than this.
Try modifying your gsub to this:
gsub(/\r?\n/, "->")

Universal newline support in Ruby that includes \r (CR) line endings

In a Rails app, I'm accepting and parsing CSV files that may come formatted with any of three possible line termination characters: \n (LF), \r\n (CR+LF), or \r (CR). Ruby's File and CSV libraries seem to handle the first two cases just fine, but the last case ("Mac classic" \r line endings) isn't handled as a newline. It's important to be able to accept this format as well as the others, since Microsoft Excel for Mac (running on OS X) seems to use it when exporting to "Comma Separated Values" (although exporting to "Windows Comma Separated" produces the easier-to-handle \r\n).
Python has "universal newline support" and will handle any of these three formats without a problem. Is there something similar in Ruby that will accept all three without knowing the format in advance?
You could use :row_sep => :auto:
:row_sep
The String appended to the end of each row. This can be set to the special :auto setting, which requests that CSV automatically discover this from the data. Auto-discovery reads ahead in the data looking for the next "\r\n", "\n", or "\r" sequence.
There are some caveats of course, see the manual linked to above for details.
You could also manually clean up the EOLs with a bit of gsubing before handing the data to CSV for parsing. I'd probably take this route and manually convert all \r\ns and \rs to single \ns before attempting to parse the CSV. OTOH, this won't work that well if there is embedded binary data in your CSV where \rs mean something. On the gripping hand, this is CSV we're dealing with so who knows what sort of crazy broken nonsense you'll end up dealing with.

Resources