I´m executing the following command "grep bruno < bash.txt " which gives me the right output "bruno" and garbage "\f0\fs24 \cf0".
I´m on the command shell on a Mac OS X v10.6.8 and i´m pretty sure i should be getting the line of the found word and the word. Not garbage.
This is the Output:
Mobile-Devs-MacBook-Pro:Screenshots Poupe mdev$ grep bruno < bash.txt
\f0\fs24 \cf0 bruno\
In bash.txt i only have written "bruno", if i output with "cat bash.txt" it also gives me the following garbage:
Mobile-Devs-MacBook-Pro:Screenshots Poupe mdev$ cat bash.txt
{\rtf1\ansi\ansicpg1252\cocoartf1038\cocoasubrtf360
{\fonttbl\f0\fswiss\fcharset0 Helvetica;}
{\colortbl;\red255\green255\blue255;}
\paperw11900\paperh16840\margl1440\margr1440\vieww9000\viewh8400\viewkind0
\pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\ql\qnatural\pardirnatural
\f0\fs24 \cf0 bruno\
If i make "echo bruno > bash.txt" and then "cat bash.txt" it gives me a clean output. Why am i not seeing a clean output when i write the file by hand?
Your file isn't a plain text file. It is RTF. grep is giving you the line containing "bruno", along with the rich text formatting.
When you do:
echo bruno > bash.txt
bash.txt contains only "bruno".
When you "edit the file by hand", your editor is saving as RTF. You need to save as plain text.
That isn't a plain text file. That looks like an RTF. Grep only understands text and its job is to output the entire line where the search text is found.
I cannot tell from your formatting, but I have to believe the "garbage" you are seeing is on the same line as the "bruno" text.
As others have pointed out, the problem is that the file is in RTF format, and contains formatting information. If you want to create a plain text file in TextEdit, use the menu option Format > Make Plain Text before saving it. Better yet, don't use TextEdit at all -- my favorite for plain text editing is TextWrangler, but there are plenty of other options.
Related
I'm creating a README file using Bash. When adding description in the file, I want the text to appear as 2 paragraphs. How can I create a line break after para one? I tried "\n" but nothing happened.
Continuing from my comments. What you want to be able to write formatted blocks of text out to a file (or to the terminal /dev/stdout) is a heredoc. A heredoc will write the lines out as formatted between and opening and closing tag. (EOF is traditionally used, but it can be anything you like). The form is:
cat << EOF
Your text goes here
and here
and here, etc...
EOF
If you want to write to a file, then use cat >filename << EOF as the opening. If you have variables in your text that you do not want expanded (e.g. $myvar you want written out as $myvar and not what it holds), quote the opening tag, e.g. 'EOF')
In your case if you want to write to a filename from within your script, then just use the form above. You can use default initialization to write to the terminal if no filename is given as an argument to your script, e.g.
#!/bin/bash
fname="${1:-/dev/stdout}" # set filename to write to (stdout by default)
# heredoc
cat >"$fname" << EOF
My dog has fleas and my cat has none. Lucky cat. My snake has
scales and can't have fleas. Lucky snake.
If the animals weren't animals could they still have fleas?
EOF
If called with no argument, the heredoc is printed to the terminal (/dev/stdout). If given a filename, then the heredoc output is redirected to the filename, e.g.
$ bash write-heredoc.sh README
Fills the README file with the heredoc contents, e.g.
$ cat README
My dog has fleas and my cat has none. Lucky cat. My snake has
scales and can't have fleas. Lucky snake.
If the animals weren't animals could they still have fleas?
You can include blank lines as you like. If you want to append to your README file using multiple heredocs, then just use cat >>filename << EOF to append instead of truncate.
This question already has answers here:
Shell script read missing last line
(7 answers)
Closed 2 years ago.
Created a text file as hello_world.rtf with following two lines only:
Hello
World
and trying to read above file using below bash script from terminal:
while test= read -r line; do
> echo "The text read from file is: $line"
> done < hello_world.rtf
and it returns the following:
The text read from file is: {\rtf1\ansi\ansicpg1252\cocoartf1671\cocoasubrtf500
The text read from file is: {\fonttbl\f0\fswiss\fcharset0 Helvetica;}
The text read from file is: {\colortbl;\red255\green255\blue255;}
The text read from file is: {\*\expandedcolortbl;;}
The text read from file is: \paperw12240\paperh15840\margl1440\margr1440\vieww10800\viewh8400\viewkind0
The text read from file is: \pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural\partightenfactor0
The text read from file is:
The text read from file is: \f0\fs24 \cf0 Hello\
Any suggestion what is wrong here and how can I get the clean result?
RTF means Rich Text Format. It is a language for text formatting, developed and used mostly by Microsoft and deprecated for a while.
The text inside the file looks as you can see in the output of your code. It contains the words "Hello" and "World" but also formatting instructions.
Save the file as plain text, not RTF and it will contain only the text you typed in it.
test= in front of read does not have any effect in this context. You can remove it.
Make sure the last line of the file ends with a new-line character. read returns an non-zero exit status (and this means false) when it reaches the end of file and your code exits the while loop and does not display the last value read by read. If the file ends with a new-line character, the last line (that is read but not listed by the code) is empty, therefore nothing is lost.
It is a recommended practice for text files to always end with a newline character.
Alternatively you can print the value of line again after the loop. It contains the last line of the file (from the last end-of-line character until the end of file).
This is a common issue I have and my solution is a bit brash. So I'm looking for a quick fix and explanation of the problem.
The problem is that when I decide to save a spreadsheet in excel (mac 2011) as a tab delimited file it seems to do it perfectly fine. Until I try to parse the file line by line using Perl. For some reason it slurps the whole document in one line.
My brutish solution is to open the file in a web browser and copy and paste the information into the tab delimited file in TextEdit (I never use rich text format). I tried introducing a newline in the end of the file before doing this fix and it does not resolve the issue.
What's going on here? An explanation would be appreciated.
~Thanks!~
The problem is the actual character codes that define new lines on different systems. Windows systems commonly use a CarriageReturn+LineFeed (CRLF) and *NIX systems use only a LineFeed (LF).
These characters can be represented in RegEx as \r\n or \n (respectively).
Sometimes, to hash through a text file, you need to parse New Line characters. Try this for DOS-to-UNIX in perl:
perl -pi -e 's/\r\n/\n/g' input.file
or, for UNIX-to-DOS using sed:
$ sed 's/$'"/`echo \\\r`/" input.txt > output.txt
or, for DOS-to-UNIX using sed:
$ sed 's/^M$//' input.txt > output.txt
Found a pretty simple solution to this. Copy data from Excel to clipboard, paste it into a google spreadsheet. Download google spreadsheet file as a 'tab-separated values .tsv'. This gets around the problem and you have tab delimiters with an end of line for each line.
Yet another solution ...
for a tab-delimited file, save the document as a Windows Formatted Text (.txt) file type
for a comma-separated file, save the document as a `Windows Comma Separated (.csv)' file type
Perl has a useful regex pattern \R which will match any common line ending. It actually matches any vertical whitespace -- the same as \v -- or the CR LF combination, so it's the same as \r\n|\v
This is useful here because you can slurp your entire file into a single scalar and then split /\R/, which will give you a list of file records, already chomped (if you want to keep the line terminators you can split /\R\K/ instead
Another option is the PerlIO::eol module. It provides a new Perl IO layer that will normalize line endings no matter what the contents of the file are
Once you have loaded the module with use PerlIO::eol you can use it in an open statement
open my $fh, '<:eol(LF)', 'myfile.tsv' or die $!;
or you can use the open pragma to set it as the default layer for all input file handles
use open IN => ':raw:eol(LF)';
which will work fine with an input file from any platform
I have a bash script that runs and outputs to a text file however the colour codes it uses are also included what i'd like to know is how to remove them from the file, ie
^[[38;1;32mHello^[[39m
^[[38;1;31mUser^[[39m
so I just want to be left with Hello and User, so something like sed -r "special characters" from file A save to file B
sed 's/\^\[\[[^m]*m//g'
remove (all) part of line starting with ^[[ until first m
Some like this:
awk '{sub(/\^\[\[38;1;[0-9][0-9]m/,x);sub(/\^\[\[39m/,x)}1'
Hello
User
I am trying to pass a file into a program for data processing with bash and I am wondering if I have the correct syntax
/home/mumps/CS3150/Script/HW1/textfiles/CardioAndPulmonary.txt | /home/mumps/Medline2012/getDocs.mps > /home/mumps/CS3150/Scripts/HW1/textfiles/Titles.txt
The text files I am sending in are all valid and correctly formatted, but am just getting back a file error from the getDocs.mps (I should note that getDocs does work properly because it was something that my teacher passed out along with the debian vdi and other people aren't having a issue with it.)
getDocs does however call a text file that is located in Medline2012 as well which is where the error is coming from I believe.
Or just use bash redirection throughout without cat.
/home/mumps/Medline2012/getDocs.mps < /home/mumps/CS3150/Script/HW1/textfiles/CardioAndPulmonary.txt > /home/mumps/CS3150/Scripts/HW1/textfiles/Titles.txt
/home/mumps/Medline2012/getDocs.mps < /home/mumps/CS3150/Script/HW1/textfiles/CardioAndPulmonary.txt > /home/mumps/CS3150/Scripts/HW1/textfiles/Titles.txt
or
~/Medline2012/getDocs.mps < ~/CS3150/Script/HW1/textfiles/CardioAndPulmonary.txt > ~/CS3150/Scripts/HW1/textfiles/Titles.txt
or even
< ~/CS3150/Script/HW1/textfiles/CardioAndPulmonary.txt ~/Medline2012/getDocs.mps > ~/CS3150/Scripts/HW1/textfiles/Titles.txt
You either need to cat your .txt file, to pass the contents of it to the script via the pipe,
cat /home/mumps/CS3150/Script/HW1/textfiles/CardioAndPulmonary.txt | /home/mumps/Medline2012/getDocs.mps > output
or, depending on what's in the script, it might need to go as a command line parameter, i.e.
/home/mumps/Medline2012/getDocs.mps /home/mumps/CS3150/Script/HW1/textfiles/CardioAndPulmonary.txt > output
You are trying to execute your data file and feed the results to your script.
try
cat /home/mumps/CS3150/Script/HW1/textfiles/CardioAndPulmonary.txt | /home/mumps/Medline2012/getDocs.mps > /home/mumps/CS3150/Scripts/HW1/textfiles/Titles.txt
If you are still having trouble do a cd to the Medline2012 before you execute getDocs.mps. The reason is because when you access the getDoc.mps it calls to open the osu.medline database. This will cause a "file error" because the call in getDoc.mps does not include the path to osu.medline.
EDIT: A lot of people are telling you that you need to "cat" which is wrong. getDoc.mps has its own printing. If it didn't it wouldn't be printing "file error" for you. I also saw that you said that it is breaking after the loop. Did you test to make sure it isn't at the opening of the file. You can check by adding and indicating word in between the quotes in the first printing of "file error". You could change it to something like "file error 1." I realize you probably know that I just like to be thorough.