Can't properly print file in Bash - bash

I'm trying to echo the contents of this link and it exhibits what to me is bizarre behavior.
git#gud:/home/git$ URL="https://raw.githubusercontent.com/fivethirtyeight/data/master/births/US_births_1994-2003_CDC_NCHS.csv"
git#gud:/home/git$ content=$(wget $URL -q -O -)
git#gud:/home/git$ echo $content
2003,12,31,3,12374_month,day_of_week,births
I expected this code to print the contents as I see them when I open the link on a browser. But instead, the output, on its entirety, is 2003,12,31,3,12374_month,day_of_week,births, that's it.
I actually see this behaviour locally as well, after downloading the file. Tried it both using curl and simply copy and pasting into a text editor and saving the file. They all exhibit the same behavior. The same happens with cat, cut, head, tail and even awk.
This doesn't happen with other files and works fine on Python. What am I missing? How do I get it to work?
I realize that the file doesn't end with a new line character, but adding it doesn't fix it.
I'm on Ubuntu 18.04.1 LTS and the CLI I'm using is Bash release 4.4.19(1).

The data file uses Mac-style end-of-line markers (carriage return only). When you echo the content, or just cat the file, the lines are all printing over eachother. If you were to view the file with less or vim, you would see the complete content.
Try this:
$ URL="https://raw.githubusercontent.com/fivethirtyeight/data/master/births/US_births_1994-2003_CDC_NCHS.csv"
$ curl -o data.csv "$URL"
The wc command thinks that the file has zero lines:
$ wc -l data.csv
0 data.csv
Now let's translate those end-of-line markers:
$ tr '\r' '\n' < data.csv > data-modified.csv
wc now sees a more reasonable number of lines:
$ wc -l data-modified.csv
3652 data-modified.csv
And if we were to cat the file:
$ cat data-modified.csv
.
.
.
2003,12,28,7,7645
2003,12,29,1,12823
2003,12,30,2,14438
2003,12,31,3,12374

Related

How can I pipe output into another command?

I have a script located at /usr/local/bin/gq which is returned by the command whereis gq, well almost. What is actually returned is gq: /usr/local/bin/gq. But the following gives me just the filepath (with some white space)
whereis gq | cut -d ":" -f 2
What I’d like to do is be able to pipe that into cat, so I can see the contents. However the old pipe isn’t working. Any suggestions?
If you want to cat the contents of gq, then how about:
cat $(which gq)
The command which gq will result in /usr/local/bin/gq, and the cat command will act on that.

Writing a Bash script that takes a text file as input and pipes the text file through several commands

I keep text files with definitions in a folder. I like to convert them to spoken word so I can listen to them. I already do this manually by running a few commands to insert some pre-processing codes into the text files and then convert the text to spoken word like so:
sed 's/\..*$/[[slnc 2000]]/' input.txt inserts a control code after first period
sed 's/$/[[slnc 2000]]/' input.txt" inserts a control code at end of each line
cat input.txt | say -v Alex -o input.aiff
Instead of having to retype these each time, I would like to create a Bash script that pipes the output of these commands to the final product. I want to call the script with the script name, followed by an input file argument for the text file. I want to preserve the original text file so that if I open it again, none of the control codes are actually inserted, as the only purpose of the control codes is to insert pauses in the audio file.
I've tried writing
#!/bin/bash
FILE=$1
sed 's/$/ [[slnc 2000]]/' FILE -o FILE
But I get hung up immediately as it says sed: -o: No such file or directory. Can anyone help out?
If you just want to use foo.txt to generate foo.aiff with control characters, you can do:
#!/bin/sh
for file; do
test "${file%.txt}" = "${file}" && continue
sed -e 's/\..*$/[[slnc 2000]]/' "$file" |
sed -e 's/$/[[slnc 2000]]/' |
say -v Alex -o "${file%.txt}".aiff
done
Call the script with your .txt files as arguments (eg, ./myscript *.txt) and it will generate the .aiff files. Be warned, if say overwrites files, then this will as well. You don't really need two sed invocations, and the sed that you're calling can be cleaned up, but I don't want to distract from the core issue here, so I'm leaving that as you have it.
This will:-
a} Make a list of your text files to process in the current directory, with find.
b} Apply your sed commands to each text file in the list, but only for the current use, allowing you to preserve them intact.
c} Call "say" with the edited files.
I don't have say, so I can't test that or the control codes; but as long as you have Ed, the loop works. I've used it many times. I learned it as a result of exposure to FORTH, which is a language that still permits unterminated loops. I used to have problems with remembering to invoke next at the end of the script in order to start it, but I got over that by defining my words (functions) first, in FORTH style, and then always placing my single-use commands at the end.
#!/bin/sh
next() {
[[ -s stack ]] && main
end
}
main() {
line=$(ed -s stack < edprint+.txt)
infile=$(cat "${line}" | sed 's/\..*$/[[slnc 2000]]/' | sed 's/$/[[slnc 2000]]/')
say "${infile}" -v Alex -o input.aiff
ed -s stack < edpop+.txt
next
}
end() {
rm -v ./stack
rm -v ./edprint+.txt
rm -v ./edpop+.txt
exit 0
}
find *.txt -type -f > stack
cat >> edprint+.txt << EOF
1
q
EOF
cat >> edpop+.txt << EOF
1d
wq
EOF
next

What is wrong with this sed command?

I am facing a strange problem. An answer to what I want to do already exists Here. I am trying to remove trailing commas from each line of a file containing thousands of lines. Like this -
This is my command -
sed -i 's/,*$//g' file_name.csv
However, the output I get is exactly the same as the image above and the trailing commas are not removed.
I think SED is not matching the pattern and thus failing to replace the commas. To check if there are any hidden characters in the file, I used VIM's :set list option -
There are only $ at the end of each line which is just what is expected.
I can't understand why the command is failing.
I can suggest you two options:
First One is my favorite.
dos2unix file
#####will work for Huge File also
then try to run the command.
Other way to do this:
cat file | tr -d '\r' > file
###may not work for huge file
then run the command.
tr -d '\r' < file > file.tmp ; mv file.tmp file
##will work for Huge File also
Thanks to #Nahuel for suggesting last command.

wc output differs inside/outside vim

I'm working on a text file that contains normal text with LaTeX-style comments (lines starting with a %). To determine the non-comment word count of the file, I was running this command in Bash:
grep -v "^%" filename | wc -w
which returns about the number of words I would expect. However, if from within vim I run this command:
:r! grep -v "^%" filename | wc -w
It outputs the word count which includes the comments, but I cannot figure out why.
For example, with this file:
%This is a comment.
This is not a comment.
Running the command from outside vim returns 5, but opening the file in vim and running the similar command prints 9.
I also was having issues getting vim to prepend a "%" to the command's output, but if the output is wrong anyways, that issue becomes irrelevant.
The % character is special in vi. It gets substituted for the filename of the current file.
Try this:
:r! grep -v "^\%" filename | wc -w
Same as before but backslash-escaping the %. In my testing just now, your example :r! command printed 9 as it did for you, and the above printed 5.

Unix: How can I prepend output to a file?

Specifically, I'm using a combination of >> and tee in a custom alias to store new Homebrew updates in a text file, as well as output on screen:
alias bu="echo `date "+%Y-%m-%d at %H:%M"` \
>> ~/Documents/Homebrew\ Updates.txt && \
brew update | tee -a ~/Documents/Homebrew\ Updates.txt"
Question: What if I wish to prepend this output in my textfile, i.e. placed at the beginning of the file as opposed to appending it to the end?
Edit1: As someone reported in the answers below, the use of temp files might be a good approach, which at least helped me partially:
targetLog="~/Documents/Homebrew\ Updates.txt"
alias bu="(brew update | cat - $targetLog \
> /tmp/out1 && mv /tmp/out1 $targetLog \
&& echo `date "+%Y-%m-%d at %H:%M":%S` | \
cat - $targetLog > /tmp/out2 \
&& mv /tmp/out2 $targetLog)"
But the problem is the output to STDOUT (previously made possible by tee), which I'm not sure can be incorporated in this tempfile approach …?
sed will happily do that for you, using -i to edit in place, eg.
sed -i -e "1i `date "+%Y-%m-%d at %H:%M"`" some_file
This works by creating an output file:
Let's say we have the initial contents on file.txt
echo "first line" > file.txt
echo "second line" >> file.txt
So, file.txt is our 'bottom' text file. Now prepend into a new 'output' file
echo "add new first line" | cat - file.txt > output.txt # <--- Just this command
Now, output has the contents the way we want. If you need your old name:
mv output.txt file.txt
cat file.txt
The only simple and safe way to modify an input file using bash tools, is to use a temp file, eg. sed -i uses a temp file behind the scenes (but to be robust sed needs more).
Some of the methods used have a subtle "can break things" trap, when, rather than running your command on the real data file, you run it on a symbolic link (to the file you intend to modify). Unless catered for correctly, this can break the link and convert it into a real file which receives the mods and leaves the original real file without the intended mods and without the symlink (no error exit-code results)
To avoid this with sed, you need to use the --follow-symlinks option.
For other methods, just be aware that it needs to follow symlinks (when you act on such a link)
Using a temp file, then rm temp file works only if "file" is not a symlink.
One safe way is to use sponge from package moreutils
Unlike a shell redirect, sponge soaks up all its input before
opening
the output file. This allows for constructing pipelines that read from
and write to the same file.
sponge is a good general way to handle this type of situation.
Here is an example, using sponge
hbu=~/'Documents/Homebrew Updates.txt'
{ date "+%Y-%m-%d at %H:%M"; cat "$hbu"; } | sponge "$hbu"
Simplest way IMO would be to use echo and cat:
echo "Prepend" | cat - inputfile > outputfile
Or for your example basically replace the tee -a ~/Documents/Homebrew\ Updates.txt with cat - ~/Documents/Homebrew\ Updates.txt > ~/Documents/Homebrew\ Updates.txt
Edit: As stated by hasturkun this won't work, try:
echo "Prepend" | cat - file | tee file
But this isn't the most efficient way of doing it any more...
Similar to the accepted answer, however if you are coming here because you want to prepend to the first line - rather than prepend an entirely new line - then use this command.
sed -i "1 s/^/string_replacement/" some_file
The -i flag will do a replacement within the file (rather than creating a new file).
Then the 1 will only do the replacement on line 1.
Finally, the s command is used which has the following syntax s/find/replacement/flags.
In our case we don't need any flags. The ^ is called a caret and it is used to represent the very start of a string.
Try this http://www.unix.com/shell-programming-scripting/42200-add-text-beginning-file.html
There is no direct operator or command AFAIK.You use echo, cat, and mv to get the effect.
{ date; brew update |tee /dev/tty; cat updates.txt; } >updates.txt.new
mv updates.txt.new updates.txt
I've no idea why you want to do this. It's pretty standard that logs like this have later entries appearing, well, later in the file.

Resources