Bash: replace specific newline with space - bash

I have numerous files with extension .awesome containing lines like the following:
something =
[51,42,12]
Where something =* is in all the files as well as **[ (numbers vary.)
I would like to get rid of the newline, but don't know how. I came across tr, but worry it would replace all newlines. My files contain multiple newlines that I would like to retain (only change this newline.) I've been able to successfully to find and replace in the past with sed, but am having specifically with the special characters (\n and =.) In addition, I'm reading that sed is line by line and cannot handle something like this.
Any guidance would be appreciated.

GNU sed solution:
Sample test.awesome file contents:
some text
another text
something =
[51,42,12]
text
text
The job:
sed '/something =/{N; s/\n/ /;}' test.awesome
The output:
some text
another text
something = [51,42,12]
text
text

Related

Programmatically delete all text between 2 characters in osx terminal

I have a thousand of txt files
1.txt
2.txt
3.txt
in each files, several times I have tags among my text:
{somethinghere...blablabla} than the text I want to keep than again {somethinghere...blablabla}
I'm not very pratical in mac osx command line, can someone help me to write a command opening each file, parsing it, and deleting all text included by two "{"?
To be clear:
First of all I need to open each file, than parse the text. When the loop finds a "{" it starts deleting till it founds a "}". When done parsing it saves and close the file. That's what I need to do.
$ sed -i.bak -e 's#{[^}]*}##g' *.txt
-i.bak make a backup copy of each modified files. If you don't want backups, on OsX use -i'' (the quotes are not necessary on Linux)
in substitutions, the delimiter can be another character than /, here I choose #, so : s#<REGEX>#<REMPLACEMENT># (the basic form for substitutions are s///)
In the regex, we search a litteral { and all but not a } with [^}]. * means 0 or more occurences. Last, we search the closing } and we replace the matching part by nothing, so it delete what was matching
the g modifier #the end means not only one match but all

Excel saves tab delimited files without newline (UNIX/Mac os X)

This is a common issue I have and my solution is a bit brash. So I'm looking for a quick fix and explanation of the problem.
The problem is that when I decide to save a spreadsheet in excel (mac 2011) as a tab delimited file it seems to do it perfectly fine. Until I try to parse the file line by line using Perl. For some reason it slurps the whole document in one line.
My brutish solution is to open the file in a web browser and copy and paste the information into the tab delimited file in TextEdit (I never use rich text format). I tried introducing a newline in the end of the file before doing this fix and it does not resolve the issue.
What's going on here? An explanation would be appreciated.
~Thanks!~
The problem is the actual character codes that define new lines on different systems. Windows systems commonly use a CarriageReturn+LineFeed (CRLF) and *NIX systems use only a LineFeed (LF).
These characters can be represented in RegEx as \r\n or \n (respectively).
Sometimes, to hash through a text file, you need to parse New Line characters. Try this for DOS-to-UNIX in perl:
perl -pi -e 's/\r\n/\n/g' input.file
or, for UNIX-to-DOS using sed:
$ sed 's/$'"/`echo \\\r`/" input.txt > output.txt
or, for DOS-to-UNIX using sed:
$ sed 's/^M$//' input.txt > output.txt
Found a pretty simple solution to this. Copy data from Excel to clipboard, paste it into a google spreadsheet. Download google spreadsheet file as a 'tab-separated values .tsv'. This gets around the problem and you have tab delimiters with an end of line for each line.
Yet another solution ...
for a tab-delimited file, save the document as a Windows Formatted Text (.txt) file type
for a comma-separated file, save the document as a `Windows Comma Separated (.csv)' file type
Perl has a useful regex pattern \R which will match any common line ending. It actually matches any vertical whitespace -- the same as \v -- or the CR LF combination, so it's the same as \r\n|\v
This is useful here because you can slurp your entire file into a single scalar and then split /\R/, which will give you a list of file records, already chomped (if you want to keep the line terminators you can split /\R\K/ instead
Another option is the PerlIO::eol module. It provides a new Perl IO layer that will normalize line endings no matter what the contents of the file are
Once you have loaded the module with use PerlIO::eol you can use it in an open statement
open my $fh, '<:eol(LF)', 'myfile.tsv' or die $!;
or you can use the open pragma to set it as the default layer for all input file handles
use open IN => ':raw:eol(LF)';
which will work fine with an input file from any platform

Remove colour code special characters from bash file

I have a bash script that runs and outputs to a text file however the colour codes it uses are also included what i'd like to know is how to remove them from the file, ie
^[[38;1;32mHello^[[39m
^[[38;1;31mUser^[[39m
so I just want to be left with Hello and User, so something like sed -r "special characters" from file A save to file B
sed 's/\^\[\[[^m]*m//g'
remove (all) part of line starting with ^[[ until first m
Some like this:
awk '{sub(/\^\[\[38;1;[0-9][0-9]m/,x);sub(/\^\[\[39m/,x)}1'
Hello
User

search a pattern in each line and append it at the end of that line

I have a file with the following entries:
folder1/a_b.csv folder1/generated/
folder2/folder3/a_b1.csv folder12/generated/
folder4/b_c.csv folder123/generated/
folder5/d.csv folder1/new_folder/generated/
folder6/12.csv folder/anotherfolder/morefolder/evenmorefolder/generated/
I want to copy the csv file name from each line, paste them at the end of that line and append it with ".org". Hence, the changed file would look like
folder1/a_b.csv folder1/generated/a_b.csv.org
folder2/folder3/a_b1.csv folder12/generated/a_b1.csv.org
folder4/b_c.csv folder123/generated/b_c.csv.org
folder5/d.csv folder1/new_folder/generated/d.csv.org
folder6/12.csv folder/anotherfolder/morefolder/evenmorefolder/generated/12.csv.org
Basically, I am looking for a command in vim or sed using which I can search a pattern in each line and append it at the end of that line. Is it possible?
Thanks in advance.
Vim
Here's how to do this in Vim:
:%s/\([^/]*\.csv\)\( .*\)/&\1.org/
This global (:%) substitution matches the filename (characters that don't contain /, ending in .csv), and captures \(...\) it. It then matches the rest of the line, and captures that, too.
As a replacement, first keep the original match & (or \0), then append the first capture (\1) with the additional suffix.
sed
Though the regular expression syntax is somewhat different than in Vim, the identical expression can be used with sed:
sed -e 's/\([^/]*\.csv\)\( .*\)/&\1.org/' input
Alternatives
It looks like you want to do file renaming in batches. On Linux, the mmv command-line tool is well suited for that; you'll probably find many similar tools on the web, too.
This might work for you (GNU sed):
sed -r 's|/([^ ]*) .*|&\1.org|' file

bash templating

i have a template, with a var LINK
and a data file, links.txt, with one url per line
how in bash i can substitute LINK with the content of links.txt?
if i do
#!/bin/bash
LINKS=$(cat links.txt)
sed "s/LINKS/$LINK/g" template.xml
two problem:
$LINKS has the content of links.txt without newline
sed: 1: "s/LINKS/http://test ...": bad flag in substitute command: '/'
sed is not escaping the // in the links.txt file
thanks
Use some better language instead. I'd write a solution for bash + awk... but that's simply too much effort to go into. (See http://www.gnu.org/manual/gawk/gawk.html#Getline_002fVariable_002fFile if you really want to do that)
Just use any language where you don't have to mix control and content text. For example in python:
#!/usr/bin/env python
links = open('links.txt').read()
template = open('template.xml').read()
print template.replace('LINKS', links)
Watch out if you're trying to force sed solution with some other separator - you'll get into the same problems unless you find something disallowed in urls (but are you verifying that?) If you don't, you already have another problem - links can contain < and > and break your xml.
You can do this using ed:
ed template.xml <<EOF
/LINKS/d
.r links.txt
w output.txt
EOF
The first command will go to the line
containing LINKS and delete it.
The second line will insert the
contents of links.txt on the current
line.
The third command will write the file
to output.txt (if you omit output.txt
the edits will be saved to
template.xml).
Try running sed twice. On the first run, replace / with \/. The second run will be the same as what you currently have.
The character following the 's' in the sed command ends up the separator, so you'll want to use a character that is not present in the value of $LINK. For example, you could try a comma:
sed "s,LINKS,${LINK}\n,g" template.xml
Note that I also added a \n to add an additional newline.
Another option is to escape the forward slashes in $LINK, possibly using sed. If you don't have guarantees about the characters in $LINK, this may be safer.

Resources