Outputting text file to csv using bash - bash

I am new to bash and I'm not sure how to do the following:
I have a text file in the format:
123
John
234
Sally
456
Lucy
...
I want to output it to a csv file in the form:
123,John
234,Sally
456,Lucy
...

A good job for sed:
sed '/[0-9]/{N;s/\n/,/}' txtfile
It detects lines having numbers and, when found, replaces the newline character by a comma.
If you also want to get rid of the blank lines in-between,
sed '/[0-9]/{N;s/\n/,/;n;d}' txtfile
Notice that if your file is as regular as the sample you gave, you don't even need the regex, 'N;s/\n/,/;n;d' would suffice.

Related

Searching a file for a string and printing the lines containing the string

I want to search a file line by line for the string 12345678, and print the full line(s) containing the string.
For instance, if the input file was
09298213 YYYY
12345678 NYNY
12173217 YYNN
Then the output should be
12345678 NYNY
You are just looking for grep. Specifically, in your case, the following will do the job:
grep '12345678' yourfile
and you can place with all grep's options that you like to print some context lines, to add coloring, and to do other fancy stuff.

Use sed to replace everything after match that is between two characters

I am grepping logs for the word "line1" and need to replace the text following that word that is between the characters : and ,
possible results would be:
xxx,\"line1\":\"C/O FRED FLINSTONE, MD\",xxx
xxx,\n line1: 'C/O FRED FLINSTONE, MD',xxx
xxx,\\\"line1\\\":\\\"C\\\\/O FRED FLINSTONE\\\,MD",xxx
I want to replace "C/O FRED FLINSTONE, MD" with "Redacted-Address1" so the end result would look something like:
xxx,\"line1\":Redacted-Address1,xxx
xxx,\n line1:Redacted-Address1,xxx
xxx,\\\"line1\\\":Redacted-Address1,xxx
I don't necessarily need to use SED but thought that was a good place to start. The xxx represents the reset of the line (not actual xxx) so we cant search by that and I want to leave that untouched.
A more complete example of the data would be:
,\"object\":{\"address\":[{\"city\":\"Bedrock\",\"line1\":\"C/O FRED FLINSTONE\, MD\",\"line2\":\"55101 Main St\",\"state\":\"TX\",\"use\":\"H\",\"zip\":69162}],
And the desired result would be:
,\"object\":{\"address\":[{\"city\":\"Bedrock\",\"line1\":Redacted-Address1,\"line2\":\"55101 Main St\",\"state\":\"TX\",\"use\":\"H\",\"zip\":69162}],
Using sed
sed -r '/line1/{s/([\]"line1[\]":)[\]"[^"]+",/\1Redacted-Address1,/}'
example
echo ',\"object\":{\"address\":[{\"city\":\"Bedrock\",\"line1\":\"C/O FRED FLINSTONE\, MD\",\"line2\":\"55101 Main St\",\"state\":\"TX\",\"use\":\"H\",\"zip\":69162}], '|sed -r '/line1/{s/([\]"line1[\]":)[\]"[^"]+",/\1Redacted-Address1,/}'
output will be
,\"object\":{\"address\":[{\"city\":\"Bedrock\",\"line1\":Redacted-Address1,\"line2\":\"55101 Main St\",\"state\":\"TX\",\"use\":\"H\",\"zip\":69162}],
It sounds like you want to get everything between the colon following line1 and the coma that immediately precedes the next colon. The following regex should accomplish that by replacing everything but the capture groups:
sed 's/\(line1[^:]*:\)[^:]*\(,[^,:]*:\)/\1 Redacted-Address1\2/'
You can use this sed with greedy regex :.*, which will match from : to last ,:
sed 's/:.*,/:Redacted Name,/' file
xxx,\"line1\":Redacted Name,xxx
xxx,\n line1:Redacted Name,xxx
xxx,\\"line1\\":Redacted Name,xxx
As per comments below:
sed "s/:..*['\"],/:Redacted Name,/" file
xxx,\"line1\":Redacted Name,xxx
xxx,\n line1:Redacted Name,xxx
xxx,\\"line1\\":Redacted Name,xxx
This might work for you (GNU sed):
sed 's/\(line1[^:]*:\).*,/\1Redacted Name,/' file
This uses the pattern line1 and any characters following which are not a : followed by a : ; and then greed (all characters to the end of the file then backtracking till a , is found). The match is then replaced by the back reference of the pattern upto the first : and the required string followed by a ,.

Append LINE FEED in VIM repeatedly after a line containing a unique String

Is there a way to append LINEFEED after every line containing the String
"Specialty" here are example of some lines:
"XYZ_Specialty=1122"
"Specialty_123=AABB"
"Specialty_MOD=ZZZZ"
Now, all the above three lines contain "Specialty" and i would like to append LINEFEED in linux after the end of line , as in right after :
1122
AABB
ZZZZ
If not in vim,perhaps in perl , or bash / awk ?
I am not sure if it is what you need, yet you can use sed in bash. I also assume that your data is stored in a file called temptest2
sed "s/\(.*Specialty.*\)\"/\1\n/g" temptest2
Use -i to save the output to the temptest2
() specify a field for you, which is being called by \1 and then sed adds \n to the end of it.

Use sed to extract ascii hex string from a file

I have a file that looks like this:
$ some random
$ text
00ab2c3f03$ and more
random text
1a2bf04$ more text
blah blah
and the code that looks like this:
sed -ne 's/\(.*\)$ and.*/\1/p' "file.txt" > "output1.txt"
sed -ne 's/\(.*\)$ more.*/\1/p' "file.txt" > "output2.txt"
That gives me this 00ab2c3f03 and this 1a2bf04
So it extracts anything from the beginning of the line to the shell prompt and stores it in the file, twice for two different instances.
The problem is that the file sometimes looks like this:
/dir # some random
/dir # text
00ab2c3f03/dir # and more
random text
345fabd0067234234/dir # more text
blah blah
And I want to make an universal extractor that either:
extracts data from the beginning of the line to the '$' OR '/' characters
intelligently extracts random amount of random hex data from the beginning of the line up to the first non-hex digit
But I'm not so good with sed to actually think of an easy solution by myself...
I think you want the output like this,
$ cat file
$ some random
$ text
00ab2c3f03$ and more
random text
1a2bf04$ more text
blah blah
/dir # some random
/dir # text
00ab2c3f03/dir # and more
random text
345fabd0067234234/dir # more text
blah blah
$ sed -ne 's/\([a-f0-9]*\).* and more.*/\1/p' file
00ab2c3f03
00ab2c3f03
$ sed -ne 's/\([a-f0-9]*\).* more text.*/\1/p' file
1a2bf04
345fabd0067234234
You could try the below GNU sed command also. Because / present in your input, i changed the sed delimiter to ~,
$ sed -nr 's~([a-f0-9]*)\/*\$*.* and more.*~\1~p' file
00ab2c3f03
00ab2c3f03
$ sed -nr 's~([a-f0-9]*)\/*\$*.* more text.*~\1~p' file
1a2bf04
345fabd0067234234
Explanation:
([a-f0-9]*) - Captures all the hexdigits and stored it into a group.
OP said there may be chance of / or $ symbol present just after the hex digits so the regex should be \/*\$*(/ zero or more times, $ zero or more times) after capturing group.
First command only works on the lines which contains the strings and more.
And the second one only works on the lines which contain more text because op want the two outputs in two different files.
This seems better to me:
sed -nr 's#([[:xdigit:]]+)[$/].*#\1#p' file

How to remove a long line with special characters from a big file in bash

I have a file having 200 lines and I want to remove 5 long lines( each line contains special characters).
$cat abc
............
comments[asci?_203] part of jobs where to delete
5 similar lines
.....
I tried sed to remove these 5 lines, using line numbers(nl) on the file, but did not work.
Thanks
Have you tried to remove the lines with awk? This is untested with special characters, but maybe it could work
awk '{if (length($0)<55) print $0}' < abc
Replace 55 with the maximum line length you want to keep

Resources