I'm trying to output all the lines of a file which contain a specific word/pattern even if it contains other characters between its letters.
Let's say we have a bunch of domain names and we want to filter out all those that contain "paypal" inside, I would like to have this kind of output :
pay-pal-secure.com
payppal.net
etc...
I was wondering if this is possible with grep or does it exist something else that might do it.
Many thanks !
Replace paypal with regexp p.*a.*y.*p.*a.*l to allow all characters between the letters.
Update:
Use extended regular expression p.{0,2}a.{0,2}y.{0,2}p.{0,2}a.{0,2}l to limit characters between the letters to none to two.
Example: grep -E 'p.{0,2}a.{0,2}y.{0,2}p.{0,2}a.{0,2}l' file
See: The Stack Overflow Regular Expressions FAQ
Alternatively you could use agrep (approximate grep):
$ agrep -By paypal file
agrep: 2 words match within 1 error
pay-pal-secure.com
payppal.net
I'm trying to convert TXT files into pipe-delimited text files.
Let's say I have a file called sample.csv:
aaa",bbb"ccc,"ddd,eee",fff,"ggg,hhh,iii","jjj kkk","lll"" mmm","nnn"ooo,ppp"qqq",rrr" sss,"ttt,""uuu",Z
I'd like to convert this into an output that looks like this:
aaa"|bbb"ccc|ddd,eee|fff|ggg,hhh,iii|jjj kkk|lll" mmm|"nnn"ooo|ppp"qqq"|rrr" sss|ttt,"uuu|Z
Now after tons of searching, I have come the closest using this sed command:
sed -r 's/""/\v/g;s/("([^"]+)")?,/\2\|/g;s/"([^"]+)"$/\1/;s/\v/"/g'
However, the output that I received was:
aaa"|bbb"ccc|ddd,eee|fff|ggg,hhh,iii|jjj kkk|lll" mmm|"nnn"ooo|pppqqq|rrr" sss|ttt,"uuu|Z
Where the expected for the 9th column should have been ppp"qqq" but the result removed the double quotes and what I got was pppqqq.
I have been playing around with this for a while, but to no avail.
Any help regarding this would be highly appreciated.
As suggested in comments sed or any other Unix tool is not recommended for this kind of complex CSV string. It is much better to use a dedicated CSV parser like this in PHP:
$s = 'aaa",bbb"ccc,"ddd,eee",fff,"ggg,hhh,iii","jjj kkk","lll"" mmm","nnn"ooo,ppp"qqq",rrr" sss,"ttt,""uuu",Z';
echo implode('|', str_getcsv($s));
aaa"|bbb"ccc|ddd,eee|fff|ggg,hhh,iii|jjj kkk|lll" mmm|nnnooo|ppp"qqq"|rrr" sss|ttt,"uuu|Z
The problem with sample.csv is that it mixes non-quoted fields (containing quotes) with fully quoted fields (that should be treated as such).
You can't have both at the same time. Either all fields are (treated as) unquoted and quotes are preserved, or all fields containing a quote (or separator) are fully quoted and the quotes inside are escaped with another quote.
So, sample.csv should become:
"aaa""","bbb""ccc","ddd,eee",fff,"ggg,hhh,iii","jjj kkk","lll"" mmm","""nnn""ooo","ppp""qqq""","rrr"" sss","ttt,""uuu",Z
to give you the desired result (using a csv parser):
aaa"|bbb"ccc|ddd,eee|fff|ggg,hhh,iii|jjj kkk|lll" mmm|"nnn"ooo|ppp"qqq"|rrr" sss|ttt,"uuu|Z
Have the same problem.
I found right result with https://www.papaparse.com/demo
Here is a FOSS on github. So maybe you can check how it works.
With the source of [ "aaa""","bbb""ccc","ddd,eee",fff,"ggg,hhh,iii","jjj kkk","lll"" mmm","""nnn""ooo","ppp""qqq""","rrr"" sss","ttt,""uuu",Z ]
The result appears in the browser console:
[1]: https://i.stack.imgur.com/OB5OM.png
I have numerous files with extension .awesome containing lines like the following:
something =
[51,42,12]
Where something =* is in all the files as well as **[ (numbers vary.)
I would like to get rid of the newline, but don't know how. I came across tr, but worry it would replace all newlines. My files contain multiple newlines that I would like to retain (only change this newline.) I've been able to successfully to find and replace in the past with sed, but am having specifically with the special characters (\n and =.) In addition, I'm reading that sed is line by line and cannot handle something like this.
Any guidance would be appreciated.
GNU sed solution:
Sample test.awesome file contents:
some text
another text
something =
[51,42,12]
text
text
The job:
sed '/something =/{N; s/\n/ /;}' test.awesome
The output:
some text
another text
something = [51,42,12]
text
text
I have a thousand of txt files
1.txt
2.txt
3.txt
in each files, several times I have tags among my text:
{somethinghere...blablabla} than the text I want to keep than again {somethinghere...blablabla}
I'm not very pratical in mac osx command line, can someone help me to write a command opening each file, parsing it, and deleting all text included by two "{"?
To be clear:
First of all I need to open each file, than parse the text. When the loop finds a "{" it starts deleting till it founds a "}". When done parsing it saves and close the file. That's what I need to do.
$ sed -i.bak -e 's#{[^}]*}##g' *.txt
-i.bak make a backup copy of each modified files. If you don't want backups, on OsX use -i'' (the quotes are not necessary on Linux)
in substitutions, the delimiter can be another character than /, here I choose #, so : s#<REGEX>#<REMPLACEMENT># (the basic form for substitutions are s///)
In the regex, we search a litteral { and all but not a } with [^}]. * means 0 or more occurences. Last, we search the closing } and we replace the matching part by nothing, so it delete what was matching
the g modifier #the end means not only one match but all
I am new to writing in bash and I just finished this long script but I made the mistake of not adding quotation marks to all the variables beginning with the unary operator $. Adding all the quotation marks by hand is going to take a while. Is there a short cut I can use so all the words in the text file beginning with $ get quotation marks around them? So if a line in the file looks like:
python myProgram.py $car1 $car2 $speed1 $speed2
Then after the shortcut it will appear as
python myProgram.py "$car1" "$car2" "$speed1" "$speed2"
I am writing the script using nano.
Use global search and replace with the expression (\$\w+).
Switch to search and replace mode with C-\.
Switch to regex mode with Alt-R.
Type the expression (\$\w+). Hit Enter.
Type in the replacement expression "\1" replace the captured expression with quotations. Hit Enter.
On the match, hit A for All.
Given your need, it doesn't seem mandatory to provide a solution based on that editor. If you have access to a shell you might try this simple sed command:
sed -i.bak -r 's/\$\w+/"&"/g' my-script.sh
This is far from being perfect but should do the job in your particular case. If the above command:
-i.bak will perform the replacement "in place" -- that is modifying the original file, making a backup with the .bak extension
s/..../..../g is the usual sed command to search and replace using a pattern. The search pattern is between the first two \. The replacement is between the last two /
\$\w+ this pattern correspond to a $ followed by one or more letters (\w+). The backslash before $ is needed because that character normally has special meaning in a search pattern.
"&" is the replacement string. In there, the & is replaced by the string found in the search pattern. Broadly speaking this put quotes arround any string matching the search pattern.