Bash - removing characters other than specific string [closed] - bash

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Closed 9 years ago.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Improve this question
I use wget -O to get the site text to a specific file.
It saves the file within ONE line. File has about 2 milion characters. After, I would like to remove any other characters different from "directory= ************" format (including quotation mark, **** stand for any text till second quotation mark).
Is there any smart way to acomplish that?

grep -o '"directory= [^"]*"' file

The smartest way would be to pipe it out to a smarter interpreter. Python for instance.
Python has a great regular expressions library at your disposal.

Related

Windows cmd line rename removes zeros from filenames? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 2 years ago.
Improve this question
I have a bunch of files that I need to rename :
XX_20_0001.pdf
XX_20_0001.xml
XX_20_0023.pdf
XX_20_0023.jpg, etc.
I made a bat file with this code in it:
rename "XX_20_*" "XX_083_*"
But for some reason it only leaves three digits at the end of the file name:
XX_083_001.pdf
Why? and how can I change this?
Thank you!
Maria
For DOS ren SourceMask TargetMask, the TargetMask is processed against the SourceMask (source name) strictly from left to right with no back-tracking. As 083 is one character longer than 20, so DOS will "eat" one character.

bash sort numerically correct [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Closed 5 years ago.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Improve this question
I'm having a problem with sorting that -V isn't solving. I have a very well structured set of folder names (shown below with sort -V) that I want to do a version sort but recognizing floating point numbers (see the first two entries). Any advice?
Xe_phi_3.5_eV_tOn_0.5_ns_tOff_20_ns_d_3_um_V_97.4_V
Xe_phi_3.5_eV_tOn_0.5_ns_tOff_20_ns_d_3_um_V_97_V
Xe_phi_3.5_eV_tOn_0.5_ns_tOff_20_ns_d_6_um_V_90_V
Xe_phi_3.5_eV_tOn_0.5_ns_tOff_20_ns_d_6_um_V_153_V
Xe_phi_3.5_eV_tOn_0.5_ns_tOff_20_ns_d_7_um_V_70_V
Xe_phi_3.5_eV_tOn_0.5_ns_tOff_20_ns_d_8_um_V_70_V
Xe_phi_3.5_eV_tOn_0.5_ns_tOff_20_ns_d_8_um_V_80_V
Xe_phi_3.5_eV_tOn_0.5_ns_tOff_20_ns_d_9_um_V_192_V
Xe_phi_3.5_eV_tOn_0.5_ns_tOff_20_ns_d_10_um_V_90_V
You can apply numeric sort for 15th field if you want to sort by floating point number.
$ cat test.txt | sort -t _ -V -k 1,14 -k 15n
Xe_phi_3.5_eV_tOn_0.5_ns_tOff_20_ns_d_3_um_V_97_V
Xe_phi_3.5_eV_tOn_0.5_ns_tOff_20_ns_d_3_um_V_97.4_V
Xe_phi_3.5_eV_tOn_0.5_ns_tOff_20_ns_d_6_um_V_90_V
Xe_phi_3.5_eV_tOn_0.5_ns_tOff_20_ns_d_6_um_V_153_V
Xe_phi_3.5_eV_tOn_0.5_ns_tOff_20_ns_d_7_um_V_70_V
Xe_phi_3.5_eV_tOn_0.5_ns_tOff_20_ns_d_8_um_V_70_V
Xe_phi_3.5_eV_tOn_0.5_ns_tOff_20_ns_d_8_um_V_80_V
Xe_phi_3.5_eV_tOn_0.5_ns_tOff_20_ns_d_9_um_V_192_V
Xe_phi_3.5_eV_tOn_0.5_ns_tOff_20_ns_d_10_um_V_90_V

In Windows OS why is a single line without any newline character is being displayed in two lines in notepad? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 6 years ago.
Improve this question
I'm just trying to understand the below scenario.
I'm entering some data to notepad without any new line character but it is being displayed in two lines as shown in below (but I want output in single line only).
Is Windows OS adding a newline character to the data after a certain limit ? What is the reason behind displaying it in two lines ?
It might look funny, but please help me understand.
Notepad is not adding a newline character to the data. Notepad has a limit on how many characters it will display on a single line, even with the word-wrap mode turned off. This does not effect the actual characters in the file.

How does `Alt` decide which strings is a word in bash? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 8 years ago.
Improve this question
I can use Alt+f, Alt+b and a series of other Alt command to doing things with word. But the concept word is vague for me.
From my experience, . , _ are valid separator for word, meanwhile ' not,
but is there a clear doc which tells us which separator are valid for word in bash?
Bash (among other command line applications) uses a library called Readline for allowing the user to enter commands. The shortcuts you mentioned are Readline shortcuts (specifically, Readline in Emacs mode). You can check out the documentation for Readline here.
The answer to your specific question is that words are "composed of letters and digits".

Why do bash's man pages have double letters when written to file? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 9 years ago.
Improve this question
I've been wondering about this for a while. Whenever I run a command of the form "man [command] >> file.txt", there's a weird phenomenon where words that were bold when I viewed the man page in Terminal appear in the text file with each letter repeated twice.
For example,
DESCRIPTION
The find utility recursively descends the directory tree for each path listed
turns into
DDEESSCCRRIIPPTTIIOONN
The ffiinndd utility recursively descends the directory tree for each path
listed
Why is this? I can't think of a file encoding for which this makes sense. (If it encoded "bold" as "write each letter twice", then the band Abba would show up as A b a, among other problems...)
Try man -7 or man --ascii [command] >> file.txt which will export man pages to ascii.

Resources