How to sort titles in a toctree using glob? - sorting

I need to sort my toctree by titles. I use :glob: property, and it sorts only on my filename.
Do you have a solution ?
Exemple :
.. toctree::
:maxdepth: 1
:titlesonly:
:glob:
myfile1.rst
myfile2.rst
myfile1.rst
BBBB
=====
myfile2.rst
AAAA
=====
I obtain in my HTML page :
BBBB
AAAA
and I would like to have :
AAAA
BBBB

glob is unnecessary because you don't use a globbing expression.
You can manually sort the files in the toctree directive. File extensions are unnecessary.
.. toctree::
:maxdepth: 1
:titlesonly:
myfile2
myfile1

Related

Is it possible to work with 'for loop grep' commands?

I have lots of files in every year directory
and in each file have long and large sentence like this for exmaple
List item
home/2001/2001ab.txt
the AAAS kill every one not but me and you and etc
the A1CF maybe color of full fill zombie
home/2002/2002ab.txt
we maybe know some how what
home/2003/2003ab.txt
Mr, Miss boston, whatever
aaas will will will long long
and in home directory, I got home/reference.txt (list of word file)
A1BG
A1CF
A2M
AAAS
I'd like to do count how many word in the file reference.txt is in every single year file
this is my code where I run in every year directory
home/2001/, home/2002/, home/2003/
# awk
function search () {
awk -v pattern="$1" '$0 ~ pattern {print}' *.txt > $1
}
# load custom.txt
for i in $(cat reference.txt)
do
search $i
done
# word count
wc -l * > line-count.txt
this is my result
home/2001/A1BG
$cat A1BG
0
home/2001/A1CF
$cat A1CF
1
home/2001/A2M
$cat A2M
0
home/2001/AAAS
$cat AAAS
1
home/2001/line-count.txt
$cat line-count.txt
2021ab.txt 2
A1BG
A1CF 1
A2M 0
AAAS 1
result line-count.txt file have all information what I want
but I have to do this work repeat manually
do cd directory
do run my code
and then cd directory
I have around 500 directory and file, it is not easy
and second problem is wasty bunch of file
create lots of file and takes too much time
because of this at first I'd likt use grep command
but I dont' know how to use list of file instead of single word
that is why I use awk
How can i do it more simple
at first I'd likt use grep command but I dont' know how to use list of
file instead of single word
You might use --file=FILE option for that purpose, selected file should hold one pattern per line.
How can i do it more simple
You might use --count option to avoid need of using wc -l for that, consider following simple example, let file.txt content be
123
456
789
and file1.txt content be
abc123
def456
and file2.txt content be
ghi789
xyz000
and file3.txt content be
xyz000
xyz000
then
grep --count --file=file.txt file1.txt file2.txt file3.txt
gives output
file1.txt:2
file2.txt:1
file3.txt:0
Observe that no files are created and file without matches does appear in output. Disclaimer: this solution assumes file.txt does not contain character of special meaning for GNU grep, if this does not hold do not use this solution.
(tested in GNU grep 3.4)

Shell script: Insert multiple lines into a file ONLY after a specified pattern appears for the FIRST time. (The pattern appears multiple times)

I want to insert multiple lines into a file using shell script. Let us consider my original file: original.txt:
aaa
bbb
ccc
aaa
bbb
ccc
aaa
bbb
ccc
.
.
.
and my insert file: toinsert.txt
111
222
333
Now I have to insert the three lines from the 'toinsert.txt' file ONLY after the line 'ccc' appears for the FIRST time in the 'original.txt' file. Note: the 'ccc' pattern appears more than one time in my 'original.txt' file. After inserting ONLY after the pattern appears for the FIRST time, my file should change like this:
aaa
bbb
ccc
111
222
333
aaa
bbb
ccc
aaa
bbb
ccc
.
.
.
I should do the above insertion using a shell script. Can someone help me?
Note2: I found a similar case, with a partial solution:
sed -i -e '/ccc/r toinsert.txt' original.txt
which actually does the insertion multiple times (for every time the ccc pattern shows up).
Use ed, not sed, to edit files:
printf "%s\n" "/ccc/r toinsert.txt" w | ed -s original.txt
It inserts the contents of the other file after the first line containing ccc, but unlike your sed version, only after the first.
This might work for you (GNU sed):
sed '0,/ccc/!b;/ccc/r insertFile' file
Use a range:
If the current line is in the range following the first occurrence of ccc, break from further processing and implicitly print as usual.
Otherwise if the current line does contain ccc,insert lines from insertFile.
N.B. This uses the address 0 which allows the regexp to occur on line 1 and is specific to GNU sed.
or:
sed -e '/ccc/!b;r insertFile' -e ':a;n;ba' file
Use a loop:
If a line does not contain ccc, no further processing and print as usual.
Otherwise, insert lines from insertFile and then using a loop, fetch/print the remaining lines until the end of the file.
N.B. The r command insists on being delimited from other sed commands by a newline. The -e option simulates this effect and thus the sed commands are split across two -e options.
or:
sed 'x;/./{x;b};x;/ccc/!b;h;r insertFile' file
Use a flag:
If the hold space is not empty (the flag has already been set), no further processing and print as usual.
Otherwise, if the line does not contain ccc, no further processing and print as usual.
Otherwise, copy the current line to the hold space (set the flag) and insert lines from insertFile.
N.B. In all cases the r command inserts lines from insertFile after the current line is printed.

How to remove lines that begin with the same character?

I'm trying to clean up the output from someone else's script by removing the headers that have no content.
Output currently looks like this:
====== Header1 ======
====== Header2 ======
====== Header3 ======
information
I'm trying to remove the lines for Header1 and Header2, but not Header3. I found an awk command that removes all duplicate lines but the last that begin with the same character, so that helps for this issue, but causes a new problem when the 'information' bit is numerous lines that also begin with the same character (usually tabs).
Desired output post cleanup:
====== Header3 ======
information
Thanks
This awk might work for you:
$ awk '/^===/{h=$0;p=0;next}!p{print h};{p=1}1' file
====== Header3 ======
information
Or as Glenn pointed out, this also works:
awk '/^===/{h=$0;next}h{print h;h=0}1' file

shell scripts how to replace string between 2 characters in all lines start with specific string?

I have a text file like below, I want to replace the old string between 2 characters(in this case is ^ and |) with new string (in this case will be replaced to old string ^ old string)if the line start with specific string (in this example is MMX.
text file original:
General start, this is a test file.
TAG okay, this line not need to be processed.
MMX ABCD ^string1|other strings abc
CCF ABCD ^string2|other strings cde, skip line
MMX CDEE ^String3|other strings aaa
MMX AAAA ^String4|other strings bbb
CCD BBBB ^String5|other strings ccc, skip line
text file after modify should be:
General start, this is a test file.
TAG okay, this line not need to be processed.
MMX ABCD ^string1^String1|other strings abc
CCF ABCD ^string2|other strings cde, skip line
MMX CDEE ^String3^String3|other strings aaa
MMX AAAA ^String4^String4|other strings bbb
CCD BBBB ^String5|other strings ccc, skip line
How can I use shell scripts to perform this job?
Here's one way using sed:
sed '/^MMX/s/\(\^[^|]*\)/\1\1/' file.txt
Results:
General start, this is a test file.
TAG okay, this line not need to be processed.
MMX ABCD ^string1^string1|other strings abc
CCF ABCD ^string2|other strings cde, skip line
MMX CDEE ^String3^String3|other strings aaa
MMX AAAA ^String4^String4|other strings bbb
CCD BBBB ^String5|other strings ccc, skip line
Just for completeness:
$ awk '/^MMX/{sub(/\^[^|]+/,"&&")}1' file
General start, this is a test file.
TAG okay, this line not need to be processed.
MMX ABCD ^string1^string1|other strings abc
CCF ABCD ^string2|other strings cde, skip line
MMX CDEE ^String3^String3|other strings aaa
MMX AAAA ^String4^String4|other strings bbb
CCD BBBB ^String5|other strings ccc, skip line
but I'd use one of the posted sed solutions since this is a simple substitution on a single line which is what sed is good at.
You can provide sed with an "address", which is a filter for the lines that the command is executed on:
sed '/^MMX/s/\^(.*)\|/^\1^\1|/g'
in this case, the address is /^MMX/, the command is s///g, and it replaces \^(.*)\| with ^\1^\1|, where \1 is the part in parentheses.
perl -plne "if(/^MMX/){$_=~s/([^\^]*)([^\|]*)(.*)/$1$2$2$3/g;}" your_file
tested below:
>perl -plne "if(/^MMX/){$_=~s/([^\^]*)([^\|]*)(.*)/$1$2$2$3/g;}" new.txt
General start, this is a test file.
TAG okay, this line not need to be processed.
MMX ABCD ^string1^string1|other strings abc
CCF ABCD ^string2|other strings cde, skip line
MMX CDEE ^String3^String3|other strings aaa
MMX AAAA ^String4^String4|other strings bbb
CCD BBBB ^String5|other strings ccc, skip line
To ensure capitalization in the new string:
sed '/^MMX/s/\^\([^|]\+\)/^\1^\u\1/'
sed s/^MMX([^^])^([^|])\|(.+)/MMX\1^\2^\2\|\3/ fileName

Merge all files in a directory into one using bash

I have a directory with several *.js files. Quantity and file names are unknown. Something like this:
js/
|- 1.js
|- 2.js
|- blabla.js
I need to merge all the files in this directory into one merged_dmYHis.js. For example, if files contents are:
1.js
aaa
bbb
2.js
ccc
ddd
eee
blabla.js
fff
The merged_280120111257.js would contain:
aaa
bbb
ccc
ddd
eee
fff
Is there a way to do it using bash, or such task requires higher level programming language, like python or similar?
cat 1.js 2.js blabla.js > merged_280120111257.js
general solution would be:
cat *.js > merged_`date +%d%m%Y%H%M`.js
Just out of interest - do you think it is a good idea to name the files with DDMMYYYYHHMM? It may be difficult to sort the files chronologically (within the shell). How about the YYYYMMDDHHMM pattern?
cat *.js > merged_`date +%Y%m%d%H%M`.js
You can sort the incoming files as well, the default is alphabetical order, but this example goes through from oldest to the newest by the file modification timestamp:
cat `ls -tr *.js` > merged_`date +%Y%m%d%H%M`.js
In this example cat takes the list of files from the ls command, and -t sorts by timestamp, and -r reverses the default order.

Resources