pandoc to make each directory a chapter - pandoc

I have a lot of markdown files in various directories each with the same format (# title, then ## sub-title).
can I make the --toc respect the folder layout, in that the folder itself is the name of chapter, and each markdown file is content of this chapter.
so far pandoc totally ignores my folder names, it works the same as putting all the markdown files within the same folder.

My approach to this is to create index files in each folder with first level heading and downgrade headings in other files by one level.
I use Git and by default I'm using default structure, having first level headings in files, but when I want to generate ebook using pandoc I'm modifying files via automated Linux shell script. After that, I revert changed files via Git.
Here's the script:
find ./docs/*/ -name "*.md" ! -name "*index.md" -exec perl -pi -e "s/^(#)+\s/#$&/g" {} \;
./docs/*/ means I'm looking only for files inside subfolders of docs directory like docs/foo/file1.md, docs/bar/file2.md.
I'm also interested only in *.md files, excluding *index.md files.
In index.md files (that I name usually 00-index.md to make them appear as first), I put a first level heading # and because those files are excluded from find portion of the script, their headings aren't downgraded.
Next, there's a perl's search and replace command with regular expression s/^(#)+\s/#$&/g that looks for all lines starting from one or more # and adds another # to them.
In the end, I'm running pandoc with --toc-depth=2 so the table of content contains only first and second level headings.
pandoc ./docs/**/*.md --verbose --fail-if-warnings --toc-depth=2 --table-of-contents -o ./ebook.epub
To revert all changes made to files, I restore changes in the Git repo.
git restore .

Related

Extracting contents of many zipped folders into a single directory

Kind of easy question, but I can't find the answer. I want to extract the contents of multiple zipped folders into a single directory. I am using the bash console, which is the only tool available on the particular website I am using.
For example, I have two folders: a.zip (which contains a1.txt and a2.txt) and b.zip (which contains b1.txt and b2.txt). I want to get extract all four text files into a single directory.
I have tried
unzip \*.zip -d \newdirectory
But it creates two directories (a and b) with two text files in each.
I also tried concatenating the two zipped folders into one big folder and extracting it, but it still creates two directories, even when I specify a new directory.
I can't figure what I am doing wrong. Any help?
Thanks in advance!
Use the -j parameter to ignore any directory structure.
unzip -j -d /path/to/your/directory '*.zip*'

Append part of folder name to all .gz within

I have a folder of data folders with the following structure:
sampleName1-randomNumbers/subfolder1/subfolder2/subfolder3/data1.gz
sampleName1-randomNumbers/subfolder1/subfolder2/subfolder3/data2.gz
sampleName2-randomNumbers/subfolder1/subfolder2/subfolder3/data1.gz
I want to modify all the data.gz within each sample folder by appending the sample name but not the random numbers to get:
sampleName1-randomNumbers/subfolder1/subfolder2/subfolder3/sampleName1_data1.gz
sampleName1-randomNumbers/subfolder1/subfolder2/subfolder3/sampleName1_data2.gz
sampleName2-randomNumbers/subfolder1/subfolder2/subfolder3/sampleName2_data1.gz
It seems like this should be a simple mv for loop but I haven't been able to figure out how to pull part of a folder name using basename.
for i in */Data/Intensities/BaseCalls/*.gz; do mv $i "fastq""/"${i%%-*}"."`basename $i`; done
I couldn't figure out how to make the files stay in their original folder but for my purposes it works to have all the files go to a new folder ("fastq")
I suppose the "sampleName" part doesn't include dashes. In that case, use the standard pattern removal expansion: %%. That is, suppose your full path (relative to directory root) is stored in $path, just do ${path%%-*} to extract the "sampleName" part. Search for %% in the Bash Reference Manual for more details. As a simple example:
> path=sampleName1-randomNumbers/subfolder1/subfolder2/subfolder3/data1.gz
> echo ${path%%-*}
sampleName1
Otherwise, you could also use more advanced substring extraction based on regex. See BashFAQ/100 or Manipulating Strings from the TLDP Advanced Bash Scripting Guide.
Update. Here's the full command to perform the job described, and it is entirely native to the shell:
for file in */Data/Intensities/BaseCalls/*.gz; do
mv "$file" "${file%/*}/${file%%-*}_${file##*/}"
done

Finding and Removing Unused Files Through Command Line

My websites file structure has gotten very messy over the years from uploading random files to test different things out. I have a list of all my files such as this:
file1.html
another.html
otherstuff.php
cool.jpg
whatsthisdo.js
hmmmm.js
Is there any way I can input my list of files via command line and search the contents of all the other files on my website and output a list of the files that aren't mentioned anywhere on my other files?
For example, if cool.jpg and hmmmm.js weren't mentioned in any of my other files then it could output them in a list like this:
cool.jpg
hmmmm.js
And then any of those other files mentioned above aren't listed because they are mentioned somewhere in another file. Note: I don't want it to just automatically delete the unused files, I'll do that manually.
Also, of course I have multiple folders so it will need to search recursively from my current location and output all the unused (unreferenced) files.
I'm thinking command line would be the fastest/easiest way, unless someone knows of another. Thanks in advance for any help that you guys can be!
Yep! This is pretty easy to do with grep. In this case, you would run a command like:
$ for orphan in `cat orphans.txt`; do \
echo "Checking for presence of ${orphan} in present directory..." ;
grep -rl $orphan . ; done
And orphans.txt would look like your list of files above, one file per line. You can add -i to the grep above if you want to grep case-insensitively. And you would want to run that command in /var/www or wherever your distribution keeps its webroots. If, after you see the above "Checking for..." and no matches below, you haven't got any files matching that name.

Unable to create the md5sum file I need to create. Manually doing it would be far too labour-intensive

I need to create/recreate an md5sum file for all files in a directory and all files in all sub-directories of that directory.
I am using a rockettheme template that requires a valid md5sum document and I have made changes to the files, so the originally included md5sum file is no longer valid.
There are over 300 files that need to be checksummed, and the md5hash added to a single file.
The basic structure of the file is as follows:
1555599f85c7cd6b3d8f1047db42200b admin/forms/fields/imagepicker.php
8a3edb0428f11a404535d9134c90063f admin/forms/fields/index.html
8a3edb0428f11a404535d9134c90063f admin/forms/index.html
8a3edb0428f11a404535d9134c90063f admin/index.html
8a3edb0428f11a404535d9134c90063f admin/presets/index.html
b6609f823ffa5cb52fc2f8a49618757f admin/presets/preset1.png
7d84b8d140e68c0eaf0b3ee6d7b676c8 admin/presets/preset2.png
0de9472357279d64771a9af4f8657c2a admin/presets/preset3.png
5bda28157fe18bffe11cad1e4c8a78fa admin/presets/preset4.png
2ff2c5c22e531df390d2a4adb1700678 admin/presets/preset5.png
4b3561659633476f1fd0b88034ae1815 admin/presets/preset6.png
8a3edb0428f11a404535d9134c90063f admin/tips/index.html
2afd5df9f103032d5055019dbd72da38 admin/tips/overview.xml
79f1beb0ce5170a8120ba65369503bdc component.php
caf4a31db542ca8ee63501b364821d9d css/grid-responsive.css
8a3edb0428f11a404535d9134c90063f css/index.html
8697baa2e31e784c8612e2c56a1cd472 css/master-gecko.css
0857bc517aa15592eb796553fd57668b css/master-ie10.css
a4625ce5b8e23790eacb7704742bf735 css/master-ie8.css
This is just a snippet, but the logic is there.
hash path/to/file/relative/to/MD5SUM_file
Can anyone help me write a shell script (bash shell) that I can add to my path that will execute and generate a file called "MD5SUM_new"? I want the output file name to be "MD5SUM_new" so I can review the content before issuing a mv MD5SUM_new MD5SUM
FYI, the MD5SUM_new file needs to be saved in the root level of the template.
Thanks
This is quite easy, really. To hash all files under the current directory:
find . -type f | xargs md5sum > md5sums
Then, you can make sure it's correct:
md5sum -c md5sums

Replacing a file in a zip archive

Using Ruby (1.9.3) I need to replace a single file in a zip archive.
The situation is as follows. I have ~1000 zip archives that need to be updated, specifically one file in each of them needs to be replaced. The archives are all of the same structure. Is there a quick and dirty way for Ruby, or a library/gem for Ruby, to simply say "replace the file in this zip archive with this file on the filesystem"?
I'll work on a solution of my own in the meantime.
You can use the zip command, called from the ruby, which probably will be the best solution. From the zip manpage zip manpage
-d
--delete
Remove (delete) entries from a zip archive. For example:
zip -d foo foo/tom/junk foo/harry/\* \*.o
will remove the entry foo/tom/junk, all of the files that start with foo/harry/, and all of the files that end with .o (in any path). Note that shell path‐
name expansion has been inhibited with backslashes, so that zip can see the asterisks, enabling zip to match on the contents of the zip archive instead of the
contents of the current directory. (The backslashes are not used on MSDOS-based platforms.) Can also use quotes to escape the asterisks as in
zip -d foo foo/tom/junk "foo/harry/*" "*.o"
Not escaping the asterisks on a system where the shell expands wildcards could result in the asterisks being converted to a list of files in the current
directory and that list used to delete entries from the archive.
Under MSDOS, -d is case sensitive when it matches names in the zip archive. This requires that file names be entered in upper case if they were zipped by
PKZIP on an MSDOS system. (We considered making this case insensitive on systems where paths were case insensitive, but it is possible the archive came from
a system where case does matter and the archive could include both Bar and bar as separate files in the archive.) But see the new option -ic to ignore case
in the archive.
If you want a pure ruby solution take a look at ZipFileSystem
Zip::ZipFile looks promising. It appears to have a way to delete and add files to a zip archive.

Resources