Sphinx documentation: custom images dir and _static directory for HTML docs - python-sphinx

I am a relative beginner developing a Python package. At the root of the repository there are two important directories: images and docs. The former contains some png and svg files I would like to put inside a documentation, the latter is where I run sphinx-quickstart in. I cannot change that layout therefore I have to let Sphinx know to use the top-level images directory while building the docs.
According to what I found over the internet I adjusted the conf.py file to have:
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static', '../images']
And in the index.rst I have to point to the image file itself:
.. image:: ../images/scheme.svg
:width: 500
:alt: schematic
:align: center
Having these two set up I run make html and I do get clean logs but the output directory is a little strange... Once the build is finished i have a docs/_build/html directory which contains _static and _images sub-directories (among many others). What I find strange is that inside docs/_build/html/_static I see all the contents of the root-level images being copied over whereas (at the same time) inside docs/_build/html/_images I only have scheme.svg. So essentially this one file is duplicated into these two subdirectories...
This does not look very clean to me... how should I adjust this setup?
Reply to the comment of bad_coder:
Below I will paste a tree with the dir structure (kept only the relevant elements):
.
├── docs
│   ├── Makefile
│   ├── _build
│   │   └── html
│   │   ├── _images
│   │   │   └── scheme.svg
│   │   ├── _static
│   │   │   ├── scheme.svg
│   ├── conf.py
│   ├── index.html
│   ├── index.rst
├── images
│   ├── scheme.svg

Related

Why does Sphinx duplicate images in HTML output directory?

I am using Sphinx to generate HTML documentation and have a structure similar to the following:
docs/
├── _static/
├── _templates/
├── guide/
│ ├── index.rst
│ └── image.jpg
├── conf.py
├── index.rst
├── make.bat
└── Makefile
I reference docs/guide/index.rst from docs/index.rst and embed docs/guide/image.jpg in docs/guide/index.rst using the image: directive.
What I notice is that after running make html, the build directory created has duplicated image.jpg, once in a build/html/_static/ folder, and once in a build/html/_images/ folder. Is there any reason for this or way to not duplicate the image? It seems that the generated HTML is only referencing the image using the build/html/_images/ path.

Pandoc --resource-path not finding assets

I am using the pandoc/latex docker container and having some issues with --resource-path. My project is using a latex template which references an image, and I cannot get latex to find the image.
My file tree looks like this:
/app/tmp/
├── 42
│   ├── galley.md
│   └── img
│   ├── 1.jpg
│   ├── 5.jpg
│   ├── 7.jpg
│   └── 9.jpg
And I have user data dirs like this:
/home/worker/.pandoc
├── assets
│   └── logo.png
└── templates
└── sow.latex
The command I'm using looks like
pandoc --from markdown --to latex /app/tmp/42/galley.md --output=/app/tmp/42/2/out.pdf --template sow.latex --resource-path=.:/home/worker/.pandoc/assets
As mentioned the latex template does make a reference to logo.png:
\includegraphics[width=0.75\textwidth]{logo.png}\par\vspace{1cm}
Whenever I run pandoc I get the following error:
! Package pdftex.def Error: File `logo.png' not found: using draft setting.
If I comment out the reference to logo.png then everything works fine including the other images referenced within galley.md. Likewise if I copy logo.png into the directory with galley.md everything works fine; as a workaround that is fine, but it feels quite clangy so I would rather sort out how to reference logo.png from where it sits in assets.
Is there something I've missed about how reference-path plays with template?

Correct directory structure for Puppet RSpec testing

I'm having some issues creating unit tests for my Puppet control repository.
I mostly work with roles and profiles with the following directory structure:
[root#puppet]# tree site
site
├── profile
│   ├── files
│   │   └── demo-website
│   │   └── index.html
│   └── manifests
│   ├── base.pp
│   ├── ci_runner.pp
│   ├── docker.pp
│   ├── gitlab.pp
│   ├── logrotate.pp
│   └── website.pp
├── role
│   └── manifests
│   ├── gitlab_server.pp
│   └── nginx_webserver.pp
Where do I need to place my spec files and what are the correct filenames?
I tried placing them here:
[root#puppet]# cat spec/classes/profile_ci_runner_spec.rb
require 'spec_helper'
describe 'profile::ci_runner' do
...
But I get an error:
Could not find class ::profile::ci_runner
The conventional place for a module's spec tests is in the module, with the spec/ directory in the module root. So site/profile/spec/classes/ci_runner_spec.rb, for example.
You could consider installing PDK, which can help you set up the structure and run tests, among other things.

Altering snakemake workflow to anticipate and accommodate different data-structures

I have an existing snakemake RNAseq workflow that works fine with a directory tree as below. I need to alter the workflow so that it can accommodate another layer of directories. Currently, I use a python script that os.walks the parent directory and creates a json file for the sample wildcards (json file for sample widlcards also included below). I am not very familiar with python, and it seems to me that adapting the code for an extra layer of directories shouldn't be too difficult and was hoping someone would be kind enough to point me in the right direction.
RNAseqTutorial/
├── Sample_70160
│ ├── 70160_ATTACTCG-TATAGCCT_S1_L001_R1_001.fastq.gz
│ └── 70160_ATTACTCG-TATAGCCT_S1_L001_R2_001.fastq.gz
├── Sample_70161
│ ├── 70161_TCCGGAGA-ATAGAGGC_S2_L001_R1_001.fastq.gz
│ └── 70161_TCCGGAGA-ATAGAGGC_S2_L001_R2_001.fastq.gz
├── Sample_70162
│ ├── 70162_CGCTCATT-ATAGAGGC_S3_L001_R1_001.fastq.gz
│ └── 70162_CGCTCATT-ATAGAGGC_S3_L001_R2_001.fastq.gz
├── Sample_70166
│ ├── 70166_CTGAAGCT-ATAGAGGC_S7_L001_R1_001.fastq.gz
│ └── 70166_CTGAAGCT-ATAGAGGC_S7_L001_R2_001.fastq.gz
├── scripts
├── groups.txt
└── Snakefile
{
"Sample_70162": {
"R1": [ "/gpfs/accounts/SlurmMiKTMC/Sample_70162/Sample_70162.R1.fq.gz"
],
"R2": [ "/gpfs/accounts//SlurmMiKTMC/Sample_70162/Sample_70162.R2.fq.gz"
]
},
{
"Sample_70162": {
"R1": [ "/gpfs/accounts/SlurmMiKTMC/Sample_70162/Sample_70162.R1.fq.gz"
],
"R2": [ "/gpfs/accounts/SlurmMiKTMC/Sample_70162/Sample_70162.R2.fq.gz"
]
}
}
The structure I need to accommodate is below
RNAseqTutorial/
├── part1
│   ├── 030-150-G
│   │   ├── 030-150-GR1_clipped.fastq.gz
│   │   └── 030-150-GR2_clipped.fastq.gz
│   ├── 030-151-G
│   │   ├── 030-151-GR1_clipped.fastq.gz
│   │   └── 030-151-GR2_clipped.fastq.gz
│   ├── 100T
│   │   ├── 100TR1_clipped.fastq.gz
│   │   └── 100TR2_clipped.fastq.gz
├── part2
│   ├── 030-025G
│   │   ├── 030-025GR1_clipped.fastq.gz
│   │   └── 030-025GR2_clipped.fastq.gz
│   ├── 030-131G
│   │   ├── 030-131GR1_clipped.fastq.gz
│   │   └── 030-131GR2_clipped.fastq.gz
│   ├── 030-138G
│   │   ├── 030-138R1_clipped.fastq.gz
│   │   └── 030-138R2_clipped.fastq.gz
├── part3
│   ├── 030-103G
│   │   ├── 030-103GR1_clipped.fastq.gz
│   │   └── 030-103GR2_clipped.fastq.gz
│   ├── 114T
│   │   ├── 114TR1_clipped.fastq.gz
│   │   └── 114TR2_clipped.fastq.gz
├── scripts
├── groups.txt
└── Snakefile
The main script that generates the json file for the sample wildcards is below
for root, dirs, files in os.walk(args):
for file in files:
if file.endswith("fq.gz"):
full_path = join(root, file)
#R1 will be forward reads, R2 will be reverse reads
m = re.search(r"(.+).(R[12]).fq.gz", file)
if m:
sample = m.group(1)
reads = m.group(2)
FILES[sample][reads].append(full_path)
I just can't seem to wrap my head around a way to accommodate that extra layer. Is there another module or function other than os.walk? Could I somehow force os.walk to skip a directory and merge the part and sample prefixes? Any suggestions would be helpful!
Edited to add:
I wasn't clear in describing my problem, and noticed that the second example wasn't representative of the problem, and I fixed the examples accordingly, because the second tree was taken from a directory processed by someone else. Data I get comes in two forms, either samples of only one tissue, where the directory consists of WD, sampled folders, and fastq files, where the fastq files have the same prefix as the sample folders that they reside in. The second example is of samples from two tissues. These tissues must be processed separate from each other. But tissues from both types can be found in separate "Parts", but tissues of the same type from different "Parts" must be processed together. If I could get os.walk to return four tuples, or even use
root,dirs,files*=os.walk('Somedirectory')
where the * would append the rest of the directory string to the files variable. Unfortunately, this method does not go to the file level for the third child directory 'root/part/sample/fastq'. In an ideal world, the same snakemake pipeline would be able to handle both scenarios with minimal input from the user. I understand that this may not be possible, but I figured I ask and see if there was a module that could return all portions of each sample directory string.
It seems to me that your problem doesn't have much to do on how to accommodate the second layer. Instead, the question is about the specifications of the directory trees and file names you expect.
In the first case, it seems you can extract the sample name from the first part of the file name. In the second case, file names are all the same and the sample name comes from the parent directory. So, either you implement some logic that tells which naming scheme you are parsing (and this depends on who/what provides the files) or you always extract the sample name from the parent directory as this should work also for the first case (but again, assuming you can rely on such naming scheme).
If you want to go for the second option, something like this should do:
FILES = {}
for root, dirs, files in os.walk('RNAseqTutorial'):
for file in files:
if file.endswith("fastq.gz"):
sample = os.path.basename(root)
full_path = os.path.join(root, file)
if sample not in FILES:
FILES[sample]= {}
if 'R1' in file:
reads = 'R1'
elif 'R2' in file:
reads = 'R2'
else:
raise Exception('Unexpected file name')
if reads not in FILES[sample]:
FILES[sample][reads] = []
FILES[sample][reads].append(full_path)
Not sure if I understand correctly, but here you go:
for root, dirs, files in os.walk(args):
for file in files:
if file.endswith("fq.gz"):
full_path = join(root, file)
reads = 'R1' if 'R1' in file else 'R2'
sample = root.split('/')[-1]
FILES[sample][reads].append(full_path)

What files of an Xcode project should I store in version control?

I'm new to Xcode and just found out that it stores a bunch of user information and other stuff in the project directory that I don't really need in version control or want to put up on Github.
This is what an Xcode project basically looks like:
1 AppName/
2 ├── AppName
3 │   ├── Base.lproj
4 │   │   ├── LaunchScreen.xib
5 │   │   └── Main.storyboard
6 │   ├── Images.xcassets
7 │   │   └── AppIcon.appiconset
8 │   │   └── Contents.json
9 │   ├── AppDelegate.swift
10 │   ├── Info.plist
11 │   └── ViewController.swift
12 ├── AppName.xcodeproj
13 │   ├── project.xcworkspace
14 │   │   ├── xcuserdata
15 │   │   │   └── user1.xcuserdatad
16 │   │   │   └── UserInterfaceState.xcuserstate
17 │   │   └── contents.xcworkspacedata
18 │   ├── xcuserdata
19 │   │   └── user1.xcuserdatad
20 │   │   └── xcschemes
21 │   │   ├── AppName.xcscheme
22 │   │   └── xcschememanagement.plist
23 │   └── project.pbxproj
24 └── AppNameTests
25 ├── AppNameTests.swift
26 └── Info.plist
My inclination is to just commit the AppName/ and AppNameTests/ and exclude the AppName.xcodeproj/ directory. What's the recommended way of doing this?
You'll want to use a .gitignore file to specify which files you don't want to store in GitHub.
Here is how to create the file, and here's what should go in that .gitignore file.
A better question is what should go in my git ignore file. This is a link to the github repo containing the file you need
https://github.com/github/gitignore/blob/master/Global/Xcode.gitignore
Make sure u start with this file so the files are properly ignored because if you don't some files my be added already and you will have to manually remove them.
The "recommended way" really depends on what you want to do with the project. Typically, there are three choices:
check-in only those files which are necessary to build the project
add files that reflect development customizations (such as project files that store the names of the currently-visible files in editors)
generated files, to make a complete snapshot of the project state.
With the last, you can get into problems with timestamps (while git can be told to know something about commit-times — see Checking out old file WITH original create/modified timestamps — few people do it). Without a system that retrieves files using their original timestamps, you end up with a set of files that demand recompilation each time you do a commit.
Even saving the customization files can be problematic, if you move the files to another part of the filesystem (or attempt to share the files with others).
So... use .gitignore to filter out files not needed to build. But check that you can successfully build using a fresh checkout.

Resources