Directory layout for pure Ruby project

Directory layout for pure Ruby project - ruby

I'm starting to learn ruby. I'm also a day-to-day C++ dev.
For C++ projects I usually go with following dir structure
/
-/bin <- built binaries
-/build <- build time temporary object (eg. .obj, cmake intermediates)
-/doc <- manuals and/or Doxygen docs
-/src
--/module-1
--/module-2
-- non module specific sources, like main.cpp
- IDE project files (.sln), etc.
What dir layout for Ruby (non-Rails, non-Merb) would you suggest to keep it clean, simple and maintainable?

As of 2011, it is common to use jeweler instead of newgem as the latter is effectively abandoned.

Bundler includes the necessary infrastructure to generate a gem:
$ bundle gem --coc --mit --test=minitest --exe spider
Creating gem 'spider'...
MIT License enabled in config
Code of conduct enabled in config
create spider/Gemfile
create spider/lib/spider.rb
create spider/lib/spider/version.rb
create spider/spider.gemspec
create spider/Rakefile
create spider/README.md
create spider/bin/console
create spider/bin/setup
create spider/.gitignore
create spider/.travis.yml
create spider/test/test_helper.rb
create spider/test/spider_test.rb
create spider/LICENSE.txt
create spider/CODE_OF_CONDUCT.md
create spider/exe/spider
Initializing git repo in /Users/francois/Projects/spider
Gem 'spider' was successfully created. For more information on making a RubyGem visit https://bundler.io/guides/creating_gem.html
Then, in lib/, you create modules as needed:
lib/
spider/
base.rb
crawler/
base.rb
spider.rb
require "spider/base"
require "crawler/base"
Read the manual page for bundle gem for details on the --coc, --exe and --mit options.

The core structure of a standard Ruby project is basically:
lib/
foo.rb
foo/
share/
foo/
test/
helper.rb
test_foo.rb
HISTORY.md (or CHANGELOG.md)
LICENSE.txt
README.md
foo.gemspec
The share/ is rare and is sometimes called data/ instead. It is for general purpose non-ruby files. Most projects don't need it, but even when they do many times everything is just kept in lib/, though that is probably not best practice.
The test/ directory might be called spec/ if BDD is being used instead of TDD, though you might also see features/ if Cucumber is used, or demo/ if QED is used.
These days foo.gemspec can just be .gemspec --especially if it is not manually maintained.
If your project has command line executables, then add:
bin/
foo
man/
foo.1
foo.1.md or foo.1.ronn
In addition, most Ruby project's have:
Gemfile
Rakefile
The Gemfile is for using Bundler, and the Rakefile is for Rake build tool. But there are other options if you would like to use different tools.
A few other not-so-uncommon files:
VERSION
MANIFEST
The VERSION file just contains the current version number. And the MANIFEST (or Manifest.txt) contains a list of files to be included in the project's package file(s) (e.g. gem package).
What else you might see, but usage is sporadic:
config/
doc/ (or docs/)
script/
log/
pkg/
task/ (or tasks/)
vendor/
web/ (or site/)
Where config/ contains various configuration files; doc/ contains either generated documentation, e.g. RDoc, or sometimes manually maintained documentation; script/ contains shell scripts for use by the project; log/ contains generated project logs, e.g. test coverage reports; pkg/ holds generated package files, e.g. foo-1.0.0.gem; task/ could hold various task files such as foo.rake or foo.watchr; vendor/ contains copies of the other projects, e.g. git submodules; and finally web/ contains the project's website files.
Then some tool specific files that are also relatively common:
.document
.gitignore
.yardopts
.travis.yml
They are fairly self-explanatory.
Finally, I will add that I personally add a .index file and a var/ directory to build that file (search for "Rubyworks Indexer" for more about that) and often have a work directory, something like:
work/
NOTES.md
consider/
reference/
sandbox/
Just sort of a scrapyard for development purposes.

#Dentharg: your "include one to include all sub-parts" is a common pattern. Like anything, it has its advantages (easy to get the things you want) and its disadvantages (the many includes can pollute namespaces and you have no control over them). Your pattern looks like this:
- src/
some_ruby_file.rb:
require 'spider'
Spider.do_something
+ doc/
- lib/
- spider/
spider.rb:
$: << File.expand_path(File.dirname(__FILE__))
module Spider
# anything that needs to be done before including submodules
end
require 'spider/some_helper'
require 'spider/some/other_helper'
...
I might recommend this to allow a little more control:
- src/
some_ruby_file.rb:
require 'spider'
Spider.include_all
Spider.do_something
+ doc/
- lib
- spider/
spider.rb:
$: << File.expand_path(File.dirname(__FILE__))
module Spider
def self.include_all
require 'spider/some_helper'
require 'spider/some/other_helper'
...
end
end

Why not use just the same layout? Normally you won't need build because there's no compilation step, but the rest seems OK to me.
I'm not sure what you mean by a module but if it's just a single class a separate folder wouldn't be necessary and if there's more than one file you normally write a module-1.rb file (at the name level as the module-1 folder) that does nothing more than require everything in module-1/.
Oh, and I would suggest using Rake for the management tasks (instead of make).

I would stick to something similar to what you are familiar with: there's no point being a stranger in your own project directory. :-)
Typical things I always have are lib|src, bin, test.
(I dislike these monster generators: the first thing I want to do with a new project is get some code down, not write a README, docs, etc.!)

So I went with newgem.
I removed all unnecessary RubyForge/gem stuff (hoe, setup, etc.), created git repo, imported project into NetBeans. All took 20 minutes and everything's on green.
That even gave me a basic rake task for spec files.
Thank you all.

Related

Where should I install myproj-config.cmake and myproj-version-config.cmake?

Suppose you're developing some library, myproj, using CMake for build configuration; supporting the cmake --install (using install() commands); and supporting use of myproj with CMake config mode, i.e. by making relevant .cmake files accessible to dependent projects.
Now, ,given an install root directory - where should I install my project's configuration .cmake files? Is there an idiomatic standard(ish) location?
Sorush Khajepor's R&D blog suggests ${LIB_INSTALL_DIR}/cmake/myproj - and it's the newest.
Foonathan's blog suggests placing the config .cmake files in ${LIB_INSTALL_DIR}/. So does Falkor's blog.
The documentation page for the CMakePackageConfigHelpers module suggests: ${LIB_INSTALL_DIR}/myproj/cmake.
What's the most popular/idiomatic choice? And what are its pros and cons relative to the other ones?

I advocate for setting a cache variable to override this and defaulting it to <LIBDIR>/cmake/ProjName (as you suggest in your answer):
cmake_minimum_required(VERSION 3.21) # for saner CACHE variables
project(ProjName VERSION 0.1.0)
# ...
include(GNUInstallDirs)
include(CMakePackageConfigHelpers)
set(ProjName_INSTALL_CMAKEDIR "${CMAKE_INSTALL_LIBDIR}/cmake/ProjName"
CACHE STRING "Path to ProjName CMake files")
install(EXPORT ProjName_Targets
DESTINATION "${ProjName_INSTALL_CMAKEDIR}"
NAMESPACE ProjName::
FILE ProjNameConfig.cmake
COMPONENT ProjName_Development)
write_basic_package_version_file(
ProjNameConfigVersion.cmake
COMPATIBILITY SameMajorVersion)
install(FILES
"${CMAKE_CURRENT_BINARY_DIR}/ProjNameConfigVersion.cmake"
DESTINATION "${ProjName_INSTALL_CMAKEDIR}"
COMPONENT ProjName_Development)
I wrote a blog post with an expanded version of this a while back: https://alexreinking.com/blog/building-a-dual-shared-and-static-library-with-cmake.html
In general, setting an install() destination to anything other than "${SOME_CACHE_VARIABLE}" is bound to cause headaches for some package maintainer. Where GNUInstallDirs doesn't provide a valid configuration point, you must create your own.

I'll argue in favor of ${LIB_INSTALL_DIR}/cmake/myproj.
If you're installing to some library-specific install location, e.g. /opt/myproj - then it doesn't really matter all that much anyway. But think about what happens when you install to, say, /usr/local.
If you place the scripts in ${LIB_INSTALL_DIR}, that library now becomes full of foo-config.cmake and foo-version-config.cmake, instead of just library files (and some subdirs). Less fun for browsing and searching.
If you place the scripts in ${LIB_INSTALL_DIR}/myproj/cmake, then - the same thing happens, but with per-project subdirs instead of sets of files. Better, perhaps, but instead - why don't we just replace the path elements of myproj and cmake, and that way we would get a cmake/ directory with many subdirs, instead. That's cleaner and more convenient IMHO.

Can a go module have no go.mod file?

I ran into a repo that seems to be a Go module, but there's no go.mod file in it: github.com/confluentinc/confluent-kafka-go.
Is it ok for a go module to have no go.mod file with dependencies, or the authors of that library just didn't migrate to modules yet?

Dependency modules do not need to have explicit go.mod files.
The “main module” in module mode — that is, the module containing the working directory for the go command — must have a go.mod file, so that the go command can figure out the import paths for the packages within that module (based on its module path), and so that it has a place to record its dependencies once resolved.
In addition, any modules slotted in using replace directives must have go.mod files (in order to reduce confusion due to typos or other errors in replacement paths).
However, in general a module that lacks an explicit go.mod file is valid and fine to use. Its effective module path is the path by which it was required, which can be a bit confusing if the same repository ends up in use via multiple paths. Since a module with no go.mod file necessarily doesn't specify its own dependencies, consumers of that module will have to fill in those dependencies themselves (go mod tidy will mark them as // indirect in the consumer's go.mod file).

SHORT SUMMARY OF THE DISCUSSION:
The answer is "No"!
This project contains a set of go packages, but it is not a Go module as it doesn't contain go.mod file (although, it used to be a multi-module repo (Go) previously).
go get can run in both ways: module-aware mode and legacy GOPATH mode (as of Go 1.16).
To read more about this, refer to docs by using the go command:
$ go help gopath-get
and
$ go help module-get
It'd tell about how go get works in both cases.
Also, I noticed that it can download any repository and would treat it as a Go package, even if it contains an arbitrary Python project.
I did a simple test to demonstrate the same:
$ go get github.com/mongoengine/mongoengine
And it surprisingly worked.

Modules are defined by their go.mod file. Without a go.mod file, it is not a module.
See this from the Go Modules Reference
A module is a collection of packages that are released, versioned, and distributed together. Modules may be downloaded directly from version control repositories or from module proxy servers.
A module is identified by a module path, which is declared in a go.mod file, together with information about the module's dependencies. The module root directory is the directory that contains the go.mod file.
And
A module is defined by a UTF-8 encoded text file named go.mod in its root directory.

Go Modules for microservices

I am developing multiple microservices, which require different modules (which shall be available like modules on github, but private)
My first tests with Go were all located within one package, which gets quite messy after some time
I'm coming from the Java side of programming - with loads of packages - which keep stuff clear and clean.
(This also works with public modules like on github.com/xyz/module1 github.com/xyz/module2 github.com/xyz/module3)
I just need this for private modules - how can I do so?
This is what I tried yet:
Directory of my go sources:
C:\my\path\to\go\src\
In this dir I have multiple subdirs containing modules (actually more than listed here)
my-module1
my-module2
my-module3
For each folder I called go mod init but I get the message that
package my-module1/util is not in GOROOT (c:\go\src\my-module1\util)
which is obviously right, as my private libraries reside in C:\my\path\to\go\src\
Importing packages from github with go get ... is working without troubles (those packages will be loaded but copied to c:\go\src)
Working with all files in one folder works but is not the desired solution (I need to create multiple microservices therefore I want to be able to create different projects with custom executeables and or tests)
What am I doing wrong?
If more information is needed I will provide it - just let me know what
NOTE: packages without go file in package main cannot be installed via go install. This system looks pretty confusing to me - as modules cannot be found...

finally i created a multimodule
means i have a root folder (lets call it root) which contains a go.mod file
this defines the root module
it requires all my submodules
and it replaces the absolute path into a relative.
root
* aa
* ab
* bc
main gets defined as
module root
go 1.15
require (
root/aa v0.0.0
root/ab v0.0.0
root/bc v0.0.0
)
replace (
root/aa => ./aa
root/ab => ./ab
root/bc => ./bc
)
all the submodules have a very simple definition (this is equal for all, as long as you do not have any dependencies)
module root/aa

Are pysa users expected to copy configuration files?

Facebook's Pysa tool looks useful, in the Pysa tutorial exercises they refer to files that are provided in the pyre-check repository using a relative path to include a path outside of the exercise directory.
https://github.com/facebook/pyre-check/blob/master/pysa_tutorial/exercise1/.pyre_configuration
{
"source_directories": ["."],
"taint_models_path": ["."],
"search_path": [
"../../stubs/"
],
"exclude": [
".*/integration_test/.*"
]
}
There are stubs provided for Django in the pyre-check repository which if I know the path where pyre check is installed I can hard-code in my .pyre_configuration and get something working but another developer may install pyre-check differently.
Is there a better way to refer to these provided stubs or should I copy them to the repository I'm working on?

Many projects have a standard development environment, allowing for hard coded paths in the .pyre_configuration file. These will usually point into the venv, or some other standard install location for dependencies.
For projects without a standard development environment, you could trying incorporating pyre init into your setup scripts. pyre init will setup a fresh .pyre_configuration file with paths that correspond to the current install of pyre. For additional configuration you want to add on top of the generated .pyre_configuration file (such as a pointer to local taint models), you can hand write a .pyre_configuration.local, which will act as an overlay and overwrite/add to the content of .pyre_configuration.

Pyre-check looks for the stubs in the directory specified by the typeshed directive in the configuration file.
The easiest way is to move stubs provided for Django in the pyre-check repository to the typeshed directory that is in the pyre-check directory.
For example, if you have installed pyre-check to the ~/.local/lib directory, move the django directory from ~/.local/lib/pyre_check/stubs to ~/.local/lib/pyre_check/typeshed/third_party/2and3/ and make sure your .pyre_configuration file will look like this:
{
"source_directories": ["~/myproject"],
"taint_models_path": "~/myproject/taint",
"typeshed": "~/.local/lib/pyre_check/typeshed"
}
In this case, your Django stubs directory will be ~/.local/lib/pyre_check/typeshed/third_parth/2and3/django
Pyre-check uses the following algorithm to traverse across the typeshed directory:
If it contains the third_party subdirectory, it uses a legacy method: enters just the two subdirectories: stdlib and third_party and there looks for any subdirectory except those with names starting with 2 but not 2and3, and looks for the modules in those subdirectories like 2and3, e.g. in third_party/2and3/
Otherwise, it enters the subdirectories stubs and stdlib, and looks for modules there, e.g. in stubs/, but not in stubs/2and3/.
That's why specifying multiple paths may be perplexing and confusing, and the easiest way is to setup the typeshed directory to ~/.local/lib/pyre_check/typeshed/ and move django to third_parth/2and3, so it will be ~/.local/lib/pyre_check/typeshed/third_parth/2and3/django.
Also don't forget to copy the .pysa files that you need to the taint_models_path directory. Don't set it up to the directory of the Pyre-check, create your own new directory and copy only those files that are relevant to you.

How to generate multiple packages with Rake?

The task i'm using looks something like this:
Rake::PackageTask.new("deploy", "0.1.2") do |p|
p.need_tar = true
p.package_files.include("build/**/*")
end
This generates a deploy-0.1.2.zip file. I would like to be able to generate a different package for each folder contained in the build, for example:
build/
|— en/
|— es/
|— de/
|— fr/
Should generate deploy-en-0.1.2.zip, deploy-es-0.1.2.zip, deploy-de-0.1.2.zip, deploy-fr-0.1.2.zip files.

The preconised concept is to locales, not make specific packages
Work with i18n, create a determining code if needed or assuming the locales of the workstation, server how code is executed (in case of CLI, library, etc...)
Otherwise you could create n clones for each languages of your project and create a Rakefile in the root folder how do explicit or recursive call of rake target for each gem building...
you need to do n specific gemspec (one for a specific language) , you also need to write a opy target in your Rakefile to centralized packages because of the differents pkg folders build in each language specific gem folder you create?
Personnaly i think it's not a good idea, you have to maintained n similar copy of the same application ....

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio