How to export variables from multiple files using module manifests in PowerShell? - powershell-4.0

I am trying to package multiple modules in one manifest module with PowerShell 4.0. Basically, I have 3 Setup Modules that do some stuff. I package these three using my manifest module. I then export only some functions and variables from all three modules. However, only my functions are callable from outside, my variables are nowhere to be seen. Can anyone help me out here? I basically followed this guide.
Here is my code:
Setup.ps1 (Startup-Script):
$currentDirectory = Split-Path -Parent $MyInvocation.MyCommand.Definition
$setupScriptPath = $currentDirectory + "\Setup.psd1"
Import-module $setupScriptPath
# This has to be set for every environment
$firstVariable # Not defined?
# $secondVariable= "http://url/"
# $thirdVariable= "http://url/"
Start-FirstSetup
Remove-Module -Name Setup
Setup.psd1 (Manifest module):
#
# Module manifest for module 'Setup'
#
# Generated by: Name
#
# Generated on: 1/7/2017
#
#{
# Script module or binary module file associated with this manifest.
# RootModule = ''
# Version number of this module.
ModuleVersion = '1.0'
# ID used to uniquely identify this module
GUID = 'guid'
# Author of this module
Author = 'Author name'
# Company or vendor of this module
CompanyName = 'Company'
# Copyright statement for this module
Copyright = '(c) 2017 Company'
# Description of the functionality provided by this module
Description = 'Starts three setups.'
# Minimum version of the Windows PowerShell engine required by this module
# PowerShellVersion = ''
# Name of the Windows PowerShell host required by this module
# PowerShellHostName = ''
# Minimum version of the Windows PowerShell host required by this module
# PowerShellHostVersion = ''
# Minimum version of Microsoft .NET Framework required by this module
# DotNetFrameworkVersion = ''
# Minimum version of the common language runtime (CLR) required by this module
# CLRVersion = ''
# Processor architecture (None, X86, Amd64) required by this module
# ProcessorArchitecture = ''
# Modules that must be imported into the global environment prior to importing this module
# RequiredModules = #()
# Assemblies that must be loaded prior to importing this module
# RequiredAssemblies = #()
# Script files (.ps1) that are run in the caller's environment prior to importing this module.
# ScriptsToProcess = #()
# Type files (.ps1xml) to be loaded when importing this module
# TypesToProcess = #()
# Format files (.ps1xml) to be loaded when importing this module
# FormatsToProcess = #()
# Modules to import as nested modules of the module specified in RootModule/ModuleToProcess
NestedModules = #('FirstSetup.psm1', 'SecondSetup.psm1', 'ThirdSetup.psm1')
# Functions to export from this module
FunctionsToExport = #('Start-FirstSetup', 'Start-SecondSetup', 'Start-ThirdSetup')
# Cmdlets to export from this module
CmdletsToExport = '*'
# Variables to export from this module
VariablesToExport = #('firstVariable', 'secondVariable', 'thirdVariable')
# Aliases to export from this module
AliasesToExport = '*'
# List of all modules packaged with this module
ModuleList = #('FirstSetup.psm1', 'SecondSetup.psm1', 'ThirdSetup.psm1')
# List of all files packaged with this module
# FileList = #()
# Private data to pass to the module specified in RootModule/ModuleToProcess
# PrivateData = ''
# HelpInfo URI of this module
# HelpInfoURI = ''
# Default prefix for commands exported from this module. Override the default prefix using Import-Module -Prefix.
# DefaultCommandPrefix = ''
}
FirstSetup.psm1 (First Module):
$firstVariable # This is not getting exported. Why?
function Start-FirstSetup{
Register-FirstVariables
echo "First setup started..."
}
Second and third setups: Same as first, only varables and functions are named firstVariable, thirdVariable, Start-SecondSetup, etc.
So my specific problem is, when I try to access $firstVariable from Setup.psm1, I get an error that it's not defined. But I marked it for export in my manifest module. So what did I miss here? When Start-FirstSetup is called, it goes through without any problem and I can even debug my module, but even then my $firstVariable is undefined.

You need to do 3 things to get a variable exported from a module:
Define the variable explicitly:
New-Variable -Name firstvariable
An implicit definition ($firstvariable) does not suffice.
Export the variable in the module (the .psm1 file):
Export-ModuleMember -Variable firstvariable
Define the exported variables in the manifest (the .psd1 file). A wildcard should normally suffice here:
VariablesToExport = '*'
but you can also provide a list:
VariablesToExport = #('firstvariable', 'secondvariable', 'thirdvariable')
This setting is basically a filter for defining which of the exported variables will actually be exposed. If a variable does not have a match here it will not be exported from the module, even if it's exported in a .psm1 file.

Related

Declare additional dependency to sphinx-build in an extension

TL,DR: From a Sphinx extension, how do I tell sphinx-build to treat an additional file as a dependency? In my immediate use case, this is the extension's source code, but the question could equally apply to some auxiliary file used by the extension.
I'm generating documentation with Sphinx using a custom extension. I'm using sphinx-build to build the documentation. For example, I use this command to generate the HTML (this is the command in the makefile generated by sphinx-quickstart):
sphinx-build -b html -d _build/doctrees . _build/html
Since my custom extension is maintained together with the source of the documentation, I want sphinx-build to treat it as a dependency of the generated HTML (and LaTeX, etc.). So whenever I change my extension's source code, I want sphinx-build to regenerate the output.
How do I tell sphinx-build to treat an additional file as a dependency? That is not mentioned in the toctree, since it isn't part of the source. Logically, this should be something I do from my extension's setup function.
Sample extension (my_extension.py):
from docutils import nodes
from docutils.parsers.rst import Directive
class Foo(Directive):
def run(self):
node = nodes.paragraph(text='Hello world\n')
return [node]
def setup(app):
app.add_directive('foo', Foo)
Sample source (index.rst):
.. toctree::
:maxdepth: 2
.. foo::
Sample conf.py (basically the output of sphinx-quickstart plus my extension):
import sys
import os
sys.path.insert(0, os.path.abspath('.'))
extensions = ['my_extension']
templates_path = ['_templates']
source_suffix = '.rst'
master_doc = 'index'
project = 'Hello directive'
copyright = '2019, Gilles'
author = 'Gilles'
version = '1'
release = '1'
language = None
exclude_patterns = ['_build']
pygments_style = 'sphinx'
todo_include_todos = False
html_theme = 'alabaster'
html_static_path = ['_static']
htmlhelp_basename = 'Hellodirectivedoc'
latex_elements = {
}
latex_documents = [
(master_doc, 'Hellodirective.tex', 'Hello directive Documentation',
'Gilles', 'manual'),
]
man_pages = [
(master_doc, 'hellodirective', 'Hello directive Documentation',
[author], 1)
]
texinfo_documents = [
(master_doc, 'Hellodirective', 'Hello directive Documentation',
author, 'Hellodirective', 'One line description of project.',
'Miscellaneous'),
]
Validation of a solution:
Run make html (or sphinx-build as above).
Modify my_extension.py to replace Hello world by Hello again.
Run make html again.
The generated HTML (_build/html/index.html) must now contain Hello again instead of Hello world.
It looks like the note_dependency method in the build environment API should do what I want. But when should I call it? I tried various events but none seemed to hit the environment object in the right state. What did work was to call it from a directive.
import os
from docutils import nodes
from docutils.parsers.rst import Directive
import sphinx.application
class Foo(Directive):
def run(self):
self.state.document.settings.env.note_dependency(__file__)
node = nodes.paragraph(text='Hello done\n')
return [node]
def setup(app):
app.add_directive('foo', Foo)
If a document contains at least one foo directive, it'll get marked as stale when the extension that introduces this directive changes. This makes sense, although it could get tedious if an extension adds many directives or makes different changes. I don't know if there's a better way.
Inspired by Luc Van Oostenryck's autodoc-C.
As far as I know app.env.note_dependency can be called within the doctree-read to add any file as a dependency to the document currently being read.
So in your use case, I assume this would work:
from typing import Any, Dict
from sphinx.application import Sphinx
import docutils.nodes as nodes
def doctree-read(app: Sphinx, doctree: nodes.document):
app.env.note_dependency(file)
def setup(app: Sphinx):
app.connect("doctree-read", doctree-read)

Rendering discrepancy between ReadTheDocs and localhost

I have just uploaded my documentation from Github to ReadTheDocs and I have found that it renders completely different on ReadTheDocs and my local machine. I am using the latest sphinx_rtd_theme on my local machine.
Here is the display on my local machine:
and here is the rendering on ReadTheDocs:
I have tried on Chrome, Firefox and Microsoft Edge with the same results so it does not appear to be a browser problem.
Here is a copy of my conf.py:
# -*- coding: utf-8 -*-
#
# Configuration file for the Sphinx documentation builder.
#
# This file does only contain a selection of the most common options. For a
# full list see the documentation:
# http://www.sphinx-doc.org/en/master/config
# -- Path setup --------------------------------------------------------------
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
# import sys
# sys.path.insert(0, os.path.abspath('.'))
import os
# -- Project information -----------------------------------------------------
project = 'OrderedTree'
project_title = 'Ordered Tree'
copyright = '2018, Jonathan Gossage'
author = 'Jonathan Gossage'
# The short X.Y version
version = '0.0'
# The full version, including alpha/beta/rc tags
release = '0.0.1'
# -- General configuration ---------------------------------------------------
# If your documentation needs a minimal Sphinx version, state it here.
#
# needs_sphinx = '1.0'
# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
'sphinx.ext.autodoc',
'sphinx.ext.intersphinx',
'sphinx.ext.todo',
'sphinx.ext.mathjax',
'sphinx.ext.ifconfig',
]
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
# The suffix(es) of source filenames.
# You can specify multiple suffix as a list of string:
#
# source_suffix = ['.rst', '.md']
source_suffix = '.rst'
# The master toctree document.
master_doc = 'index'
# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
#
# This is also used if you do content translation via gettext catalogs.
# Usually you set "language" from the command line for these cases.
language = None
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path .
exclude_patterns = []
# The name of the Pygments (syntax highlighting) style to use.
pygments_style = 'sphinx'
# Elements to be included at yhe start of each document file
rst_prolog = """
.. |br| raw:: html
<br />
.. |pn| replace:: {}
.. |pt| replace:: {}
""".format(project, project_title)
# -- Options for HTML output -------------------------------------------------
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
#html_theme = 'sphinx_rtd_theme'
on_rtd = os.environ.get('READTHEDOCS') == 'True'
if on_rtd:
html_theme = 'default'
else:
html_theme = 'sphinx_rtd_theme'
# Theme options are theme-specific and customize the look and feel of a theme
# further. For a list of options available for each theme, see the
# documentation.
#
# html_theme_options = {}
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
if on_rtd:
html_static_path = []
else:
html_static_path = ['_static']
html_context = { # Specify the css file to use
'css_files': ['_static/theme_overrides.css',]
}
# Custom sidebar templates, must be a dictionary that maps document names
# to template names.
#
# The default sidebars (for documents that don't match any pattern) are
# defined by theme itself. Builtin themes are using these templates by
# default: ``['localtoc.html', 'relations.html', 'sourcelink.html',
# 'searchbox.html']``.
#
# html_sidebars = {}
# -- Options for HTMLHelp output ---------------------------------------------
# Output file base name for HTML help builder.
htmlhelp_basename = '{}doc'.format(project)
# -- Options for LaTeX output ------------------------------------------------
latex_elements = {
# The paper size ('letterpaper' or 'a4paper').
#
# 'papersize': 'letterpaper',
# The font size ('10pt', '11pt' or '12pt').
#
# 'pointsize': '10pt',
# Additional stuff for the LaTeX preamble.
#
# 'preamble': '',
# Latex figure (float) alignment
#
# 'figure_align': 'htbp',
}
# Grouping the document tree into LaTeX files. List of tuples
# (source start file, target name, title,
# author, documentclass [howto, manual, or own class]).
latex_documents = [
(master_doc, '{}.tex'.format(project), '{} Documentation'.format(project_title),
'{}'.format(author), 'manual'),
]
# -- Options for manual page output ------------------------------------------
# One entry per manual page. List of tuples
# (source start file, name, description, authors, manual section).
man_pages = [
(master_doc, '{}'.format(project), '{} Documentation'.format(project_title),
[author], 1)
]
# -- Options for Texinfo output ----------------------------------------------
# Grouping the document tree into Texinfo files. List of tuples
# (source start file, target name, title, author,
# dir menu entry, description, category)
texinfo_documents = [
(master_doc, '{}'.format(project), '{} Documentation'.format(project_title),
author, '{}'.format(project), 'One line description of project.',
'Miscellaneous'),
]
# -- Options for Epub output -------------------------------------------------
# Bibliographic Dublin Core info.
epub_title = project_title
epub_author = author
epub_publisher = author
epub_copyright = copyright
# The unique identifier of the text. This can be a ISBN number
# or the project homepage.
#
# epub_identifier = ''
# A unique identification for the text.
#
# epub_uid = ''
# A list of files that should not be packed into the epub file.
epub_exclude_files = ['search.html']
# -- Extension configuration -------------------------------------------------
# -- Options for intersphinx extension ---------------------------------------
# Example configuration for intersphinx: refer to the Python standard library.
intersphinx_mapping = {'https://docs.python.org/': None}
# -- Options for todo extension ----------------------------------------------
# If true, `todo` and `todoList` produce output, else they produce nothing.
todo_include_todos = True
I have determined what is causing the problem but I have no idea why it is happening or how to fix it. I am using a boiler-plate fragment of HTML from Creative Commons which identifies the license governing use of the documentation. I took the base sphinx_rtd_theme Footer.html and added this fragment to it and used the modified Footer.html to override the base copy. The fragment follows:
<br /> <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">
<img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" /></a>
<br />
<p>
This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution-ShareAlike 4.0 International License</a>
</p>
The fragment actually works as this HTML is correctly rendered on ReadTheDocs but the rest of the formatting disappears.
What do I need to do to get the local display on ReadTheDocs?
Sounds like static assets are not getting copied over on RTD. Let's look at the build log, under python /home/docs/checkouts/readthedocs.org/user_builds/orderedtree/envs/latest/bin/sphinx-build -T -E -b readthedocs -d _build/doctrees-readthedocs -D language=en . _build/html:
copying static files... WARNING: html_static_path entry '/home/docs/checkouts/readthedocs.org/readthedocs/templates/sphinx/_static' does not exist
You know what? That error sounds strangely familiar for some reason....
Let's see the setting in your conf.py:
html_static_path = ['_static']
Try changing that to:
html_static_path = []
Seems to work for about a dozen other users.

Sphinx pdfbuilder format not resolved, probably missing URL

I am trying to generate a pdf using rst2pdf, but keep getting the error "format not resolved, probably missing URL or undefined destination for target".
I use sphinx to generate the .rst files and am able to generate the HTML output just fine.
To generate the pdf I followed the instructions at: https://gist.github.com/alfredodeza/7fb5c667addb1c6963b9
Directory Structure:
- sphinxtext
* docs
** source
** build
* test.py
Conf.py:
# -*- coding: utf-8 -*-
#
# Test documentation build configuration file, created by
# sphinx-quickstart on Fri Jun 30 12:08:40 2017.
#
# This file is execfile()d with the current directory set to its
# containing dir.
#
# Note that not all possible configuration values are present in this
# autogenerated file.
#
# All configuration values have a default; values that are commented out
# serve to show the default.
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
import os
import sys
sys.path.insert(0, os.path.abspath('..\..'))
# -- General configuration ------------------------------------------------
# If your documentation needs a minimal Sphinx version, state it here.
#
# needs_sphinx = '1.0'
# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = ['sphinx.ext.autodoc',
'sphinx.ext.coverage','rst2pdf.pdfbuilder']
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
# The suffix(es) of source filenames.
# You can specify multiple suffix as a list of string:
#
# source_suffix = ['.rst', '.md']
source_suffix = '.rst'
# The master toctree document.
master_doc = 'index'
# General information about the project.
project = u'Test'
copyright = u'2017, m.a.'
author = u'm.a.'
# The version info for the project you're documenting, acts as replacement for
# |version| and |release|, also used in various other places throughout the
# built documents.
#
# The short X.Y version.
version = u'1.0.0'
# The full version, including alpha/beta/rc tags.
release = u'1.0.0'
# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
#
# This is also used if you do content translation via gettext catalogs.
# Usually you set "language" from the command line for these cases.
language = None
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This patterns also effect to html_static_path and html_extra_path
exclude_patterns = []
# The name of the Pygments (syntax highlighting) style to use.
pygments_style = 'sphinx'
# If true, `todo` and `todoList` produce output, else they produce nothing.
todo_include_todos = False
# -- Options for HTML output ----------------------------------------------
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = 'alabaster'
# Theme options are theme-specific and customize the look and feel of a theme
# further. For a list of options available for each theme, see the
# documentation.
#
# html_theme_options = {}
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
# -- Options for HTMLHelp output ------------------------------------------
# Output file base name for HTML help builder.
htmlhelp_basename = 'Testdoc'
# -- Options for LaTeX output ---------------------------------------------
latex_elements = {
# The paper size ('letterpaper' or 'a4paper').
#
'papersize': 'letterpaper',
# The font size ('10pt', '11pt' or '12pt').
#
'pointsize': '10pt',
# Additional stuff for the LaTeX preamble.
#
'preamble': '',
# Latex figure (float) alignment
#
# 'figure_align': 'htbp',
}
# Grouping the document tree into LaTeX files. List of tuples
# (source start file, target name, title,
# author, documentclass [howto, manual, or own class]).
latex_documents = [
(master_doc, 'Test.tex', u'Test Documentation',
u'm.a.', 'manual'),
]
# -- Options for manual page output ---------------------------------------
# One entry per manual page. List of tuples
# (source start file, name, description, authors, manual section).
man_pages = [
(master_doc, 'test', u'Test Documentation',
[author], 1)
]
# -- Options for Texinfo output -------------------------------------------
# Grouping the document tree into Texinfo files. List of tuples
# (source start file, target name, title, author,
# dir menu entry, description, category)
texinfo_documents = [
(master_doc, 'Test', u'Test Documentation',
author, 'Test', 'One line description of project.',
'Miscellaneous'),
]
# -- Options for pdf output -------------------------------------------
pdf_documents = [('index', u'rst2pdf', u'Sample rst2pdf doc', u'Your Name'),]
# index - master document
# rst2pdf - name of the generated pdf
# Sample rst2pdf doc - title of the pdf
# Your Name - author name in the pdf
test.py
class Test(object):
'''
Test class
'''
def __init__(self,**kwargs):
"""
Constructor
"""
def some_func(self,a,b,c,d,e,f,g):
"""
Function that does something TEST
:param a: Identifier
:param b: b
:param c: c
:param d: d
:param e: e
:param f: f
:param g: g
:type a: int
:type b: float
:type c: float
:type d: float
:type e: float
:type f: float
:type h: float
:returns: Test obj
:Example:
>>> test=Test()
>>> test.some_func(10, 42164., 0.0, 0., -132.0, 0.0, 0.0)
"""
self.a=a
self.b = b
self.c = c
self.d = d
self.e = e
self.f = f
self.g = g
return self
When building as html I see the following warning: checking consistency... C:\PATH\docs\source\modules.rst: WARNING: document isn't included in any toctree
My index.rst file:
.. Test documentation master file, created by
sphinx-quickstart on Fri Jun 30 12:08:40 2017.
You can adapt this file completely to your liking, but it should at least
contain the root toctree directive.
Welcome to Test's documentation!
================================
.. toctree::
:maxdepth: 2
:caption: Contents:
test
Indices and tables
==================
* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
My test.rst:
Test Module!
================================
.. automodule:: test
:members:
This has now been resolved in rst2pdf via PR 619.
You can install the version of rst2pdf that has this fix by doing this:
$ cd {some directory where you keep 3rd-party projects, e.g. ~/projects}
$ git clone https://github.com/rst2pdf/rst2pdf.git
$ cd rst2pdf
$ python setup.py install
If you are still getting this message in 2021, it likely means rst2pdf can't figure out where one or more of your links is pointed. For example, I had in-page nav links in my .rst code like this that were ok when building HTML but causing the error referred to by the OP during PDF generation:
`Back to top ↑ <#top>`_
The most reliable solution seems to be to remove these in-page links.
Alternatively, you could just convert these implicit links to raw html blocks like below, which could be a solution if you can get rst2pdf to render raw html using xhtml2pdf (this isn't working for me, but perhaps someone else can get it).
.. |top| raw:: html
Back to top ↑
|top|
Note that in this alternative solution, the in-page links probably will not work in most PDF viewers, so this is really only a solution to get .rst files to compile to both PDF and HTML (as the OP asked), rather than one that will allow identical behavior.

Perl Image::OCR::Tesseract module on Windows

Anyone out there know of a graceful way to install the "Image::OCR::Tesseract" module on Windows? The module fails to install on Windows via CPAN due to a *NIX only module dependency called "LEOCHARRE::CLI". This module does not seem to be required to run "Image::OCR::Tesseract" itself.
I've managed to get the module working by first manually installing the dependency modules listed in the makefile.pl (except for "LEOCHARRE::CLI") and then by moving the module file to the correct directory structure under "C:\Perl\site\lib\Image\OCR". The final part of getting it to work was to alter the section of code that calls the ImageMagick and Tesseract executables from the command line to put quotes around the program names when the executables are called by module.
This works, but I'd really feel better about doing a PPM or CPAN install on a production system from a repo that works on Windows.
Never mind, I got it, though I can't decide what is the better solution.
To get the installer to work on Windows via the traditional "perl makefile.pl, make, make test, make install" routine requires an edit to the Makefile.pl script, including the missing Windows install module (Devel::AssertOS::MSWin32), and patch to AssertEXE.pm to use "File::Which" rather than the built in shell "which" command that Windows lacks. All this still requires that The "Image::OCR::Tesseract" be patched to put quotes around program names when executing "convert" and "tesseract" from the command line.
Given the number of steps involved to make the installer work on Windows, and the fact the module does not create a binary component for the module to link to, I'd say the best option for installing and getting the Tesseract module working on windows would be to first install the following binary packages:
ImageMagick
Link
Tesseract
http://code.google.com/p/tesseract-ocr/downloads/list
Next, locate your Perl module directory - on my system it is "C:\Perl\site\lib". Create a folder "Image", if you don't have one. Next, open the Image folder and create a folder called "OCR". Open the OCR folder. At this point, your path should be something along the lines of "C:\Perl\site\lib\Image\OCR". Create a new text file called "Tesseract.pm", and copy in the following content...
package Image::OCR::Tesseract;
use strict;
use Carp;
use Cwd;
use String::ShellQuote 'shell_quote';
use Exporter;
use vars qw(#EXPORT_OK #ISA $VERSION $DEBUG $WHICH_TESSERACT $WHICH_CONVERT %EXPORT_TAGS #TRASH);
#ISA = qw(Exporter);
#EXPORT_OK = qw(get_ocr get_hocr _tesseract convert_8bpp_tif tesseract);
$VERSION = sprintf "%d.%02d", q$Revision: 1.24 $ =~ /(\d+)/g;
%EXPORT_TAGS = ( all => \#EXPORT_OK );
BEGIN {
use File::Which 'which';
$WHICH_TESSERACT = which('tesseract');
$WHICH_CONVERT = which('convert');
if($^O=~m/MSWin/) {
$WHICH_TESSERACT='"'.$WHICH_TESSERACT.'"';
$WHICH_CONVERT='"'.$WHICH_CONVERT.'"';
}
$WHICH_TESSERACT or die("Is tesseract installed? Cannot find bin path to tesseract.");
$WHICH_CONVERT or die("Is convert installed? Cannot find bin path to convert.");
}
END {
scalar #TRASH or return;
if ( $DEBUG ){
print STDERR "Debug on, these are trash files:\n".join("\n",#TRASH) ;
}
else {
unlink #TRASH;
}
}
sub DEBUG { Carp::cluck("Image::OCR::Tesseract::DEBUG() deprecated") }
sub get_hocr {
my ($abs_image,$abs_tmp_dir,$lang)= #_;
-f $abs_image or croak("$abs_image is not a file on disk");
my $hocr="hocr";
if(defined $abs_tmp_dir){
-d $abs_tmp_dir or die("tmp dir arg $abs_tmp_dir not a dir on disk.");
$abs_image=~/([^\/]+)$/ or die("cant match filename in path arg '$abs_image'");
my $abs_copy = "$abs_tmp_dir/$1";
# TODO, what if source and dest are same, i want it to die
require File::Copy;
File::Copy::copy($abs_image, $abs_copy)
or die("cant make copy of $abs_image to $abs_copy, $!");
# change the image to get ocr from to be the copy
$abs_image = $abs_copy;
# since it's a copy. erase that on exit
push #TRASH, $abs_image;
}
my $tmp_tif = convert_8bpp_tif($abs_image);
push #TRASH, $tmp_tif; # for later delete
_tesseract($tmp_tif,$lang,$hocr) || '';
}
sub get_ocr {
my ($abs_image,$abs_tmp_dir,$lang)= #_;
-f $abs_image or croak("$abs_image is not a file on disk");
if(defined $abs_tmp_dir){
-d $abs_tmp_dir or die("tmp dir arg $abs_tmp_dir not a dir on disk.");
$abs_image=~/([^\/]+)$/ or die("cant match filename in path arg '$abs_image'");
my $abs_copy = "$abs_tmp_dir/$1";
# TODO, what if source and dest are same, i want it to die
require File::Copy;
File::Copy::copy($abs_image, $abs_copy)
or die("cant make copy of $abs_image to $abs_copy, $!");
# change the image to get ocr from to be the copy
$abs_image = $abs_copy;
# since it's a copy. erase that on exit
push #TRASH, $abs_image;
}
my $tmp_tif = convert_8bpp_tif($abs_image);
push #TRASH, $tmp_tif; # for later delete
_tesseract($tmp_tif,$lang) || '';
}
sub convert_8bpp_tif {
my ($abs_img,$abs_out) = (shift,shift);
defined $abs_img or die('missing image arg');
$abs_out ||= $abs_img.'.tmp.'.time().(int rand(9000)).'.tif';
my #arg = ( $WHICH_CONVERT, $abs_img, '-compress','none','+matte', $abs_out );
#die (join(" ", #arg));
system(#arg) == 0 or die("convert $abs_img error.. $?");
$DEBUG and warn("made $abs_out 8bpp tiff.");
$abs_out;
}
# people expect tesseract to automatically convert
*tesseract = \&_tesseract;
sub _tesseract {
my ($abs_image,$lang,$hocr) = #_;
defined $abs_image or croak('missing image path arg');
$abs_image=~/\.tif+$/i or warn("Are you sure '$abs_image' is a tif image? This operation may fail.");
#my #arg = (
# $WHICH_TESSERACT, shell_quote($abs_image), shell_quote($abs_image),
# (defined $lang and ('-l', $lang) ), '2>/dev/null'
#);
my $cmd =
( sprintf '%s %s %s',
$WHICH_TESSERACT,
shell_quote($abs_image),
shell_quote($abs_image)
) .
( defined $lang ? " -l $lang" : '' ) .
( defined $hocr ? " hocr" : '' ) .
" 2>/dev/null";
$DEBUG and warn "command: $cmd";
system($cmd); # hard to check ==0
my $txt = $abs_image.($hocr?".html":".txt");
unless( -f $txt ){
Carp::cluck("no text output for image '$abs_image'. (No text file '$txt' found on disk)");
return;
}
$DEBUG and warn "Found text file '$txt'";
my $content = (_slurp($txt) || '');
$DEBUG and warn("content length of text in '$txt' from image '$abs_image' is ". length $content );
push #TRASH, $txt;
$content;
}
sub _slurp {
my $abs = shift;
open(FILE,'<', $abs) or die("can't open file for reading '$abs', $!");
local $/;
my $txt = <FILE>;
close FILE;
$txt;
}
1;
__END__
#sub _force_imgtype {
# my $img = shift;
# my $type = shift;
# my $delete_original = shift;
# $delete_original ||=0;
#
#
# if($img=~/\.$type$/i){
# return $img;
# }
#
# my $img_out= $img;
# $img_out=~s/\.\w{1,5}$/\.$type/ or die("cant get file ext for $img");
#
#
#
#}
Save and close. Close the command line session and open a new one if you've had one open from before you did the ImageMagick and Tesseract binary installs. Test the module with the following script:
use Image::OCR::Tesseract;
my $image = 'SomeImageFileThatContainsText.jpg';
my $text = Image::OCR::Tesseract::get_ocr($image);
print "Text...\n";
print $text."\n";
print "Normal Exit\n";
exit;
That's it. Messy, I know, but there's no good way around the fact that the module installer really needs to be updated to support Windows (and other) systems even though the actual module code almost runs without modification. Really, if Tesseract and ImageMagick were installed to paths without spaces then the "Image::OCR::Tesseract" module code would not need any changes, but this minor tweak lets the supporting executables be installed anywhere, including the default locations.

windows django AttributeError: 'tuple' object has no attribute 'split

I am using the following command on WinXP and getting an error, but works fine on MacOS and Linux, thank you very very much for any help.
C:\Documents and Settings\Administrator\Sites\team_track>manage.py syncdb --settings=local_settings
Creating tables ...
Creating table auth_permission
Creating table auth_group_permissions
Creating table auth_group
Creating table auth_user_user_permissions
Creating table auth_user_groups
Creating table auth_user
Creating table auth_message
Creating table django_content_type
Creating table django_session
Creating table django_site
Creating table django_admin_log
Creating table app_player
Creating table app_team_players
Creating table app_team
Creating table app_game
Creating table app_gameparticipant
You just installed Django's auth system, which means you don't have any superusers defined.
Would you like to create one now? (yes/no): yes
Username (Leave blank to use 'administrator'):
E-mail address: kam#kam.com
Password:
Password (again):
Superuser created successfully.
Installing custom SQL ...
Traceback (most recent call last):
File "C:\Documents and Settings\Administrator\Sites\team_track\manage.py", line 19, in <module>
execute_manager(team_tracker.settings)
File "c:\Python27\lib\site-packages\django\core\management\__init__.py", line 438, in execute_mana
ger
utility.execute()
File "c:\Python27\lib\site-packages\django\core\management\__init__.py", line 379, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "c:\Python27\lib\site-packages\django\core\management\base.py", line 191, in run_from_argv
self.execute(*args, **options.__dict__)
File "c:\Python27\lib\site-packages\django\core\management\base.py", line 220, in execute
output = self.handle(*args, **options)
File "c:\Python27\lib\site-packages\django\core\management\base.py", line 351, in handle
return self.handle_noargs(**options)
File "c:\Python27\lib\site-packages\django\core\management\commands\syncdb.py", line 121, in handl
e_noargs
custom_sql = custom_sql_for_model(model, self.style, connection)
File "c:\Python27\lib\site-packages\django\core\management\sql.py", line 166, in custom_sql_for_mo
del
backend_name = connection.settings_dict['ENGINE'].split('.')[-1]
AttributeError: 'tuple' object has no attribute 'split'
Here is what my manage.py looks like:
#!/usr/bin/env python
import sys
import os.path
from django.core.management import execute_manager
try:
import team_tracker.settings # Assumed to be in the same directory.
except ImportError:
import sys
sys.stderr.write("Error: Can't find the file 'settings.py' in the directory containing %r. It appears you've customized things.\nYou'll have to run django-admin.py, passing it your settings module.\n(If the file settings.py does indeed exist, it's causing an ImportError somehow.)\n" % __file__)
sys.exit(1)
if __name__ == "__main__":
execute_manager(team_tracker.settings)
And my local_settings.py resides in root dir:
from team_tracker.settings import *
DEBUG = True
#DATABASE_ENGINE = 'sqlite3'
#DATABASE_NAME = 'caktus_website.db'
DATABASE_ENGINE = 'sqlite3', # Add 'postgresql_psycopg2', 'postgresql', 'mysql', 'sqlite3' or 'oracle'.
DATABASE_NAME = 'team_track.db' # Or path to database file if using sqlite3.
And finally my team_tracker/settings.py is here:
# Django settings for team_tracker project.
import os.path
DEBUG = True
TEMPLATE_DEBUG = DEBUG
ADMINS = (
# ('Your Name', 'your_email#example.com'),
)
SITE_ROOT = os.path.realpath(os.path.dirname(__file__))
MANAGERS = ADMINS
#DATABASES = {
# 'default': {
# 'ENGINE': 'django.db.backends.sqlite3', # Add 'postgresql_psycopg2', 'postgresql', 'mysql', 'sqlite3' or 'oracle'.
# 'NAME': 'team_track.db', # Or path to database file if using sqlite3.
# 'USER': '', # Not used with sqlite3.
# 'PASSWORD': '', # Not used with sqlite3.
# 'HOST': '', # Set to empty string for localhost. Not used with sqlite3.
# 'PORT': '', # Set to empty string for default. Not used with sqlite3.
# }
#}
# Local time zone for this installation. Choices can be found here:
# http://en.wikipedia.org/wiki/List_of_tz_zones_by_name
# although not all choices may be available on all operating systems.
# On Unix systems, a value of None will cause Django to use the same
# timezone as the operating system.
# If running in a Windows environment this must be set to the same as your
# system time zone.
TIME_ZONE = 'America/Chicago'
# Language code for this installation. All choices can be found here:
# http://www.i18nguy.com/unicode/language-identifiers.html
LANGUAGE_CODE = 'en-us'
SITE_ID = 1
# If you set this to False, Django will make some optimizations so as not
# to load the internationalization machinery.
USE_I18N = True
# If you set this to False, Django will not format dates, numbers and
# calendars according to the current locale
USE_L10N = True
# Absolute filesystem path to the directory that will hold user-uploaded files.
# Example: "/home/media/media.lawrence.com/media/"
MEDIA_ROOT = '/Users/kamilski81/Sites/team_tracker/media/'#os.path.join(SITE_ROOT, 'appmedia')
# URL that handles the media served from MEDIA_ROOT. Make sure to use a
# trailing slash.
# Examples: "http://media.lawrence.com/media/", "http://example.com/media/"
MEDIA_URL = '/media/'
# Absolute path to the directory static files should be collected to.
# Don't put anything in this directory yourself; store your static files
# in apps' "static/" subdirectories and in STATICFILES_DIRS.
# Example: "/home/media/media.lawrence.com/static/"
STATIC_ROOT = '/Users/kamilski81/Sites/team_tracker/static/'
# URL prefix for static files.
# Example: "http://media.lawrence.com/static/"
STATIC_URL = '/static/'
# URL prefix for admin static files -- CSS, JavaScript and images.
# Make sure to use a trailing slash.
# Examples: "http://foo.com/static/admin/", "/static/admin/".
ADMIN_MEDIA_PREFIX = '/static/admin/'
# Additional locations of static files
STATICFILES_DIRS = (
# Put strings here, like "/home/html/static" or "C:/www/django/static".
# Always use forward slashes, even on Windows.
# Don't forget to use absolute paths, not relative paths.
)
# List of finder classes that know how to find static files in
# various locations.
STATICFILES_FINDERS = (
'django.contrib.staticfiles.finders.FileSystemFinder',
'django.contrib.staticfiles.finders.AppDirectoriesFinder',
# 'django.contrib.staticfiles.finders.DefaultStorageFinder',
)
# Make this unique, and don't share it with anybody.
SECRET_KEY = 'v8#s)7gw-^#zp&6**g7rz$uj!#3v4a36so_uw!_#0pa$h4)b-s'
# List of callables that know how to import templates from various sources.
TEMPLATE_LOADERS = (
'django.template.loaders.filesystem.Loader',
'django.template.loaders.app_directories.Loader',
# 'django.template.loaders.eggs.Loader',
)
MIDDLEWARE_CLASSES = (
'django.middleware.common.CommonMiddleware',
'django.contrib.sessions.middleware.SessionMiddleware',
# #kamtodo: find out how to truly use this and the best way if we have many forms
# 'django.middleware.csrf.CsrfViewMiddleware',
'django.contrib.auth.middleware.AuthenticationMiddleware',
'django.contrib.messages.middleware.MessageMiddleware',
)
ROOT_URLCONF = 'urls'
TEMPLATE_DIRS = (
# Put strings here, like "/home/html/django_templates" or "C:/www/django/templates".
# Always use forward slashes, even on Windows.
# Don't forget to use absolute paths, not relative paths.
os.path.join(SITE_ROOT, 'templates').replace('\\','/'),
)
INSTALLED_APPS = (
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.sites',
'django.contrib.messages',
'django.contrib.staticfiles',
# Uncomment the next line to enable the admin:
'django.contrib.admin',
# Uncomment the next line to enable admin documentation:
'django.contrib.admindocs',
'app',
)
# A sample logging configuration. The only tangible logging
# performed by this configuration is to send an email to
# the site admins on every HTTP 500 error.
# See http://docs.djangoproject.com/en/dev/topics/logging for
# more details on how to customize your logging configuration.
LOGGING = {
'version': 1,
'disable_existing_loggers': False,
'handlers': {
'mail_admins': {
'level': 'ERROR',
'class': 'django.utils.log.AdminEmailHandler'
}
},
'loggers': {
'django.request': {
'handlers': ['mail_admins'],
'level': 'ERROR',
'propagate': True,
},
}
}
In local_settings.py:
DATABASE_ENGINE = 'sqlite3',
The comma here makes DATABASE_ENGINE a tuple with one element instead of a string. Remove it and it should work.

Resources