Problem with get_children() method when using TreeManager. Wrong result - django-mptt

I have encountered an strange problem with the use of TreeManager
Here is my code:
# other imports
from mptt.models import MPTTModel, TreeForeignKey
from mptt.managers import TreeManager
class SectionManager(TreeManager):
def get_queryset(self):
return super().get_queryset().filter(published=True)
class Section(MPTTModel):
published = models.BooleanField(
default=True,
help_text="If unpublished, this section will show only"
" to editors. Else, it will show for all."
)
objects = TreeManager()
published_objects = SectionManager()
When I test it. I get the following correct results:
# show all objects
Section.objects.count() # result is correct - 65
Section.objects.root_nodes().count() # result is correct - 12
# show published objects, just one is not published.
Section.published_objects.count() # result is correct - 64
Section.published_objects.root_nodes().count() # result is corrct - 12
But one child of the roots is unpublished and it does not show in the results. Here is the test:
for root in Section.objects.root_nodes():
print(f"root_section_{root.id} has {root.get_children().count()} children")
# results ...
root_section_57 has 13 children # correct - 13 items
# ... more results
for root in Section.published_objects.root_nodes():
print(f"root_section_{root.id} has {root.get_children().count()} children")
# results ...
root_section_57 has 13 children # WRONG - should be only 12 children
# ... more results
I may not understand something, or I may have hit a bug??
Any ideas?
NOTE: This issue has been posted on the django-mptt github issues page at: https://github.com/django-mptt/django-mptt/issues/689

https://github.com/django-mptt/django-mptt/blob/master/mptt/managers.py
You override wrong function call. root_nodes() call ._mptt_filter()
#delegate_manager
def root_nodes(self):
"""
Creates a ``QuerySet`` containing root nodes.
"""
return self._mptt_filter(parent=None)
And your _mptt_filter does not got any given qs.
#delegate_manager
def _mptt_filter(self, qs=None, **filters):
"""
Like ``self.filter()``, but translates name-agnostic filters for MPTT
fields.
"""
if qs is None:
qs = self
return qs.filter(**self._translate_lookups(**filters))
Now you need to customized based on your use case.
Hope it can be some help

Related

MyST-Parser: Auto linking / linkifying references to bug tracker issues

I use sphinx w/ MyST-Parser for markdown, and
I want GitHub or GitLab-style auto linking (linkfying) for references.
Is there a way to have MyST render the reference:
#346
In docutils-speak, this is a Text node (example)
And behave as if it was:
[#346](https://github.com/vcs-python/libvcs/pull/346)
So when rendered it'd be like:
#346
Not the custom role:
{issue}`1` <- Not this
Another example: Linkifying the reference #user to a GitHub, GitLab, StackOverflow user.
What I'm currently doing (and why it doesn't work)
Right now I'm using the canonical solution docutils offers: custom roles.
I use sphinx-issues (PyPI), and does just that. It uses a sphinx setting variable, issues_github_path to parse the URL:
e.g. in Sphinx configuration conf.py:
issues_github_path = 'vcs-python/libvcs'
reStructuredText:
:issue:`346`
MyST-Parser:
{issue}`346`
Why custom roles don't work
Sadly, those aren't bi-directional with GitHub/GitLab/tools. If you copy/paste MyST-Parser -> GitHub/GitLab or preview it directly, it looks very bad:
Example of CHANGES:
Example issue: https://github.com/vcs-python/libvcs/issues/363
What we want is to just be able to copy markdown including #347 to and from.
Does a solution already exist?
Are there any projects out there of docutils or sphinx plugins to turn #username or #issues into links?
sphinx (at least) can demonstrable do so for custom roles - as seen in sphinx-issues usage of issues_github_path - by using project configuration context.
MyST-Parser has a linkify extension which uses linkify-it-py
This can turn https://www.google.com into https://www.google.com and not need to use <https://www.google.com>.
Therefore, there may already be a tool out there.
Can it be done through the API?
The toolchain for myst, sphinx and docutils is robust. This is a special case.
This needs to be done at the Text node level. Custom role won't work - as stated above - since it'll create markdown that can't be copied between GitLab and GitHub issues trivially.
The stack:
MyST-Parser API (Markdown-it-py API) > Sphinx APIs (MySTParser + Sphinx) > Docutils API
At the time of writing, I'm using Sphinx 4.3.2, MyST-Parser 0.17.2, and docutils 0.17.1 on python 3.10.2.
Notes
For the sake of an example, I'm using an open source project of mine that is facing this issue.
This is only about autolinking issues or usernames - things that'd easily be mappable to URLs. autodoc code-linking is out of scope.
There is a (defunct) project that does this: sphinxcontrib-issuetracker.
I've rebooted it:
conf.py:
import sys
from pathlib import Path
cwd = Path(__file__).parent
project_root = cwd.parent
sys.path.insert(0, str(project_root))
sys.path.insert(0, str(cwd / "_ext"))
extensions = [
"link_issues",
]
# issuetracker
issuetracker = "github"
issuetracker_project = "cihai/unihan-etl" # e.g. for https://github.com/cihai/unihan-etl
_ext/link_issues.py:
"""Issue linking w/ plain-text autolinking, e.g. #42
Credit: https://github.com/ignatenkobrain/sphinxcontrib-issuetracker
License: BSD
Changes by Tony Narlock (2022-08-21):
- Type annotations
mypy --strict, requires types-requests, types-docutils
Python < 3.10 require typing-extensions
- TrackerConfig: Use dataclasses instead of typing.NamedTuple and hacking __new__
- app.warn (removed in 5.0) -> Use Sphinx Logging API
https://www.sphinx-doc.org/en/master/extdev/logging.html#logging-api
- Add PendingIssueXRef
Typing for tracker_config and precision
- Add IssueTrackerBuildEnvironment
Subclassed / typed BuildEnvironment with .tracker_config
- Just GitHub (for demonstration)
"""
import dataclasses
import re
import sys
import time
import typing as t
import requests
from docutils import nodes
from sphinx.addnodes import pending_xref
from sphinx.application import Sphinx
from sphinx.config import Config
from sphinx.environment import BuildEnvironment
from sphinx.transforms import SphinxTransform
from sphinx.util import logging
if t.TYPE_CHECKING:
if sys.version_info >= (3, 10):
from typing import TypeGuard
else:
from typing_extensions import TypeGuard
logger = logging.getLogger(__name__)
GITHUB_API_URL = "https://api.github.com/repos/{0.project}/issues/{1}"
class IssueTrackerBuildEnvironment(BuildEnvironment):
tracker_config: "TrackerConfig"
issuetracker_cache: "IssueTrackerCache"
github_rate_limit: t.Tuple[float, bool]
class Issue(t.NamedTuple):
id: str
title: str
url: str
closed: bool
IssueTrackerCache = t.Dict[str, Issue]
#dataclasses.dataclass
class TrackerConfig:
project: str
url: str
"""
Issue tracker configuration.
This class provides configuration for trackers, and is passed as
``tracker_config`` arguments to callbacks of
:event:`issuetracker-lookup-issue`.
"""
def __post_init__(self) -> None:
if self.url is not None:
self.url = self.url.rstrip("/")
#classmethod
def from_sphinx_config(cls, config: Config) -> "TrackerConfig":
"""
Get tracker configuration from ``config``.
"""
project = config.issuetracker_project or config.project
url = config.issuetracker_url
return cls(project=project, url=url)
class PendingIssueXRef(pending_xref):
tracker_config: TrackerConfig
class IssueReferences(SphinxTransform):
default_priority = 999
def apply(self) -> None:
config = self.document.settings.env.config
tracker_config = TrackerConfig.from_sphinx_config(config)
issue_pattern = config.issuetracker_issue_pattern
title_template = None
if isinstance(issue_pattern, str):
issue_pattern = re.compile(issue_pattern)
for node in self.document.traverse(nodes.Text):
parent = node.parent
if isinstance(parent, (nodes.literal, nodes.FixedTextElement)):
# ignore inline and block literal text
continue
if isinstance(parent, nodes.reference):
continue
text = str(node)
new_nodes = []
last_issue_ref_end = 0
for match in issue_pattern.finditer(text):
# catch invalid pattern with too many groups
if len(match.groups()) != 1:
raise ValueError(
"issuetracker_issue_pattern must have "
"exactly one group: {0!r}".format(match.groups())
)
# extract the text between the last issue reference and the
# current issue reference and put it into a new text node
head = text[last_issue_ref_end : match.start()]
if head:
new_nodes.append(nodes.Text(head))
# adjust the position of the last issue reference in the
# text
last_issue_ref_end = match.end()
# extract the issue text (including the leading dash)
issuetext = match.group(0)
# extract the issue number (excluding the leading dash)
issue_id = match.group(1)
# turn the issue reference into a reference node
refnode = PendingIssueXRef()
refnode["refdomain"] = None
refnode["reftarget"] = issue_id
refnode["reftype"] = "issue"
refnode["trackerconfig"] = tracker_config
reftitle = title_template or issuetext
refnode.append(
nodes.inline(issuetext, reftitle, classes=["xref", "issue"])
)
new_nodes.append(refnode)
if not new_nodes:
# no issue references were found, move on to the next node
continue
# extract the remaining text after the last issue reference, and
# put it into a text node
tail = text[last_issue_ref_end:]
if tail:
new_nodes.append(nodes.Text(tail))
# find and remove the original node, and insert all new nodes
# instead
parent.replace(node, new_nodes)
def is_issuetracker_env(
env: t.Any,
) -> "TypeGuard['IssueTrackerBuildEnvironment']":
return hasattr(env, "issuetracker_cache") and env.issuetracker_cache is not None
def lookup_issue(
app: Sphinx, tracker_config: TrackerConfig, issue_id: str
) -> t.Optional[Issue]:
"""
Lookup the given issue.
The issue is first looked up in an internal cache. If it is not found, the
event ``issuetracker-lookup-issue`` is emitted. The result of this
invocation is then cached and returned.
``app`` is the sphinx application object. ``tracker_config`` is the
:class:`TrackerConfig` object representing the issue tracker configuration.
``issue_id`` is a string containing the issue id.
Return a :class:`Issue` object for the issue with the given ``issue_id``,
or ``None`` if the issue wasn't found.
"""
env = app.env
if is_issuetracker_env(env):
cache: IssueTrackerCache = env.issuetracker_cache
if issue_id not in cache:
issue = app.emit_firstresult(
"issuetracker-lookup-issue", tracker_config, issue_id
)
cache[issue_id] = issue
return cache[issue_id]
return None
def lookup_issues(app: Sphinx, doctree: nodes.document) -> None:
"""
Lookup issues found in the given ``doctree``.
Each issue reference in the given ``doctree`` is looked up. Each lookup
result is cached by mapping the referenced issue id to the looked up
:class:`Issue` object (an existing issue) or ``None`` (a missing issue).
The cache is available at ``app.env.issuetracker_cache`` and is pickled
along with the environment.
"""
for node in doctree.traverse(PendingIssueXRef):
if node["reftype"] == "issue":
lookup_issue(app, node["trackerconfig"], node["reftarget"])
def make_issue_reference(issue: Issue, content_node: nodes.inline) -> nodes.reference:
"""
Create a reference node for the given issue.
``content_node`` is a docutils node which is supposed to be added as
content of the created reference. ``issue`` is the :class:`Issue` which
the reference shall point to.
Return a :class:`docutils.nodes.reference` for the issue.
"""
reference = nodes.reference()
reference["refuri"] = issue.url
if issue.title:
reference["reftitle"] = issue.title
if issue.closed:
content_node["classes"].append("closed")
reference.append(content_node)
return reference
def resolve_issue_reference(
app: Sphinx, env: BuildEnvironment, node: PendingIssueXRef, contnode: nodes.inline
) -> t.Optional[nodes.reference]:
"""
Resolve an issue reference and turn it into a real reference to the
corresponding issue.
``app`` and ``env`` are the Sphinx application and environment
respectively. ``node`` is a ``pending_xref`` node representing the missing
reference. It is expected to have the following attributes:
- ``reftype``: The reference type
- ``trackerconfig``: The :class:`TrackerConfig`` to use for this node
- ``reftarget``: The issue id
- ``classes``: The node classes
References with a ``reftype`` other than ``'issue'`` are skipped by
returning ``None``. Otherwise the new node is returned.
If the referenced issue was found, a real reference to this issue is
returned. The text of this reference is formatted with the :class:`Issue`
object available in the ``issue`` key. The reference title is set to the
issue title. If the issue is closed, the class ``closed`` is added to the
new content node.
Otherwise, if the issue was not found, the content node is returned.
"""
if node["reftype"] != "issue":
return None
issue = lookup_issue(app, node["trackerconfig"], node["reftarget"])
if issue is None:
return contnode
else:
classes = contnode["classes"]
conttext = str(contnode[0])
formatted_conttext = nodes.Text(conttext.format(issue=issue))
formatted_contnode = nodes.inline(conttext, formatted_conttext, classes=classes)
assert issue is not None
return make_issue_reference(issue, formatted_contnode)
return None
def init_cache(app: Sphinx) -> None:
if not hasattr(app.env, "issuetracker_cache"):
app.env.issuetracker_cache: "IssueTrackerCache" = {} # type: ignore
return None
def check_project_with_username(tracker_config: TrackerConfig) -> None:
if "/" not in tracker_config.project:
raise ValueError(
"username missing in project name: {0.project}".format(tracker_config)
)
HEADERS = {"User-Agent": "sphinxcontrib-issuetracker v{0}".format("1.0")}
def get(app: Sphinx, url: str) -> t.Optional[requests.Response]:
"""
Get a response from the given ``url``.
``url`` is a string containing the URL to request via GET. ``app`` is the
Sphinx application object.
Return the :class:`~requests.Response` object on status code 200, or
``None`` otherwise. If the status code is not 200 or 404, a warning is
emitted via ``app``.
"""
response = requests.get(url, headers=HEADERS)
if response.status_code == requests.codes.ok:
return response
elif response.status_code != requests.codes.not_found:
msg = "GET {0.url} failed with code {0.status_code}"
logger.warning(msg.format(response))
return None
def lookup_github_issue(
app: Sphinx, tracker_config: TrackerConfig, issue_id: str
) -> t.Optional[Issue]:
check_project_with_username(tracker_config)
env = app.env
if is_issuetracker_env(env):
# Get rate limit information from the environment
timestamp, limit_hit = getattr(env, "github_rate_limit", (0, False))
if limit_hit and time.time() - timestamp > 3600:
# Github limits applications hourly
limit_hit = False
if not limit_hit:
url = GITHUB_API_URL.format(tracker_config, issue_id)
response = get(app, url)
if response:
rate_remaining = response.headers.get("X-RateLimit-Remaining")
assert rate_remaining is not None
if rate_remaining.isdigit() and int(rate_remaining) == 0:
logger.warning("Github rate limit hit")
env.github_rate_limit = (time.time(), True)
issue = response.json()
closed = issue["state"] == "closed"
return Issue(
id=issue_id,
title=issue["title"],
closed=closed,
url=issue["html_url"],
)
else:
logger.warning(
"Github rate limit exceeded, not resolving issue {0}".format(issue_id)
)
return None
BUILTIN_ISSUE_TRACKERS: t.Dict[str, t.Any] = {
"github": lookup_github_issue,
}
def init_transformer(app: Sphinx) -> None:
if app.config.issuetracker_plaintext_issues:
app.add_transform(IssueReferences)
def connect_builtin_tracker(app: Sphinx) -> None:
if app.config.issuetracker:
tracker = BUILTIN_ISSUE_TRACKERS[app.config.issuetracker.lower()]
app.connect(str("issuetracker-lookup-issue"), tracker)
def setup(app: Sphinx) -> t.Dict[str, t.Any]:
app.add_config_value("mybase", "https://github.com/cihai/unihan-etl", "env")
app.add_event(str("issuetracker-lookup-issue"))
app.connect(str("builder-inited"), connect_builtin_tracker)
app.add_config_value("issuetracker", None, "env")
app.add_config_value("issuetracker_project", None, "env")
app.add_config_value("issuetracker_url", None, "env")
# configuration specific to plaintext issue references
app.add_config_value("issuetracker_plaintext_issues", True, "env")
app.add_config_value(
"issuetracker_issue_pattern",
re.compile(
r"#(\d+)",
),
"env",
)
app.add_config_value("issuetracker_title_template", None, "env")
app.connect(str("builder-inited"), init_cache)
app.connect(str("builder-inited"), init_transformer)
app.connect(str("doctree-read"), lookup_issues)
app.connect(str("missing-reference"), resolve_issue_reference)
return {
"version": "1.0",
"parallel_read_safe": True,
"parallel_write_safe": True,
}
Mirrors
https://gist.github.com/tony/05a3043d97d37c158763fb2f6a2d5392
https://github.com/ignatenkobrain/sphinxcontrib-issuetracker/issues/25
Mypy users
mypy --strict docs/_ext/link_issues.py work as of mypy 0.971
If you use mypy: pip install types-docutils types-requests
Install:
https://pypi.org/project/types-docutils/
https://pypi.org/project/types-requests/
https://pypi.org/project/typing-extensions/ (Python <3.10)
Example
via unihan-etl#261 / v0.17.2 (source, view, but page may be outdated)

Formatting error when I make sphinx-build docs for classes that inherits param.Parameterized

I am trying to create documentation for some software I am making and I have ran into the following problem. Any of my classes that inherits from param.Parameterized repo here end up having what looks like a "mini" documentation of param written in plaintext after the Docstring stuff I have written.
As seen in the screenshot above, with a small example this is what is produced:
Parameters
exampleType
Example parameter
Attributes
exampleType
Example attribute
[1;32mParameters of ‘Example’
=======================
[0m
[1;31mParameters changed from their default values are marked in red.[0m
[1;36mSoft bound values are marked in cyan.[0m
C/V= Constant/Variable, RO/RW = ReadOnly/ReadWrite, AN=Allow None
[1;34mName Value Type Bounds Mode [0m
test False Boolean (0, 1) V RW
[1;32mParameter docstrings:
=====================[0m
[1;34mtest: < No docstring available >[0m
My conf.py contains:
import os
import sys
sys.path.insert(0, os.path.abspath("."))
project = "example"
copyright = ""
author = ""
extensions = [
"sphinx.ext.napoleon",
"sphinx.ext.autodoc",
"sphinx.ext.todo",
"sphinx.ext.coverage",
"numpydoc",
]
numpydoc_show_class_members = False
numpydoc_show_inherited_class_members = False
autodoc_default_options = {
"show-inheritance": False,
}
templates_path = ["_templates"]
exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"]
html_theme = "sphinx_rtd_theme"
html_static_path = ["_static"]
and my index.rst is:
Welcome to Documentation!
========================================
.. autoclass:: example.Example
:members: example_method
and my example.py contains:
import param
class Example(param.Parameterized):
"""Example Class.
Parameters
----------
example : Type
Example parameter
Attributes
----------
example : Type
Example attribute
"""
test = param.Boolean()
def __init__(self):
testing = True
def example_method(self, example):
"""Example Method.
Parameters
----------
example : Type
Example parameter
Returns
----------
None
"""
testing = True
I am not sure whether the problem lies with the param class or sphinx, so if anyone knows how to turn this extra part of the documentation off I'd really appreciate it.
The built documentation:

how to "include" another file as part of a Jenkins Pipeline definition

We have a large project that has multiple separate declarative pipeline file definitions. This is used to build different apps and installers from the single code base.
Right now, all of these files contain a large block of "code" used to generate the email body and JIRA update messages. examples:
// Get a JIRA's to add Comments to
// Return map of JIRA id to comment text from all commits for that JIRA
#NonCPS
def getJiraMap() {
a bunch of stuff
return jiraset
}
// Get the body text for the emails
def getMailBody1() {
return "See: ${BUILD_URL}\n\nChanges:\n" + getChangeString() + "\n" + testStatuses()
}
etc...
What I would like to do is have all these common methods in a separate file that all the other pipeline files can include. This seems like it SHOULD be easy, but all examples I've found appear to be rather complex involving a separate SCM - which is NOT what I want.
Updates:
Going through the various suggestions given in that link, I make the following file - BuildTools.groovy: Note that this file is in the same directory as the jenkins pipeline file that uses it.
import hudson.tasks.test.AbstractTestResultAction
import hudson.model.Actionable
Class BuildTools {
// Get a JIRA's to add Comments to
// Return map of JIRA id to comment text from all commits for that JIRA
#NonCPS
def getJiraMap() {
def jiraset = [:]
.. whole bunch of stuff ..
Here are the various things I've tried, and the results.
File sourceFile = new File("./AutomatedBuild/BuildTools.groovy");
Class gcl = new GroovyClassLoader(getClass().getClassLoader()).parseClass(sourceFile);
GroovyObject bt = (GroovyObject) gcl.newInstance();
Fails with:
org.jenkinsci.plugins.scriptsecurity.sandbox.RejectedAccessException: Scripts not permitted to use method java.lang.Class getClassLoader
evaluate(new File("./AutomatedBuild/BuildTools.groovy"))
def bt = new BuildTools()
Fails with:
15:29:07 WorkflowScript: 8: unable to resolve class BuildTools
15:29:07 # line 8, column 10.
15:29:07 def bt = new BuildTools()
15:29:07 ^
import BuildTools
def bt = new BuildTools()
Fails with:
15:35:58 WorkflowScript: 16: unable to resolve class BuildTools (note that BuildTools.groovy is in the same folder as this script)
15:35:58 # line 16, column 1.
15:35:58 import BuildTools
15:35:58 ^
GroovyShell shell = new GroovyShell()
def bt = shell.parse(new File("./AutomatedBuild/BuildTools.groovy"))
Fails with:
org.jenkinsci.plugins.scriptsecurity.sandbox.RejectedAccessException: Scripts not permitted to use new groovy.lang.GroovyShell

How to include the source line number everywhere in html output in Sphinx?

Let's say I'm writing a custom editor for my RestructuredText/Sphinx stuff, with "live" html output preview. Output is built using Sphinx.
The source files are pure RestructuredText. No code there.
One desirable feature would be that right-clicking on some part of the preview opens the editor at the correct line of the source file.
To achieve that, one way would be to put that line number in every tag of the html file, for example using classes (e.g., class = "... lineno-124"). Or use html comments.
Note that I don't want to add more content to my source files, just that the line number be included everywhere in the output.
An approximate line number would be enough.
Someone knows how to do this in Sphinx, my way or another?
I decided to add <a> tags with a specific class "lineno lineno-nnn" where nnn is the line number in the RestructuredText source.
The directive .. linenocomment:: nnn is inserted before each new block of unindented text in the source, before the actual parsing (using a 'source-read' event hook).
linenocomment is a custom directive that pushes the <a> tag at build time.
Half a solution is still a solution...
import docutils.nodes as dn
from docutils.parsers.rst import Directive
class linenocomment(dn.General,dn.Element):
pass
def visit_linenocomment_html(self,node):
self.body.append(self.starttag(node,'a',CLASS="lineno lineno-{}".format(node['lineno'])))
def depart_linenocomment_html(self,node):
self.body.append('</a>')
class LineNoComment(Directive):
required_arguments = 1
optional_arguments = 0
has_content = False
add_index = False
def run(self):
node = linenocomment()
node['lineno'] = self.arguments[0]
return [node]
def insert_line_comments(app, docname, source):
print(source)
new_source = []
last_line_empty = True
lineno = 0
for line in source[0].split('\n'):
if line.strip() == '':
last_line_empty = True
new_source.append(line)
elif line[0].isspace():
new_source.append(line)
last_line_empty = False
elif not last_line_empty:
new_source.append(line)
else:
last_line_empty = False
new_source.append('.. linenocomment:: {}'.format(lineno))
new_source.append('')
new_source.append(line)
lineno += 1
source[0] = '\n'.join(new_source)
print(source)
def setup(app):
app.add_node(linenocomment,html=(visit_linenocomment_html,depart_linenocomment_html))
app.add_directive('linenocomment', LineNoComment)
app.connect('source-read',insert_line_comments)
return {
'version': 0.1
}

pyyaml parse data with tag

I have yaml data like the input below and i need output as key value pairs
Input
a="""
--- !ruby/hash:ActiveSupport::HashWithIndifferentAccess
code:
- '716'
- '718'
id:
- 488
- 499
"""
ouput needed
{'code': ['716', '718'], 'id': [488, 499]}
The default constructor was giving me an error. I tried adding new constructor and now its not giving me error but i am not able to get key value pairs.
FYI, If i remove the !ruby/hash:ActiveSupport::HashWithIndifferentAccess line from my yaml then it gives me desired output.
def new_constructor(loader, tag_suffix, node):
if type(node.value)=='list':
val=''.join(node.value)
else:
val=node.value
val=node.value
ret_val="""
{0}
""".format(val)
return ret_val
yaml.add_multi_constructor('', new_constructor)
yaml.load(a)
output
"\n [(ScalarNode(tag=u'tag:yaml.org,2002:str', value=u'code'), SequenceNode(tag=u'tag:yaml.org,2002:seq', value=[ScalarNode(tag=u'tag:yaml.org,2002:str', value=u'716'), ScalarNode(tag=u'tag:yaml.org,2002:str', value=u'718')])), (ScalarNode(tag=u'tag:yaml.org,2002:str', value=u'id'), SequenceNode(tag=u'tag:yaml.org,2002:seq', value=[ScalarNode(tag=u'tag:yaml.org,2002:int', value=u'488'), ScalarNode(tag=u'tag:yaml.org,2002:int', value=u'499')]))]\n "
Please suggest.
This is not a solution using PyYAML, but I recommend using ruamel.yaml instead. If for no other reason, it's more actively maintained than PyYAML. A quote from the overview
Many of the bugs filed against PyYAML, but that were never acted upon, have been fixed in ruamel.yaml
To load that string, you can do
import ruamel.yaml
parser = ruamel.yaml.YAML()
obj = parser.load(a) # as defined above.
I strongly recommend following #Andrew F answer, but in case you
wonder why your code did not get the proper result, that is because
you don't correctly process the node under the tag in your tag
handling.
Although the node's value is a list (of tuples with key value pairs),
you should test for the type of the node itself (using isinstance)
and then hand it over to the "normal" mapping processing routine as
the tag is on a mapping:
import yaml
from yaml.loader import SafeLoader
a = """\
--- !ruby/hash:ActiveSupport::HashWithIndifferentAccess
code:
- '716'
- '718'
id:
- 488
- 499
"""
def new_constructor(loader, tag_suffix, node):
if isinstance(node, yaml.nodes.MappingNode):
return loader.construct_mapping(node, deep=True)
raise NotImplementedError
yaml.add_multi_constructor('', new_constructor, Loader=SafeLoader)
data = yaml.load(a, Loader=SafeLoader)
print(data)
which gives:
{'code': ['716', '718'], 'id': [488, 499]}
You should not use PyYAML's yaml.load(), it is documented to be potentially unsafe
and above all it is not necessary. Just add the new constructor to the SafeLoader.

Resources