How to accumulate errors with ValidatedNec? - validation

I am trying to do a minimal error accumulation with cats validatedNec.
But it seems to fail compilation.
Here is the code I tried :
import cats.data._
import cats.implicits._
// doesn’t work with or without this line : import cats.syntax.applicative._
val one: ValidatedNec[String, Int] = Validated.valid(42)
val two: ValidatedNec[String, Boolean] = Validated.valid(true)
(one, two).mapN{
(one, two) => println(one)
}
The error is : value mapN is not a member of (cats.data.ValidatedNec[String,…
Am I missing something?

Yes, when you're importing cats.implicits._, you're already importing all of the syntax extension, therefore, there's no need to import cats.syntax.applicative. To make things worse, when you import something twice in Scala, they will clash and leave you with nothing (since the Scala compiler can't choose which of the two to take)
If you remove that syntax import, it should work no problems.
See the import guide for more on this: https://typelevel.org/cats/typeclasses/imports.html

Related

How to use kde_kws parameters for seaborn.histplot()?

I am trying to use sns.histplot() instead of sns.distplot() since I got the following message in colab:
FutureWarning: distplot is a deprecated function and will be removed
in a future version. Please adapt your code to use either displot (a
figure-level function with similar flexibility) or histplot (an axes-level function for histograms).
Code:
import pandas as pd
import seaborn as sns
df = sns.load_dataset('tips')
sns.histplot(df['tip'], kde=True, kde_kws={'fill' : True});
I got an error when passing kde_kws parameters inside sns.histplot():
TypeError: init() got an unexpected keyword argument 'fill'
From the documentation kde_kws= is intended to pass arguments "that control the KDE computation, as in kdeplot()." It is not entirely explicit which arguments those are, but they seem to be the ones like bw_method= and bw_adjust= that change the way the KDE is computed, rather than displayed. If you want to change the appearance of the KDE plot, the you can use line_kws=, but, as the name implies, the KDE is represented only by a line and therefore cannot be filled.
If you want both a histogram and a filled KDE, you need to combine histplot() and kdeplot() on the same axes
sns.histplot(df['tip'], stat='density')
sns.kdeplot(df['tip'], fill=True)

Sphinx nit-picky mode but only for links I explicitly wrote

I tried turning on Sphinx's nit-picky mode (-n) to catch any broken links I might have accidentally made. However, it spews out errors for all the places where I've documented types. In some cases I've described types semantically (e.g. "3D array"), but it does it even for types extracted from type hints (even with intersphinx set up to pull Python types). For example, for this module
from typing import Callable
def foo(x: Callable[..., int]):
pass
I get the error docstring of myproj.foo:: WARNING: py:class reference target not found: Callable[..., int]. That's with only sphinx.ext.autodoc and sphinx.ext.intersphinx extensions and a freshly-generated conf.py.
Is there some way to prevent Sphinx from trying to generate links for type information, or at least stop it complaining when they don't exist while still telling me about bad links in my hand-written documentation?
I'm using Sphinx 3.0.3.
Perhaps nitpick_ignore will do what you want? In your conf.py, something like this:
nitpick_ignore = [
("py:class", "Callable"),
]
I'm not sure of the exact values in the tuple that should be used, but I got the idea from this issue and a linked commit.
I had success solving a similar problem by writing a custom sphinx transform. I only wanted warnings for cross-references to my own package's python documentation. The following can be saved as a python file and added to extensions in conf.py once it is on the python path.
from sphinx import addnodes
from sphinx.errors import NoUri
from sphinx.transforms.post_transforms import SphinxPostTransform
from sphinx.util import logging
logger = logging.getLogger(__name__)
class MyLinkWarner(SphinxPostTransform):
"""
Warns about broken cross-reference links, but only for my_package_name.
This is very similar to the sphinx option ``nitpicky=True`` (see
:py:class:`sphinx.transforms.post_transforms.ReferencesResolver`), but there
is no way to restrict that option to a specific package.
"""
# this transform needs to happen before ReferencesResolver
default_priority = 5
def run(self):
for node in self.document.traverse(addnodes.pending_xref):
target = node["reftarget"]
if target.startswith("my_package_name."):
found_ref = False
with suppress(NoUri, KeyError):
# let the domain try to resolve the reference
found_ref = self.env.domains[node["refdomain"]].resolve_xref(
self.env,
node.get("refdoc", self.env.docname),
self.app.builder,
node["reftype"],
target,
node,
nodes.TextElement("", ""),
)
# warn if resolve_xref did not return or raised
if not found_ref:
logger.warning(
f"API link {target} is broken.", location=node, type="ref"
)
def setup(app):
app.add_post_transform(MyLinkWarner)

Is it possible to replace one directive with another one

I would like to create a substitution (or similar) that transforms one directive into another.
For example:
In our sphinx based documentation, we use Admonitions to create certain note and warning boxes.
However, if we use
.. note:: This is a Note
The title of the box is Note, and This is a Note becomes the first paragraph.
In contrast, this directive
.. admonition:: This is a Note
:class: note
produces a note box with the desired title.
To make it easier for other editors, I would like to create a substitution, that replaces the first with the second.
Is there anything this can be done with in sphinx?
Yes, it can be done. You have to add a custom directive to Sphinx. Create a Python module (like mydirectives.py next to conf.py) with the following:
import os
import os.path
import re
import subprocess
import docutils.core
import docutils.nodes
import docutils.parsers.rst
class AbstractDirective(docutils.parsers.rst.Directive):
has_content = True
required_arguments = 0
optional_arguments = 0
option_spec = {}
final_argument_whitespace = False
node_class = docutils.nodes.container
def run(self):
self.assert_has_content()
text = '\n'.join(self.content)
admonition_node = self.node_class(rawsource=text)
self.state.nested_parse(self.content, self.content_offset,
admonition_node)
admonition_node.set_class("abstract")
return [admonition_node]
def setup(app):
app.add_directive('abstract', AbstractDirective)
There must be some way to add the title as well. Perhaps you need to add a
title node yourself. The documentation is lacking there, best look at the
source for
admonitions
and you will get a feel for the docutils.
With a custom text node you should be able to make up your own note directive.

Cross validation of dataset separated on files

The dataset that I have is separated on different files grouped on samples that know each other, i.e., they were created on similar conditions on a similar time.
The balance of the train-test dataset is important so the samples have to be on train or test, but cannot be separated. So KFold it is not simple to use on my scikit-learn code.
Right now, I am using something similar to LOO making something like:
train ~> cat ./dataset/!(1.txt)
test ~> cat ./dataset/1.txt
Which is not confortable and not very useful if I want to make folds on test of several files and make a "real" CV.
How would be possible to make a good CV to check real overfitting?
Looking to this answer, I've realized that pandas can concatenate dataframes. I checked that the process is 15-20% slower than cat command-line but makes able to do folds as I was expecting.
Anyway, I am quite sure that there should be any other better way than this one:
import glob
import numpy as np
import pandas as pd
from sklearn.cross_validation import KFold
allFiles = glob.glob("./dataset/*.txt")
kf = KFold(len(allFiles), n_folds=3, shuffle=True)
for train_files, cv_files in kf:
dataTrain = pd.concat((pd.read_csv(allFiles[idTrain], header=None) for idTrain in train_files))
dataTest = pd.concat((pd.read_csv(allFiles[idTest], header=None) for idTest in cv_files))

What is the corret syntax for using max function

Still using bloody OpenOffice Writer to customize my sale_order.rml report.
In my sale order I have 6 order lines with 6 different lead time to delivery. I need to show the maximum out of the six values.
After many attempt I have abandoned using the reduce function as it works erratically or not at all most of the time. I have never seen anything like this.
So I thought I'd give a try using max encapsulating a loop such as:
[[ max(repeatIn(so.order_line.delay,'d')) ]]
My maximum lead time being 20, I would expect to see 20 (yes well that would be too easy, wouldn't it!).
It returns
{'d': 20.0}
At least it contains the value I am after.
But; if I try and manipulate this result, it disappears altogether.
I have tried:
int(re.findall(r'[0-9]+', max(repeatIn(so.order_line.delay,'d')))[0])
which works great from the python window, but returns absolutely nothing in OpenERP.
I import the re from my sale_order.py file, which I have recompiled into sale_order.pyo:
import time
import re
from datetime import datetime, timedelta
from report import report_sxw
class order(report_sxw.rml_parse):
def __init__(self, cr, uid, name, context=None):
super(order, self).__init__(cr, uid, name, context=context)
self.localcontext.update({
'time': time,
'datetime': datetime,
'timedelta': timedelta,
're': re,
})
I have of course restarted the server many times. My test install sits on windows.
So can anyone tell me what I am doing wrong, because I can make it work from Python but not from OpenOffice Writer!
Thanks for your help!
EDIT 1:
The format
{'d': 20.0}
is, according to python, a dictionary. Still in Python, to extract the integer from a dictionary it is possible to do it like so:
>>> dict={'d': 20.0}
>>> print(dict['d'])
20.0
But how can I transpose this to OpenERP writer???
I have manage to get the result I wanted by importing functools and declaring the reduce function within the parameters of the sale_order.py file.
I then simply used a combination of reduce and max function and it works exactly as expected.
The correct syntax is as follow:
repeatIn(objects,'o')
reduce(lambda x, y: max(x, y.delay), o.order_line, 0)
Nothing else is required.
Enjoy!

Resources