Validating uniqueness of struct field values with cue - validation

I'm evaluating using a cue schema to replace bespoke validation code for an existing YAML format that looks something like this:
items:
- name: a
description: something
- name: b
description: other thing
Following the tutorials has made it pretty easy to get the basics working: enforcing the required and optional fields on each item, and their types and value constraints.
However, a feature of the existing validation code that I'd like to replicate is the ability to enforce that no two entries in the items list share the same value for name. It's not obvious to me from the documentation whether or how this might be possible with cue. Is it?
(I know that I could and maybe should just use a map here instead of a list, and promote the name field into a key in the map, but I'd like to avoid changing the YAML format for the benefit of the validation code / tool if possible.)
Here's a specific example of the kind of thing that I'd want to fail validation (because the name a is reused):
items:
- name: a
description: something
- name: a
description: other thing

Disclaimer: I'm VERY new to CUE and there's probably a better solution to this.
The solution I could think of is having a guard-like private variable _uniqueName: true and leveraging the fact that CUE will fail validation if we try to assign a conflicting value to the same var. Therefore you could reassign _uniqueName to a result of list.UniqueItems after massaging the items list first.
See https://cuelang.org/play/?id=WZyVeYCEsEM#cue#export#yaml
Granted the error message is not the most descriptive one or doesn't show you which value is duplicated, but it generally does what you need/asked.
To improve a bit you could e.g. rename the variable to _namesInItemsMustBeUnique.

Related

YAML : Use mapped list vs array

I am creating a configuration file for my application. To do it, I decided to use YAML for its simplicity and reliability.
I am currently designing a special part of my application: In this part, I have to list and configure all datasets I want to use in a module. To do that I wrote this :
// Other stuff
datasets:
rate_variation:
name: Rate variation over time # Optional
description: Description here # Optional
type: POINTS_2D
options:
REFRESH_TIME: 5 # Time of refresh in second
frequency_variation:
name: Frequency variation over time
description: Description here # Optional
type: POINTS_2D
But, after some reflection, I have some doubts about it. Because maybe something like this is better :
datasets:
- id: rate_variation
name: Rate variation over time # Optional
description: Description here # Optional
type: POINTS_2D
options:
REFRESH_TIME: 5 # Time of refresh in second
- id: frequency_variation
name: Frequency variation over time
description: Description here # Optional
type: POINTS_2D
I use the ID to identify each dataset in my scripts (two datasets must have a different id) and generate output files for each of them.
But now, I really don't know what is the best solution...
What would you recommend to use? And for what reason?
Quick Answer (TL;DR)
YAML can be normalized quite cleanly and in a straightforward manner using YAML ddconfig format
Using this approach can simplify construction and maintenance of configuration files, and make them highly flexible for later use by many types of consuming applications.
Detailed Answer
Context
Data normalization (aka YAML schema definition) with YAML ddconfig format
(tag:dreftymac#dreftymac.org,2017:ddconfig)
dmid://uu773yamldata1620421509
Problem
Scenario: Developer graille_stentiplub is creating a configuration file format for use with YAML.
the data structure (i.e., schema) for the YAML must be flexible for use in many contexts.
the schema should be amenable to arbitrary and flexible queries where the structure of the YAML does not "get in the way".
the schema should be easy to read and understand by humans.
the schema should be easily manipulated by any programming environment capable of processing standard YAML.
Special considerations: graille_stentiplub wants an easy way to determine when to use lists vs mappings.
Example
the following is a simple config file using YAML ddconfig format
dataroot:
file_metadata_str: |
### <beg-block>
### - caption: "my first project"
### notes: |
### * href="//home/sm/docs/workup/my_first_project.txt"
### <end-block>
project_info:
prj_name_nice: StackOverflow Demo Answer Project
prj_name_mach: stackoverflow_demo_001a
prj_sponsor_url: https://stackoverflow.com/questions/54349286
prj_dept_url: https://demo-university.edu/dept/basketweaving
dataset_recipient_list:
- graille_stentiplub#example.org
- dreftymac_lufcrom#demo-university.edu
- nobody_knows_who_you_are#example.com
dataset_variations_table:
- dvar_id: rate_variation
dvar_name: Rate variation over time # Optional
dvar_description: Description here # Optional
dvar_type: POINTS_2D
dvar_opt_refresh_per_second: 5 # Time in seconds
- dvar_id: frequency_variation
dvar_name: Frequency variation over time
dvar_description: Description here # Optional
dvar_type: POINTS_2D
Explanation
The entire data structure is nested under a toplevel key called dataroot (this is optional).
Inclusion of the dataroot key makes the YAML structure more addressible but is not necessary.
Using a filesystem analogy, you can think of dataroot as a root-level directory.
Using an XML analogy, you can think of this as the root-level XML tag.
The entire data structure consists of a YAML mapping (aka dictionay) (aka associative-array).
every mapping key is a first-level child of dataroot (or else a toplevel key if dataroot is omitted).
There are different types of mapping keys:
String: (suffix _str) indicates that the mapped value is a string (aka scalar) value.
List: (suffix _list) indicates the mapped value is a list (aka sequence).
Info: (suffix _info) indicates the mapped value is mapping (aka dictionary) (aka associative-array).
Table: (suffix _table) indicates the mapped value is a sequence-of-mappings (aka table).
Tree: (suffix _tree) indicates a composite structure with support for one or more nested parent-child relationships.
Rationale
The YAML ddconfig format coincides nicely with many different contexts and tools.
This allows for simplified decision making when laying out the configuration file format, as well as simplified programming when parsing the file.
Simplicity
a _list mapping consists of a sequence of scalar-value items with no nesting.
a _info mapping consists of a scalar-key and a scalar-value (name-value pairs) with no nesting.
a _table mapping is simply a sequence of _info mappings.
nesting of arbitrary depth can be accomplished through YAML anchors and aliases, thus supporting the _tree composite data structure.
Similarity to relational databases
You can think of a ddconfig _info mapping as a single record from a standard table in a relational database.
You can think of a ddconfig _table mapping as a standard table in a relational database.
This similarity makes it extremely straightforward to transmit YAML to a database if and where necessary.
Anchors and aliases
The YAML ddconfig format works well with YAML anchors and aliases.
One or more _info mappings can be easily converted to a _table mapping by way of aliases.
Multiple _info mappings can be combined together into another _info mapping by way of YAML merge keys.
See also
github link https://github.com/dreftymac/trypublic/search?q=uu773yamldata1620421509
With the first option, YAML enforces that there are no duplicate IDs. Therefore, an editor supporting YAML may support your user by showing an error in this case. With the second option, you need to check uniqueness in your code and the user only sees the error when loading the syntactically correct YAML into your application.
However, there are other factors to consider. For example, you may have a preference for the resulting in-memory data structures. If you use standard YAML implementations that deserialize to native data structures (PyYAML, SnakeYAML etc), the YAML structure imposes the type of the in-memory data structure (you can customize by writing custom constructors, but that's not trivial). For example, if you want to ask a dataset object for its ID, that is only directly doable with the second structure – if you use the first structure, you would need to search the parent table for the dataset value you have to get its ID.
So, final answer is (as always): It depends. Think about what you want to do with it. For simple configuration files, my second argument may be weaker than my first one, but I don't know what exactly you want to do with the data.

Improved way of scaling in saltstack

I have a problem about the Jinja2 template and that problem is breaking a one line string over multiple lines when it comes to writing a state or anything in salt [my exact case refers to trying to write a list of machines one after the other,in a list,instead of just in a really long line].
What I am trying to say is that I want to achieve this:
nodegroups:
- group: 'L#adsdasdadas' +
'dasdasdasdas'
.............->imagine 10.000 names coming here
'adsasdasddsa'
Compared to the approach that I have to do now:
nodegroups:
- group: 'L#adsdasdadas,dasdsadasdsa,dasdsadasdsa,......,asdqwe'
Is there a better way to do it?Is there a better way to handle thousands of machines?
You could say grains,and I thought about it but I was wondering if there's a better and elegant way of doing it.
Any thoughts or opinions would help me a lot
[Edit1]:
I wrote a script that takes a list of hostnames and adds them to the master config file in the nodegroups section.For now it might work
Choice of data source
I would recommend targeting with pillars because they are managed centrally from Master = convenient, rather than static custom grains (which are configured distributively on each Minion) = inconvenient - see comparison summary here.
Limitations of configuration files
The nodegroups are specified in Salt configuration file /etc/salt/master which is not a Jinja template (it has pure YAML format). So, you don't have an option to use Jinja to join external input with list of strings.
Possible solution
Why joining is even mentioned? You can turn the problem of "breaking a one line string over multiple lines" into solution of using lists right away - no need to break (and if you need "one line string" somewhere, joining list items is easy).
In other words, you could define nodegroups via pillar (avoiding long strings as in your example). Pillars, in turn, are rendered by Jinja. Therefore, using the same list of Minions defined somewhere, you could generate derived product in pillars through Jinja (be it joined string of them or list as is). There is a trick which allows reusing the same external data in multiple pillars files.
First of all I would like to thank uvsmtid for the wonderful idea.Sorry for the confusion created too
So,what I did was create a pillar with the name of each minion[which happens to be it's id] and then in a state what I did was compared the value from that list to the actual id of the minion
{%for item in salt['pillar.get']('info') %}
{%if grains['id'] == item %}
something:
cmd.run:
- name: touch something
{%endif%}
{%endfor%}
I hope this solution will help someone the same way it helped me

Is there any way to clear out the "Testers" field in Microsoft Test Manager (or in TFS)

Seems like once you set the Testers field on a Test Case in MTM, it will not allow you to clear it. You are only allowed to change it to a different value. Has anyone found a way to clear this field?
It is important to not get the "Assigned" Tester field confused with the Assigned To field. They are distinct fields on a Test Case work item. One reason why someone might want the Tester field to be blank is if a team all pitches in to help with testing during an iteration and leaving the assigned tester blank allows the team to know that no one has "picked up" this test to execute. The team member could then assign the test to his or herself and execute it.
The template pulls the Assigned Tester Values from a list using the 'AllowedValues' tag. Instead, simply change this to 'SuggestedValues'. That will allow for empty values - easy as that.
If you'd rather not allow free entry, you can also add a default value such as 'None' and use that rather than blank.
Unless you've customized your work item type, this field should never be blank after being saved. It defaults to the person who created the bug in all Microsoft supported process templates, and a value is required by default.
That being said, why would you want to change it to blank?
If you really, really want to be able to blank it out (which I don't think is a good idea at all), you'll need to customize your template. See the below guidance:
http://msdn.microsoft.com/en-us/library/ms243849(v=vs.110).aspx

What is a proper way for naming of message properties in i18n?

We do have a website which should be translate into different languages. Some of the wording is in message properties files ready for translation. I want now add the rest of the text into these files.
What is a good way to name the text blocks?
<view>.<type>.<name>
We mostly have webpages and some of the elements/modules are repeating on some sites.
As far as I know, no "standard" exists. Therefore it is pretty hard to tell what is proper and what is improper way of naming resource keys. However, based on my experience, I could recommend this way:
property file name: <module>.properties
resource keys: <view or dialog>[.<sub-context>].<control-type>.<name>
We may discuss if it is proper way to put every strings from one module into one property files - probably it could be right if updates doesn't happen often and there are not so many messages. Otherwise you might think about one file per view.
As for key naming strategy: it is important for the Translator (sounds like film with honorable governor Arnold S. isn't it?) to have a Context. Translation may actually depend on it, i.e. in Polish you would translate a message in a different way if it is page/dialog/whatever title and in totally different way if it is text on a button.
One example of such resource key could be:
preferences.password_area.label.username=User name
It gives enough hints to the Translator about what it actually is, which could result in correct translation...
We have come up with the following key naming convention (Java, btw) using dot notation and camel case:
Label Keys (form labels, page/form/app titles, etc...i.e., not full sentences; used in multiple UI locations):
If the label represents a Java field (i.e., a form field) and matches the form label: label.nameOfField
Else: label.sameAsValue
Examples:
label.firstName = First Name
label.lastName = Last Name
label.applicationTitle = Application Title
label.editADocument = Edit a Document
Content Keys:
projectName.uiPath.messageOrContentType.n.*
Where:
projectName is the short name of the project (or a derived name from the Java package)
uiPath is the UI navigation path to the content key
messageOrContentType (e.g., added, deleted, updated, info, warning, error, title, content, etc.) should be added based on the type of content. Example messages: (1) The page has been updated. (2) There was an error processing your request.
n.* handles the following cases: When there are multiple content areas on a single page (e.g., when the content is separated by, an image, etc), when content is in multiple paragraphs or when content is in an (un)ordered list - a numeric identifier should be appended. Example: ...content.1, ...content.2
When there are multiple content areas on a page and one or more need to be further broken up (based on the HTML example above), a secondary numeric identifier may be appended to the key. Example: ...content.1.1, ...content.1.2
Examples:
training.mySetup.myInfo.content.1 = This is the first sentence of content 1. This is the second sentence of content 1. This content will be surrounded by paragraph tags.
training.mySetup.myInfo.content.2 = This is the first sentence of content 2. This is the second sentence of content 2. This content will also be surrounded by paragraph tags.
training.mySetup.myInfo.title = My Information
training.mySetup.myInfo.updated = Your personal information has been updated.
Advantages / Disadvantages:
+ Label keys can easily be reused; location is irrelevant.
+ For content keys that are not reused, locating the page on the UI will be simple and logical.
- It may not be clear to translators where label keys reside on the UI. This may be a non-issue for translators who do not navigate the pages, but may still be an issue for developers.
- If content keys must be used in more than one location on the UI (which is highly likely), the key name choice will not make sense in the other location(s). In our case, management is not concerned with a duplication of values for content areas, so we will be using different keys (to demonstrate the location on the UI) in this case.
Feedback on this convention - especially feedback that will improve it - would be much appreciated since we are currently revamping our resource bundles! :)
I'd propose the below convention
functionalcontext.subcontext.key
logicalcontext.subcontext.key
This way you can logically group all the common messages in a super context (id in the below example). There are few things that aren't specific to any functional context (like lastName etc) which you can group into logical-context.
order.id=Order Id
order.submission.submit=Submit Order
name.last=Last Name
the method that I have personally used and that I've liked more so far is using sentence to localisee as the key. For example: (pls replace T with the right syntax dependably on the language)
for example:
print(T("Hello world"))
in this case T will search for a key "Hello world". If it is not found then the key is returned, otherwise the value of the key.
In this way, you do not need to edit the message (in your default language) at least that you need to use parameters.... It saved me a LOT of dev time

Is this valid YAML?

So for my text parsing in C# question, I got directed at YAML. I'm hitting a wall with this library I was recommended, so this is a quickie.
heading:
name: A name
taco: Yes
age: 32
heading:
name: Another name
taco: No
age: 27
And so on. Is that valid?
Partially. YAML supports the notion of multiple consecutive "documents". If this is what you are trying to do here, then yes, it is correct - you have two documents (or document fragments). To make it more explicit, you should separate them with three dashes, like this:
---
heading:
name: A name
taco: Yes
age: 32
---
heading:
name: Another name
taco: No
age: 27
On the other hand if you wish to make them part of the same document (so that deserializing them would result in a list with two elements), you should write it like the following. Take extra care with the indentation level:
- heading:
name: A name
taco: Yes
age: 32
- heading:
name: Another name
taco: No
age: 27
In general YAML is concise and human readable / editable, but not really human writable, so you should always use libraries to generate it. Also, take care that there exists some breaking changes between different versions of YAML, which can bite you if you are using libraries in different languages which conform to different versions of the standard.
Well, it appears YAML is gone out the window then. I want something both human writable and readable. Plus, this C# implementation...I have no idea if it's working or not, the documentation consists of a few one line code examples. It barfs on their own YAML files, and is an old student project. The only other C# YAML parser I've found uses the MS-PL which I'm not really comfortable using.
I might just end up rolling my own format. Best practices be damned, all I want to do is associate a key with a value.
Try this(Online YAML parser).
You don't have to download anything or do something. Just go there, and copy & paste. That's it.
There appears to be a YAML validator called Kwalify which should give you the answer. You shoulda just gone with the String tokenizing, man. Writing parsers is fun :)
There is another YAML library for .NET which is under development. Right now it supports reading YAML streams. It has been tested on Windows and Mono. Write support is currently being implemented.
CodeProject has one at:
http://www.codeproject.com/KB/recipes/yamlparser.aspx
I haven't tried it too much, but it's worth a look.
You can see the output in the online yaml parser :
http://yaml-online-parser.appspot.com/?yaml=heading%3A%0D%0A+name%3A+A+name%0D%0A+taco%3A+Yes%0D%0A+age%3A+32%0D%0A%0D%0Aheading%3A%0D%0A+name%3A+Another+name%0D%0A+taco%3A+No%0D%0A+age%3A+27%0D%0A&type=json
As you can see, there is only one heading node created.
Just to make an explicit comment about it: You have a duplicate mapping key issue. A YAML processor will resolve this as a !!map, which prohibits duplicate keys. Not all processors enforce this constraint, though, so you might get an incorrect result if you pass an incorrect YAML stream to a processor.

Resources