plone.scale annotation bloated with (usesless?) scales - performance

While investigating a ConflictError (see this previous question) I saw a lot of persistent.mapping.PersistentMapping conflicts.
Looking at a specific one it turned out to be a PersistentMapping for plone.scale.
Turns out that a random object with just one image has 562 keys on it, no wonder why it gets a conflict error...
Some context on the object that holds this plone.scale annotation:
- dexterity content type
- one of its behaviors has an image field (plone.namedfile.field.NamedBlobImage)
The code to see it is as following:
Start a debugging instance: ./bin/instance debug
from ZODB.utils import p64
OID = 0x568428 # got from zeo client logs
mapping = app._p_jar[p64(OID)]
len(mapping) # that returns 562
The mysterious part is that only 4 keys on that persistent mapping are tuples, while the other 558 are just hashes.
A brief look at method seems to imply that there should be only one to one relation from tuples and hashes keys on the persistent mapping.
Further investigating the elements reveals that, indeed, if you look at the width and height properties from all elements there are only 4 different combinations (the ones from the tuples itself).
As a new scale is generated whenever the modified time is bigger (see the scale method pointed above) and plone.namedfield.scaling.ImageScaling.modified uses context as the source for modified, that means that at every single update of the object a new scale will be generated?
So two questions arise from the previous:
my assumption of only 4 scales are really used and the other 558 are old and useless is true?
provided a yes on the previous, shouldn't they be cleaned up then?

You may be right, but surely the correct place to report this is


Why is the EntryID Changing in VSTO? The MailItem is not moving folders

I'm writing some code in C# that matches a pattern from the subject and then ingests the email. To initialize my datastore, I go through the current Microsoft.Office.Interop.Outlook.Table.
while (!table.EndOfTable)
Row row = table.GetNextRow();
string entryId = row["EntryID"].ToString();
this.SaveInXML(entryId, row);
It seems pretty simple. Well, I also have an event (Application.ItemLoad) that I'm watching, too. I notice that in the event the MailItem's EntryID is completely different than the Table's EntryID. In fact, the string lengths are not even the same (See example below). Why is this? Shouldn't they be the same? The item has not moved folders, so I'd assume it's the same. Thank you, all.
Example code:
NameSpace ns = this.Folder.Application.GetNamespace("MAPI");
var mi = ns.GetItemFromID("EF0000003E65593F1D361C44AFBFA24E6F365D6E04782F00") as MailItem;
string entryId = mi.EntryID;
// Output Produced:
// EF0000003E65593F1D361C44AFBFA24E6F365D6E04782F00
// 000000003E65593F1D361C44AFBFA24E6F365D6E0700CC348F1AD97A224B9898503750437E4700000000010C0000CC348F1AD97A224B9898503750437E470000F59160590000
// Notice that the second WriteLine isn't even remotely close to the EntryID that I requested.
Entry identifiers come in two types: short-term and long-term.
Short-term entry identifiers are faster to construct, but their uniqueness is guaranteed only over the life of the current session on the current workstation.
Long-term entry identifiers have a more prolonged lifespan. Short-term entry identifiers are used primarily for rows in tables and entries in dialog boxes, whereas long-term entry identifiers are used for many objects such as messages, folders, and distribution lists.
Use the MailItem.EntryID property if you need to get a long-term entry identifiers.
Entry identifiers cannot be compared directly because one object can be represented by two different binary values. Use the NameSpace.CompareEntryIDs method to determine whether two entry identifiers represent the same object.
As Eugene noted, you have two kinds of entry ids - long term and short term. Even for long-term entry ids, they can be different depending on how the item was opened. Long term entry ids always start with "00000000". Short term entry ids can only be used in the current MAPI session and therefore should not be persisted to be used across different sessions.
You must treat entry id as black boxes and never compare them directly - always use Namespace.CompareEntryIDs.

Gensim's FastText KeyedVector out of vocab

I want to use the read-only version of Gensim's FastText Embedding to save some RAM compared to the full model.
After loading the KeyVectors version, I get the following Error when fetching a vector:
IndexError: index 878080 is out of bounds for axis 0 with size 761210
The error occurs when using words that should be out-of-vocabulary e.g. "lawyerxy" instead of "lawyer". The full model returns a vector for both.
from gensim.models import KeyedVectors
model = KeyedVectors.load("model.kv")
model .wv.__getitem__("lawyerxy")
So, my assumption is that the KeyedVectors do not offer FastText's out of vacabulary function - a key feature for my usecase. This limitation is not given in the documentation:
Can anyone prove that assumption and/or name a fix to allow vectors for "lawyerxy" etc. ?
The KeyedVectors name is (as of gensim-3.8.0) just an alias for class Word2VecKeyedVectors, which only maintains a simple word (as key) to vector (as value) mapping.
You shouldn't expect FastText's advanced ability to synthesize vectors for out-of-vocabulary words to appear in any model/representation that doesn't explicitly claim to offer that ability.
(I would expect a lookup of an out-of-vocabulary word to give a clearer KeyError rather than the IndexError you've reported. But, you'd need to show exactly what code created the file you're loading, and triggered the error, and the full error stack, to further guess what's going wrong in your case.)
Depending on how your model.kv file was saved, you might be able to load it, with retained OOV-vector functionality, by using the class FastTextKeyedVectors instead of plain KeyedVectors.

What API / Database Does Elasticsearch Use to Generate Random ENV Names

When launching new 'unnamed' elasticsearch nodes (?) I see a unique name displayed in the debugging output, in this case the node is called: Riot, other gems include: "Oneg the Prober"
org.elasticsearch.env: [Riot] max file descriptors [10240] for elasticsearch process likely too low, consider increasing to at least [65536]
There is always a clever unique name coming from somewhere? I've looked for this line in the source-code and can only find a reference to Locale.ROOT - but I cannot find the call to fetch a new unique name and I think they're always funny and would like to use a similar generator (:
The list of names is in elastic/elasticsearch/core/src/main/resources/config/names.txt
That file was removed, in the context of making node names persistent. The change was merged to master in 2016, quite some time before this question was asked.
You can find "Riot" on line 2131 and "Oneg the Prober" on line 1898.
Came across this list which has 2938 character names being used, not sure if this is the original source.

SNMP OID with non-unique node names

I am writing an extension to my companies existing SNMP MIB. I have a whole list of objects, with the same properties on each. I want to be able to get and set these through SNMP.
So for example, consider my object has name, desc, arg0, arg1. What I want is to be able to refer to these as:
However the leaf nodes appear to have to have unique names, so I am unable to define this.
I can use a SNMP table to produce:
But there is nowhere to look up that 2 means ObjectB. This leaves it open to user error looking up the wrong value and setting the wrong thing.
At the moment the best solution I can see is:
which involves defining name for every object (there are 20 or so of them). The set of objects is fixed, so this is ok...just not very tidy.
Is there some way to define names for index in the table?
Is there some way of defining a container type?
Is there some way of allowing leaf nodes to be non-unique?
Any other ideas?
You should definitely use SNMP tables to accomplish what is required. This is the only way.
MIB Object names must be unique within entire MIB file.
You can easily use object of OCTET STRING type as Table index. So each byte/symbol/char of OCTET STRING value will be translated to corresponding numeric ASCII code in OID.
I ended up just using a naming convention and adding each of the settings directly into the MIB.
Not really the answer I wanted, but it means that all of the settings show up in the MIB, and that reduces the chance of users setting the wrong setting.

How can I retrieve object keys from a sequence in freemarker?

I have a list of objects that are returned as a sequence, I would like to retrieve the keys of each object so as to be able to display the object correctly. At the moment I try data?first?keys which seems to get something like the queries that return the objects (Not sure how to explain that last sentence either but img below shows what I'm trying to explain).
The objects amount of objects returned are correct (7) but displaying the keys for each object is my aim. The macro that attempts this is here (from the apache ofbiz development book chapter 8).
Seems like it my sequence is a list of hashes and as explained by Daniel Dekany this post:
The original problem is that, someHash[key] expects a
string as key. Because, the hash type of FTL, by definition, maps
string keys to arbitrary values. It's not the same as Java's Map.
(Note that to further complicate the matters, in FTL
someSequenceOrString[index] expects an integer index. So, the [] thing
is used for that too.) Now someBeanWrappedMap(key) has technically
nothing to do with all the []-s, it's just a method call, so it
accepts all kind of keys. If you have a Map with non-string keys, you
must use that.
Thanks D Dekany if you're on stack, this ended my half day frustration with the ftl template.
