What is the number prepended to the Sublime Text "Goto Anything" search? - sublimetext

Whenever I use the Goto Anything search in Sublime Text and start typing to search the files in my current project I get a whole bunch of results based on Sublime Text's fuzzy-search algorithm, each prepended with a number.
I assume this is some sort of score for the search "strength" but I just wanted confirm this. What is this number based on?

It seems like the numbers are indeed representative of match strength, as you assumed.
I noticed an odd effect when testing your hypothesis, and then proceeded to create the dummy files CustomCompletions.CustomCompletions & CustomCompletions ( a file with no extension ) for further comparison.
Here are the results:
As you can see,
CustomCompletions has the highest ranking with 1524
CustomCompletions.py & CustomCompletions.todo share a rank of 1507
CustomCompletions.CustomCompletions & CustomCompletions.sublime-settings share a rank of 1490
All of the remaining files, which contain additional text in the base name, continue to receive lower rankings.
What I found odd was that the 2nd & 3rd groups had different rankings, despite sharing a base file name that exactly matches the query.
I figured that it might be due to the number of characters in the file extension, so I tested that assumption by creating the following files:
CustomCompletions.a
CustomCompletions.ab
CustomCompletions.abc
CustomCompletions.abcd
CustomCompletions.abcde
CustomCompletions.abcdef
CustomCompletions.abcdefg
CustomCompletions.abcdefgh
CustomCompletions.abcdefghi
CustomCompletions.abcdefghij
CustomCompletions.1
CustomCompletions.12
CustomCompletions.123
CustomCompletions.1234
CustomCompletions.12345
CustomCompletions.123456
CustomCompletions.1234567
CustomCompletions.12345678
CustomCompletions.123456789
CustomCompletions.1234567890
But it turns out they all ranked at 1507, the same ranking as the 2nd group.
Because of that outcome, I am still unsure what criteria affects the ranking of files which share a base name that is an exact match for the Goto Anything query, but have differing file extensions.

Related

To build a flow using Power Automate to download linked csv report in gmail

I'm trying to create a flow using Power Automate (which I'm quite new to) that can get the link/URL in an email I receive daily, then download the .csv file that normally a click to the link would do, and then save the file to a given local folder.
An example of the email I get:
Screenshot of the email I get daily
I searched in Power Automate Community and found this insightful LINK post & answer almost solved it. However, after following the steps and built the flow, it kept failing at the Compose step.
Screenshot of the Flow & Error Message
The flow
Error message
Expression used:
substring(body('Html_to_text'),add(indexOf(body('Html_to_text'),'here'),5),sub(indexOf(body('Html_to_text'),'Name'),5))
Seems the expression couldn't really get the URL/Link? I'm not sure and searched but couldn't find any more posts that can help.
Please kindly share all insights on approaches or workarounds that you think may help me solve the problem and truly thanks!
PPPPPPPPisces
We need to breakdown the bits of the function here which needs 3 bits of info
substring(1 text to search, 2 starting position of the text you want, 3 length of text)
For example, if you were trying to return an unknown number from the text dog 4567 bird
Our function would have 3 parts.
body('Html_to_text'), this bit gets the text we are searching for
add(indexOf(body('Html_to_text'),'dog'),4), this bit finds the position in the text 4 characters after the start of the word dog (3 letters for dog + the space)
sub(sub(indexOf(body('Html_to_text'),'bird'),2)),add(indexOf(body('Html_to_text'),'dog'),4)), I've changed the structure of your code here because this part needs to return the length of the URL, not the ending position. So here, we take the position of the end of the URL (position of the word bird minus two spaces) and subtract it from the position of the start of the URL (position of the word dog + 4 spaces) to get the length.
In your HTML to text output, you need to check what the HTML looks like, and search for a word before the URL starts, and a word after the URL starts, and count the exact amount of spaces to reach the URL. You can then put those words and counts into your code.
More generally, when you have a complicated problem that you need to troubleshoot, you can break it down into steps. For example. Rather than putting that big mess of code into a single block, you can make each chunk of the code in its own compose, and then one final compose to bring them all together - that way when you run it you can see what information each bit is giving out, or where it is failing, and experiment from there to discover what is wrong.

Reporting Multiple Values & Sorting

Having a bit of an issue and unsure if it's actually possible to do.
I'm working on a file that I will enter target progression vs actual target reporting the % outcome.
PAGE 1
¦NAME ¦TAR 1 %¦TAR 2 %¦TAR 3 %¦TAR 4 %¦OVERALL¦SUB 1¦SUB 2¦SUB 3¦
¦NAME1¦ 114%¦ 121%¦ 100%¦ 250%¦ 146%¦ 2¦ 0¦ 0%¦
¦NAME2¦ 88%¦ 100%¦ 90%¦ 50%¦ 82%¦ 0¦ 1¦ 0%¦
¦NAME3¦ 82%¦ 54%¦ 64%¦ 100%¦ 75%¦ 6¦ 6¦ 15%¦
¦NAME4¦ 103%¦ 64%¦ 56%¦ 43%¦ 67%¦ 4¦ 4¦ 24%¦
¦NAME5¦ 87%¦ 63%¦ 89%¦ 0%¦ 60%¦ 3¦ 2¦ 16%¦
Now I already have it sorting all rows by the Overall % column so I can quickly see at a glance but I am creating a second page that I need to reference points.
So on the second page I would like to somehow sort and reference different columns for example
PAGE 2
TOP TAR 1¦Name of top %¦Top %¦
TOP TAR 2¦Name of top %¦Top %¦
Is something like this possible to do?
Essentially I'm creating an Employee of the Month form that automatically works out who has topped what.
I'm willing to drop a paypal donation for whoever can figure this out for me as I've been doing it manually every month and would appreciate the time saved
I don't think a complicated array formula is necessary for this - I am suggesting a fairly standard Index/Match approach.
First set up the row titles - you can just copy and transpose them from Page 1, or use a formula in A2 of Page 2 like
=transpose('Page 1'!B1:E1)
The use them in an index/match to get the data in the corresponding column of the main sheet and find its maximum (in C2)
=max(index('Page 1'!A:E,0,match(A2,'Page 1'!A$1:E$1,0)))
Finally look up the maximum in the main sheet to find the corresponding name:
=index('Page 1'!A:A,match(C2,index('Page 1'!A:E,0,match(A2,'Page 1'!A$1:E$1,0)),0))
If you think there could be a tie for first place with two or more people getting the same score, you could use a filter to get the different names:
So if the max score is in B8 this time (same formula)
=max(index('Page 1'!A:E,0,match(A8,'Page 1'!A$1:E$1,0)))
the different names could be spread across the corresponding row using transpose (in C8)
=ArrayFormula(TRANSPOSE(filter('Page 1'!A:A,index('Page 1'!A:E,0,match(A8,'Page 1'!A$1:E$1,0))=B8)))
I have changed the test data slightly to show these different scenarios
Results

Need an algorithm that detects diffs between two files for additions and reorders

I am trying to figure out if there are existing algorithms that can detect changes between two files in terms of additions but also reorders. I have an example below:
1 - User1 commit
processes = 1
a = 0
allactive = []
2 - User2 commit
processes = 2
a = 0
allrecords = range(10)
allactive = []
3 - User3 commit
a = 0
allrecords = range(10)
allactive = []
processes = 2
I need to be able to say that for example user1 code is the three initial lines of code, user 2 added the "allrecords = range(10)" part (as well as a number change), and user 3 did not change anything since he/she just reordered the code.
Ideally, at commit 3, I want to be able to look at the code and say that from character 0 to 20 (this is user1's code), 21-25 user2's code, 26-30 user1's code etc.
I know there are two popular algorithms, Longest common subsequence and longest common substring but I am not sure which one can correctly count additions of new code but be able also to identify reorders.
Of course this still leaves out the question of having the same substring existing twice in a text. Are there any other algorithms that are better suited to this problem?
Each "diff" algorithm defines a set of possible code-change edit types, and then (typically) tries to find the smallest set of such changes that explains how the new file resulted from the old. Usually such algorithms are defined purely syntactically; semantics are not taken into account.
So what you want, based on your example, is an algorithm that allow "change line", "insert line", "move line" (and presumably "delete line" [not in your example but necessary for a practical set of edits]). Given this you ought to be able to define a dynamic programming algorithm to find a smallest set of edits to explain how one file differs from another. Note that this set is defined in terms of edits to whole-lines, rather like classical "diff"; of course classical diff does not have "change line" or "move line" which is why you are looking for something else.
You could pick different types of deltas. Your example explicitly noted "number change"; if narrowly interpreted, this is NOT an edit on lines, but rather within lines. Once you start to allow partial line edits, you need to define how much of a partial line edit is allowed ("unit of change"). (Will your edit set allow "change of digit"?)
Our Smart Differencer family of tools defines the set of edits over well-defined sub-phrases of the targeted language; we use formal language grammar (non)terminals as the unit of change. [This makes each member of the family specific to the grammar of some language] Deltas include programmer-centric concepts such as "replace phrase by phrase", "delete listmember", "move listmember", "copy listmember", "rename identifier"; the algorithm operates by computing a minimal tree difference in terms of these operations. To do this, the SmartDifferencer needs (and has) a full parser (producing ASTs) for the language.
You didn't identify the language for your example. But in general, for a language looking like that, the SmartDifferencer would typically report that User2 commit changes were:
Replaced (numeric literal) "1" in line 1 column 13 by "2"
Inserted (statement) "allrecords = range(10)" after line 2
and that User3 commit changes were:
Move (statement) at line 1 after line 4
If you know who contributed the original code, with the edits you can straightforwardly determine who contributed which part of the final answer. You have to decide the unit-of-reporting; e.g., if you want report such contributions on a line by line basis for easy readability, or if you really want to track that Mary wrote the code, but Joe modified the number.
To detect that User3's change is semantically null can't be done with purely syntax-driven diff tool of any kind. To do this, the tool has to be able to compute the syntactic deltas somehow, and then compute the side effects of all statements (well, "phrases"), requiring a full static analyzer of the language to interpret the deltas to see if they have such null effects. Such a static analyzer requires a parser anyway so it makes sense to do a tree based differencer, but it also requires a lot more than just parser [We have such language front ends and have considered building such tools, but haven't gotten there yet].
Bottom line: there is no simple algorithm for determining "that user3 did not change anything". There is reasonable hope that such tools can be built.

Wiktionary/MediaWiki Search & Suffix Filtering

I'm building an application that will hopefully use Wiktionary words and definitions as a data source. In my queries, I'd like to be able to search for all Wiktionary entries that are similar to user provided terms in either the title or definition, but also have titles ending with a specified suffix (or one of a set of suffixes).
For example, I want to find all Wiktionary entries that contain the words "large dog", like this:
https://en.wiktionary.org/w/api.php?action=query&list=search&srsearch=large%20dog
But further filter the results to only contain entries with titles ending with "d". So in that example, "boarhound", "Saint Bernard", and "unleashed" would be returned.
Is this possible with the MediaWiki search API? Do you have any recommendations?
This is mostly possible with ElasticSearch/CirrusSearch, but disabled for performance reasons. You can still use it on your wiki, or attempt smart search queries.
Usually for Wiktionary I use yanker, which can access the page table of the database. Your example (one-letter suffix) would be huge, but for instance .*hound$ finds:
Afghan_hound
Bavarian_mountain_hound
Foxhound
Irish_Wolfhound
Mahound
Otterhound
Russian_Wolfhound
Scottish_Deerhound
Tripehound
basset_hound
bearhound
black_horehound
bloodhound
boarhound
bookhound
boozehound
buckhound
chowhound
coon_hound
coonhound
covert-hound
covert_hound
coverthound
deerhound
double-nosed_andean_tiger_hound
elkhound
foxhound
gazehound
gorehound
grayhound
greyhound
harehound
heckhound
hell-hound
hell_hound
hellhound
hoarhound
horehound
hound
limehound
lyam-hound
minkhound
newshound
nursehound
otterhound
powder_hound
powderhound
publicity-hound
publicity_hound
rock_hound
rockhound
scent_hound
scenthound
shag-hound
sighthound
sleuth-hound
sleuthhound
slot-hound
slowhound
sluthhound
smooth_hound
smoothhound
smuthound
staghound
war_hound
whorehound
wolfhound

Logic for parsing names

I am wanting to solve this problem, but am kind of unsure how to correctly structure the logic for doing this. I am given a list of user names and I am told to find an extracted name for that. So, for example, I'll see a list of user names such as this:
jason
dooley
smith
rob.smith
kristi.bailey
kristi.betty.bailey
kristi.b.bailey
robertvolk
robvolk
k.b.dula
kristidula
kristibettydula
kristibdula
kdula
kbdula
alexanderson
caesardv
joseluis.lopez
jbpritzker
jean-luc.vey
dvandewal
malami
jgarciathome
christophertroethlisberger
How can I then turn each user name into an extracted name? The only parameter I am given is that every user name is guaranteed to have at least a partial person's name.
So for example, kristi.bailey would be turned into "Kristi Bailey"
alexanderson would be turned into "Alex Anderson"
So, the pattern I see is that, if I see a period I will turn that into two strings (possibly a first and last name). If I see three periods then it will be first, middle. The problem I am having trouble finding the logic for is when the name is just clumped up together like alexanderson or jgarciathome. How can I turn that into an extracted name? I was thinking of doing something like if I see 2 consonants and a vowel in a row I would separate the names, but I don't think that'll work.
Any ideas?
I'd use a string.StartsWith method and a string.EndsWith method and determine the maximum overlap on each. As long as it's more than 2 characters, call that the common name. Sort them into buckets based on the common name. It's a naive implementation, but it that's where I'd start.
Example:
string name1 = "kristi.bailey";
string name2 = "kristi.betty.bailey";
// We've got a 6 character overlap for first name:
name2.StartsWith(name1.Substring(0,6)) // this is true
// We've got a 6 character overlap for last name:
name2.EndsWith(name1.Substring(7)) // this is true
HTH!

Resources