FPC JSON-unit: don't re-sort

FPC JSON-unit: don't re-sort - pascal

Every time I save file using JSONConf unit, json file is sorted in new order. E.g. "myname: {...}" can be saved to end of file or to middle of file (middle of key order).
How to set one sort order forever, e.g. sort by ABC. This way text editor can open file at same key at same offset.
Laz 1.5 svn, fpc 3.0.
Example of sort needed:
{
"aa":1,
"dd":1,
"tt":1,
"z":1
}

Related

Reversing order in ng-repeat output using TamperMonkey

So I have this line on a page that provides a list of serial numbers as they are scanned in:
<div class='animate' ng-repeat='serialIdEntry in scan.scannedSerialIds track by $index'>
Presently each new scanned serial is appended to the bottom of the list, and it would be wildly more efficient and helpful were the order reversed.
Meaning, I need the most recent scan to be at top, and the whole of the list (as displayed) in reverse order.
I believe I can accomplish that by adding | orderBy:'':true" after the track by $index argument. Or by simply replacing that line with one that included the orderBy argument.
I suspect the method of updating on that page is AJAX/Angular, and I'm wondering how best to modify the output by way of a TamperMonkey script. I've been reading up on `waitForKeyElements', but that seems to largely speak to appending a new line of code rather than inserting a new snippet inline, or replacing a single line with one that has been modified.

Copying file details from Explorer as tabular text

I am looking for a way to easily copy the file details that appear in a Windows Explorer (details view) and paste it as tabular text.
Ideally, the procedure would be to select some files in an Explorer, make a choice in the context menu (or use a shortcut key), and the list would be copied to the clipboard. When pasting, the tabular format would be preserved so that Excel would recognize the columns or Word keep tabs (or create a table).
I would like to have a solution that transfers the available columns, and not just a predefined set a details such as name + size + date.
Do you think that there is an easy way to achieve this functionality ? I am ready to program in any language if necessary but I need a path to follow. I also need a procedure to integrate it in Windows (Vista and later) so that a few clicks suffice.

1) Create context menu shell extension. It must implement IShellExtInit, IContextMenu(2,3) and IObjectWithSite. Register your shell extension on HKCR\AllFilesystemObjects key.
2) Before Explorer calls IContextMenu.InvokeCommand it calls IObjectWithSite.SetSite. Save Site value.
3) Inside IContextMenu.InvokeCommand:
Site.QueryInterface(IServiceProvider, ServiceProvider)
ServiceProvider.QueryService(SID_SFolderView, IColumnManager, ColumnManager)
ColumnManager.GetColumnCount(CM_ENUM_VISIBLE, Count)
GetMem(Keys, SizeOf(PPropertyKey) * Count)
ColumnManager.GetColumns(CM_ENUM_VISIBLE, Keys, Count)
Now you have array of all visible columns.
4) Extract IShellFolder of current folder from IDataObject passed to your handler in IShellExtInit.Initialize.
5) Extract PItemIDList of every file in IDataObject.
6) For every PItemIDList:
6.1) Call ShellFolder.BindToObject(Child, nil, IPropertyStore, PropertyStore) to get PropertyStore of item.
6.2) For every PropertyKey in Keys array:
6.2.1) Call PropertyStore.GetValue(PropertyKey, Value);
6.2.2) Convert Value to string with PropVariantToStringAlloc function.
6.2.3) Store string representation of Value in you internal txt storage.
7) Copy your txt storage to clipboard.
8) Free all resources.
Update 1
Also you can try to use IShellFolder2.GetDetailsEx instead of using IPropertyStore.
Update 2
In case of using IPropertyStore you can additionally call IPropertySystem.FormatForDisplayAlloc to format the value. For example for PKEY_Size PropertyStore.GetValue return "100000" but PropertySystem.FormatForDisplayAlloc will format value to "100 KB".
Update 3
It was quite interesting task so I created my own shell extension which copies details to clipboard. It can be downloaded via link http://www.shellace.com/bin/CopyDetails.zip

Editing a CSV file in place, row by row

I have a long CSV file with two columns of numbers:
1,2
2,5
7,3
etc...
I would like to add a third column equal to the sum of the first two:
1,2,3
2,5,7
7,3,10
The following code is a solution to the problem, and it makes a copy of the input file, with the third column appended. Instead, I would like to operate on the input file line by line, writing the third column to each line as I went along. If the process gave error through for some reason, the answers to the first half of the file should already be saved and would not need to be recalculated.
I can't come up with a good way to do this using ruby's CSV class. Here's my current solution with the copied file:
require 'csv'
CSV.open("big_file.csv", "w") do |csv|
csv << %w{1 2}
csv << %w{2 5}
csv << %w{3 8}
end
big_csv_file = CSV.open("big_file.csv", 'r')
# I'm creating a copy of big_file.csv here
# I'd rather edit it in place
CSV.open("copy_with_extra_column.csv", "w") do |csv|
big_csv_file.each do |row|
row << eval(row[0] + row[1])
csv << row
end
end

To put this another, way, there is no way, at the fundamental file level, to "insert" the sum into the file. In your example:
1,2
2,5
7,2
If we ignore the whole notion of a "CSV" file (which is really just a concept layered on top of a stream text file) To "insert" the text ,3 at the end of the first line, we need to do all of these things:
move the "\n" after the 2, and all the following text two positions later in the file (leaving some junk in its place)
overwrite the junk with ",3"
Then you would repeat this process for each additional row.
This is obviously very inefficient. In simple terms, the CSV file format is not designed for efficient insertion of data.
Your two options are:
Load the file into memory (as, i.e., an array of lines), operate on it there, and then write it all back out over the existing file. Assuming your file only grows, this will work fine, but you'll need to be willing to allocate enough memory to read and operate on the whole file.
Write to a temporary file as you work through the data, and then move the temporary file in place of the original when you're done.
Updating the file "in place" is not practical.

A file is like one long string, for example:
1,2\n2,5
However, unlike a string, you can only overwrite characters in a file. In the example above, there are 7 characters. You can overwrite any of those characters with any characters you choose. So for instance, if you put the sum of the numbers at position 0 and position 2 into position 3, the result is:
1,232,5
That's probably not what you want because it looks like the first two numbers are 1 and 232 and their sum is 5. However, that is all you can do when editing a file inplace: you can only overwrite characters with other characters.
For a large file, you can read in one line, then write the altered line to a new file. When you are done, you can delete the original file, and then you can rename the new file to the old file name. You can use the Tempfile class to avoid name clashes for the new file name.

Instead of CSV.open(), try CSV.read(). For example, it's obviously a little ugly, but:
big_csv_file = CSV.read("big_file.csv")
big_csv_file[0] << eval(big_csv_file[0][0] + big_csv_file[0][1])
CSV.open("copy_with_extra_column.csv", "w") do |csv|
big_csv_file.each do |row|
csv << row
end
end
If you need the file to always be at the latest, the alterations and the writing will need to be in a loop, obviously.

OpenTBS Replace Text

Is it possible to replace text without having to create a TAG?
I'm using a template to fill some data in a word document. One of the things I need to fill is the revision number, something like: "1º Revision" ([onshow.rev])
But the generated file can be re-submited several times, and I need to increment the revision for each time, but since the tag [onshow.rev] is gone from the first time, I can't do it anymore.
Is it possible the replace just the text "1º Revision" with something line "2º Revision"?
thks

With TinyButStrong, You can directly modify the contents using $TBS->Source.
But, with OpenTBS over TBS, you have to take care that only the current sub-file is actually accessible by $TBS->Source.
For example, if you are working on an Ms Excel sheet, then you have to do a $TBS->PlugIn(OPENTBS_SELECT_SHEET, $Sheet) in order to select the sheet to modify.
Here is how to replace your item in the template :
$TBS->Source = str_replace("1º Revision", "2º Revision", $TBS->Source);

sed optimization (large file modification based on smaller dataset)

I do have to deal with very large plain text files (over 10 gigabytes, yeah I know it depends what we should call large), with very long lines.
My most recent task involves some line editing based on data from another file.
The data file (which should be modified) contains 1500000 lines, each of them are e.g. 800 chars long. Each line is unique, and contains only one identity number, each identity number is unique)
The modifier file is e.g. 1800 lines long, contains an identity number, and an amount and a date which should be modified in the data file.
I just transformed (with Vim regex) the modifier file to sed, but it's very inefficient.
Let's say I have a line like this in the data file:
(some 500 character)id_number(some 300 character)
And I need to modify data in the 300 char part.
Based on the modifier file, I come up with sed lines like this:
/id_number/ s/^\(.\{650\}\).\{20\}/\1CHANGED_AMOUNT_AND_DATA/
So I have 1800 lines like this.
But I know, that even on a very fast server, if I do a
sed -i.bak -f modifier.sed data.file
It's very slow, because it has to read every pattern x every line.
Isn't there a better way?
Note: I'm not a programmer, had never learnt (in school) about algorithms.
I can use awk, sed, an outdated version of perl on the server.

My suggested approaches (in order of desirably) would be to process this data as:
A database (even a simple SQLite-based DB with an index will perform much better than sed/awk on a 10GB file)
A flat file containing fixed record lengths
A flat file containing variable record lengths
Using a database takes care of all those little details that slow down text-file processing (finding the record you care about, modifying the data, storing it back to the DB). Take a look for DBD::SQLite in the case of Perl.
If you want to stick with flat files, you'll want to maintain an index manually alongside the big file so you can more easily look up the record numbers you'll need to manipulate. Or, better yet, perhaps your ID numbers are your record numbers?
If you have variable record lengths, I'd suggest converting to fixed-record lengths (since it appears only your ID is variable length). If you can't do that, perhaps any existing data will not ever move around in the file? Then you can maintain that previously mentioned index and add new entries as necessary, with the difference is that instead of the index pointing to record number, you now point to the absolute position in the file.

I suggest you a programm written in Perl (as I am not a sed/awk guru and I don't what they are exactly capable of).
You "algorithm" is simple: you need to construct, first of all, an hashmap which could give you the new data string to apply for each ID. This is achieved reading the modifier file of course.
Once this hasmap in populated you may browse each line of your data file, read the ID in the middle of the line, and generate the new line as you've described above.
I am not a Perl guru too , but I think that the programm is quite simple. If you need help to write it, ask for it :-)

With perl you should use substr to get id_number, especially if id_number has constant width.
my $id_number=substr($str, 500, id_number_length);
After that if $id_number is in range, you should use substr to replace remaining text.
substr($str, -300,300, $new_text);
Perl's regular expressions are very fast, but not in this case.

My suggestion is, don't use database. Well written perl script will outperform database in order of magnitude in this sort of task. Trust me, I have many practical experience with it. You will not have imported data into database when perl will be finished.
When you write 1500000 lines with 800 chars it seems 1.2GB for me. If you will have very slow disk (30MB/s) you will read it in a 40 seconds. With better 50 -> 24s, 100 -> 12s and so. But perl hash lookup (like db join) speed on 2GHz CPU is above 5Mlookups/s. It means that your CPU bound work will be in seconds and you IO bound work will be in tens of seconds. If it is really 10GB numbers will change but proportion is same.
You have not specified if data modification changes size or not (if modification can be done in place) thus we will not assume it and will work as filter. You have not specified what format of your "modifier file" and what sort of modification. Assume that it is separated by tab something like:
<id><tab><position_after_id><tab><amount><tab><data>
We will read data from stdin and write to stdout and script can be something like this:
my $modifier_filename = 'modifier_file.txt';
open my $mf, '<', $modifier_filename or die "Can't open '$modifier_filename': $!";
my %modifications;
while (<$mf>) {
chomp;
my ($id, $position, $amount, $data) = split /\t/;
$modifications{$id} = [$position, $amount, $data];
}
close $mf;
# make matching regexp (use quotemeta to prevent regexp meaningful characters)
my $id_regexp = join '|', map quotemeta, keys %modifications;
$id_regexp = qr/($id_regexp)/; # compile regexp
while (<>) {
next unless m/$id_regexp/;
next unless $modifications{$1};
my ($position, $amount, $data) = #{$modifications{$1}};
substr $_, $+[1] + $position, $amount, $data;
}
continue { print }
On mine laptop it takes about half minute for 1.5 million rows, 1800 lookup ids, 1.2GB data. For 10GB it should not be over 5 minutes. Is it reasonable quick for you?
If you start think you are not IO bound (for example if use some NAS) but CPU bound you can sacrifice some readability and change to this:
my $mod;
while (<>) {
next unless m/$id_regexp/;
$mod = $modifications{$1};
next unless $mod;
substr $_, $+[1] + $mod->[0], $mod->[1], $mod->[2];
}
continue { print }

You should almost certainly use a database, as MikeyB suggested.
If you don't want to use a database for some reason, then if the list of modifications will fit in memory (as it currently will at 1800 lines), the most efficient method is a hashtable populated with the modifications as suggested by yves Baumes.
If you get to the point where even the list of modifications becomes huge, you need to sort both files by their IDs and then perform a list merge -- basically:
Compare the ID at the "top" of the input file with the ID at the "top" of the modifications file
Adjust the record accordingly if they match
Write it out
Discard the "top" line from whichever file had the (alphabetically or numerically) lowest ID and read another line from that file
Goto 1.
Behind the scenes, a database will almost certainly use a list merge if you perform this alteration using a single SQL UPDATE command.

Good deal on the sqlloader or datadump decision. That's the way to go.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

FPC JSON-unit: don't re-sort - pascal

Related

Reversing order in ng-repeat output using TamperMonkey

Copying file details from Explorer as tabular text

Editing a CSV file in place, row by row

OpenTBS Replace Text

sed optimization (large file modification based on smaller dataset)

Categories

Resources