Is it possible to create a custom sorting key at the column level in a MediaWiki table? - sorting

Using https://www.mediawiki.org/wiki/Help:Sorting and other similar guides, I see that it's possible to assign a custom sorting key to specific rows using data-sort-value. Is it possible to create a key at the column level that will sort all data in that column by that key?
The end goal is to be able to take a column of data that uses alphanumeric characters and sort them in a way makes sense in context as opposed to the default alphabetic or numeric options available with data-sort-type. Since many tables on the site will be using the same sorting key for this particular column, it makes more sense to apply it at the column level than to each row of every table.
Alternatively if MediaWiki doesn't have the above sorting feature, is it possible to create a custom data-sort-type in MediaWiki? If so, where would I start?

Related

DynamoDb delete with sort key

I have fields below in dynamo dB table
event_on -- string type
user_id -- number type
event name -- string type
Since this table may have multiple records for user_id and event_on is the single field which can be unique so I made it primary key and user_id as sort key
Now I want to delete the all records of a user, so My code is
response = dynamodb.delete_item(
TableName=events,
Key={
"user_id": {"N": str(userId)}
})
It throwing error
Exception occured An error occurred (ValidationException) when calling
the DeleteItem operation: The provided key element does not match the
schema
also is there anyway to delete with range
Can someone suggest me what should I have do with dynamodb table structure to make this code work
Thanks,
It sounds like you've modeled your data using a composite primary key, which means you have both a partition key and a sort key. Here's an example of what that looks like with some sample data.
In DynamoDB, the most efficient way to access items (aka "rows" in RDBMS language) is by specifying either the full primary key (getItem) or the partition key (query). If you want to search by any other attribute, you'll need to use the scan operation. Be very careful with scan, since it can be a costly way (both in performance and money) to access your data.
When it comes to deletion, you have a few options.
deleteItem - Deletes a single item in a table by primary key.
batchWriteItem - The BatchWriteItem operation puts or deletes multiple items in one or more tables. A single call to BatchWriteItem can write up to 16 MB of data, which can comprise as many as 25 put or delete requests
TimeToLive - You can utilize DynamoDBs Time To Live (TTL) feature to delete items you no longer need. Keep in mind that TTL only marks your items for deletion and actual deletion could take up to 48 hours.
In order to effectively use any of these options, you'll first need to identify which items you want to delete. Because you want to fetch using the value of the sort key alone, you have two options;
Use scan to find the items of interest. This is not ideal but is an option if you cannot change your data model.
Create a global secondary index (GSI) that swaps your partition key and sort key values. This pattern is called an inverted index. This would allow you to identify all items with a given user_id.
If you choose option 2, your data would look like this
This would allow you to fetch all item for a given user, which you could then delete using one of the methods I outlined above.
As you can see here, delete_item needs the primary key and not the sort key. You would have to do a full scan, and delete everything that contains the given sort key.
If you are created a DynamoDB table by the Primary key and sort key, you should provide both values to remove items from that table.
If the sort key was not added to the primary key on the table creation process, the record can be removed by the Primary key.
How I solved it.
Actually, I tried to not add the sort key when created the table. And I'm using indexes for sorting and getting items.

Is it possible to create the compound table using p:dataTable?

This is rather conceptual question, I have a compound table to be implemented and I am not sure if I should use <p:dataTable> for the same.
The structure is like there will be values for weekly basis and depending on those cumulative values will have to be calculated which will be part of the same table. Is it possible using <p:dataTable>? or I will have to create the structure using panel grids..rows and column..Any suggestions?
Structure: (the value are arbitrary for now)
please see here

Fastest way to find records that end with key

I'm looking for optimal way to search through millions of records that contain serial number saved as varchar column which ends with specified string key.
I was using EndsWith, however performance is rather poor if several queries are sent.
Is there a better way to do it?
EDIT:
Since search key is of variable length, I can't create column that holds cut-off value of serial number. However, I've done some tests with using Substring and Equals vs EndsWith and I've lowered down execution speed to 40% of the one of EndsWith.
I'm still looking for better solution though :)
Unfortunately, searching for strings ending with a particular pattern is difficult on most databases+, because searching for string suffixes cannot use an index. This results in full table scans, which may be slow on tables with millions of rows.
If your database supports reverse indexes, add one for your string key column; otherwise, you can improve performance by simulating reverse indexes:
Add a column for storing your string key in reverse
If your RDBMS supports computed columns, add one for the reversed key
Otherwise, define a trigger that populates the reversed column from the key column
Create an index on the reversed column
Use the reversed column for your searches by passing in the reversed suffix that you are looking for.
For example, if you have data like this
key
-----------
01-02-3-xyz
07-12-8-abc
then the augmented table would have
key rev_key
----------- -----------
01-02-3-xyz zyx-3-20-10
07-12-8-abc cba-8-21-70
and your search for ENDS_WITH(key, '3-xyz') would ask for STARTS_WITH(rev_key, 'zyx-3'). Since string indexes speed up lookups by prefix, the "starts with" lookup would go much faster.
+ One notable exception is Oracle, which provides reverse key indexes specifically for situations like this.

Sort by key in Cassandra

Let's assume I have a keyspace with a column family that stores user objects and the key of these objects is the username.
How can I use Hector to get a list of users sorted by username?
I tried to use a RangeSlicesQuery, paging works fine with this query, but the results are not sorted in any way.
I'm an absolute Cassandra beginner, can anyone point me to a simple example that shows how to sort a column family by key? Please ask if you need more details on my efforts.
Edit:
The result was not sorted because I used the default RandomPartitioner instead of the OrderPreseveringPartitioner in cassandra.yaml.
Probably it's better not to rely on the sorting by key but to use a secondary index.
Quoting Cassandra - The Definitive Guide
Column names are stored in sorted order according to the value of compare_with. Rows,
on the other hand, are stored in an order defined by the partitioner (for example,
with RandomPartitioner, they are in random order, etc.)
I guess you are using RandomPartitioner which
... return data in an essentially random order.
You should probably use OrderPreservingPartitioner (OPP) where
Rows are therefore stored
by key order, aligning the physical structure of the data with your sort order.
Be aware of inefficiency of OPP.
(edit on Mar 07, 2014)
Important:
This answer is very old now.
It is a system-wide setting. You can set in cassandra.yaml. See this doc. Again, OPP is highly discouraged. This document is for version 1.1, and you can see it is deprecated. It is likely that it is removed from latest version. If you do want to use OPP, you may want to revisit the architecture the architecture.
Or create a row called "meta:userNames" in same column family and put all user names as a look up hash. Something like that.
Users {
key: "meta:userNames" {david:david, paolo:paolo, victor:victor},
key: "paolo" {password:"*****", locale:"it_it"},
key: "david" {password:"*****", locale:"en_us"},
key: "victor" {password:"*****", locale:"en_uk"}
}
First query the meta:userNames columns (that are sorted) and use them to get the user rows. Don't try to get everything via single db query as in SQL driven databases. Use Cassandra as huge Hash Map which provides rapid random access to its data.

Enumerate indexes on a Extensible Storage Engine (ESENT) table

Background
I'm writing an adapter for ESE to .NET and LINQ in a Google Code project called eselinq. One important function I can't seem to figure out is how to get a list of indexes defined for a table. I need to be able to list available indexes so the LINQ part can automatically determine when indexes can be used. This will allow much more efficient plans for user queries if appropriate indexes can be found.
There are two related functions for querying index information:
JetGetTableIndexInfo - get index information by tableID
JetGetIndexInfo - get index information by tableName
These only differ in how the related table is specified (name or tableid). It sounds like these would support the function I want but all the info levels seem to require that I already have a certain index to query information for. The only exception is JET_IdxInfoCount, but that only counts how many indexes are present.
JET_IdxInfo with its JET_INDEXLIST sounds plausible but it only lists the columns on a specific index.
Alternatives
I am aware that I could get the index information another way, like annotations on .NET types corresponding to database tables, or by requiring a index mapping be provided ahead of time. I think there's enough introspection implemented to make everything else work out of the box without the user supplying extra information, except for this one function.
Another option may be to examine the system tables to find related index objects, but this is would mean depending on an undocumented interface.
To satisfy this question, I want a supported method of enumerating the indexes (just the name would be sufficient) on a table.
You are correct about JetGetTableIndexInfo and JetGetIndexInfo and JET_IdxInfo. The twist is that the data is returned in a somewhat complex: a temporary table is returned containing a row for the index and then a row for each column in the table. To just get the index names you will need to skip the column rows (the column count is given by the value of the columnidcColumn column in the first row).
For a .NET example of how to decipher this, look at the ManagedEsent project. In the MetaDataHelpers.cs file there is a method called GetIndexInfoFromIndexlist that extracts all the data from the temporary table.

Resources