Having a string array field as an index, good or bad? - dynamics-ax-2009

My gut feel is that setting a string (with array elements) field as an index on a table will be bad for performance (where the bulk of the operations done on a table are inserts and updates - the table holds transactional data and its current size is approximately 20 mil records).
The string extends a type with 4 array elements, where all of them aren’t always populated. I need to justify why not to set this field as one of the indexes. I’ve tried searching for answers, reading Kimberley Tripps blog, going through best practises re indexes on MSDN (which only mentions indexes are best on numerics first, then string fields), etc. But none of these mention indexing the table on a field that is of an array type. What reasons can I give to justify not indexing on the string-array field. And if my gut feel is totally wrong and indexes work well on array fields, why so?

A Memo or Container field cannot be part of an index in AX.
Furthermore, columns consisting of the ntext, text, or image data types cannot be specified as columns for an index in SQL Server.

Let's say you have an extended data type ArrElement with 3 additional array elements ArrElement2, ArrElement3, ArrElement4. Creating an index with a field of the ArrElement type in AX will effectively create an index with 4 fields (ArrElement, ArrElement2, ArrElement3, and ArrElement4 - in that order) in SQL Server. You cannot change the order of the array elements in the index, but in my opinion there's really nothing wrong in having such an index if it really serves your purpose. Hope that answers your question.

As #10p noted adding say Dimension as the only field, will create an index of all the array elements: Dimension, Dimension2_, Dimension3_ (which are the names of the SQL table fields).
The value of such an index will depend on the queries performed. If only Dimension[3] is queried, then the index is of no value because Dimension[1] and Dimension[2] is not known.
This could be solved by creating an index for each of the array elements, for example:
Dim1Idx: Dimension[1] (maybe append more fields)
Dim2Idx: Dimension[2] (maybe append more fields)
Dim3Idx: Dimension[3] (maybe append more fields)
Individual array elements can be selected by using the combo-box on the index field.
The value of such indexes should be weighted against the added cost of insertion (and update, if the array values are changed).

Related

Get raw Low Cardinality values in Clickhouse

Is there a way to retrieve the underlying values of LowCardinality types in Clickhouse? I would also need to retrieve a mapping (in a separate query) of the underlying values to the logical values. I've tried using lowCardinalityIndices and lowCardinalityKeys but it appears that indices -> keys returned by those functions are a many to many mapping.
Thank you!
Your question does not make sense.
Column with LowCardinality does not have a single dictionary. Each part has multiple dictionaries for a single LowCardinality column. That's why your observe this lowCardinalityIndices/lowCardinalityKeys behaviour.

Elasticsearch query on string representation of number

Good day:
I have an indexed field called amount, which is of string type. The value of amount can be either one or 1. Say in this example, we have amount=1 as an indexed document but, I try to search for one, ElasticSearch will not return the value unless I put 1 for the search query. Thoughts on how I can get this to work? I'm thinking a tokenizer is what's needed.
Thanks.
You probably don't want this for sevenmillionfourhundredfifteenthousendtwohundredfourteen and the like, but only for a small number of values.
At index time I would convert everything to a proper number and store it in a numerical field, which then even allows to sort --- if you need it. Apart from this I would use synonyms at index and at query time and map everything to the digit-strings, but in a general text field that is searched by default.

Categorising documents in elasticsearch

I've got a bunch of ES documents that I'd like to put into "collections". Each document has a unique integer as an ID. Each collection also needs to have a unique integer as an ID.
I need to be able to run queries to get a list of docs in a collection, and easily add an existing doc to a collection.
What would be the most efficient and logical way of approaching this:
An index of collections, which each has an array of document IDs, or
For each document have an array of integers (or a single integer) indicating to which collections it belongs?
Thank you.

Sort by a different index's values

Given two indexes, I'm trying to sort the first based on values of the second.
For example, Index 1 ('Products') has fields id, name. Index 2 ('Prices') has fields id, price.
Struggling to figure out how to sort 'Products' by the 'Prices'.price, assuming the ids match. Reason for this quest is that hypothetically the 'Products' index becomes very large (with duplicate ids), and updating all documents becomes expensive.
Elasticsearch is a document based store, rather than a column based store. What you're looking for is a way to JOIN the two indices, however this is not supported in Elasticsearch. The 'Elasticsearch way' of storing these documents is to have 1 index that contains all relevant data. If you're worried about update procedures taking very long, look into creating an index with an Alias. When you need to do a major update, do it to a new index and only when you're done switch the alias target to the new index, this will allow you to update you data seamlessly

MongoDB: Two-field index vs document field index

I need to index a collection by two fields (unique index), say field1 and field2. What's better approach in terms of performance:
Create a regular two-column index
-or -
Combine those two fields in a single document field {field1 : value, field2 : value2} and index that field?
Note: I will always be querying by those two fields together.
You can keep the columns separate and create a single index that will increase performance when querying both fields together.
db.things.ensureIndex({field1:1, field2:1});
http://www.mongodb.org/display/DOCS/Indexes#Indexes-CompoundKeysIndexes
Having the columns in the same column provides no performance increases, because you must index them the same way:
db.things.ensureIndex({fields.field1:1, fields.field2:1});
http://www.mongodb.org/display/DOCS/Indexes#Indexes-EmbeddedKeys
Or you can index the entire document
db.things.ensureIndex({fields: 1});
http://www.mongodb.org/display/DOCS/Indexes#Indexes-DocumentsasKeys
There could be a possible performance increase, but doubtfully very much. Use the test database, create test data and benchmark some tests to figure it out. We would love to hear your results.
I'd create a compound index over both fields. This'll take up less disk space because you won't need to store the extra combined field, and give you the bonus of an additional index over the first field, i.e. an index over { a:1, b:1 } is also an index over { a:1 }.

Resources