Will guid generation logic ever generate 00000000-0000-0000-0000-000000000000? - random

I have run into a scenario where a customer may have a nullable guid for me to compare to in my database. I could go ahead and rewrite my call stack to make the object nullable, but I also don't want to do that. I'm considering just doing a null check at the data ingestion point and coalescing to 00000000-0000-0000-0000-000000000000, but am not sure if that's "safe" as I'm not sure if a guid generator could actually generate that guid randomly
(yes, I know that I could pick any random guid and the odds of a guid generator picking it are stupidly low, but I'm really more curious if the 0-value guid is reserved)

Related

What would be the most appropriate data structure given these requirements?

We are building Search API in our company for some of our entities - events, leagues and sports each of which has name property and we have difficulties implementing business requirements.
TL;DR; What will be the data structure addressing these business requirements better than basic Red-Black tree does?
What we are the business requirements?
Data structure needs to be sorted so following requirements are easier for implementation therefore insertion should not break this property.
Data structure needs to hold information about it's entities, so node key(entity's name property) will be used for searching, but the node needs to hold all the entities with name property starting with node key value.
Data structure needs to support deletion by id. Id is also a property of all entities.
It needs to support index search (up to 3 characters) so if someone searches for "aaa" every node with key between "aaaa.." and "aaaz" should appear. (ex. query = "aaa", index = "aaa", "aaab", "aaaab", "aaaz", result should be "aaa", "aaab", "aaaab").
We need to search by localized node key.
What we have done so far?
We started our first iteration using built-in red-black tree (SortedSet in C#) and for nodes we had structure that holds the name property of the entity and all related events to that name property. And with one helper method we satisfied business requirements (1), (2) and (4).
As our second iteration we had to support deletion so we created a map(Dictionary) of entity id's to references to entity objects put into the SortedSet. We do that because our request for deletion is only by id and we cannot recreate entity from id, so at addition we need to create such map. (maybe augumentation can help?) With this we secured requirement (3).
Now we need to support (5) however, with every iteration (business requirement we receive) it is getting harder and harder to implement and I almost feel like we need to change our data structure in order to address business criteria better.
Whats the problem with the localization?
We can create new SortedSet and re-use the implementation, but this comes with huge trade off. Let me elaborate.
We have 100 of clients, each of which has like 7-8 supported languages, languages in our system are unique per client so translations for one customer does not interfere with another (if someone wants to call it Soccer rather than Football, fine let it be.), besides that we have base languages (global for every client) which are basically default settings for newly create languages, so we can safely say that very large portion of client specific language (lets say english) is the same as the base one. Having said all of that, if we want to have accurate search for each client and locale individually we need to have index for each client and locale individually which on the other hand introduces massive amounts of duplication.
What I have thought so far?
I am not an expert in data structures myself, but I really want to make this right. Of course everything is possible with enough coding and hardware, but thats not the point.
I thought about implementing some binary tree (could be AVL, Red-Black, 2-3-4 etc.) and augment it to meet the requirements better than built in SortedSet does. This will hopefully solve a lot of the issue and workarounds we had to make so far and as I said address future requirements better so implementation is faster and more accurate, however like I said I am not an expert in data structures myself and sadly I am unable to map these business requirements to some data structure for the time frame I have, so without further a due, do you guys have any suggestions?
My suggestion here would be for your primary data structure to be a dictionary, keyed by product id, and the value is the product data. That gives you very quick insertion, and removal by product id.
For searching, provide a separate data structure that contains the product names and associated product ids.
class IndexEntry
{
string ProductName;
string ProductId; // or int, if ProductId is an integer
}
Since you allow customer-specific names, you'll have to add all those customer names to this index. Not a problem, but when you remove something by ID, you'll also have to remove the associated items from the other data structure. This will require a sequential search of the name index data structure to ensure that you get all the names associated with a particular product. That could be expensive, even if you use a tree structure.
To speed things up, you could have a "deleted" flag for those index entries, and then rebuild the structure periodically to remove the deleted items. That way, a deletion just requires a sequential scan. That's less than ideal, but if insertions and deletions are infrequent, quite acceptable.
The key, though, is to make your primary data structure that holds the product information indexed by product id. You can then build secondary indexes any way you want.

Change length of name fields

We are using Dynamics CRM 2016 on-premise. When you create a custom entity, you get a default "name" field which is a string with 100 characters. You can change that datatype during entity creation but we didn't do that.
Now we learned that 100 chars are not enough in our usecase, we would need 120 or 150.
The solution designer allows changing the string length but when we save the changes we get a generic database error.
Question: Is there a known workaround to change the string length of the main field?
Obviously, it is possible to create a new entity and copy the data from the old to the new entity. Since we have many views, forms and references between entities, this is not really feasible.
This is not possible using any conventional solutions (i.e. through the UI) due to constraints in the Database. The default name field is the primary key of the table. I encourage you to remake the entity and migrate existing data to the new entity.
If this is really not feasible then you can try to change the length of the column directly in the SQL DB, but that is unsupported so it might break the environment. If you want to try this be sure that you test this in an disposable environment.
I have never done it so I don't know the outcome but that is something that I would try.

find object real name by guid?

i am using windows7, and i saw many 3ab22e31-8264-4b4e-9af5-a8d2d8e33e62[1] &[25] in many of my device list's properties, is there way to find object name by GUID ?
what is {3ab22e31-8264-4b4e-9af5-a8d2d8e33e62}[1] &[25] refers to ?
thanks
Practically speaking, google it.
GUID's are used in thousands of contexts, and often there's not even a concept of a name. For instance, GUIDs are commonly used in databases to uniquely identify each row in a table. The table may have a name, but the row doesn't. But how would you even know which database to search? A GUID is effectively just a 128 bits number, almost without any indication where it came from (some can be traced by to your computer, but that's about it).

Sorting dynamic localized data

I've been working on a Java EE 6 project (EJB 3.1, JSF 2, JPA 2) for some time now and I cannot figure out a good way to sort dynamic localized data. With dynamic localized data, I mean a user created object with a name in one or several languages. To be more concrete, imagine a database table called X containing just an id (There's more columns ofc, this is just to simplify it), another table containing available languages, and one table binding the two together by putting a name on X for every language.
[x]
id integer PRIMARY KEY
[lang]
id integer PRIMARY KEY
[x_names]
x_id integer FOREIGN KEY REFERENCES x.id
lang_id integer FOREIGN KEY REFERENCES lang.id
name
PRIMARY KEY (x_id, lang_id)
[X.java]
#Id private Integer id;
#OneToMany(mappedBy="x", cascade=CascadeType.ALL, orphanRemoval=true)
private List<XName> names;
Now i'd like to present these objects sorted by name (while preserving the MVC model) for a user for a language chosen by this user. If the x-object lacks a name in this language, the name of a default language should be shown. X-objects will always have a name in at least one language (although there is no way to specify this database wise).
Java sorting seems kind of out of the question since the x-entity should not know what language is currently used and thus compareTo cannot do its job, so it feels like sorting in the database makes most sense. My first idea looked something like this:
SELECT DISTINCT x FROM X x LEFT JOIN x.names n ORDER BY n.name
This however makes no sense since no language is specified it will sometimes sort on one language, sometimes another.
This is the best that i've been able to come up with:
SELECT DISTINCT x FROM X x LEFT JOIN x.names n LEFT JOIN n.language l ORDER BY l.id, cn.name
Which could work in theory, always selecting the names from the lowest language id, enabling me to work in a priority order that way. The problem here is that DISTINCT does not work, I get one x-object for every name in the system. Could I perhaps send down the chosen language to the model layer and use it somehow while still getting ALL x objects, not just the ones with a language available? Any ideas would be appreciated.
The logic of all this is for a view where a user can organize these x-objects and easily fill in a name at any and all languages, while still seeing x-objects even if they dont have a name in the users language so that they can add one.
JPA provider is EclipseLink.
I realize now that sorting in the controller layer makes more sense, and skip database sorting completely. I just realized that you can implement your own comparator with Collections.sort which can have access to what language is currently selected and sort after that.

Random ID generation on Sign Up - Database Performance

I am making a site that each account will have an ID.
But, I didn't want to make it incrementable, meaning:
id=1
id=2
...
id=1000
What I want is to have random IDs:
id=2355
id=5647734
id=23532
...
(The reason is to avoid robots to check all accounts profiles by just incrementing a ID in URL - and maybe other reason, but that is not the question)
But, I am worried about performance on registration.
It will be something like this:
while (RANDOM_ID is not taken): generate new RANDOM_ID
On generating a new ID for the new account, I will query database (MySQL) to check if the ID exists, for each generation.
Is there any better solution for this?
Is there any disadvantage of using random IDs?
Thanks in advance.
There are many, many reasons not to do this:
Your solution, as written, is not transactionally-safe; two transactions at the same time could both generate the same "random" ID.
If you serialize the transaction in order to make it safe, you will slaughter performance because the query will keep every single collision row locked until it finds a spare ID.
Using a random ID as the primary key will fragment the hell out of your clustered index. This is bad enough with uuids - the whole point of an auto-generated identity column is so you can generate a safe sequence out of it.
Why not use a regular primary key, but just don't use that in any of your URLs? Generate a secondary non-sequential ID along with it - such as a uuid - index it, and use this column in any public-facing segments of your application instead of the primary key if you are really worried about security.
You can use UUIDs. It's a unique identifier generated based partly on timestamp. It's almost certainly guaranteed to be unique so you don't have to do a query to check.
i do not know what language you're using, but there should be library or sample code for this for most languages.
Yes you can use UUID but keep your auto_increment field. Just add a new field and set it so something like: md5(microtime(true).rand()) or whatever other method you like and use that unike key along the site to make the links instead to expose the primary key in urls.

Resources