Building a multi-record flat file - linq

I'm struggling to find a proper solution for generating a flat file.
Here are some criteria I need to take care of:
The file has a header with summary of its following records
there could be multiple Collection Header Records with multiple Batch Header Records which contain multiple records of different types.
All records within a Batch have a checksum which has to be added to a batch checksum. This one has to be added to the collection Header checksum and that again to the file checksum. Also each entry in the file has a counter value.
So my plan was to create a class for each record. but what now? I have the records and the "summary records", the next step would be to bring them all in order, count the sums and then set the counters.
How should I proceed from here, should I put everything in a big SortedList? If so, how do I know where to add the latest record (It has to be added to its representing batch summary)?
My first idea was to do something like this:
SortedList<HeaderSummary, SortedList<BatchSummary, SortedList<string, object>>>();
But it is hard to navigate through the HeaderSummaries and BatchSummaries to add a object in the inner Sorted list, bearing in mind that I may have to create and add a HeaderSummary / BachtSummary.
Having several different ArrayLists like one for Header, one for Batch and one for the rest gives me problems when combining them to a flat file because of the order and the - yet to set - counters, while keeping the order etc.
Do you have any clever solution for such a flat file?

Consider using classes to represent levels of your tree structure.
interface iBatch {
public int checksum { get; set; }
}
class BatchSummary {
int batchChecksum;
List<iBatch> records;
public void WriteBatch() {
WriteBatchHeader();
foreach (var record in records)
batch.WriteRecord();
}
public void Add(iBatch rec) {
records.Add(rec); // or however you find the appropriate batch
}
}
class CollectionSummary {
int collectionChecksum;
List<BatchSummary> batches;
public void WriteCollection() {
WriteCollectionHeader();
foreach (var batch in batches)
batch.WriteBatch();
}
public void Add(int WhichBatch, iBatch rec) {
batches[whichBatch].Add(rec); // or however you find the appropriate batch
}
}
class FileSummary {
// ... file summary info
int fileChecksum;
List<CollectionSummary> collections;
public void WriteFile() {
WriteFileHeader();
foreach (var collection in collections)
collection.WriteCollection();
}
public void Add(int whichCollection, int WhichBatch, iBatch rec) {
collections[whichCollection].Add(whichBatch, rec); // or however you find the appropriate collection
}
}
Of course, you could use a common Summary class to be more DRY, if not necessarily more clear.

Related

Best Key to use when storing GameObjects in Hashtable? - Unity, C#

I'm working towards writing a script to take a "snapshot" of the initial attributes of all children of a GameObject. Namely at startup I want to save the position, orientation & color of all these objects in a Hashtable. The user has the ability to move & modify these objects during runtime, and I want to update the Hashtable to keep track of this. This will allow me to create an Undo last action button.
I found that gameObject.name isn't a good Key for my Hashtable entries because sometimes multiple game objects have the same name (like "cube"). So what would make a better Key? It's clear that Unity differentiate between two identical game objects with the same name, but how? I don't want to have to manually Tag every game object. I want to eventually bring in a large CAD file with hundreds of parts, and automatically record them all in a Hashtable.
For example, the code below works fine, unless I have multiple game objects with the same name. Then I get this error ArgumentException: Item has already been added. Key in dictionary: 'Cube' Key being added: 'Cube'
public class GetAllObjects : MonoBehaviour
{
public Hashtable allObjectsHT = new();
void Start()
{
Debug.Log("--Environment: GetAllObjects.cs <<<<<<<<<<");
foreach (Transform child in transform)
{
allObjectsHT.Add(child.gameObject.name, child);
}
}
}
Thanks Chuck this is what I want, and you solved my problem:
public class GetAllObjects : MonoBehaviour
{
UnityEngine.Vector3 startPosition;
UnityEngine.Quaternion startRotation;
public Hashtable allObjectsHT = new();
void Start()
{
Debug.Log("--Environment: GetAllObjects.cs <<<<<<<<<<");
foreach (Transform child in transform)
{
startPosition = child.position;
startRotation = child.rotation;
Hashtable objHT = new();
objHT.Add("position", startPosition);
objHT.Add("rotation", startRotation);
allObjectsHT.Add(child, objHT);
}
}
}
It's good to use meaningful keys you can refer to, otherwise you'd just use a collection without keys like a List. You could use an editor script to name all of the objects you import and use the names as keys. e.g.
int i = 0;
foreach(GameObject g in Selection.gameObjects)
{
g.name = "Object_" + i.ToString();
i++;
}
You could make the naming more sophisticated and meaningful of course, this is just an example.

How to retrieve data by property in Couchbase Lite?

My documents have the property docType that separated them based on the purpose of each type, in the specific case template or audit. However, when I do the following:
document.getProperty("docType").equals("template");
document.getProperty("docType").equals("audit");
The results of them are always the same, it returns every time all documents stored without filtering them by the docType.
Below, you can check the query function.
public static Query getData(Database database, final String type) {
View view = database.getView("data");
if (view.getMap() == null) {
view.setMap(new Mapper() {
#Override
public void map(Map<String, Object> document, Emitter emitter) {
if(String.valueOf(document.get("docType")).equals(type)){
emitter.emit(document.get("_id"), null);
}
}
}, "4");
}
return view.createQuery();
}
Any hint?
This is not a valid way to do it. Your view function must be pure (it cannot reference external state such as "type"). Once that is created you can then query it for what you want by setting start and end keys, or just a set of keys in general to filter on.

Parallel Access to Elements in a Groovy List

This is a simple efficiency question around the Groovy language; I have a Customer object that within it has an id and I would like to transfer those IDs into another list which in my view is atomic so can be paralleled.
e.g. linear execution
public List<Long> extractIds(List<Customer> customerList) {
List<Long> customerIds = new ArrayList<Long>();
customerList.each { it -> customerIds.add(it.id) }
}
Question: What is the most efficient way to transfer the IDs in the above example when holding a large volume of customers?
The simplest method would be:
public List<Long> extractIds(List<Customer> customerList) {
customerList.id
}
Or, if you want to do it in a multi-threaded fashion, you can use gpars:
import static groovyx.gpars.GParsPool.withPool
public List<Long> extractIds(List<Customer> customerList) {
withPool {
customerList.collectParallel { it.id }
}
}
But you may find the first brute-force method is quicker for this simple example (rather than spinning up a thread pool, and synchronizing the collection of results from different threads)

Applications of linked lists

What are some good examples of an application of a linked list? I know that it's a good idea to implement queues and stacks as linked lists, but is there a practical and direct example of a linked list solving a problem that specifically takes advantage of fast insert time? Not just other data structures based on linked lists.
Hoping for answers similar to this question about priority queues: Priority Queue applications
I have found one myself: A LRU (least recently used) cache implemented with a hash table and a linked list.
There's also the example of the Exception class having an InnerExeption
What else is there?
I work as a developer at a "large stock market" in the US. Part of what makes us operate at very fast speed is we don't do any heap allocation/de-allocation after initialization (before the start of the day on the market). This technique isn't unique to exchanges, it's also common in most real time systems.
First of all, for us, Linked lists are preferred to array based lists because they do not require heap allocation when the list grows or shrinks. We use linked lists in multiple applications on the exchange.
One application is to pre-allocate all objects into pools (which are linked lists) during initialization; so whenever we need a new object we can just remove the head of the list.
Another application is in order processing; every Order object implements a linked list entry interface (has a previous and next reference), so when we receive an order from a customer, we can remove an Order object from the pool and put it into a "to process" list. Since every Order object implements a Linked List entry, adding at any point in the list is as easy as populating a previous and next references.
Example off the top of my head:
Interface IMultiListEntry {
public IMultiListEntry getPrev(MultiList list);
public void setPrev(MultiList list, IMultiListEntry entry);
public IMultiListEntry getNext(MultiList list);
public void setNext(MultiList list, IMultiListEntry entry);
}
Class MultiListEntry implements IMultiListEntry {
private MultiListEntry[] prev = new MultiListEntry[MultiList.MAX_LISTS];
private MultiListEntry[] next = new MultiListEntry[MultiList.MAX_LISTS];
public MultiListEntry getPrev(MultiList list) {
return prev[list.number];
}
public void setPrev(MultiList list, IMultiListEntry entry) {
prev[list.number] = entry;
}
public IMultiListEntry getNext(MultiList list) {
return next[list.number];
}
public void setNext(MultiList list, IMultiListEntry entry) {
next[list.number] = entry;
}
}
Class MultiList {
private static int MAX_LISTS = 3;
private static int LISTS = 0;
public final int number = LISTS++;
private IMultiListEntry head = null;
private IMultiListEntry tail = null;
public IMultiListEntry getHead() {
return head;
}
public void add(IMultiListEntry entry) {
if (head==null) {
head = entry;
} else {
entry.setPrevious(this, tail);
tail.setNext(this, entry);
}
tail = entry;
}
public IMultiListEntry getPrev(IMultiListEntry entry) {
return entry.getPrev(this);
}
public IMultiListEntry getNext(IMultiListEntry entry) {
return entry.getNext(this);
}
}
Now all you have to do is either extend MultiListEntry or implement IMultiListEntry and delegate the interface methods to an internal reference to a MultiListEntry object.
The answer could be infinitely many and "good example" is a subjective term, so the answer to your question is highly debatable. Of course there are examples. You just have to think about the possible needs of fast insertion.
For example you have a task list and you have to solve all the tasks. When you go through the list, when a task is solved you realize that a new task has to be solved urgently so you insert the task after the task you just solved. It is not a queue, because the list might be needed in the future for reviewing, so you need to keep your list intact, no pop method is allowed in this case.
To give you another example: You have a set of names ordered in alphabetical order. Let's suppose that somehow you can determine quickly the object which has its next pointing to the object where a particular name is stored. If you want to quickly delete a name, you just go to the previous item of the object to be deleted. Deletion is also quicker than in the case of stacks or queues.
Finally, imagine a very big set of items which needs to be stored even after your insertion or deletion. In this case it is far more quicker to just search for the item to be deleted or the item before the position where your item should be inserted and then do your operation than copy your whole large set.
hashmaps in java uses link list representation.
When more than one key hashes on the same place it results in collision and at that time keys are chained like link list.

SubSonic 3 ActiveRecord generated code with warnings

While using SubSonic 3 with ActiveRecord T4 templates, the generated code shows many warnings about CLS-compliance, unused items, and lack of GetHashCode() implementation.
In order to avoid them, I did the following modifications:
// Structs.tt
[CLSCompliant(false)] // added
public class <#=tbl.CleanName#>Table: DatabaseTable
{ ...
// ActiveRecord.tt
[CLSCompliant(false)] // added
public partial class <#=tbl.ClassName#>: IActiveRecord
{
#region Built-in testing
#pragma warning disable 0169 // added
static IList<<#=tbl.ClassName#>> TestItems;
#pragma warning restore 0169 // added
...
public override Int32 GetHashCode() // added
{
return this.KeyValue().GetHashCode();
}
...
Is there a better way to get rid of the warnings? Or a better GetHashCode() implementation?
Currently, the only way to get rid of the warnings is to update your t4 templates and submit a bug/fix to Rob. Or wait until somebody else does.
As for the GetHashCode implementation, I don't think you're going to find a good way to do this through templates. Hash code generation is very dependent on what state your object contains. And people with lots of letters after their name work long and hard to come up with hash code algorithms that are fast and return results with low chances of collision. Doing this from within a template that may generate a class with millions of different permutations of the state it may hold is a tall order to fill.
Probably the best thing Rob could have done would be to provide a default implementation that calls out to a partial method, checks the result and returns it if found. Here's an example:
public partial class Foo
{
public override int GetHashCode()
{
int? result = null;
TryGetHashCode(ref result);
if (result.HasValue)
return result.Value;
return new Random().Next();
}
partial void TryGetHashCode(ref int? result);
}
public partial class Foo
{
partial void TryGetHashCode(ref int? result)
{
result = 5;
}
}
If you compile this without the implementation of TryGetHashCode, the compiler completely omits the call to TryGetHashCode and you go from the declaration of result to the check to see if it has value, which it never will, so the default implementation of the hash code is returned.
I wanted a quick solution for this as well. The version that I am using does generate GetHashCode for tables that have a primary key that is a single int.
As our simple tables use text as their primary keys this didn't work out of the box. So I made the following change to the template near line 273 in ActiveRecord.tt
<# if(tbl.PK.SysType=="int"){#>
public override int GetHashCode() {
return this.<#=tbl.PK.CleanName #>;
}
<# }#>
<# else{#>
public override int GetHashCode() {
throw new NotImplementedException();
}
<# }#>
This way GetHashCode is generated for all the tables and stops the warnings, but will throw an exception if called (which we aren't).
We use this is for a testing application, not a website or anything like that, and this approach may not be valid for many situations.

Resources