In an application I am building at work, I have a large database with a table say "People" with 100,000 + rows. Furthermore the entries in this table contain two types of data :
Parent type and Child type, where each Child type entry has the database id of its parent in a special "Child_OF" column.
In memory, both db entry types are represented by corresponding classes "TParent" and "TChild", where each parent class has the field "children : TList".
Which is the fastest way, using ADO, to:
- create a list of Parents and correctly assign to them their children...
The way I see it... one can go about the problem by
1) retrieve in a bulk( by one sql query) all parents from the table and create the parents list with empty children lists.
2) retrieve in a bulk all children and for each parent try to find his/her children from the corresponding dataset.
Here is an example of what I have in mind for the assignment stage of the program...
procedure assignParentsTheirChildren(parentList: TList<TParent>;
ma_people: TADOTable);
var
i: Integer;
qry: TADOQuery;
aChild: TChild;
aParent: TParent;
begin
// create the query
qry := TADOQuery.Create(nil);
qry.Connection := ma_people.Connection;
// set the sql statement to fetch all children ...
qry.SQL.Clear;
qry.SQL.Add('Select * from ' + ma_people.TableName + ' WHERE ChildOF <> ' +
QuotedStr(''));
// supposedly do some optimization---
qry.CursorLocation := clUseClient; // load whole recordset in memory
qry.DisableControls;
// disable controls ensures that no dataset bound control will be updated while iterating the recordset
qry.CursorType := ctStatic; // set cursor to static
// open dataset
qry.Open;
// ***EDIT*** for completeness I add the suggestion made by Agustin Seifert below
qry.RecordSet.Fields['ChildOf'].Properties.Item['Optimize'].value := true;
for i := 0 to parentList.count - 1 do
begin
// get daddy
aParent := parentList[i];
qry.Filter := 'ChildOF = ' + QuotedStr(IntToStr(aParent.parentID));
qry.Filtered := true;
while (not qry.EOF) do
begin
aChild := TChild.Create;
getChildFromQuery(aChild, qry); // fills in the fields of TChild class...
aParent.children.Add(aChild);
qry.Next;
end;
end;
qry.Free;
end;
I guess the biggest bottleneck of the above code is that I am filtering the data for every new parent. Is there a faster rework using seek() or locate/find...? Basically one can assume that my dataset is static (during the time of creating the parents list) and network latency infinite:) (that is, I first want to do the child to parent assignment from memory).
Many thanks!
btw I am using Microsoft SQL Server 2012.
If you don't wanna change your code/logic, there's a way to optimize filter, find, sort operations in ADO.
Access the recordset an optimize the involved fields:
var
qry: TADOQuery;
rs: _Recordset;
...
begin
...
//after qry.Open;
rs := qry.Recordset;
rs.Fields['YourField'].Properties.Item['Optimize'].Value := True; //YourField = ChildOF in your case
This will create an index for the field. It takes a small amount of time vs the time it takes to filter lot of times without an index.
msdn: Optimize Property-Dynamic (ADO)
Related
Let's say I have a collection of UDTs. I populate it as below:
public type udtEmp
Id as long
Name as string
end type
dim col as new Collection
dim empRec as udtEmp, empDummy as udtEmp
for n = 1 to 100000
empRec = empDummy ' reset record
emp.Id = n
emp.Name = "Name " & n
col.add emp, cstr(emp.Id)
next
Now I want to loop through it. I am using a Long data type as the index to .Item()
dim n as long
For n = 1 To 100000
emp = col.Item(n)
Next
The code above works, but it's really slow - takes 10,000 milliseconds to iterate. If I accessed the collection via a key, its much faster - 78 milliseconds.
For n = 1 To 100000
emp = col.Item(cstr(n))
Next
The problem is that when I iterate over collection, I don't have the keys. If I had a collection of objects instead of UDTs, I could do for each obj in col, but with UDTs, it won't let me iterate in that manner.
One of my thoughts was to have a secondary collection of indexes and keys to point to the main collection, but I am trying not to complicate the code unless I absolutely have to.
So what are my options?
the elegance of the code or the performance of it is a serious decision you have to make. the choice should be based on the impact of the results. for each is elegant but slow and goes with objects and classes. but if the speed is a mater then use UDT and arrays.
in your case, i think an array of UDT is best suited for your situation. and to gain more speed , try to access arrays using SAFE_ARRAY (that you can google for it), the result is much impressive.
You can use a user typed class collection. It'll provide the for-each iteration ability with great performance.
Easiest way to make that happen is through the Class Builder Utility (https://msdn.microsoft.com/en-us/library/aa442930(v=vs.60).aspx). You might need to first run the Add-in Manager and load the Class Builder Utility. (I think that there were install options regarding these features when you installed vb6/vs6? So if you don't see the Class Builder Utility in the Add-in manager it's could be due to that).
To match your udt sample, using the Class Builder Utility, first add a class (eg: Employee), with two properties (eg: EmpId and EmpName, long and string types respectively). Then add a collection (eg: Employees) based on the Employee class. Save it to the project (that will create two new class modules) and close the Utility.
Now you can create the new Employees collection, load it up, and iterate through it via index, key or for-each. (note: don't use a pure number for the key - requesting an item by a key that is a pure number, even as a string, will be interpreted as an index request, it'll be slow and you probably won't get the desired item)
Also - once the new classes have been created, you can add customized properties and methods to them to handle whatever kinds of fancy stuff you may have requirements for.
Dim i As Long
Dim Emp As Employee
Dim colEmp As New Employees
Dim name As String
' Loading
For i = 1 To 100000
colEmp.Add i, "name" & CStr(i), "key" & CStr(i)
Next i
' iterate with index
For i = 1 To 100000
Set Emp = colEmp(i)
name = Emp.EmpName
Next i
' iterate with key
For i = 1 To 100000
Set Emp = colEmp("key" & i)
name = Emp.EmpName
Next i
'iterate with for-each
For Each Emp In colEmp
name = Emp.EmpName
Next Emp
Timings
On my system for the above code:
Loading time: 1 second
Index time: 20 seconds
Key time: 0.29 seconds
For-each time: 0.031 seconds
In my Delphi application, I use lookup fields, but in unusual way. Actually, I wanna update field in underlying data set, just like if it was in the same table.
Existing guides tell that there is no problem, just join the table and voila... I envy if they really succeeded this task with such simple solution. I do not. BTW I think I'm getting close to reach my goal. I have one question left: how the hell I can get value I just entered into DBGrid Cell?
I tried DBGrid[FieldName].EditValue and .DisplayText, but they show the same value as Field.Value, which doesn't change after exiting the column, because it is lookup field. Sender.NewValue is null. I'm using this function to update lookup table:
procedure TKDGridForm.LookupFieldChange(Sender: TField);
begin
if not Assigned(Sender) then
Exit;
Sender.OnChange := nil;
if not Assigned(Sender.LookupDataSet) then
Exit;
if Sender.LookupDataSet.Locate(Sender.LookupKeyFields, Sender.DataSet[Sender.KeyFields], []) then
Sender.LookupDataSet.Edit
else
Sender.LookupDataSet.Append;
// how do I get the value I just entered?
Sender.Value := KDGrid3[Sender.FieldName].DisplayText;
Sender.LookupDataSet.FieldValues[Sender.LookupResultField] := Sender.Value;
Sender.LookupDataSet.Post;
Sender.OnChange := LookupFieldChange;
end;
Here is SQL I used before I ended up with lookup fields:
select det.*,
od1.T_EQ T_SHABLON_EQ,
od1.T_NV T_SHABLON_NV,
od1.T_PRIM T_SHABLON_PRIM,
od2.T_EQ T_PRAVKA_EQ,
od2.T_NV T_PRAVKA_NV,
od2.T_PRIM T_PRAVKA_PRIM,
od3.T_EQ T_VALCOV_EQ,
od3.T_NV T_VALCOV_NV,
od3.T_PRIM T_VALCOV_PRIM,
od4.T_EQ T_REZKA2_EQ,
od4.T_NV T_REZKA2_NV,
od4.T_PRIM T_REZKA2_PRIM
from CMKNEW.details det
left join CMKNEW.OperDetails od1
ON det.nrec = od1.cdetail
and 81 = od1.coper
left join CMKNEW.OperDetails od2
ON det.nrec = od2.cdetail
and 82 = od2.coper
left join CMKNEW.OperDetails od3
ON det.nrec = od3.cdetail
and 83 = od3.coper
left join CMKNEW.OperDetails od4
ON det.nrec = od4.cdetail
and 84 = od4.coper
where det.ckd=:CKD order by det.NREC
Hope it will explain my task clearer. If you wanna mcve, I can extend this, though I think it's not essential.
My database is Oracle, connected through ADO. I'd like the solution to be as simple as possible.
I assume you're talking about a standard TDBGrid and that what you're asking is how to get the text which is displayed in a cell of the grid when you type into it, but before the grid's dataset is updated. At that point, the current row indicator in the LH column will have changed from the default right-pointing triangle to an I-beam
If so, the snippet below shows you how to do get this text value. The point is, in the condition I've described, what's in the cell hasn't yet been posted back to the underlying dataset field. What happens is that when you start editing, an InplaceEditor (TCustomMaskEdit descendant) is dynamically created, and it's this which holds the text value which is being edited.
Add a TTimer and a TMemo to your form and then run the code below to see what I mean.
type
TMyGrid = Class(TDBGrid);
procedure TMyForm.Timer1Timer(Sender: TObject);
var
S : String;
Grid : TmyGrid;
begin
Grid := TmyGrid(DBGrid1);
if Grid.InplaceEditor <> Nil then
S := Grid.InplaceEditor.Text
else
S := IntToStr(Grid.Col) + ':' + IntToStr(Grid.Row);
Grid.Invalidate;
Memo1.Lines.Insert(0, S);
end;
I need some help creating a Master/Detail report in Fast Reports for Delphi XE2.
I have a simple form which accepts 2 dates and 2 times from a user. I then have 2 Oracle Datasets on the form with which to retrieve my data. When the user presses the print button, the program accepts the values from the user and sends the values to the first oracle dataset, which then in turn retrieves the first value, and then sends this value along with the user accepted values to the second dataset to print the detail pertaining to the value retrieved.
For each dataset I do have a corresponding frxDBDataset component which is then assigned to the frxReport1 component. With in the report, I have created a Master Band which is assigned to dataset1, and a Detail Band assigned to datset2. When I run my report, dataset 1 brings back all the records, but dataset 2 only brings back the records for the first value and duplicates it for every record in dataset1.
Below is the code I am trying to execute:
opr_operator_ods.Close;
opr_operator_ods.SetVariable('DATEFROM', opr_datefrom_dtp.Date);
opr_operator_ods.SetVariable('DATETO', opr_dateto_dtp.Date);
opr_operator_ods.SetVariable('TIMEFROM', opr_timefrom_dtp.Text);
opr_operator_ods.SetVariable('TIMETO', opr_timeto_dtp.Text);
opr_operator_ods.Open;
if opr_operator_ods.RecordCount > 0 then
begin
while not opr_operator_ods.Eof do
begin
opr_operatorcount_ods.Close;
opr_operatorcount_ods.SetVariable('DATEFROM', opr_datefrom_dtp.Date);
opr_operatorcount_ods.SetVariable('DATETO', opr_dateto_dtp.Date);
opr_operatorcount_ods.SetVariable('TIMEFROM', opr_timefrom_dtp.Text);
opr_operatorcount_ods.SetVariable('TIMETO', opr_timeto_dtp.Text);
opr_operatorcount_ods.SetVariable('OPERATOR',
opr_operator_ods.FieldByName('opr_code').AsString);
opr_operatorcount_ods.Open;
while not opr_operatorcount_ods.Eof do
begin
frxReport1.PrepareReport(false);
opr_operatorcount_ods.Next;
end;
frxReport1.PrepareReport(true);
opr_operator_ods.Next;
end;
DecodeDate(opr_datefrom_dtp.Date, tyear, tmonth, tday);
StartDate := '''' + IntToStr(tday) + '/' + IntToStr(tmonth) + '/' + IntToStr(tyear) + '''';
DecodeDate(opr_dateto_dtp.Date, tyear, tmonth, tday);
EndDate := '''' + IntToStr(tday) + '/' + IntToStr(tmonth) + '/' + IntToStr(tyear) + '''';
frxReport1.Variables['StartDate'] := StartDate;
frxReport1.Variables['EndDate'] := EndDate;
//frxReport1.PrepareReport(True);
frxReport1.ShowPreparedReport;
How do I get the second dataset to move on to the next record's values?
This report used to work perfectly in Delphi 2005 with RaveReports6, but there we used code based form development which was easier to manipulate with a 'writeln' and not visual like with Fast Reports.
When creating the preview FastReport does something like this code:
while not MasterBand.DataSet.Eof do
begin
...Do special FastReport's work :)
while not DetailBand.DataSet.eof do
begin
...Do special FastReport's work :)
DetailBand.DataSet.Next;
end;
MasterBand.DataSet.Next;
end;
In your code:
while not opr_operatorcount_ods.Eof do
begin
frxReport1.PrepareReport(false);
opr_operatorcount_ods.Next; <-- here opr_operatorcount_ods is in the last position from PrepareReport
end;
Data bands may be a master or detail type, but they only control the positioning of data of the output page
(order and number of times displayed).
Data displayed by the objects in the bands depends on relationship between the two (or more) datasets.
So you should made relation
Relationship can be done in several ways.
If you want to use parameters you can do this as follows:
Place DataSource component.
Connect it to dataset1(opr_operator_ods) using DataSet property DataSet = opr_operator_ods;
in DataSource.OnDataChange event write :
opr_operatorcount_ods.Close;
......
//Set parameter(relation between opr_operator(Master) and opr_operatorcount(Detail)
opr_operatorcount_ods.Params.ParamByName('opr_code').asString := opr_operator_ods.FieldByName('opr_code').AsString);
opr_operatorcount_ods.Open;
And then prepare and print report as:
procedure Print;
begin
//Prepare Master dataset ( parameters, close open etc.) like :
opr_operator_ods.Close;
opr_operator_ods.SetVariable('DATEFROM', opr_datefrom_dtp.Date);
opr_operator_ods.SetVariable('DATETO', opr_dateto_dtp.Date);
opr_operator_ods.SetVariable('TIMEFROM', opr_timefrom_dtp.Text);
opr_operator_ods.SetVariable('TIMETO', opr_timeto_dtp.Text);
opr_operator_ods.Open;
...
frxReport1.PrepareReport;
frxReport1.ShowPreparedReport;
end;
I need to re-sequence the majority of child nodes at one level within my document.
The document has a structure that looks (simplified) like this:
sheet
table
row
parameters
row
parameters
row
parameters
row
cell
header string
cell
header string
cell
header string
data row A
cell
data
cell
data
cell
data
data row B
cell
data
cell
data
cell
data
data row C
cell
data
cell
data
cell
data
data row D
cell
data
cell
data
cell
data
data row E
cell
data
cell
data
cell
data
row
parameters
row
parameters
row
parameters
row
parameters
row
parameters
I'm using pugixml now to load, parse, and traverse and access the large xml file, and I'm ultimately processing out a new sequence of the data rows. I know I'm parsing everything correctly and, looking at the resequence results, I can see that the reading and processing is correct. The resequence solution after all my optimizing and processing is a list of indicies in a revised order, like { D,A,E,C,B } for the example above. So now I need to actually resequence them into this new order and then output the resulting xml to a new file. The actual data is about 16 meg, with several hundred data element row nodes and more than a hundred data elements for each row
I've written a routine to swap two data rows, but something I'm doing is destroying the xml structural consistency during the swaps. I'm sure I don't understand the way pugi is moving nodes around and/or invalidating node handles.
I create and set aside node handles -- pugi::xml_node -- to the "table" level node, to the "header" row node, and to the "first data" row node, which in the original form above would be node "data row A". I know these handles give me correct access to the right data -- I can pause execution and look into them during the optimization and resequencing calculations and examine the rows and their siblings and see the input order.
The "header row" is always a particular child of the table, and the "first data row" is always the sibling immediately after the "header row". So I set these up when I load the file and check them for data consistency.
My understanding of node::insert_copy_before is this:
pugi:xml_node new_node_handle_in_document = parentnode.insert_copy_before( node_to_be_copied_to_child_of_parent , node_to_be_copied_nodes_next_sibling )
My understanding is that a deep recursive clone of node_to_be_copied_to_child_of_parent with all children and attributes will be inserted as the sibling immediately before node_to_be_copied_nodes_next_sibling, where both are children of parentnode.
Clearly, if node_to_be_copied_nodes_next_sibling is also the "first data row", then the node handle to the first data row may still be valid after the operation, but will no longer actually be a handle to the first data node. But will using insert_copy on the document force updates to individual node handles in the vicinity -- or not -- of the changes?
So let's look at the code I'm trying to make work:
// a method to switch data rows
bool switchDataRows( int iRow1 , int iRow2 )
{
// temp vars
int iloop;
// navigate to the first row and create a handle that can move along siblings until we find the target
pugi::xml_node xmnRow1 = m_xmnFirstDataRow;
for ( iloop = 0 ; iloop < iRow1 ; iloop++ )
xmnRow1 = xmnRow1.next_sibling();
// navigate to the second row and create another handle that can move along siblings until we find the target
pugi::xml_node xmnRow2 = m_xmnFirstDataRow;
for ( iloop = 0 ; iloop < iRow2 ; iloop++ )
xmnRow2 = xmnRow2.next_sibling();
// ok.... so now get convenient handles on the the locations of the two nodes by creating handles to the nodes AFTER each
pugi::xml_node xmnNodeAfterFirstNode = xmnRow1.next_sibling();
pugi::xml_node xmnNodeAfterSecondNode = xmnRow2.next_sibling();
// at this point I know all the handles I've created are pointing towards the intended data.
// now copy the second to the location before the first
pugi::xml_node xmnNewRow2 = m_xmnTableNode.insert_copy_before( xmnRow2 , xmnNodeAfterFirstNode );
// here's where my concern begins. Does this copy do what I want it to do, moving a copy of the second target row into the position under the table node
// as the child immediately before xmnNodeAfterFirstNode ? If it does, might this operation invalidate other handles to data row nodes? Are all bets off as
// soon as we do an insert/copy in a list of siblings, or will handles to other nodes in that list of children remain valid?
// now copy the first to the spot before the second
pugi::xml_node xmnNewRow1 = m_xmnTableNode.insert_copy_before( xmnRow1 , xmnNodeAfterSecondNode );
// clearly, if other handles to data row nodes have been invalidated by the first insert_copy, then these handles aren't any good any more...
// now delete the old rows
bool bDidRemoveRow1 = m_xmnTableNode.remove_child( xmnRow1 );
bool bDidRemoveRow2 = m_xmnTableNode.remove_child( xmnRow2 );
// this is my attempt to remove the original data row nodes after they've been copied to their new locations
// we have to update the first data row!!!!!
bool bDidRowUpdate = updateFirstDataRow(); // a routine that starts with the header row node and finds the first sibling, the first data row
// as before, if using the insert_copy methods result in many of the handles moving around, then I won't be able to base an update of the "first data row node"
// handle on the "known" handle to the header data row node.
// return the result
return( bDidRemoveRow2 && bDidRemoveRow1 && bDidRowUpdate );
}
As I said, this destroys the structural consistency of the resulting xml. I can save it, but nothing will read it except notepad. The table ends up being somewhat garbled. If I try to use my own program to read it, the reader reports an "element mismatch" error and refuses to load it, understandably.
So I'm doing one or more things wrong. What are they?
My datasource has a column that contains a comma-separated list of numbers.
I want to create a dataset that takes those numbers and turns them into groupings to use in a bar chart.
requirements
numbers will be between 0-17 inclusive
groupings: 0-2,3-5,6-10,11-17
x-axis labels have to be the groupings
y-axis is the percent of rows that contain that grouping
note that because each row can contribute to multiple columns the percentages can add up to > 100%
any help you can offer would be awesome... i'm very new to BIRT and have been stuck on this for a couple days now
Not sure that I understand the requirements exactly, but your basic question "split dataset column into multiple rows" can be solved either using a scripted dataset or with pure SQL (depending on your DB).
Either way, you will need a second dataset (e.g. your data model is master-detail, and in your layout you will need something like
Table/List "Master bound to master DS
Table/List "Detail" bound to detail DS
The detail DS need the comma-separated result column from the master DS as an input parameter of type "String".
Doing this with a scripted dataset is quite easy IFF you understand Javascript AND you understand how scripted datasets work: Create a report variable "myValues" of type object with a default value of null and a second report variable "myValuesIndex" of type integer with a default value of 0.
(Note: this is all untested!)
Create the dataset "detail" as a scripted DS, with one input parameter "csv" of type String and one output parameter "value" of type String.
In the open event of the scripted DS, code:
vars["myValues"] = this.getInputParameterValue("csv").split(",");
vars["myValuesIndex"] = 0;
In the fetch event, code:
var i = vars["myValuesIndex"];
var len = vars["myValues"].length;
if (i < len) {
row["value"] = vars["myValues"][i];
vars["myValuesIndex"] = i+1;
return true;
} else {
return false;
}
For example, for the master DS result row with csv = "1,2,3-4,foo", the detail DS will result in 4 rows with
value = "1"
value = "2"
value = "3-4"
value = "foo"
Using an Oracle DB, this can be done without Javascript. The detail DS (with the same input parameter as above) would then look like:
select t.value as value from table(split(?)) t
For the definition of the split function, see RedFilter's answer on
Is there a function to split a string in PL/SQL?
If you get ORA-22813, you should change the original definition
create or replace type split_tbl as table of varchar2(32767);
to
create or replace type split_tbl as table of varchar2(4000);
as mentioned on https://community.oracle.com/thread/2288603?tstart=0
It's also possible with pure SQL in 11g using regexp_substr (see the same page).
create parameters in the scripted data set. we have to pass or link actual dataset values to scripted dataset parameters through DataSet parameter Binding after assigning the scripted data set to Table.