how to understand the dynamic nature of sas views - view

Or, in practice, how does the dynamic nature help when processing data in a proc step.
I found this link http://support.sas.com/documentation/cdl/en/lrcon/62955/HTML/default/viewer.htm#a000975382.htm. But it doesn't help much.

"A data file is static; a SAS view is dynamic. ...." - the dynamic aspect here is meant, IMHO, such that if the underlying data members (on which the view is based) changes, the view (the data it returns) is automatically updated, it returns fresh/latest data without the needed for a "refresh". This is simply because the view does not contain/store data, it's like a compiled data step that is "run" each time the view is accessed.
The actual sentence from docs is somewhat "promising", but there's not much special behind it once you understand the nature of views.
I'd say the statement could also be used as a tiny warning - if you change/loose/damage the underlying data, the view won't return the original data anymore - the dynamic could also be less safe.
Pls note, that if the underlying structures changes (adding/dropping/modifying column properties) you'd need to recreate the views (both data step view as SQL views) to keep the view valid and apply the changes in underlyings.

It allows you to avoid writing code to process the data every time.
Random example: If my IT dept choose to store my data in files that were monthly and followed a naming convention such as:
Y2014_M01
Y2014_M02
I could theoretically write a view that was
data Y2014/view=Y2014;
set Y2014:;
run;
And then when I needed to process the file I could simply refer to Y2014 as my data set. Then when it was updated monthly I wouldn't need to update my code. A bit of a contrived example but hope it helps to explain.
data class_F;
set sashelp.class;
where sex='F';
run;
data class_M;
set sashelp.class;
where sex='M';
run;
data class/view=class;
set class:;
run;
proc means data=class;
run;

Related

Identify if objects are the same

I want to check certain objects against each other and identify if they are the same.
For example, I need to verify that the total cost in one page is the same as another page. I developed a script that works, however the total cost changes every day so I have to update the object properties in maintenance mode every day.
Is there a way that UFT automatically recognizes this object must change and update?
I request you to elaborate your question. For now, you can use .* if certain values of the object are changing. Alternatively, you can store the values in an excel sheet and you can change everyday depending on the requirement.
If this is not helpful let me know
It sounds like you actually want to compare the values shown in two different objects, and see if those values are the same. (I assume this because you say they are on two different pages)
Also, you mention maintenance mode, so I assume you are using checkpoints to store their expected values.
I would suggest: instead of storing the expected values in a checkpoint, you could read the value of the first object (getROproperty), store it in a variable (dataTable field, environment variable, etc), and then navigate to the other page, read the ROproperty from the other object, and then compare.
i.e.
if {browser,page,object...}.getROproperty({whateverPropertyYouNeed}) = environment({storedFirstValue}) then
reporter.reportevent micPass,"compare step","{details here}"
end if
*replace stuff inside {} with your code, I don't know what it is
If you need to actually store the total cost externally, you could use a DataTable field and export the sheets at the end. then import the same sheet at the beginning. That would save the data to an excel sheet on a drive.

Reading XML-files with StAX / Kettle (Pentaho)

I'm doing an ETL-process with Pentaho (Spoon / Kettle) where I'd like to read XML-file and store element values to db.
This works just fine with "Get data from XML" -component...but the XML file is quite big, several giga bytes, and there fore reading the file takes too long.
Pentaho Wiki says:
The existing Get Data from XML step is easier to use but uses DOM
parsers that need in memory processing and even the purging of parts
of the file is not sufficient when these parts are very big.
The XML Input Stream (StAX) step uses a completely different approach
to solve use cases with very big and complex data stuctures and the
need for very fast data loads...
There fore I'm now trying to do the same with StAX, but it just doesn't seem to work out like planned. I'm testing this with XML-file which only has one element group. The file is read and then mapped/inserted to table...but now I get multiple rows to table where all the values are "undefined" and some rows where I have the right values. In total I have 92 rows in the table, even though it should only have one row.
Flow goes like:
1) read with StAX
2) Modified Java Script Value
3) Output to DB
At step 2) I'm doing as follow:
var id;
if ( xml_data_type_description.equals("CHARACTERS") &&
xml_path.equals("/labels/label/id") ) {
id = xml_data_value; }
...
I'm using positional-staz.zip from http://forums.pentaho.com/showthread.php?83480-XPath-in-Get-data-from-XML-tool&p=261230#post261230 as an example.
How to use StAX for reading XML-file and storing the element values to DB?
I've been trying to look for examples but haven't found much. The above example uses "Filter Rows" -component before inserting the rows. I don't quite understand why it's being used, can't I just map the values I need? It might be that this problem occurs because I don't use, or know how to use, Filter Rows -component.
Cheers!
I posted a possible StAX-based solution on the forum listed above, but I'll post the gist of it here since it is awaiting moderator approval.
Using the StAX parser, you can select just those elements that you care about, namely those with a data type of CHARACTERS. For the forum example, you basically need to denormalize the rows in sets of 4 (EXPR, EXCH, DATE, ASK). To do this you add the row number to the stream (using an Add Sequence step) then use a Calculator to determine a "bucket number" = INT((rownum-1)/4). This will give you a grouping field for a Row Denormaliser step.
When the post is approved, you'll see a link to a transformation that uses StAX and the method I describe above.
Is this what you're looking for? If not please let me know where I misunderstood and maybe I can help.

Is it possible to traverse rowtype fields in Oracle?

Say i have something like this:
somerecord SOMETABLE%ROWTYPE;
Is it possible to access the fields of somerecord with out knowing the fields names?
Something like somerecord[i] such that the order of fields would be the same as the column order in the table?
I have seen a few examples using dynamic sql but i was wondering if there is a cleaner way of doing this.
What i am trying to do is generate/get the DML (insert query) for a specific row in my table but i havent been able to find anything on this.
If there is another way of doing this i'd be happy to use but would also be very curious in knowing how to do the former part of this question - it's more versatile.
Thanks
This doesn't exactly answer the question you asked, but might get you the result you want...
You can query the USER_TAB_COLUMNS view (or the other similar *_TAB_COLUMN views) to get information like the column name (COLUMN_NAME), position (COLUMN_ID), and data type (DATA_TYPE) on the columns in a table (or a view) that you might use to generate DML.
You would still need to use dynamic SQL to execute the generated DML (or at least generate static SQL separately).
However, this approach won't work for identifying the columns in an arbitrary query (unless you create a view of it). If you need that, you might need to resort to DBMS_SQL (or other tools).
Hope this helps.
As far as I know there is no clean way of referencing record fields by their index.
However, if you have a lot of different kinds of updates of the same table each with its own column set to update, you might want to avoid dynamic sql and look in the direction of statically populating your record with values, and then issuing update someTable set row = someTableRecord where someTable.id = someTableRecord.id;.
This approach has it's own drawbacks, like, issuing an update to every, even unchanged column, and thus creating additional redo log data, but I believe it should be considered.

Performance of accessing dataSet fields using Field-names instead of indexes

Is the performance negligible?
For example,
myQuery.FieldbyName("MyField").AsString;
myQuery.Fields[0].AsString;
Cases:
Table with a decent number of fields, say > 50 fields
Accessing large resultsets, say > 100,000 rows
Is the readability benefit of field names worth the performance decrease?
Here is an interesting post by François Gaillard about FieldByName performance issues.
The performance may not be negligible, depending on how often you access the field by name. If you use it for every field and every row you may notice a performance decrease (see for example http://www.delphifeeds.com/go/s/74559). To mantain readability yet improve performance you could:
Use the ['FieldName'] or FieldByName() syntax only once, and store a reference to the field in a variable.
Use "static" persistent field declaration, right-clicking the dataset, select Field Editor and adding needed fields. It will declare the proper TField descendant, and let you assign a name.
Also the AsXXXXX calls may be slower than using a TField descendant native Value property.
I have found FieldByName to be noticeable slower.
I normally access the database through an intermediate layer, that access entire records from the same table alot of times. On creation of that layer I assign the index of each field to an variable. I then use the variables for later access, to still have readable code.
ADODataSet.CommandText := 'select * from [TABLE] where 1 = 0'; //table layout
ADODataSet.Open;
ADODataSet.GetFieldNames(List);
varMyField := List.IndexOf('MyField');

iterating over linq entity column

i need to insert a record with linq
i have a namevaluecollection with the data from a form post..
so started in the name=value&name2=value2 etc.. type format
thing is i need to inset all these values into the table, but of course the table fields are typed, and i need to type up the data before inserting it
i could of course explicitly do
linqtableobj.columnproperty = convert.toWhatever(value);
but i have many columns in the table, and the data coming back from the form, doesnt always contain all fields in the table
thought i could iterate over the linq objects columns, getting their datatype - to use to convert the appropriate value from the form data
fine all good, but then im still stuck with doing
linqtableobj.columnproterty = converted value
...if there is one for every column in the table
foreach(col in newlinqrowobj)
{
newlinqobj[col] = convert.changetype(namevaluecollection[col.name],col.datatype)
}
clearly i cant do that, but anything like that possible.. or
is it possible to loop around the columns for the new 'record' setting the values as i go.. and i guess grabbing the types at that point to do the conversion
stumped i am
thanks
nat
If you have some data type with a hundred different properties, and you want to copy those into a completely different data type with a hundred different properties, then somehow somewhere in your code you are going to have to define a hundred different "mapping" instructions. It doesn't matter what framework you are using, or whether the "mapping" instructions are lines of C# code, XML elements, lambda functions, proprietary "stuff", or whatever. There's no getting away from it.
Bearing that in mind, having one line of code per property looks to me like the fastest, simplest, most readable and maintainable solution.
If I understood your problem correctly, you could use reflection (or dynamic code generation if it is performance sensitive) to circumvent your typing problems
There is a preety good description of how to do something like this at codeproject.
Basically you get a PropertyInfo for the property you want to set (if it's not a property I think you would need dynamic code generation) and use it's setValue method (after calling the appropriate Convert.ChangeType of course). This will basicall circumvent the whole static typing, so there you are.

Resources