InstallDate00 NULL for a number of programs in v_Add_Remove_Programs - sccm

I notice that several important programs listed in v_Add_Remove_Programs have a NULL install date, even though the install date exists in Programs and Features on the machine.
How does SCCM collect this data and why would it be null for random programs?

Related

How can we do data analysis for DB replication project

We are facing one issue in our project i.e. Data verification issue.
The project is about Replication of data from Sybase to oracle DBs.
The table structures for Table A across Sybase, Oracle is same.
Same column and primary key combination across all the databases.
e.g. If Sybase has Table A with columns a, b and C
same table with same name and same columns will be available in different databses.
We are done with replication stuff part.But we faced some silent failure like data discrepancy just wondering if there will any tool already available for this.
Any information on his would be helpful. Thanks.
Sybase (now SAP) has a couple products that can be used for data comparisons and reconciliation:
rs_subcmp - an older, 32-bit tool that comes with the Sybase Replication Server product that can be used to compare data between
source and target; SQL reconciliation scripts can be generated from
the differences and then applied to the target to bring it in sync
with the source; if your tables are more than 1GB in size you can
still use rs_subcmp but you'll need to create multiple comparison
jobs (via where clauses) to work on different subsets of your tables
[I don't recall if rs_subcmp can be use for heterogeneous
replication setsup, eg, ASE-Oracle.]
Data Assurance (DA) - the newer, 64-bit product ... also from
Sybase ... which can also compare data and (re)sync the target(s)
from the source (either via SQL reconciliation scripts or directly);
DA is capable of handling comparisons between a handful of
different RDBMS products (eg, ASE-Oracle); I'm currently working on a
project where one of the requirements is to validate (and reconcile
where needed) 200+TB of data being migrated from Oracle to HANA and
I'm using DA for the validation/reconciliation portion of the project
As #TenG has hinted at with his answer, there's a good bit of effort involved to compare data and generate code to reconcile the differences. Rolling your own code is doable but will entail a lot of work. If you've got the money you'll likely find 3rd party tools can get most/all of the work done for you.
If you used a 3rd party product to replicate your data from Sybase to Oracle, you may want to see if the same vendor has a comparison/validation/reconciliation tool you could use.
I've worked on a few migration projects and a key part has always been data reconciliation.
I can only talk about the approaches we took, based on constraints around tools available and minimising downtime, and constraints of available space.
In all cases I took to writing scripts that worked on two levels - summary view and "deep dive". We couldn't find any tools readily available that did what we wanted in a timely enough manner. In fact even the migration tools we found had limitations (datapump, sqlloader, golden gate, etc) and hand coded scripts to handle the bits that we found to be lacking or too slow in the standard tools.
The summary view varied from project to project. It was part functional based (do the accounting figures for transactions match) for the users to verify, and part technical. For smaller tables we could just write simple reports and the diff was straight forward.
For larger tables we wrote technical reports that looked at bands of data (e.g group the PK into 1000s) collect all the column data and produce checksum, generating a report for each table like:
PK ID Range Start Checksum
----------------- -----------
100000 22773377829
200000 38938938282
.
.
Corresponding table pairs from each database were then were "diff"d against each other to highlight discrepancies. Any differences that were found could then be looked at in more detail.
The scripts were written in such a way to allow them to run in parallel looking at discrete bands. Te band ranges were tunable as well to get the best throughput. This obviously sped things up.
The scripts were shell scripts firing off sqlplus reports, and similar for the source database.
On one project there wasn't enough diskspace to do these reports, so I wrote a Java program that queried the two databases side by side, using block queues to fetch and compare rowsets. Being in memory meant this was super fast.
For the "deep dive" we looked at the details for key tables, or for tables that reports a checksum difference.
For the user reports, the users would specify what they wanted to see, and we wrote the reports accordingly.
On the last project, the only discrepancies found were caused by character set conversion issues (people names with accents weren't handled correctly).
On projects where the overall dataset was smaller we extracted the data to XML files and wrote a Java tool to processes pairs and report differences.
The SAP/Sybase rs_subcmp tool is pretty powerful and also pretty hard to use. For details see:
https://help.sap.com/viewer/075940003f1549159206fcc89d020515/16.0.3.3/en-US/feb58db1bd1c1014b134ef4efef25563.html?q=rs_subcmp
You have to pass it key field information, but once you do that, it can retry/restart the compare streams after transient differences. Pretty fancy.
rs_subcmp expects to work on Sybase data source. So to compare against Oracle, you'd probably have to setup one of those Sybase-to-Oracle gateway products ($$$$$).
Could you install the Oracle ODBC drivers and configure them to allow Sybase clients to access Oracle? I'm guessing not (but that's outside the range of my experience).
Note the "-h" option for rs_subcmp. The docs just say it runs a "fast comparison", but what it's actually doing is running queries using the hashbytes() function. Something like:
select keyfield1,keyfield2, hashbytes("Md5",datacol1,datacol2,datacol3)
from mytable
So this sort of query might be good for the "summary view" type comparison discussed above (if the Oracle STANDARD_HASH() function output matches up with the Sybase hashbytes() function (again, outside my experience))
Note, as of ASE 16, there was a bug with the hash() & hashbytes() functions running the Md5 hash option against large varbinary columns where they could use up all procedure cache, potentially crashing the server (CR 811073)

Design approach for feeds

We have feeds running between external systems and our system that brings in investment data. These feeds run every 15 minutes. Every time feeds run, we update a LastRun timestamp column that indicates feed ran successfully. To force a feed to run, we set that feed's LastRun timestamp to NULL.
I am working on some new workflow that will let my users create investments in our own system. Once investment is created in the original external system, feed will get that in, and I will link that investment to the one I created. While linking, I will force-run the feeds related to investments to get other investment-related data.
Issue I am having is, what if feed is already running when I set the LastRun timestamp to null? It will not know that linking has happened, and it will simply update the LastRun timestamp and be on its way. Any solution to this?
you can do one thing that make a table that will keep id,status and dt_created where you keep the new investment done to your system and set the status flag to no. now when you run the feed check the status flag if it is no then run the feed and after running update it to yes
hope this can solve your problem

ETL for processing history records

I am in sort of a DWH project (not quite, but still). And there is this issue we constantly run into which I was wondering if there would be a better solution. Follows
We receive some big files with records containing the all states a user have been into, like:
UID | State | Date
1 | Active | 20120518
2 | Inactive | 20120517
1 | Inactive | 20120517
...
And we are usually inly interested in the latest state of each user. So far so good, with just a little sorting and we could get the way we want it. Only problem is, these files are usually big.. like 20-60gb, sorting these guys sometimes is a pain since the logic for sorting isn't usually so straight forward.
What we do generally is load everything into our Oracle and use intermediary tables and materialized views to have it done. Still, sometimes performance bites us.
20-60gb might be big, but not that big. I mean, should be a somewhat more specialised way to deal with these records, shouldn't it?
I imagine two basic ways of seeing tackling the issue:
1) Programming outside the DBMS, scripts and compiled things. But maybe this is not very flexible unless some bigger amount of time is invested developing something. Also, I might have to busy myself administrating the box resources, whereas I wish not to worry with that.
2) Load everything into the DBMS (Oracle in my case) and use whatever tools it provide to sort and clip the data. This would be my case, though, I am not sure we are using all the tools or simply doing it the right way that would be for Oracle 10g.
Question is then:
You have a 60gb file with millions of historical records like the one above and your user want a table in DB with the last state for each user.
how would you guys do?
thanks!
There are two things you can do to speed up the process.
The first thing is to throw compute power at it. If you have Enterprise Edition and lots of cores you will get significant reductions in load time with parallel query.
The other thing is to avoid loading the records you don't want. This is why you mention pre-processing the file. I'm not sure there's much you can do there, unless you have access to a Hadoop cluster to run some map-reduce jobs on your file (well, reduce mainly, the structure you post is about as mapped as can be already).
But there is an alternative: external tables. External tables are tables which have their data in OS files rather then tablespaces. And they can be parallel enabled (providing your file meet certain criteria). Find out more.
So, you might have an external table like this
CREATE TABLE user_status_external (
uid NUMBER(6),
status VARCHAR2(10),
sdate DATE
ORGANIZATION EXTERNAL
(TYPE oracle_loader
DEFAULT DIRECTORY data_dir
ACCESS PARAMETERS
(
RECORDS DELIMITED BY newline
BADFILE 'usrsts.bad'
DISCARDFILE 'usrsts.dis'
LOGFILE 'usrsts.log'
FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY '"'
(
uid INTEGER EXTERNAL(6),
status CHAR(10),
sdate date 'yyyymmdd' )
)
LOCATION ('usrsts.dmp')
)
PARALLEL
REJECT LIMIT UNLIMITED;
Note that you need read and write permissions on the DATA_DIR directory object.
Having created the external table you can load the only desired data into your target table with an insert statement:
insert into user_status (uid, status, last_status_date)
select sq.uid
, sq.status
, sq.sdate
from (
select /*+ parallel (et,4) */
et.uid
, et.status
, et.sdate
, row_number() over (partition by et.uid order by et.sdate desc) rn
from user_status_external et
) sq
where sq.rn = 1
Note that as with all performance advice, there are no guarantees. You need to benchmark things in your environment.
Another thing is the use of INSERT: I'm assuming these are all fresh USERIDs, as that is the scenario your post suggests. If you have a more complicated scenario then you probably want to look at MERGE or a different approach altogether.
One last thing: you seem to be assuming this is a common situation, which has some standard approaches. But most data warehouses load all the data they get. They may then filter it for various different uses, data marts, etc. But they almost always maintain a history in the actual warehouse of all the distinct records. So that's why you might not get an industry standard solution.
I'd go with something along the lines of what APC said as a first go. However, I think parallel tables can only load data in parallel if the data is in multiple files, so you might have to cut the files into several. How are the files generated? A 20 - 60GB file is a real pain to deal with - can you get the generation of the files changed so you get X 2GB files for example?
After getting all the records into the database, you might run into problems attempting to sort 60GB of data - it would be worth having a look at the sort stage of the query you are using to extract the latest status. In the past I helped large sorts by hash partitioning the data on one of the fields to be sorted, in this case user_id. Then Oracle only has to do X smaller sorts, each of which can proceed in parallel.
So, my thoughts would be:
Try and get many smaller files generated instead of 1 big one
Using External tables, see if it is feasible to extract the data you want directly from the external tables
If not, load the entire contents of the files into a hash partition table - at this stage make sure you do insert /*+ append nologging */ to avoid undo generation and redo generation. If your database has force_logging set to true, the nologging hint will have no effect.
Run the select on the staged data to extract only the rows you care about and then trash the staged data.
The nologging option is probably critical to you getting good performance - to load 60GB of data, you are going to generate at least 60GB of redo logs, so if that can be avoided, all the better. You would probably need to have a chat with your DBA about that!
Assuming you have lots of CPU available, it may also make sense to compress the data as you bulk load it into the staging table. Compression may well half the size of your data on disk if it has repeating fields - the disk IO saved when writing it usually more than beats any extra CPU consumed when loading it.
I may be oversimplifying the problem, but why not something like:
create materialized view my_view
tablespace my_tablespace
nologging
build immediate
refresh complete on demand
with primary key
as
select uid,state,date from
(
select /*+ parallel (t,4) */ uid, state, date, row_number() over (partition by uid order by date desc) rnum
from my_table t;
)
where rnum = 1;
Then refresh fully when you need to.
Edit: Any don't forget to rebuild stats and probably throw a unique index on uid.
I would write a program to iterate over each record and retain only those which are more recent than record previously seen. At the end, insert the data into the database.
How practical that is would depend on how many users we're talking about - you could end up having to think carefully about your intermediate storage.
In general, this becomes (in pseudo-code):
foreach row in file
if savedrow is null
save row
else
if row is more desirable than savedrow
save row
end
end
end
send saved rows to database
The point it, you need to define how one row is considered to be more desirable than another. In the simple case, for a given user, the current row's date is later than the last row we saved. At the end, you'd have a list of rows, one-per-user, each of which has the most recent date you saw.
You could general the script or program so that the framework is separate from the code that understands each data file.
It'll still take a while, mind :-)

How to achieve test isolation testing Oracle PL/SQL?

In Java projects, JUnit tests do a setup, test, teardown. Even when mocking out a real db using an in-memory db, you usually rollback the transaction or drop the db from memory and recreate it between each test. This gives you test isolation since one test does not leave artifacts in an environment that could effect the next test. Each test starts out in a known state and cannot bleed over into another one.
Now I've got an Oracle db build that creates 1100 tables and 400K of code - a lot of pl/sql packages. I'd like to not only test the db install (full - create from scratch, partial - upgrade from a previous db, etc) and make sure all the tables, and other objects are in the state I expect after the install, but ALSO run tests on the pl/sql (I'm not sure how I'd do the former exactly - suggestions?).
I'd like this all to run from Jenkins for CI so that development errors are caught via regression testing.
Firstly, I have to use an enterprise version instead of XE because of XE doesn't support java SPs and a dependency on Oracle Web Flow. Even if I eliminate those dependencies, the build typically takes 1.5 hours just to load (full build).
So how do you acheive test isolation in this environment? Use transactions for each test and roll them back? OK, what about those pl/sql procedures that have commits in them?
I thought about backup and recovery to reset the db after each test, or recreate the entire db between each tests (too drastic). Both are impractical since it takes over an hour to install it. Doing so for each test is overkill and insane.
Is there a way to draw a line in the sand in the db schema(s) and then roll it back to that point in time? Sorta like a big 'undo' feature. Something besides expdp/impdp or rman. Perhaps the whole approach is off. Suggestions? How have others done this?
For CI or a small production upgrade window, the whold test suite has to run with in a reasonable time (30 mins would be ideal).
Are there products that might help acheive this 'undo' ability?
Kevin McCormack published an article on The Server Labs Blog about continuous integration testing for PL/SQL using Maven and Hudson. Check it out. The key ingredient for the testing component is Steven Feuerstein's utPlsql framework, which is an implementation of JUnit's concepts in PL/SQL.
The need to reset our test fixtures is one of the big issues with PL/SQL testing. One thing which helps is to observe good practice and avoid commits in stored procedures: transactional control should be restricted to only the outermost parts of the call stack. For those programs which simply must issue commits (perhaps implicitly because they execute DDL) there is always a test fixture which issues DELETE statements. Handling relational integrity makes those quite tricky to code.
An alternative approach is to use Data Pump. You appear to discard impdp but Oracle also provides PL/SQL API for it, DBMS_DATAPUMP. I suggest it here because it provides the ability to trash any existing data prior to running an import. So we can have an exported data set as our test fixture; to execute a SetUp is a matter of running a Data Pump job. You don't need do do anything in the TearDown, because that tidying up happens at the start of the SetUp.
In Oracle you can use Flashback Technology to restore the serve to a point back in time.
http://download.oracle.com/docs/cd/B28359_01/backup.111/b28270/rcmflash.htm
1.5 hours seems like a very long time for 1100 tables and 400K of code. I obviously don't know the details of your envrionment, but based on my experience I bet you can shrink that to 5 to 10 minutes. Here are the two main installation script problems I've seen with Oracle:
1. Operations are broken into tiny pieces
The more steps you have the more overhead there will be. For example, you want to consolidate code like this as much as possible:
Replace:
create table x(a number, b number, c number);
alter table x modify a not null;
alter table x modify b not null;
alter table x modify c not null;
With:
create table x(a number not null, b number not null, c number not null);
Replace:
insert into x values (1,2,3);
insert into x values (4,5,6);
insert into x values (7,8,9);
With:
insert into x
select 1,2,3 from dual union all
select 4,5,6 from dual union all
select 7,8,9 from dual;
This is especially true if you run your script and your database in different locations. That tiny network lag starts to matter when you multiply it by 10,000. Every Oracle SQL tool I know of will send one command at a time.
2. Developers have to share a database
This is more of a long-term process solution than a technical fix, but you have to start sometime. Most places that use Oracle only have it installed on a few servers. Then it becomes a scarce resource that must be carefully managed. People fight over it, roles are unclear, and things don't get fixed.
If that's your environment, stop the madness and install Oracle on every laptop right now. Spend a few hundred dollars and give everyone personal edition (which has the same features as Enterprise Edition). Give everyone the tools they need and continous improvment will eventually fix your problems.
Also, for a schema "undo", you may want to look into transportable tablespaces. I've never used it, but supposedly it's a much faster way of installing a system - just copy and paste files instead of importing. Similiarly, perhaps some type of virtualization can help - create a snapshot of the OS and database.
Although Oracle Flashback is an Enterprise Edition feature the technology it is based on is available in all editions namely Oracle Log Miner:
http://docs.oracle.com/cd/B28359_01/server.111/b28319/logminer.htm#i1016535
I would be interested to know whether anybody has used this to provide test isolation for functional tests i.e. querying v$LOGMNR_CONTENTS to get a list of UNDO statements from a point of time corresponding to the beginning of the test.
The database needs to be in archive mode and in the junit test case a method annotated with
#Startup
would call DBMS_LOGMNR.START_LOGMNR. The test would run and then in a method annotated with
#Teardown
would be query v$LOGMNR_CONTENTS to find the list of UNDO statements. These would then be executed via JDBC. In fact the querying and execution of the UNDO statements could be extracted into a PLSQL stored procedure. The order that the statements executed would have to be considered.
I think this has the benefit allowing the transaction to commit which is where an awful lot of bugs can creep in i.e. referential integrity, primary key violations etc.

Oracle Data Versioning/Partitioning Strategies/Best Practices

not sure if the subject entirely conveys what I'm trying to achieve, but let me explain:
We are building an application that uses Oracle as storage backend. Each year, last years dataset will be "Archived", and a new instance created and populated from scratch.
What are the options to do this within the same schema?
Keep version information on a record level (we presume this will be too slow for our use-case).
Keep version information on a table level, so for each new version, we will re-create all the tables but with a new version prefix. (We like this solution, since we can do it all in code).
?
Is there not something like partitions/personalities/namespaces available that will allow us to achieve this in Oracle?
My oracle experience is rather limited, any assistance will be greatly appreciated!
The RDBMS conceptual model is not very good at maintaining temporal versions of data. So it is not just Oracle which is lacking in this regard.
I am unclear why you think keeping version information at the record level will be too slow. Too slow in creating a new version? Or too slow where it comes to data retrieval during regular operations?
Here is how you could do it. Given a table CUSTOMERS with a business key of CUSTOMER_REF I might normally build it like this (I am using abbreviated syntax rather than best practice for reasons of space):
create table customers
( id number not null primary key
, customer_ref number not null unique key
, name varchar2(30) not null )
/
The versioned equivalent would look like this:
create table customers
( id number not null primary key
, customer_ref number not null
, version_number number
, name varchar2(30) not null
, constraint whatever unique (customer_ref, version_number) )
/
This works by keeping the current version of VERSION_NUMBER null, and only populating it at archival time. Any lookup is going to have to include and version_number is null. This will be a bit of a pain and you may need to include the column in any additional indexes you build.
Obviously maintaining all versions of the records in the same table will increase the size of your tables, which might have an effect on performance. Oracle's Partitioning option can definitely help here. It also would give you a neat way of creating next year's set of data. However, it is a chargeable extra on top of the Enterprise License, so it is an expensive option. Find out more..
The most time consuming aspect of this will be managing foreign key relationships in the new version of the table. Presuming you choose to use synthetic primary keys, the archival process will have to generate new IDs and then painstakingly cascade them to their dependent records in the new versions of referencing foreign keys.
Thinking about this makes discreet tables for each version seem very attractive. For ease of use I would keep the current version un-prefixed, so that archiving becomes a process simply of
create table customers_n as select * from customers;
You might want to avoid downtime while creating the versioned tables. In that case you could use materialized views to capture the tables' state during the run-up to the archival switchover. When the clock strikes twelve you can switch off the refresh. (caveat: this is thinking on the fly, I have never done anything like this so try before you buy.)
One pertinent advantage of multiple tables (and Partitioning) is that you can move the archived records to a READ ONLY tablespace. This not only preserves them from unwanted change, it also means you can exclude them from subsequent backups.
edit
I notice you have commented that the archived data can occasionbally be amended. In taht case moving it to READ ONLY tablespaces is not a go-er.
The only thing I wil add to what APC said is regarding your asking for "namespaces".
A namespace in Oracle is a schema, whereby you can have the same object name(s) in each schema.
Of course this all depends on how your app must access multiple versions, but I would lean towards a different schema for each year before I would use some sort of naming convention to maintain versions of tables in the same schema. The reason is, eventually you will have a nightmares. At least with different schemas, all DDL can be the same, all references to objects will be the same, and tools like ER modellers and query tools will work within the context of that schema. Data models change, so at some point you may need to run some compare tools, and if all your tables are named funky with some sort of version postfix, that won't work well.
Add a schema can be copied / moved with export or data pump quickly using the fromuser/touser or remap_schema options, so you won't need much code, except to do any cleanup of last years data out of the new version.
I find schemas are very useful as "containers" and most apps I host only have schema level privileges, so I'm guaranteed the app can be easily and quickly moved from instance to instance, or multiple copies of the app can be hosted side-by-side on the same instance.
Might the schema change between years. For example, in 2010 you have fifteen columns but in 2011 you add a sixteenth.
If so, will the same application work on both 2010 and 2011 data.
If the schema is static, I'd go for table with a 'YEAR' column and use VPD/RLS/FGAC to apply a YEAR = '2010' predicate.
I'd only worry about partitioning if performance was a problem.
1) Interval partition it by year and some date field in the row.
2) Add it at the end of each table and populate it with a sequence and trigger.
3) Then partition by interval year on this col.

Resources