Can we do table partition in SQL Server 2012 Standard Edition - view

I have one table which are number of rows '7515966' and this table depend on another tables. We create View for generating SSRS reports.
Now size of View is increase so that performance issue occur on report.
We start archiving data for large table. but i can't understand which methodology use please guide us..
Thank you...

Table partitioning in 2012 is only available in Enterprise Edition. See for details on what's available for each edition.
7million rows is not a lot of rows for SQL Server, we routinely deal with billions of rows. However, as your rows get into the 10s of millions range, you'll probably expose various performance gaps in your system. E.g. are your queries efficiently written so they only touch the rows they need, do you have the right indexes, are statistics up to date, is tempdb optimized, etc...
One common weak link in 9 out of 10 databases (regardless of make) I've worked with is the storage subsystem. Is yours able to keep up with the large data set you need to work with. Storage for databases should be designed and configured based on throughput, concurrency and latency requirements first. Space generally the last thing to worry about once the other requirements, including HA/DR, are met.
If you have deficiencies in your current system, you can pay for the expensive enterprise edition and implement table partitioning but you will likely still suffer performance problems soon after, if not immediately.


Would you recommend using Hadoop/HBASE?

We have a SQL server 2008 and one of the tables, say table A has the following characteristics:
Every day we get several heterogeneous feeds from other systems with numerical data.
Feeds are staged elsewhere, converted to a format compliant with A's schema.
Inserted into A.
Schema looks like:
<BusinessDate> <TypeId> <InsertDate> <AxisX> <AxisY> <Value>
The table has a variable number of rows. Essentially we have to purge it at the weekends otherwise the size affects performance. So size ranges from 3m-15m rows during the week. Due to some new requirements we expect this number to be increased by 10m by the end of 2012. So we would be talking about 10m-25m rows.
Now in addition
Data in A never change. The middle tier may use A's data but it will be a read only operation. But typically the middle tier doesn't even care about the contents. It typically (not always but 80% of cases) runs stored procs to generate reports and delivers the reports in other systems.
Clients of these table would typically want to do do long sequential reads for one business date and type. i.e. "get me all type 1 values for today"
Clients will want to join this table with 3-5 more tables and then deliver reports to other systems.
The above assumptions are not necessarily valid for all tables with which A is joined. For example we usually join A with a table B and do a computation like B.value*A.value. B.value is a volatile column.
A's characteristics do sound very much like what HBase and other column oriented schemas can offer.
However some of the joins are with volatile data.
Would you recommend migrating A to an HBase schema?
And also, if we were to move A I would assume we would also have to migrate B and other dependent tables which (on the contrary with A) are being used by several other places from the middle tier. Wouldn't this be complicating things a lot?
25 Million rows doesn't sound big enough to justify using HBase, although the usage pattern fits. You need a name node, a job tracker, a master and then your region servers, so you'll be needing a minimum of maybe 5 nodes to run HBase in any reasonable way. Your rows are so small I'm guessing it's maybe 10gb of data, so storing this across 5 servers seems like overkill.
If you do go this route (perhaps you want to store more than a week's data at once) there are ways to integrate HBase with relational DBs. Hive, for example, provides ODBC/JDBC connectivity and can query HBase. Oracle and Teradata both provide integration between their relational DB software and non-relational storage. I know Microsoft has recently announced that they are dropping Dryad in favor of integrating with Hadoop, but I am not certain how far along that process is wrt SQL Server. And if all you need is "get a list of IDs to use in my SQL query" you can of course write something yourself easily enough.
I think HBase is very exciting, and there may be things you haven't mentioned which would drive you towards it (e.g. high availability). But my gut says you can probably scale out your relational db much more cheaply than switching to HBase.

Performance Implications of Using Oracle DBMS_WM.EnableVersioning

The command to enable versioning (part of what Oracle calls Workspace Management) in Oracle (DBMS_WM.EnableVersioning) creates non-materialized views, which cannot be indexed. Will this kill performance, or will the indexes for the _AUX, _LT, and _LCK tables be used when the views are queried?
Are there significant performance issues in addition to indexing when enabling versioning?
I am using Oracle 11g.
As with most things, it depends.
Do you have queries today that need to do table scans to fetch all their data? Or is everything going to go through an index?
What is the use case? Are you using Workspace Manager to support long-running transactions? Or to maintain history data in a single LIVE workspace?
How frequently do rows change? How many versions of a row are you planning to keep?
If you have existing queries that will do table scans, the table is rebuilt every night, and you plan on keeping history data forever, you're likely going to have major performance issues. If all your queries use indexes to access data, rows change infrequently, and you just intend to retain a few versions of history, the indexes on the underlying tables should be sufficient.
We've used Workspace Manager to maintain history on relatively slowly changing tables forever as well as relatively fast changing tables for a month. And we've used it to maintain discrete savepoints across tables in a few applications so that users can permanently save the state of application data at interesting points in time. In general, we've been satisfied with performance though complex queries will occasionally go off into the weeds when the optimizer gets confused.
Since you're on 11g, you may also consider Total Recall. It's an extra-cost option on top of the enterprise license but it provides a much more efficient architecture for tracking changes to data over time assuming that you intend to store all changes for a fixed period of time. On the other hand, you're more limited in the DDL you can issue without causing history to be discarded which tends to be a rather serious constrain in the applications I've worked on.

Does Oracle 11g automatically index fields frequently used for full table scans?

I have an app using an Oracle 11g database. I have a fairly large table (~50k rows) which I query thus:
SELECT omg, ponies FROM table WHERE x = 4
Field x was not indexed, I discovered. This query happens a lot, but the thing is that the performance wasn't too bad. Adding an index on x did make the queries approximately twice as fast, which is far less than I expected. On, say, MySQL, it would've made the query ten times faster, at the very least. (Edit: I did test this on MySQL, and there saw a huge difference.)
I'm suspecting Oracle adds some kind of automatic index when it detects that I query a non-indexed field often. Am I correct? I can find nothing even implying this in the docs.
As has already been indicated, Oracle11g does NOT dynamically build indexes based on prior experience. It is certainly possible and indeed happens often that adding an index under the right conditions will produce the order of magnitude improvement you note.
But as has also already been noted, 50K (seemingly short?) rows is nothing to Oracle. The Oracle database in fact has a great deal of intelligence that allows it to scan data without indexes most efficiently. Every new release of the Oracle RDBMS gets better at moving large amounts of data. I would suggest to you that the reason Oracle was so close to its "best" timing even without the index as compared to MySQL is that Oracle is just a more intelligent database under the covers.
However, the Oracle RDBMS does have many features that touch upon the subject area you have opened. For example:
10g introduced a feature called AUTOMATIC SQL TUNING which is exposed via a gui known as the SQL TUNING ADVISOR. This feature is intended to analyze queries on its own, in depth and includes the ability to do WHAT-IF analysis of alternative query plans. This includes simulation of indexes which do not actually exist. However, this would not explain any performance differences you have seen because the feature needs to be turned on and it does not actually build any indexes, it only makes recommendations for the DBA to make indexes, among other things.
11g includes AUTOMATIC STATISTICS GATHERING which when enabled will automatically collect statistics on database objects as it deems necessary based on activity on those objects.
Thus the Oracle RDBMS is doing what you have suggested, dynamically altering its environment on its own based on its experience with your workload over time in order to improve performance. Creating indexes on the fly is just not one of the things is does yet. As an aside, this has been hinted to by Oracle in private sevearl times so I figure it is in the works for some future release.
Does Oracle 11g automatically index fields frequently used for full table scans?
In regards the MySQL issue, what storage engine you use can make a difference.
"MyISAM relies on the operating system for caching reads and writes to the data rows while InnoDB does this within the engine itself"
Oracle will cache the table/data rows, so it won't need to hit the disk. depending on the OS and hardware, there's a chance that MySQL MyISAM had to physically read the data off the disk each time.
~50K rows, depending greatly on how big each row is, could conceivably be stored in under 1000 blocks, which could be quickly read into the buffer cache by a full table scan (FTS) in under 50 multi-block reads.
Adding appropriate index(es) will allow queries on the table to scale smoothly as the data volume and/or access frequency goes up.
"Adding an index on x did make the
queries approximately twice as fast,
which is far less than I expected. On,
say, MySQL, it would've made the query
ten times faster, at the very least."
How many distinct values of X are there? Are they clustered in one part of the table or spread evenly throughout it?
Indexes are not some voodoo device: they must obey the laws of physics.
"Duplicates could appear, but as it
is, there are none."
If that column has neither a unique constraint nor a unique index the optimizer will choose an execution path on the basis that there could be duplicate values in that column. This is the value of declaring the data model as accuratley as possible: the provision of metadata to the optimizer. Keeping the statistics up to date is also very useful in this regard.
You should have a look at the estimated execution plan for your query, before and after the index has been created. (Also, make sure that the statistics are up-to-date on your table.) That will tell you what exactly is happening and why performance is what it is.
50k rows is not that big of a table, so I wouldn't be surprised if the performance was decent even without the index. Thus adding the index to equation can't really bring much improvement to query execution speed.

Oracle: Difference in execution plans between databases

I am comparing queries my development and production database.
They are both Oracle 9i, but almost every single query has a completely different execution plan depending on the database.
All tables/indexes are the same, but the dev database has about 1/10th the rows for each table.
On production, the query execution plan it picks for most queries is different from development, and the cost is somtimes 1000x higher. Queries on production also seem to be not using the correct indexes for queries in some cases (full table access).
I have ran dbms_utility.analyze schema on both databases recently as well in the hopes the CBO would figure something out.
Is there some other underlying oracle configuration that could be causing this?
I am a developer mostly so this kind of DBA analysis is fairly confusing at first..
1) The first thing I would check is if the database parameters are equivalent across Prod and Dev. If one of the parameters that affects the decisions of the Cost Based Optimizer is different then all bets are off. You can see the parameter in v$parameter view;
2) Having up to date object statistics is great but keep in mind the large difference you pointed out - Dev has 10% of the rows of Prod. This rowcount is factored into how the CBO decides the best way to execute a query. Given the large difference in row counts I would not expect plans to be the same.
Depending on the circumstance the optimizer may choose to Full Table Scan a table with 20,000 rows (Dev)where it may decide an index is lower cost on the table that has 200,000 rows (Prod). (Numbers just for demonstration, the CBO uses costing algorighms for determining what to FTS and what to Index scan, not absolute values).
3) System statistics also factor into the explain plans. This is a set of statistics that represent CPU and disk i/o characteristics. If your hardware on both systems is different then I would expect your System Statistics to be different and this can affect the plans. Some good discussion from Jonathan Lewis here
You can view system stats via the sys.aux_stats$ view.
Now I'm not sure why different plans are a bad thing for you... if stats are up to date and parameters set correctly you should be getting decent performance from either system no matter what the difference in size...
but it is possible to export statistics from your Prod system and load them into your Dev system. This make your Prod statistics available to your Dev database.
Check the Oracle documentation for the DBMS_STATS package, specifically the EXPORT_SCHEMA_STATS, EXPORT_SYSTEM_STATS, IMPORT_SCHEMA_STATS, IMPORT_SYSTEM_STATS procedures. Keep in mind you may need to disable the 10pm nightly statistics jobs on 10g/11g... or you can investigate Locking statistics after import so they are not updated by nightly jobs.

What makes Oracle more scalable?

Oracle seems to have a reputation for being more scalable than other RDBMSes. After working with it a bit, I can say that it's more complex than other RDBMSes, but I haven't really seen anything that makes it more scalable than other RDBMSes. But then again, I haven't really worked on it in a whole lot of depth.
What features does Oracle have that are more scalable?
Oracle's RAC architecture is what makes it scalable where it can load balance across nodes and parallel queries can be split up and pushed to other nodes for processing.
Some of the tricks like loading blocks from another node's buffer cache instead of going to disc make performance a lot more scalable.
Also, the maintainability of RAC with rolling upgrades help make the operation of a large system more sane.
There is also a different aspect of scalability - storage scalability. ASM makes increasing the storage capacity very straightforward. A well designed ASM based solution, should scale past the 100s of terabyte size without needing to do anything very special.
Whether these make Oracle more scalable than other RDBMSs, I don't know. But I think I would feel less happy about trying to scale up a non-Oracle database.
Cursor sharing is (or was) a big advantage over the competition.
Basically, the same query plan is used for matching queries. An application will have a standard set of queries it issue (eg get the orders for this customer id). The simple way is to treat every query individually, so if you see 'SELECT * FROM ORDERS WHERE CUSTOMER_ID = :b1', you look at whether table ORDERS has an index on CUSTOMER_ID etc. As a result, you can spend as much work looking up meta data to get a query plan as actually retrieving the data. With simple keyed lookups, a query plan is easy. Complex queries with multiple tables joined on skewed columns are harder.
Oracle has a cache of query plans, and older/less used plans are aged out as new ones are required.
If you don't cache query plans, there's a limit to how smart you can make your optimizer as the more smarts you code into it, the bigger impact you have on each query processed. Caching queries means you only incur that overhead the first time you see the query.
The 'downside' is that for cursor sharing to be effective you need to use bind variables. Some programmers don't realise that and write code that doesn't get shared and then complain that Oracle isn't as fast as mySQL.
Another advantage of Oracle is the UNDO log. As a change is done, the 'old version' of the data is written to an undo log. Other database keep old versions of the record in the same place as the record. This requires VACUUM style cleanup operations or you bump into space and organisation issues. This is most relevant in databases with high update or delete activity.
Also Oracle doesn't have a central lock registry. A lock bit is stored on each individual data record. SELECT doesn't take a lock. In databases where SELECT locks, you could have multiple users reading data and locking each other or preventing updates, introducing scalability limits. Other databases would lock a record when a SELECT was done to ensure that no-one else could change that data item (so it would be consistent if the same query or transaction looked at the table again). Oracle uses UNDO for its read consistency model (ie looking up the data as it appeared at a specific point in time).
Tom Kyte's "Expert Oracle Database Architecture" from Apress does a good job of describing Oracle's architecture, with some comparisons with other rDBMSs. Worth reading.
