background: two application nodes, three Oracle node.
I have a sequence, ddl like this:
CREATE SEQUENCE "XXX_SEQ" MINVALUE 10 MAXVALUE 99 INCREMENT BY 1 START WITH 10 NOCACHE ORDER CYCLE ;
And I use this to generate the id as my primary key
SELECT 'FWK2-' || TO_CHAR(SYS_EXTRACT_UTC(CURRENT_TIMESTAMP), 'YYYYMMDDHH24MISSFF2') || '-' || XXX_SEQ.nextval AS MSG_REQ_ID FROM DUAL;
But when concurrent generate the id, it hit the duplicate issue, in milliseconds it generate the duplicate id.
It is part of the log
2019-09-03 04:40:17,501 FWK2-2019090304401699-43
2019-09-03 04:40:17,010 FWK2-2019090304401699-43
And I checked in the 99 milliseconds, there are no more than 99 msg that would make the sequence cycle to hit the duplicate issue.
I suspect it should be the ddl issue
Dose anyone know about how to optimize it?
actually, in my PROD env, i have 3 application that use this method to generate, and each application has 2 nodes. Suppose 3 application are A,B,C.
A and C has only one place to generate id
B has more than three place to generate id
and one more strange that as my sequence is order, why the id in db is not order?
FWK2-2019090304394159-34
FWK2-2019090304394280-37
FWK2-2019090304394298-35
FWK2-2019090304394311-36
FWK2-2019090304394354-40
FWK2-2019090304394359-38
Update
Although I test with sequence NOCACHE, ORDER, under Jmeter use 3 api, each api 300 threads.
The id would not be order, like this:
FWK2-2019090403025046-67
FWK2-2019090403025046-68
FWK2-2019090403025050-69
FWK2-2019090403025053-10
FWK2-2019090403025053-11
FWK2-2019090403025053-31
FWK2-2019090403025053-39
FWK2-2019090403025053-40
FWK2-2019090403025053-41
FWK2-2019090403025053-42
FWK2-2019090403025053-46
FWK2-2019090403025053-47
FWK2-2019090403025053-70
FWK2-2019090403025053-71
FWK2-2019090403025053-72
Related
I want to get the row ID or record ID for last inserted record in the table in Trafodion.
Example:
1 | John <br/>
2 | Michael
When executing an INSERT statement, I want to return the created ID, means 3.
Could anyone tell me how to do that using trafodion or is it not possible ?
Are you using a sequence generator to generate unique ids for this table? Something like this:
create table idcol (a largeint generated always as identity not null,
b int,
primary key(a desc));
Either way, with or without sequence generator, you could get the highest key with this statement:
select max(a) from idcol;
The problem is that this statement could be very inefficient. Trafodion has a built-in optimization to read the min of a key column, but it doesn't use the same optimization for the max value, because HBase didn't have a reverse scan until recently. We should make use of the reverse scan, please feel free to file a JIRA. To make this more efficient with the current code, I added a DESC to the primary key declaration. With a descending key, getting the max key will be very fast:
explain select max(a) from idcol;
However, having the data grow from higher to lower values might cause issues in HBase, I'm not sure whether this is a problem or not.
Here is yet another solution: Use the Trafodion feature that allows you to select the inserted data, showing you the inserted values right away:
select * from (insert into idcol(b) values (11),(12),(13)) t(a,b);
A B
-------------------- -----------
1 11
2 12
3 13
--- 3 row(s) selected.
I have a table with a huge amount of data. It is partitioned by week. This table contains a column named group. Each group could have multiple records of weeks. For example:
List item
gr week data
1 1 10
1 2 13
1 3 5
. . 6
2 2 14
2 3 55
. . .
I want to create a table based on one group. The creation currently is taking ~23 minutes on Oracle 11g. This is a long time since I have to repeat the process for each group and I have many groups. what is the best fastest way to create the table ?
Create all tables then use INSERT ALL WHEN
http://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_9014.htm#i2145081
The data will be read only once.
insert all
when gr=1 then
into tab1 values (gr, week, data)
when gr=2 then
into tab2 values (gr, week, data)
when gr=3 then
into tab3 values (gr, week, data)
select *
from big_table
The best speed up you reach if you don't copy the data on group basis and process then week by week, but you don't say what you will reach so it is not possible to comment (this approach may be of course difficult or impracticable; but you should at least consider it).
Therefore below some hints how to extract the group data:
remove all indexes as this will only block space - all you need to do is one large FULL TABLE SCAN
check the available space and size of each group; maybe you can process several groups in one pass
deploy parallel query
.
create table tmp as
select /*+ parallel(4) */ * from BIG_TABLE
where group_id in (..list of groupIds..);
Please note that parallel mode must be enabled in the database, ask your DBA if you are unsure. The point is that the large FULL TABLE SCAN is performed by several sub-processes (here 4) which may (dependent on your mileage) cut the elapsed time.
I've looked at forums, worked through tutorials and looked in the Rails guides but can't seem to find what I'm looking for.
I'm trying to create a security application for a company I work for and I am creating the report form. I would like the reports to be numbered with an ascending number but am not sure what to search for or do to accomplish this.
I am using PostgreSQL as my database for both dev and prod.
You could use the id column if it is set up as the primary key (as it normally is). It would be unique to every report, never repeat, and increment by one with every new report.
Or you can make a column just for the report # and use:
CREATE TABLE tablename (
colname SERIAL
);
this is equal to doing:
CREATE SEQUENCE tablename_colname_seq;
CREATE TABLE tablename (
colname integer NOT NULL DEFAULT nextval('tablename_colname_seq')
);
ALTER SEQUENCE tablename_colname_seq OWNED BY tablename.colname;
This will allow you to set the starting number via:
SELECT setval('colname', 42, false);
The false means the first number handed out will be 42, not 43. If left out or set to true it will return 43 as the first value in the sequence.
An important aspect of sequences in Postgres is that a new number is handed out every time, even if that row fails to stay because of a transaction failure. So you could end of with missing numbers in the sequence.
Be sure and read this: http://www.postgresql.org/docs/9.3/static/functions-sequence.html
If this is a problem you might just want to do all of this in Rails by doing something like:
next_number = Report.select(:id).order('id DESC').limit(1).pluck(:id).first + 1
and have a unique constraint on the column to ensure no duplicate numbers.
For other DB's see this: http://www.w3schools.com/sql/sql_autoincrement.asp
i have tried this UDF in hive : UDFRowSequence.
But its not generating unique value i.e it is repeating the sequence depending on mappers.
Suppose i have one file (Having 4 records) availble at HDFS .it will create one mapper for this job and result will be like
1
2
3
4
but when there are multiple file (large size) at HDFS Location , Multiple mapper will get created for that job and for each mapper repetitive sequence number will get generated like below
1
2
3
4
1
2
3
4
1
2
.
Is there any solution for this so that unique number should be generated for each record
I think you are looking for ROW_NUMBER(). You can read about it and other "windowing" functions here.
Example:
SELECT *, ROW_NUMBER() OVER ()
FROM some_database.some_table
#GoBrewers14 :- Yes i did try that. We tried to use the ROW_NUMBER function ,but when we try to query this on small size data eg. file containing 500 rows, it’s working perfectly. But when it comes to large size data, query runs for couple of hours and finally fails to generate output .
I have came to know below information regarding this :-
Generating a sequential order in a distributed processing query is not possible with simple UDFs. This is because the approach will require some centralised entity to keep track of the counter, which will also result in severe inefficiency for distributed queries and is not recommended to apply.
If you want work with multiple mappers and with large dataset, try using this UDF: https://github.com/manojkumarvohra/hive-hilo
It makes use of zookeeper as central repository to maintain state of sequence
Query to generate Sequences. We can use this as Surrogate Key in Dimension table as well.
WITH TEMP AS
(SELECT if(max(seq) IS NULL, 0, max(seq)) max_seq
FROM seq_test)
SELECT col_id,
col_val,
row_number() over() + max_seq AS seq
FROM souce_table
INNER JOIN TEMP ON 1 = 1;
seq_test: Its your target table.
source_table: Its your source.
Seq: Surrogate key / Sequence number / Key column
I am programming a Windows Application (in Qt 4.6) which - at some point - inserts any number of datasets between 1 and around 76000 into some oracle (10.2) table. The application has to retrieve the primary keys, or at least the primary key range, from a sequence. It will then store the IDs in a list which is used for Batch Execution of a prepared query.
(Note: Triggers shall not be used, and the sequence is used by other tasks as well)
In order to avoid calling the sequence X times, I would like to increment the sequence by X instead.
What I have found out so far, is that the following code would be possible in a procedure:
ALTER SEQUENCE my_sequence INCREMENT BY X;
SELECT my_sequence.CURVAL + 1, my_sequence.NEXTVAL
INTO v_first_number, v_last_number
FROM dual;
ALTER SEQUENCE my_sequence INCREMENT BY 1;
I have two major concerns though:
I have read that ALTER SEQUENCE produces an implicit commit. Does this mean the transaction started by the Windows Application will be commited? If so, can you somehow avoid it?
Is this concept multi-user proof? Or could the following thing happen:
Sequence is at 10,000
Session A sets increment to 2,000
Session A selects 10,001 as first and 12,000 as last
Session B sets increment to 5,000
Session A sets increment to 1
Session B selects 12,001 as first and 12,001 as last
Session B sets increment to 1
Even if the procedure would be rather quick, it is not that unlikely in my application that two different users cause the procedure to be called almost simultaneously
1) ALTER SEQUENCE is DDL so it implicitly commits before and after the statement. The database transaction started by the Windows application will be committed. If you are using a distributed transaction coordinator other than the Oracle database, hopefully the transaction coordinator will commit the entire distributed transaction but transaction coordinators will sometimes have problems with commits issued that it is not aware of.
There is nothing that you can do to prevent DDL from committing.
2) The scenario you outline with multiple users is quite possible. So it doesn't sound like this approach would behave correctly in your environment.
You could potentially use the DBMS_LOCK package to ensure that only one session is calling your procedure at any point in time and then call the sequence N times from a single SQL statement. But if other processes are also using the sequence, there is no guarantee that you'll get a contiguous set of values.
CREATE PROCEDURE some_proc( p_num_rows IN NUMBER,
p_first_val OUT NUMBER,
p_last_val OUT NUMBER )
AS
l_lockhandle VARCHAR2(128);
l_lock_return_code INTEGER;
BEGIN
dbms_lock.allocate_unique( 'SOME_PROC_LOCK',
l_lockhandle );
l_lock_return_code := dbms_lock.request( lockhandle => l_lockhandle,
lockmode => dbms_lock.x_mode,
release_on_commit => true );
if( l_lock_return_code IN (0, 4) ) -- Success or already owned
then
<<do something>>
end if;
dbms_lock.release( l_lockhandle );
END;
Altering the sequence in this scenario is really bad idea. Particularly in multiuser environment. You'll get your transaction committed and probably several "race condition" data bugs or integrity errors.
It would be appropriate if you had legacy data alredy imported and want to insert new data with ids from sequence. Then you may alter the sequence to move currval to max existing ...
It seems to me that here you want to generate Ids from the sequence. That need not to be done by
select seq.nextval into l_variable from dual;
insert into table (id, ...) values (l_variable, ....);
You can use the sequence directly in the insert:
insert into table values (id, ...) values (seq.nextval, ....);
and optionally get the assigned value back by
insert into table values (id, ...) values (seq.nextval, ....)
returning id into l_variable;
It certainly is possible even for bulk operations with execBatch. Either just creating the ids or even returning them. I am not sure about the right syntax in java but it will be something about the lines
insert into table values (id, ...) values (seq.nextval, ....)
returning id bulk collect into l_cursor;
and you'll be given a ResultSet to browse the assigned numbers.
You can't prevent the implicit commit.
Your solution is not multi user proof. It is perfectly possible that another session will have 'restored' the increment to 1, just as you described.
I would suggest you keep fetching values one by one from the sequence, store these IDs one by one on your list and have the batch execution operate on that list.
What is the reason that you want to fetch a contiguous block of values from the sequence? I would not be too worried about performance, but maybe there are other requirements that I don't know of.
In Oracle, you can use following query to get next N values from a sequence that increments by one:
select level, PDQ_ACT_COMB_SEQ.nextval as seq from dual connect by level <= 5;