Facing Pipeline Busy issue while loading data in Greenplum using data stage - greenplum

We are getting errors while loading data(large volume) in Greenplum through data stage jobs.
There are multiple jobs running sequentially. there is no particular job that fails. it is randomly. if on the 1st day Job1 fails, on the 2nd day Job2 gets fail.
We have also observed that it only impact the jobs that have to load a high volume of data.
Please find the error, we have got so fer.
day 1----------
Message:
STG_DEPS_G,0: The following SQL statement failed: INSERT INTO GPCC_ET_20211114015751397_14836_2 SELECT DEPT, DEPT_NAME, BUYER, MERCH, PROFIT_CALC_TYPE, PURCHASE_TYPE, GROUP_NO, BUD_INT, BUD_MKUP, TOTAL_MARKET_AMT, MARKUP_CALC_TYPE, OTB_CALC_TYPE, MAX_AVG_COUNTER, AVG_TOLERANCE_PCT, DEPT_VAT_INCL_IND, CREATE_ID, CREATE_DATETIME FROM staging.STG_DEPS. The statement reported the following reason: [SQLCODE=08S01][Native=373,254] [IBM (DataDirect OEM)][ODBC Greenplum Wire Protocol driver][Greenplum]ERROR: http response code 501 from gpfdist (gpfdist://DDCETLMIG:8000/DDCETLMIG_14836_gpw_11_3_20211114015751366): HTTP/1.0 501 pipe is busy, close the pipe and try again (seg0 192.168.199.10:6000 pid=25824)(File url_curl.c; Line 474; Routine check_response; ) (CC_GPCommon::checkThreadStatusThrow, file CC_GPCommon.cpp, line 808)
The following SQL statement failed: INSERT INTO GPCC_ET_20211114015751397_14836_2 SELECT DEPT, DEPT_NAME, BUYER, MERCH, PROFIT_CALC_TYPE, PURCHASE_TYPE, GROUP_NO, BUD_INT, BUD_MKUP, TOTAL_MARKET_AMT, MARKUP_CALC_TYPE, OTB_CALC_TYPE, MAX_AVG_COUNTER, AVG_TOLERANCE_PCT, DEPT_VAT_INCL_IND, CREATE_ID, CREATE_DATETIME FROM staging.STG_DEPS. The statement reported the following reason: [SQLCODE=08S01][Native=373,254] [IBM (DataDirect OEM)][ODBC Greenplum Wire Protocol driver][Greenplum]ERROR: http response code 501 from gpfdist (gpfdist://DDCETLMIG:8000/DDCETLMIG_14836_gpw_11_3_20211114015751366): HTTP/1.0 501 pipe is busy, close the pipe and try again (seg0 192.168.199.10:6000 pid=25824)(File url_curl.c; Line 474; Routine check_response; ) (CC_GPCommon::checkThreadStatusThrow, file CC_GPCommon.cpp, line 808)
day 2
STG_RPM_ZONE,0: The following SQL statement failed: INSERT INTO GPCC_ET_20211114093430218_8212_0 SELECT ZONE_ID, ZONE_DISPLAY_ID, ZONE_GROUP_ID, NAME, CURRENCY_CODE, BASE_IND, LOCK_VERSION FROM STAGING.STG_RPM_ZONE. The statement reported the following reason: [SQLCODE=08S01][Native=373,254] [IBM (DataDirect OEM)][ODBC Greenplum Wire Protocol driver][Greenplum]ERROR: http response code 501 from gpfdist (gpfdist://DDCETLMIG:8004/DDCETLMIG_8212_gpw_0_0_20211114093430186): HTTP/1.0 501 pipe is busy, close the pipe and try again (seg1 192.168.199.11:6000 pid=26726)(File url_curl.c; Line 474; Routine check_response; ) (CC_GPCommon::checkThreadStatusThrow, file CC_GPCommon.cpp, line 808)
The following SQL statement failed: INSERT INTO GPCC_ET_20211114093430218_8212_0 SELECT ZONE_ID, ZONE_DISPLAY_ID, ZONE_GROUP_ID, NAME, CURRENCY_CODE, BASE_IND, LOCK_VERSION FROM STAGING.STG_RPM_ZONE. The statement reported the following reason: [SQLCODE=08S01][Native=373,254] [IBM (DataDirect OEM)][ODBC Greenplum Wire Protocol driver][Greenplum]ERROR: http response code 501 from gpfdist (gpfdist://DDCETLMIG:8004/DDCETLMIG_8212_gpw_0_0_20211114093430186): HTTP/1.0 501 pipe is busy, close the pipe and try again (seg1 192.168.199.11:6000 pid=26726)(File url_curl.c; Line 474; Routine check_response; ) (CC_GPCommon::checkThreadStatusThrow, file CC_GPCommon.cpp, line 808)
day 3
Event type:Fatal
Timestamp:11/15/2021 9:27:36 AM
Message:
SUB_CLASS,3: APT_PMMessagePort::dispatch:ERROR: header = 04F02E20SUBPROC_SUPPORT_EOW, savedDispatchPosition = 04F02E20, currentDispatchPosition_ = 04F02E1FS, currentInputPosition_ = 04F02E58, buffer_ = 04F02E20, this = 04EEA1E0
Day 4
Message:
STG_GROUPS_G,0: The following SQL statement failed: INSERT INTO GPCC_ET_20211115015013039_2400_0 SELECT GROUP_NO, GROUP_NAME, BUYER, MERCH, DIVISION, CREATE_ID, CREATE_DATETIME FROM staging.STG_GROUPS. The statement reported the following reason: [SQLCODE=08S01][Native=373,254] [IBM (DataDirect OEM)][ODBC Greenplum Wire Protocol driver][Greenplum]ERROR: http response code 501 from gpfdist (gpfdist://DDCETLMIG:8009/DDCETLMIG_2400_gpw_1_1_20211115015013023): HTTP/1.0 501 pipe is busy, close the pipe and try again (seg5 192.168.199.12:6001 pid=1167)(File url_curl.c; Line 474; Routine check_response; ) (CC_GPCommon::checkThreadStatusThrow, file CC_GPCommon.cpp, line 808)
The following SQL statement failed: INSERT INTO GPCC_ET_20211115015013039_2400_0 SELECT GROUP_NO, GROUP_NAME, BUYER, MERCH, DIVISION, CREATE_ID, CREATE_DATETIME FROM staging.STG_GROUPS. The statement reported the following reason: [SQLCODE=08S01][Native=373,254] [IBM (DataDirect OEM)][ODBC Greenplum Wire Protocol driver][Greenplum]ERROR: http response code 501 from gpfdist (gpfdist://DDCETLMIG:8009/DDCETLMIG_2400_gpw_1_1_20211115015013023): HTTP/1.0 501 pipe is busy, close the pipe and try again (seg5 192.168.199.12:6001 pid=1167)(File url_curl.c; Line 474; Routine check_response; ) (CC_GPCommon::checkThreadStatusThrow, file CC_GPCommon.cpp, line 808)

A couple of questions:
You indicated there are multiple jobs running sequentially? Does that mean there are multiple concurrent jobs each running a set of sequential jobs? Or is there one job running multiple sequential steps? If it is the first, I would check to make sure no jobs are using the same gpfdist ports.
Is DataStage connecting through a proxy server? That could potentially be causing a problem (common cause of 501 errors according to google lookup).
Finally, what is the throughput speed of the NIC or NICs being used to connect DataStage/gpfdist to Greenplum? Also, what is the interconnect speed between each segment/segment host in Greenplum? If either of these is less than 10gig, your network may not be able to support really high throughput.

Related

Import file failed to greenplum because of one line of data on navicate

When importing a file into Greenplum,one lines fails,and the whole file is not imported successfully.Is there a way can skip the wrong line and import other data into Greenplum successfully?
Here are my SQL execution and error messages:
copy cjh_test from '/gp_wkspace/outputs/base_tables/error_data_test.csv' using delimiters ',';
ERROR: invalid input syntax for integer: "FE00F760B39BD3756BCFF30000000600"
CONTEXT: COPY cjh_test, line 81, column local_city: "FE00F760B39BD3756BCFF30000000600"
Greenplum has an extension to the COPY command that lets you log errors and set up a certain amount of errors that can occur that won't stop the load. Here is an example from the documentation for the COPY command:
COPY sales FROM '/home/usr1/sql/sales_data' LOG ERRORS
SEGMENT REJECT LIMIT 10 ROWS;
That tells COPY that 10 bad rows can be ignored without stopping the load. The reject limit can be # of rows or a percentage of the load file. You can check the full syntax in psql with: \h copy
If you are loading a very large file into Greenplum, I would suggest looking at gpload or gpfdist (which also support the segment reject limit syntax). COPY is single threaded through the master server where gpload/gpfdist load the data in parallel to all segments. COPY will be faster for smaller load files and the others will be faster for millions of rows in a load file(s).

reg: goldengate extract process not working

My extract process is not running, below is the errors found, kindly suggest how to get all process up and running.
GGSCI (pltv015) 3> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
EXTRACT ABENDED EXTEMP 00:00:04 05:46:53
EXTRACT RUNNING PUMPEMP 00:00:00 00:00:03
REPLICAT STOPPED REP507 00:00:00 00:18:08
REPLICAT ABENDED REPTEST 00:00:00 2527:29:44
for EXTEMP :
2020-07-31 06:59:39 ERROR OGG-06601 Mismatch between the length of seqno from checkpoint (9) and recovery (6) for extract trail /opt/app/t1c2d507/ggs/t1c2d507/tr
ails/p1
for REP507 ::
2020-07-31 06:59:37 ERROR OGG-00664 OCI Error beginning session (status = 1017-ORA-01017: invalid username/password; logon denied).
2020-07-31 06:59:37 ERROR OGG-01668 PROCESS ABENDING.
2020-07-31 06:59:39 ERROR OGG-06601 Oracle GoldenGate Capture for Oracle, extemp.prm: Mismatch between the length of seqno
from checkpoint (9) and recovery (6) for extract trail /opt/app/t1c2d507/ggs/t1c2d507/trails/p1.
Just in case it might help you. The following workaround applies only to Oracle GoldenGate version 12.2.0.1.0. Applies to any to any platform.
Running GG version 12.2 PUMP fails with this error
ERROR OGG-06601 Mismatch between the length of seqno from checkpoint (9) and recovery (6) for extract trail /path_to_the_trail/
Trying to read trail file which uses 6 digit checkpoint with version 12.2 when this version uses a 9 digit checkpoint. Same error might happen even when the trail files are actually having the same length as well. In that case, the error message is incorrect as it is related with a bug with code 25439681.
If the error "Mismatch between the length of seqno from checkpoint (9) and
recovery (6) for extract trail" is seen and the filename lengths are the same
then this bug may have been encountered. Note that this message masks the
real error message so the fix in Bug 25439681 does not resolve the underlying error but
makes sure the correct error is reported.
Workaround
PART I
Stop PUMP
Stop Manager
Add the following to your GLOBALS file
TRAIL_SEQLEN_6D
REASON: Tell GG to use 6 digit checkpoint
Start Manager
Alter Pump with ETROLLOVER
Start Pump
Allow PUMP to read local trail file and write them to a remote trail file
Allow replicat to process all transactions. Replicat should show 0 lags to indicate all transactions , from the source, have been processed on the target database.
REASON: Clean up existing trail files, created from a prior release to GG version 12.2, still using a 6 digit checkpoint
PART II
Assuming you had no problems with PART I, then you need to perform some tasks both in source and target.
On Source
Remove TRAIL_SEQLEN_6D from GLOBALS
alter ext E1 etrollover where E1 is the name of your extract which creates the local trail file. REASON: ETROLLOVER needed to convert 6 digit checkpoint to 9 digits as well as GG version 12.2
Use the following to display the new sequence number of local trail file.
info extract E1, detail
or
info extract E1, showch
Write Checkpoint #1
Current Checkpoint (current write position):
Sequence #: xx
where xx = new sequence number of local trail file
alter ext P1, extseqno xx , extrba 0 (where xx = new sequence number of local trail file and P1 is the name of your PUMP) --> to handle input trail and the REASON: Tell PUMP to use the new local trail file created in step 1
alter ext p1, etrollover ---> to handle output trail. Reason Tell PUMP to create and write to a new remote trail file.
Use the following to display the new sequence number of the remote trail file
info extract E1, detail
or
info extract E1, showch
Write Checkpoint #1
Current Checkpoint (current write position):
Sequence #: yy
where yy = new sequence number of the remote trail file
On Target
alter replicat R1, extseqno yy , extrba 0 where yy = new sequence number + 1 of the remote trail file
Go back to Source
Allow changes to be made to Source tables involved with GG
Perform insert or update to verify it gets replicated to the target.
UPDATE
To update the password of the CGADMIN
Step 1: check Golden Gate user
SQL> select username,account_status from dba_users where username like ‘GG%’;
USERNAME ACCOUNT_STATUS
—————————— ——————————–
GGADMIN OPEN
Step 2: Change the password is database first
SQL> alter user GGADMIN identified by newpassWORD;
Step 3: Encrypt the new modified password in golden gate processes.
ENCRYPT PASSWORD passWORD ENCRYPTKEY DEFAULT
AACAAAAAAAAAAAIAWIVENGVBBFXEFEQH
Step 4: copy the password
dblogin userid GGADMIN, password AACAAAAAAAAAAAIAWIVENGVBBFXEFEQH, encryptkey default

AWS DMS - Oracle to PG RDS full load operation error - failed to load data from csv file

I am trying to move data from a oracle instance to postgres RDS using DMS. I am only doing a full load operation and I have disabled all the foreign keys on the target. I also made sure that the datatypes are not mismatched between columns for the same tables. I tried both 'Do Nothing' and 'Truncate' for the Target Table preparation mode and when I run the task, several tables are failing with below error messages:
[TARGET_LOAD ]E: Command failed to load data with exit error code 1, Command output: <truncated> [1020403] (csv_target.c:981)
[TARGET_LOAD ]E: Failed to wait for previous run [1020403] (csv_target.c:1578)
[TARGET_LOAD ]E: Failed to load data from csv file. [1020403] (odbc_endpoint_imp.c:5648)
[TARGET_LOAD ]E: Handling End of table 'public'.'SKEWED_VALUES' loading failed by subtask 6 thread 1 [1020403] (endpointshell.c:2416)
DMS doesn't give the correct error information and I am not able to understand what the above error messages mean.
When I use 'Drop tables on target' for the Target table preparation mode, it works but it creates the datatypes of the columns in a different way which I don't want.
Any help would be appreciated.
To troubleshoot my case, I created a copy of the task that only loaded the one problem table, and upped all the logging severities to "Detailed debug". Then I was able to see this:
[TARGET_LOAD ]D: RetCode: SQL_SUCCESS_WITH_INFO SqlState: 42622 NativeError: -1 Message: NOTICE: identifier "diagnosticinterpretationrequestdata_diagnosticinterpretationcode" will be truncated to "diagnosticinterpretationrequestdata_diagnosticinterpretationcod" (ar_odbc_stmt.c:4720)
In the RDS logs for the target DB I found:
2021-10-11 14:30:36 UTC:...:[19259]:ERROR: invalid input syntax for integer: ""
2021-10-11 14:30:36 UTC:...:[19259]:CONTEXT: COPY diagnosticinterpretationrequest, line 1, column diagnosticinterpretationrequestdata_diagnosticinterpretationcod: ""
2021-10-11 14:30:36 UTC:...:[19259]:STATEMENT: COPY "myschema"."diagnosticinterpretationrequest" FROM STDIN WITH DELIMITER ',' CSV NULL 'attNULL' ESCAPE '\'
I found that if I added a table mapping rule to explicitly rename the column to truncate the name within Postgres's limit for identifier length, then things ran ok.
{
"rule-type": "transformation",
"rule-id": "1",
"rule-name": "1",
"rule-target": "column",
"object-locator": {
"schema-name": "%",
"table-name": "%",
"column-name": "diagnosticinterpretationrequestdata_diagnosticinterpretationcode"
},
"rule-action": "rename",
"value": "diagnosticinterpretationrequestdata_diagnosticinterpretationcod",
"old-value": null
},

utl_file.fopen failing Occasionally

Everyone,
We have a set of 8 jobs which runs in parallel on Unix server. Those jobs calls Oracle stored procedure. All those procedures does a set of DB operations (on different tables) and at the end creates files in Unix server. (Each job creates a file in different names. But puts in same folder)
Recently, we are seeing random failures with error message "ORA-06512: at "SYS.UTL_FILE", line 536". Each day one or two job fails, while creating the error report. When the job is rerun there is no issues. We couldn't reproduce the issue in lower environment.
The folder has all access granted. This jobs have been running for more than an year with no issues. Any ideas appreciated.
Based on my analysis:-
DB Operations have completed without any issues. There is no file created (not even an empty one). So it failed while encountering fopen.
Sample code
DECLARE
IN_CONT_TYPE varchar2(100) := 'HARDWARE_ATTRIBUTES' ;
in_batch_name ccpm_epslz_control.push_batch_name%TYPE := 'HARDWARE_ATTRIBUTES_20181211062540';
l_file_type utl_file.file_type;
file_record_hold_cur sys_refcursor;
BEGIN
/*DB Operations*/
l_file_type :=utl_file.fopen('ERR_FOLDER',l_file_name,'W');
utl_file.put_line(l_file_type, 'count of input records filtered based on errors:');
utl_file.put_line(l_file_type, '-----------------------------------------------');
utl_file.put_line(l_file_type, l_col_name_print);
OPEN file_record_hold_cur FOR l_select_stmt_bus;
LOOP
FETCH file_record_hold_cur INTO l_putline_stmt_bus;
EXIT
WHEN file_record_hold_cur%notfound;
utl_file.put_line(l_file_type, l_putline_stmt_bus);
END LOOP;
CLOSE file_record_hold_cur;
utl_file.fclose(l_file_type);
EXCEPTION
WHEN OTHERS THEN
Dbms_Output.put_line ( DBMS_UTILITY.FORMAT_ERROR_BACKTRACE() );
END;
ERROR MESSAGE:- ORA-20051: Internal Error in file
generationORA-06512: at "MYPACKAGE", line 84                                                                   
ORA-06512: at "SYS.UTL_FILE", line 536

monetdb failed to bulk load 42000 syntax error

I'm trying to bulk load some data into monetdb.
I followed this example.
It works for my test data. But when I use the data from production environment, I get exceptions.
Caused by: java.lang.Exception: 42000!syntax error, unexpected IDENT, expecting DELIMITERS in: "copy into dm.fact_sem_keyword_collection from stdin using delimters"
25005!current transaction is aborted (please ROLLBACK)
at com.lietou.bi.dw.task.common.dataexchange.MonetDBBulkLoadWriter.flush(MonetDBBulkLoadWriter.java:140)
... 12 more
and I found the following log from the debug log.
20150804x_m_0001323100000001255480000000001310100152015-08-05 10:28:44
20150804x_z_0002000000000001000000000000000002015-08-05 10:28:44
20150804xn_cd_01000000000002000000000000000002015-08-05 10:28:44
20150804z_s_0002000000000001000000000000000002015-08-05 10:28:44
RD 1438779576672: read final block: 190 bytes
RX 1438779576672: !42000!syntax error, unexpected IDENT, expecting DELIMITERS in: "copy into dm.fact_sem_keyword_collection from stdin using delimters"
!25005!current transaction is aborted (please ROLLBACK)
RD 1438779576672: inserting prompt
I think there must be some bad data, but I don't know how to find more details to help me locate it.
And there is one more thing. Monetdb seems to read data using a buffer size of 8190 bytes. But right before the exception, there is a write final block of different size. Logs are showed below. What does this mean?
...
TD 1438779576129: write block: 8190 bytes
TX 1438779576129:
...
TD 1438779576137: write block: 8190 bytes
TX 1438779576137:
...
TD 1438779576137: write final block: 5921 bytes
TX 1438779576137:
...

Resources