Uncaught SoapFault exception: [Sender] Invalid XML with Magento API - magento

"Uncaught SoapFault exception: [Sender] Invalid XML" outputs when I try to import bulk products into Magento.
I have a excel file, a product per line, I have about 178 products. Everything is ok until it goes to the 22 line. The Fatal Error outputs.
Anyone knows what's happened. Thank you very much!

You may have used some special characters (like "<" (left chevron) or ">" (right chevron) or "'" (single quote)), in the product existing in line #22.
You need to make sure that the special characters are converted / used by the corresponding HTML entities only in such scenarios, directly in the excel file only.
If you don't convert / delete those special characters in the excel file only, then in the API they get used up as the characters to be used for the API. This is a little bit similar to the way how SQL Injection occurs.
Hope it helps.

Related

What does Error code ‎8007203c mean? In Windows Active directory?

When I try to change Group attributes using C++ Native code, I get this error: ‎8007203c. What does this error code refer to? I can't find details regarding this in documents.
The error code is defined as ERROR_DS_ENCODING_ERROR in winerror.h:
//
// MessageId: ERROR_DS_ENCODING_ERROR
//
// MessageText:
//
// An encoding error has occurred.
//
#define ERROR_DS_ENCODING_ERROR 8252L
An easier way to lookup error code is to search in The Magic Number Database: https://www.magnumdb.com/search?q=8007203c
So this is probably an encoding (ansi vs unicode) issue or maybe using special characters that should be escaped. From Active Directory: Characters to Escape:
As for my knowledge, this error occurs if there is an invalid code page, invalid characters or an encoding error (can't find the reference/documentation about it yet).
Maybe there are special characters in your group names which cause this behavior.

Oracle mysterious Unicode codepoint

While calling XMLTYPE() on a CLOB column which should contain a valid XML1.0 xml (the db encoding should be UTF-8), the following error message comes out (I am from Italy):
ORA-31011: Analisi XML non riuscita
ORA-19202: Errore durante l'elaborazione XML
LPX-00217: carattere non valido 15577023 (U+EDAFBF)
Error at line 240
ORA-06512: a "SYS.XMLTYPE", line 272
ORA-06512: a line 1
31011. 00000 - "XML parsing failed"
*Cause: XML parser returned an error while trying to parse the document.
*Action: Check if the document to be parsed is valid.
Now this invalid character is given as Unicode codepoint EDAFBF. The problem is that according to Unicode spec (wikipedia), there are no codepoints beyond 10FFFF. So what could this error mean?
Inspecting this CLOB with SQLDeveloper (and copying it to Notepad++ with encoding set to utf-8) does not reveal anything unusual beyond some strange characters which apparently came from the user browser when he copied text from a Microsoft Word document (but the CLOB, at least as copied from SQLDeveloper UI and exhibited by Notepad++ with UTF-8 encoding, seems to be a valid UTF-8 text).
Is there a way to reproduce this error populating Oracle directly (from SQLDeveloper or in some other way)? (contacting the end user to understand what he put exactly in the web form is problematic)
Not addressing the first part of the question, but you can reproduce it with a RAW value:
select xmltype('<dummy>'
|| utl_raw.cast_to_varchar2(cast('EDAFBF' as raw(6)))
|| '</dummy>')
from dual;
Error report -
SQL Error: ORA-31011: XML parsing failed
ORA-19202: Error occurred in XML processing
LPX-00217: invalid character 15577023 (U+EDAFBF)
Error at line 1
ORA-06512: at "SYS.XMLTYPE", line 310
ORA-06512: at line 1
Just selecting the character:
select utl_raw.cast_to_varchar2(cast('EDAFBF' as raw(6)))
from dual;
... is displayed as a small square with an even smaller question mark inside it (I think) in SQL Developer for me (version 4.1), but that's just how it's choosing to render that; copying and pasting still gives the replacement character � since the codepoint is, as you say, invalid. XMLType is being stricter about the validity than CLOB. The unistr() function doesn't handle the value either, which isn't really a surprise.
(You don't need to cast the string to raw(6), just utl_raw.cast_to_varchar2('EDAFBF') has the same effect; but doing it explicitly makes it a bit clearer what's going on, I think).
I don't see how that could have got into your file without some kind of corruption, possibly through a botched character set conversion I suppose. You could maybe use dbms_lob.replace_fragment() or similar to replace or remove that character, but of course there may be others you haven't hit yet, and at best you'd only be treating the symptoms rather than the cause.

W3C unable to validate

Sorry, I am unable to validate this document because on line 1200 it contained one or more bytes that I cannot interpret as utf-8 (in other words, the bytes found are not valid values in the specified Character Encoding). Please check both the content of the file and the character encoding indication.
The error was: utf8 "\xD8" does not map to Unicode
i would be thankful to know what exactly should i do, my website is : http://dailysahara.com/
The issue, as stated by the validator, is that you have some invalid UTF-8 in your document. It appears to be in the box on the left of the site with the four tabs "Tags", "Comments", "Recents", and "Popular". It shows up to me as a black square like this: �. If you remove that, you should be able to validate your site.

Not sure why the output of my PHP scripts contains random embedded spaces within character strings

I have written several PHP scripts to read the contents of a database and output those contents in an email message. Every once in a while, I will see a SPACE (0x20) character embedded in the output where there shouldn't be any. For example, in one script, I reference a PHP global variable containing exactly "n" non-space characters, and sometimes (not always), when that variable is dumped to an email message, the string will appear with an embedded blank (making the total length of the string "n+1"). Other times, an HTML tag (such as <BR>) will appear as < BR> (note the SPACE before the "B").
Because the behavior of the script is not consistent (some emails are affected, and others aren't), I can't seem to find the problem.
I am enclosing a link to the PHP script that is occasionally embedding a space into the BREAK tag. I have removed the lines that provide specific login information to the databases. Otherwise, everything else is intact. In the code file you can find at the link below, line 281 is the one that contained the BREAK command with the embedded SPACE (as described above). This has happened only once!
http://jem-software.com/temptest.txt
I guess the only other potentially relevant information is that this script file is taken from code entered into a JUMI code block contained within a Joomla! based website.
Edit 1:
Thank you, Riccardo, for your suggestions. Here is some more clarification:
I am not reading an email and parsing the results in order to insert into a database. Just the opposite, I am reading from a database and using the results to create an email. I will check the database to see what character set was used, and explicitly pass the character set to see if that makes a difference.
I don't use Joomla functions to access the database because the database I am referencing is external to the Joomla! environment. It is a pre-existing database created from PHP scripts written several years prior. When my old website was re-written using Joomla, I wanted to "port" the PHP database access code intact, so I installed the JUMI plugin to make this possible.
I will check out the character coding in the database and synchronize it to the character code of the email message.
I don't understand how an issue with character coding would result in the insertion of a SPACE into the hard-coded HTML tag - this tag did not come from any database, but was typed into the email as a literal string.
This is a strange issue, but here are my two cents:
The first is you're not using Joomla functions to access the db and the mail subsystem. While this could work, it's not really nice.
The second is, this smells like a character set / codepage issue.
Here are a few considerations on the character set issue:
I read your code quickly, and I didn't notice anything wrong. But Joomla uses UTF-8, and your queries don't specify it (mysql_set_charset() is missing!) which could be a first issue.
The second is that the emails you read will have different character sets, depending on the senders' settings. Make sure you handle the codepage issues properly: the following is a snippet of a function I use for parsing email:
$mime = imap_fetchmime($this->connection, $this->messageNumber, $partNumber);
return $this->decodeMailBody($data,$mime); // QUOTED_PRINTABLE
function decodeMailBody($string,$mime) {
$str = quoted_printable_decode($string);
echo "<h3>mime: $mime; charset $charset</h3>";
//mime: Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8
//mime: Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=windows-1252
$mimes = explode('charset=',$mime);
foreach($mimes as $mimepiece) {
$charset = $mimepiece;
}
$charset = strtolower(trim($charset));
if ($charset == 'utf-8') {
return $str;
} else {
return iconv($charset, 'UTF-8', $str);
}
}
Last, make sure you use utf-8 when you insert the mail into the db after parsing it.

InstallShield 2011 error 7185 importing Japanese strings in the string table of basic MSI project

I am trying to import Japanese strings inside my "Basic MSI" project, it use to work before without any issues but now when I try to import some Japanese strings from a text file then it throws following error (I have changed some of the personal data from the error message.)
ISDEV : error -7185: The Japanese: 日本語 translation for string identifier IDS_XXXX_1111 includes characters that are not available on code page 932.
I think there are some of the characters inside the IDS_XXXX_1111 are not part of code page 932. How to detect those characters using some tool?
Also documentation mentions about changing some encoding settings to UTF-8 in InstallShield 2011, if you are aware then please guide me.
Thanks in advance
Rahul
My favorite way to detect such characters is with python. For example, reading a file like the InstallShield string tables in python 2.x:
import codecs
strings = codecs.open("strings.txt", "r", "UTF-16"):
for line in strings.readlines():
line = line.strip()
try:
line.encode("cp932")
except UnicodeError:
print "Can't encode: " + line.encode("cp932", "replace")
Your alternatives are to pinpoint the characters that cannot be represented on the relevant code page and replace them with ones that can, or to go to the Releases view and select yes for the Build UTF-8 Database setting.

Resources