httpclient garbling extended characters - utf-8

I'm using httpclient to retrieve remote urls and need to grab things such as titles.
In some cases, I get garbled extended characters as in the case of this url
http://olhardigital.uol.com.br/noticia/bilionaria-mais-jovem-da-historia-quer-revolucionar-exames-de-sangue/43586
I've tried playing around with all sorts of settings, to no avail. Any suggestions? My config is as follows:
private CloseableHttpClient httpclient = RemotePageUtils.getThreadSafeClient();
public String processMethod(String url, OutputStream out) throws IOException, IllegalArgumentException{
[...]
BufferedReader in = null;
HttpEntity entity = null;
HttpGet httpget = null;
CloseableHttpResponse resp = null;
try {
httpget = new HttpGet(url);
resp = httpclient.execute(httpget);
entity = resp.getEntity();
String inLine;
in = new BufferedReader(new InputStreamReader(entity.getContent(),"UTF-8"));
while ((inLine = in.readLine()) != null) {
out.write(inLine.getBytes("UTF-8"));
}
} finally {
[...]
}
return null;
}
private static CloseableHttpClient getThreadSafeClient() {
SocketConfig socketConfig = SocketConfig.custom()
.setTcpNoDelay(true)
.build();
RequestConfig config = RequestConfig.custom()
.setConnectTimeout(3000)
.setSocketTimeout(7000)
.setStaleConnectionCheckEnabled(false)
.build();
List<Header> headers = new ArrayList<Header>();
headers.add(new BasicHeader("Accept-Charset","ISO-8859-1,US-ASCII,UTF-8,UTF-16;q=0.7,*;q=0.7"));
//accept gzipped
headers.add(new BasicHeader("Accept-Encoding","gzip,x-gzip,deflate,sdch"));
CloseableHttpClient client = HttpClientBuilder.create()
.setDefaultHeaders(headers)
.setDefaultRequestConfig(config)
.setDefaultSocketConfig(socketConfig)
.build();
return client;
}

You are blindly interpreting all downloaded pages as UTF-8, but the example link you gave is not in UTF-8, but ISO-8859-1.
An accented letter in ISO-8859-1 is one byte >=128, where in UTF-8 such bytes have to follow specific patterns, in other case they are treated as damaged.
But why are you decoding bytes you have downloaded, just to write bytes to a file?
Instead of:
in = new BufferedReader(new InputStreamReader(entity.getContent(),"UTF-8"));
while ((inLine = in.readLine()) != null) {
out.write(inLine.getBytes("UTF-8"));
}
and converting bytes to strings and back, you should just copy the bytes.
You can do it with Apache Commons IO:
import org.apache.commons.io.IOUtils;
IOUtils.copy(entity.getContent(), out);
or manually, with pure Java:
byte[] buf = new byte[16 * 1024];
int len = 0;
InputStream in = entity.getContent();
while ((len = in.read(buf)) >= 0) {
out.write(buf, 0, len);
}

Related

How to convert collection to csv with jackson in java spring?

I have a problem to convert a java.util.Collection to a csv file with jackson.
In the following code you can see a method to convert the collection to a csv-string.
But i need a method to convert the collection with com.fasterxml.jackson.
The Enum "DownloadType" get the column and headerlines for csv file.
Do you have an idea to fix them?
#RequestMapping(value = "/csv",
produces = {"text/csv"},
consumes = {"application/json"},
method = RequestMethod.POST)
public ResponseEntity<Object> exportCsv()
{
ResponseEntity<Object> response = null;
try
{
HttpHeaders headers = new HttpHeaders();
headers.add(HttpHeaders.CONTENT_TYPE, "text/csv; charset=UTF-8");
headers.add(HttpHeaders.CACHE_CONTROL, "no-store, must-revalidate");
headers.add(HttpHeaders.CONTENT_DISPOSITION, "attachment;filename=\"export.csv\"");
headers.add(HttpHeaders.EXPIRES, "0");
byte[] csvBytes = null;
byte[] headerBytes = null;
byte[] lineBytes = null;
CsvMapper mapper = new
Collection<User> users = getUsers()
headerBytes = DownloadType.USER.getHeaderLine().getBytes("UTF-8");
lineBytes = mapper.writer(DownloadType.USER.getCsvSchema()).writeValueAsBytes(users);
if (headerBytes != null && lineBytes != null)
{
csvBytes = new byte[headerBytes.length + lineBytes.length];
System.arraycopy(headerBytes, 0, csvBytes, 0, headerBytes.length);
System.arraycopy(lineBytes, 0, csvBytes, headerBytes.length, lineBytes.length);
}
response = new ResponseEntity<>(csvBytes, headers, HttpStatus.OK);
}
catch (Exception e)
{
LOG.error(e.getMessage(), e);
}
return response;
}
Maybe try something like this. By writing the data directly to the servlet response the string will get returned directly back to the client as is without formatting or post-processing.
#RequestMapping(value = "/csv",
produces = {"text/csv"},
consumes = {"application/json"},
method = RequestMethod.POST)
public void exportCsv(HttpServletResponse response)
{
...
String headerString = DownloadType.USER.getHeaderLine()
String data = mapper.writer(DownloadType.USER.getCsvSchema()).writeValueAsString(users);
response.setContentType("text/plain; charset=utf-8");
response.getWriter().print(headerString);
response.getWriter().print(data);
Adapted from:
How to Return CSV Data in Browser From Spring Controller

Using ByteArrayLengthHeaderSerializer with java.net.Socket client

I am trying to connect a simple java.net.Socket TCP client to a Spring Integration TCP server using a ByteArrayLengthHeaderSerializer.
Here is the relevant client code snippet:
Socket socket = null;
OutputStream out = null;
InputStream in = null;
try {
socket = new Socket(host, port);
out = new BufferedOutputStream(socket.getOutputStream());
in = new BufferedInputStream(socket.getInputStream());
byte[] lengthBuf = new byte[4];
int size = in.read(lengthBuf, 0, lengthBuf.length);
ByteArrayInputStream bis = new ByteArrayInputStream(lengthBuf);
ObjectInputStream ois = new ObjectInputStream(bis);
int dataSize = ois.readInt();
System.out.println("Got size: " + dataSize);
byte[] dataBuffer = new byte[dataSize];
int dataRead = in.read(dataBuffer, 0, dataBuffer.length);
bis = new ByteArrayInputStream(dataBuffer);
ois = new ObjectInputStream(bis);
String data = (String) ois.readObject();
System.out.println("read " + dataRead + " bytes, got data: " + data);
Here is the server side configuration:
#MessagingGateway(defaultRequestChannel="toTcp")
public interface Gateway {
void send(#Payload byte[] data, #Header(IpHeaders.CONNECTION_ID) String connectionId);
}
#Bean
public AbstractServerConnectionFactory serverFactory() {
AbstractServerConnectionFactory connectionFactory = new TcpNetServerConnectionFactory(port);
ByteArrayLengthHeaderSerializer serDeser = new ByteArrayLengthHeaderSerializer();
connectionFactory.setSerializer(serDeser);
connectionFactory.setDeserializer(serDeser);
return connectionFactory;
}
#Bean
public TcpReceivingChannelAdapter inboundAdapter(AbstractServerConnectionFactory connectionFactory) {
TcpReceivingChannelAdapter inbound = new TcpReceivingChannelAdapter();
inbound.setConnectionFactory(connectionFactory);
inbound.setOutputChannel(fromTcp());
return inbound;
}
and the calling code:
public void send(String data, String header) throws IOException {
ByteArrayOutputStream bos = new ByteArrayOutputStream();
try (ObjectOutputStream oos = new ObjectOutputStream(bos)) {
System.out.println("data.length is " + data.length());
oos.writeObject(data);
byte[] bytes = bos.toByteArray();
gateway.send(bytes, header);
}
}
When I send the data (e.g. 'abc') I get (on the client side) this error:
invalid stream header: 0000000A
On the server side I get:
SoftEndOfStreamException: Stream closed between payloads
What am I doing wrong?
No - an ObjectOutputStream serializes java objects (and the input stream deserializes). You don't need a length header deserializer for that, just use a DefaultSerializer and DefaultDeserializer (from Spring Core) and don't add the header.
You are getting the error because the serialization adds more stuff at the beginning.
Or, if you are only sending a String, you can use String.getBytes() and send the length (4 bytes) followed by the byte[].

HttpsURLConnection for rest service call

i am using HttpsURLConnection for calling POST method and i am unable to add request payload to the call. same request works when i execute manually from soapui tool. does any one know how to do this?
Here is the java code is am using:
// Keystore
String jksFile = "/Users/......jks";
String password = "password";
System.setProperty("javax.net.ssl.keyStore", jksFile);
System.setProperty("javax.net.ssl.keyStorePassword", password);
String u = "https://example.com:443/private/loan-application";
URL url = new URL(u);
String version = "3";
HttpsURLConnection conn = (HttpsURLConnection) url.openConnection();
conn.setDoOutput(true);
conn.setRequestMethod("POST");
conn.setUseCaches(false);
conn.setConnectTimeout(10000);
conn.setReadTimeout(10000);
conn.setRequestProperty("Accept", "application/json;v=" + version);
conn.setRequestProperty("Api-Key", "XYZ");
String jsonParamString = "{\"didNumber\":\"DI457\", \" .... ";
//Create JSONObject here
JSONObject jsonParam = new JSONObject();
byte[] postData = jsonParamString.toString().getBytes();
Map<String, List<String>> requestProperties = conn.getRequestProperties();
displayHeaders(requestProperties);
OutputStream out = conn.getOutputStream();
out.write(postData);
out.close();
int respCode = conn.getResponseCode();
System.out.println("Response headers:");
displayHeaders(
conn.getHeaderFields());
InputStream is = null;
if (respCode == 200 || respCode == 203) {
is = conn.getInputStream();
}
if (is == null) {
System.out.println("Using error stream");
is = conn.getErrorStream();
}
is.close();
One possible explanation for why your HttpsURLConnection call is failing is that you have not configured your SSL settings properly. Specifically, the domain example.com is presenting a certificate which is not in your trust store, and therefore your program is failing during the SSL handshake. Try adding the following code, which will configure your program to trust all certificates:
SSLContext sslContext = SSLContext.getInstance("TLS");
TrustManager tm = new X509TrustManager() {
public void checkClientTrusted(X509Certificate[] chain, String authType) throws CertificateException { }
public void checkServerTrusted(X509Certificate[] chain, String authType) throws CertificateException { }
public X509Certificate[] getAcceptedIssuers() { return null; }
};
sslContext.init(null, new TrustManager[]{tm}, null);
conn.setSSLSocketFactory(sslContext.getSocketFactory());
Note that you should probably not use this code in production, because it accepts any site as secure. In production, you should configure your program to only accept trusted certificates.

jaxrs.ResteasyClient download file

Request a webAPI that gives me a ZIP file. Now my question:
How to download that file with jaxrs.ResteasyClient. Here is what I have but it does not work for me.
// In
Reader reader = client.target(url).request().get().readEntity(Reader.class);
BufferedReader bufferedReader = new BufferedReader(reader);
// Out
File out = new File("C:\\tmp\\test.zip");
FileWriter fileWriter = new FileWriter(out);
BufferedWriter bufferedWriter = new BufferedWriter(fileWriter);
String s = null;
while (true) {
s = bufferedReader.readLine();
if (s == null) {
break;
}
bufferedWriter.write(s);
}
bufferedWriter.flush();
bufferedWriter.close();
bufferedReader.close();
I have no idea if this makes sense, but I do not find any good documentation about reading a file instead of a bean object.
Resolved this issue by not using jaxrs.ResteasyClient. Apache HTTPClient was my friend!
private void getFileByURL(String url, String target) throws URISyntaxException, IOException {
HttpClientBuilder builder = HttpClients.custom().build();
CloseableHttpClient client = builder.build();
HttpResponse response = client.execute(new HttpGet(new URI(url)));
BufferedInputStream bufferedInputStream = new BufferedInputStream(response.getEntity().getContent());
BufferedOutputStream bufferedOutputStream = new BufferedOutputStream(new FileOutputStream(new File(target)));
int inByte;
while ((inByte = bufferedInputStream.read()) != -1) {
bufferedOutputStream.write(inByte);
}
bufferedInputStream.close();
bufferedOutputStream.close();
EntityUtils.consume(response.getEntity());
}

Session cookies - CookieContainer on stack rather than heap causing issue

I've had a look here
C# WebRequest using Cookies
Multiple WebRequest in same session
Reuse Connection with HttpWebRequest in C#
C# keep session id over httpwebrequest
And that's what I'm doing except I wish to store my CookieContainer as a member (named session_cookie) in my class called connector. My problem is that if I use a temporary object in my code then the cookies work fine:
CookieContainer t = new CookieContainer();
HTTPReq = (HttpWebRequest)WebRequest.Create(scriptURL);
HTTPReq.CookieContainer = t;
But if I use
HTTPReq = (HttpWebRequest)WebRequest.Create(scriptURL);
HTTPReq.CookieContainer = session_cookie;
Then it doesn't work! I cannot figure out why
Here is the connector class code:
public class Connector
{
public CookieContainer session_cookie;
private string session_id;
private HttpWebRequest HTTPReq;
private HttpWebResponse Response;
//Session oriented connection
public string serverRequest(string scriptURL, string payLoad)
{
try
{
HTTPReq = (HttpWebRequest)WebRequest.Create(scriptURL);
HTTPReq.CookieContainer = session_cookie;
HTTPReq.Method = "POST";
//Data arguments
byte[] byteArray = Encoding.UTF8.GetBytes(payLoad);
HTTPReq.ContentType = "application/x-www-form-urlencoded";
HTTPReq.ContentLength = byteArray.Length;
//Get the stream to write into
Stream dataStream = HTTPReq.GetRequestStream();
dataStream.Write(byteArray, 0, byteArray.Length);
dataStream.Close();
Response = (HttpWebResponse)HTTPReq.GetResponse();
Encoding enc = Encoding.GetEncoding(1252); // Western latin alphabet (windows default)
//Get the repsonse from the server
StreamReader ResponseStream = new StreamReader(Response.GetResponseStream(), enc);
string response = ResponseStream.ReadToEnd().Trim();
Response.Close();
ResponseStream.Close();
return response;
}
catch (WebException ex)
{
Console.WriteLine(ex.ToString());
return null;
}
}
}
Any ideas?
You are missing a constructor for your Connector class which needs to initialize your CookieContainer property. You should also use a CookieCollection for capturing the cookies from the response and sending to the next request. Something like this should work:
public class Connector
{
public CookieContainer session_cookie;
private CookieCollection cookies;
private string session_id;
private HttpWebRequest HTTPReq;
private HttpWebResponse Response;
public Connector()
{
session_cookie = new CookieContainer();
cookies = new CookieCollection();
}
//Session oriented connection
public string serverRequest(string scriptURL, string payLoad)
{
try
{
HTTPReq = (HttpWebRequest)WebRequest.Create(scriptURL);
HTTPReq.CookieContainer = session_cookie;
HTTPReq.CookieContainer.Add(cookies);
HTTPReq.Method = "POST";
//Data arguments
byte[] byteArray = Encoding.UTF8.GetBytes(payLoad);
HTTPReq.ContentType = "application/x-www-form-urlencoded";
HTTPReq.ContentLength = byteArray.Length;
//Get the stream to write into
Stream dataStream = HTTPReq.GetRequestStream();
dataStream.Write(byteArray, 0, byteArray.Length);
dataStream.Close();
Response = (HttpWebResponse)HTTPReq.GetResponse();
cookies = Response.Cookies; // capture cookies from response for next request
Encoding enc = Encoding.GetEncoding(1252); // Western latin alphabet (windows default)
//Get the repsonse from the server
StreamReader ResponseStream = new StreamReader(Response.GetResponseStream(), enc);
string response = ResponseStream.ReadToEnd().Trim();
Response.Close();
ResponseStream.Close();
return response;
}
catch (WebException ex)
{
Console.WriteLine(ex.ToString());
return null;
}
}
}

Resources