I have a question when calculating the hash of files eg: md5, sha1, sha256 setting the progress in the progressBar through the asynchronous process
Thanks to Alejandro Duarte who shows a very practical example
https://github.com/newUserRepo/testbar/blob/issueBar/vaadin-upload/src/main/java/com/example/vaadinupload/ProcessingService.java#L44
the only way I can get it to work is that in method line 75, I put 4 milliseconds to process the task and the bar is updated, but it is too slow.
Yes, I do not sleep the Thread the application does not do the push correctly, and the changes are not reflected correctly to the client.
Another way that actually worked was with the Runnable interface and execute the heavy task in the run() method
#Override
public void run() {
calcularHash();
}
public void calcularHash() {
System.out.println("Path tmp archivo: " +
tmpPath.toFile().getAbsolutePath());
for(int f=0; f<hashType.size(); f++) {
try (InputStream bis = new
BufferedInputStream(Files.newInputStream(tmpPath))) {
t.initTime();
byte[] buffer = new byte[1024];
MessageDigest messageDigest =
MessageDigest.getInstance(hashType.get(f));
int dataRead = 0;
long largo = tmpPath.toFile().length();
Long acum = 0L;
while ((dataRead = bis.read(buffer)) != -1) {
messageDigest.update(buffer, 0, dataRead);
acum += dataRead;
Float per = ((float) acum / largo);
bar.setValue(per);
System.out.println(per * 100);
//textFieldPercent.setValue(Types.formatPercentaje(per *
100));
}
final byte[] bytesDigest = messageDigest.digest();
final StringBuilder sb = new StringBuilder();
for (int c = 0; c < bytesDigest.length; c++) {
sb.append(Integer.toString((bytesDigest[c] & 0xFF) + 0x100,
16).substring(1));
}
final String hashObtenido = sb.toString();
t.finishTime();
final String totalTime = t.getFinalTimeSec() + "seg " +
t.getFinalTimeMs() + "ms";
final String large = Types.getLargeFileFormat(largo);
System.out.println(hashObtenido);
ui.access(() -> {
checksumTransactions.initData(messageDigest.getAlgorithm(),
sb.toString(),large, totalTime);
});
//Files.delete(tmpPath); //fixme borrar desde el grid o UI
} catch (IOException e) {
e.printStackTrace();
} catch (NoSuchAlgorithmException e) {
e.printStackTrace();
}
}
In the present picture I use a simple public void run () and the Progressbar is updated well
but we know that the application may have memory leaks and is not a good practice, the ideal would be to know how to execute that Background thread
I still do not know the best way to achieve this :$
Community, could you please help me to understand why ~3% of my messages don't end up in HDFS? I wrote a simple producer in JAVA to generate 10 million messages.
public static final String TEST_SCHEMA = "{"
+ "\"type\":\"record\","
+ "\"name\":\"myrecord\","
+ "\"fields\":["
+ " { \"name\":\"str1\", \"type\":\"string\" },"
+ " { \"name\":\"str2\", \"type\":\"string\" },"
+ " { \"name\":\"int1\", \"type\":\"int\" }"
+ "]}";
public KafkaProducerWrapper(String topic) throws UnknownHostException {
// store topic name
this.topic = topic;
// initialize kafka producer
Properties config = new Properties();
config.put("client.id", InetAddress.getLocalHost().getHostName());
config.put("bootstrap.servers", "myserver-1:9092");
config.put("key.serializer", "io.confluent.kafka.serializers.KafkaAvroSerializer");
config.put("value.serializer", "io.confluent.kafka.serializers.KafkaAvroSerializer");
config.put("schema.registry.url", "http://myserver-1:8089");
config.put("acks", "all");
producer = new KafkaProducer(config);
// parse schema
Schema.Parser parser = new Schema.Parser();
schema = parser.parse(TEST_SCHEMA);
}
public void send() {
// generate key
int key = (int) (Math.random() * 20);
// generate record
GenericData.Record r = new GenericData.Record(schema);
r.put("str1", "text" + key);
r.put("str2", "text2" + key);
r.put("int1", key);
final ProducerRecord<String, GenericRecord> record = new ProducerRecord<>(topic, "K" + key, (GenericRecord) r);
producer.send(record, new Callback() {
public void onCompletion(RecordMetadata metadata, Exception e) {
if (e != null) {
logger.error("Send failed for record {}", record, e);
messageErrorCounter++;
return;
}
logger.debug("Send succeeded for record {}", record);
messageCounter++;
}
});
}
public String getStats() { return "Messages sent: " + messageCounter + "/" + messageErrorCounter; }
public long getMessageCounter() {
return messageCounter + messageErrorCounter;
}
public void close() {
producer.close();
}
public static void main(String[] args) throws InterruptedException, UnknownHostException {
// initialize kafka producer
KafkaProducerWrapper kafkaProducerWrapper = new KafkaProducerWrapper("my-test-topic");
long max = 10000000L;
for (long i = 0; i < max; i++) {
kafkaProducerWrapper.send();
}
logger.info("producer-demo sent all messages");
while (kafkaProducerWrapper.getMessageCounter() < max)
{
logger.info(kafkaProducerWrapper.getStats());
Thread.sleep(2000);
}
logger.info(kafkaProducerWrapper.getStats());
kafkaProducerWrapper.close();
}
And I use the Confluent HDFS Connector in standalone mode to write data to HDFS. The configuration is as follows:
name=hdfs-consumer-test
connector.class=io.confluent.connect.hdfs.HdfsSinkConnector
tasks.max=1
topics=my-test-topic
hdfs.url=hdfs://my-cluster/kafka-test
hadoop.conf.dir=/etc/hadoop/conf/
flush.size=100000
rotate.interval.ms=20000
# increase timeouts to avoid CommitFailedException
consumer.session.timeout.ms=300000
consumer.request.timeout.ms=310000
heartbeat.interval.ms= 60000
session.timeout.ms= 100000
The connector writes the data into HDFS, but after waiting for 20000 ms (due to rotate.interval.ms) not all messages are received.
scala> spark.read.avro("/kafka-test/topics/my-test-topic/partition=*/my-test-topic*")
.count()
res0: Long = 9749015
Any idea what is the reason for this behavior? Where is my mistake? I'm using Confluent 3.0.1/Kafka 10.0.0.1.
Are you seeing the last few messages are not moved to HDFS? If so, it's likely you are running into the issue described here https://github.com/confluentinc/kafka-connect-hdfs/pull/100
Try sending one more message to the topic after the rotate.interval.ms has expired to validate this is what you are running into. If you need to rotate based on time, it's probably a good idea to upgrade to pickup the fix.
I'm wondering if there is a tool or lib that can move messages between queues?
Currently, i'm doing something like below
public static void ProcessQueueMessage([QueueTrigger("myqueue-poison")] string message, TextWriter log)
{
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(connString);
CloudQueueClient queueClient = storageAccount.CreateCloudQueueClient();
CloudQueue queue = queueClient.GetQueueReference("myqueue");
queue.CreateIfNotExists();
var messageData = JsonConvert.SerializeObject(data, new JsonSerializerSettings { ContractResolver = new CamelCasePropertyNamesContractResolver() });
queue.AddMessage(new CloudQueueMessage(messageData));
}
As at (2018-09-11) version 1.4.1 of the Microsoft Azure Storage Explorer doesn’t have the ability to move messages from one Azure queue to another.
I blogged a simple solution to transfer poison messages back to the originating queue and thought it might save someone a few minutes. Obviously, you'll need to have fixed the error that caused the messages to end up in the poison message queue!
You’ll need to add a NuGet package reference to Microsoft.NET.Sdk.Functions :
using Microsoft.WindowsAzure.Storage;
using Microsoft.WindowsAzure.Storage.Queue;
void Main()
{
const string queuename = "MyQueueName";
string storageAccountString = "xxxxxx";
RetryPoisonMesssages(storageAccountString, queuename);
}
private static int RetryPoisonMesssages(string storageAccountString, string queuename)
{
CloudQueue targetqueue = GetCloudQueueRef(storageAccountString, queuename);
CloudQueue poisonqueue = GetCloudQueueRef(storageAccountString, queuename + "-poison");
int count = 0;
while (true)
{
var msg = poisonqueue.GetMessage();
if (msg == null)
break;
poisonqueue.DeleteMessage(msg);
targetqueue.AddMessage(msg);
count++;
}
return count;
}
private static CloudQueue GetCloudQueueRef(string storageAccountString, string queuename)
{
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(storageAccountString);
CloudQueueClient queueClient = storageAccount.CreateCloudQueueClient();
CloudQueue queue = queueClient.GetQueueReference(queuename);
return queue;
}
Azure Storage Explorer version 1.15.0 can now do this as of 2020. https://github.com/microsoft/AzureStorageExplorer/issues/1064
Essentially Azure Storage doesn't support moving messages from one queue to another. You would need to do this on your own.
One way to implement moving the messages from one queue to another is by dequeuing the messages from the source queue (by calling GetMessages), read the contents of the message and then creating a new message in the target queue. This you can do via using Storage Client Library.
One tool that comes to my mind for moving messages is Cerebrata Azure Management Studio(paid product with 15 days free trial). It has this functionality.
As at (2018-09-11) version 1.4.1 of the Microsoft Azure Storage Explorer doesn't support moving queue messages.
Here's an updated version of Mitch's answer, using the latest Microsoft.Azure.Storage.Queue package. Simply create a new .NET Console application, add the above-mentioned package to it, and replace the contents of Program.cs with the following:
using Microsoft.Azure.Storage;
using Microsoft.Azure.Storage.Queue;
using System.Threading.Tasks;
namespace PoisonMessageDequeuer
{
class Program
{
static async Task Main(string[] args)
{
const string queuename = "MyQueueName";
string storageAccountString = "xxx";
await RetryPoisonMesssages(storageAccountString, queuename);
}
private static async Task<int> RetryPoisonMesssages(string storageAccountString, string queuename)
{
var targetqueue = GetCloudQueueRef(storageAccountString, queuename);
var poisonqueue = GetCloudQueueRef(storageAccountString, queuename + "-poison");
var count = 0;
while (true)
{
var msg = await poisonqueue.GetMessageAsync();
if (msg == null)
break;
await poisonqueue.DeleteMessageAsync(msg);
await targetqueue.AddMessageAsync(msg);
count++;
}
return count;
}
private static CloudQueue GetCloudQueueRef(string storageAccountString, string queuename)
{
var storageAccount = CloudStorageAccount.Parse(storageAccountString);
var queueClient = storageAccount.CreateCloudQueueClient();
var queue = queueClient.GetQueueReference(queuename);
return queue;
}
}
}
It's still pretty slow if you're working with >1000 messages though, so I'd recommend looking into batch APIs for higher quantities.
Here's a python script you may find useful. You'll need to install azure-storage-queue
queueService = QueueService(connection_string = "YOUR CONNECTION STRING")
for queue in queueService.list_queues():
if "poison" in queue.name:
print(queue.name)
targetQueueName = queue.name.replace("-poison", "")
while queueService.peek_messages(queue.name):
for message in queueService.get_messages(queue.name, 32):
print(".", end="", flush=True)
queueService.put_message(targetQueueName, message.content)
queueService.delete_message(queue.name, message.id, message.pop_receipt)
I just had to do this again and took the time to update my snipped to the new storage SDKs. See post at https://www.bokio.se/engineering-blog/how-to-re-run-the-poison-queue-in-azure-webjobs/ for more info.
Here is the code I used
using Azure.Storage.Queues;
using System;
using System.Threading;
using System.Threading.Tasks;
namespace AzureQueueTransfer
{
internal class Program
{
// Need Read, Update & Process (full url, can create in storage explorer)
private const string sourceQueueSAS = "";
// Need Add (full url, can create in storage explorer)
private const string targetQueueSAS = "";
private static async Task Main(string[] args)
{
var sourceQueue = new QueueClient(new Uri(sourceQueueSAS));
var targetQueue = new QueueClient(new Uri(targetQueueSAS));
var queuedAny = true;
while (queuedAny)
{
Thread.Sleep(30000); // Sleep to make sure we dont build too much backlog so we can process new messages on higher prio than old ones
queuedAny = false;
foreach (var message in sourceQueue.ReceiveMessages(maxMessages: 32).Value)
{
queuedAny = true;
var res = await targetQueue.SendMessageAsync(message.Body);
Console.WriteLine($"Transfered: {message.MessageId}");
await sourceQueue.DeleteMessageAsync(message.MessageId, message.PopReceipt);
}
Console.WriteLine($"Finished batch");
}
}
}
}
To anyone coming here looking for a Node equivalent of #MitchWheats answer using an Azure Function.
import AzureStorage from 'azure-storage'
import { Context, HttpRequest } from '#azure/functions'
import util from 'util'
const queueService = AzureStorage.createQueueService()
queueService.messageEncoder = new AzureStorage.QueueMessageEncoder.TextBase64QueueMessageEncoder()
const deleteMessage = util.promisify(queueService.deleteMessage).bind(queueService)
const createMessage = util.promisify(queueService.createMessage).bind(queueService)
const getMessage = util.promisify(queueService.getMessage).bind(queueService)
export async function run (context: Context, req: HttpRequest): Promise<void> {
try {
const poisonQueue = (req.query.queue || (req.body && req.body.queue));
const targetQueue = poisonQueue.split('-')[0]
let count = 0
while (true) {
const message = await getMessage(poisonQueue)
if (!message) { break; }
if (message.messageText && message.messageId && message.popReceipt) {
await createMessage(targetQueue, message.messageText)
await deleteMessage(poisonQueue, message.messageId, message.popReceipt)
}
count++
}
context.res = {
body: `Replayed ${count} messages from ${poisonQueue} on ${targetQueue}`
};
} catch (e) {
context.res = { status: 500 }
}
}
To use the function you need to you provide connection information for the storage account used for your storage queues. This is provided as environment variables. Either you provide AZURE_STORAGE_ACCOUNT and AZURE_STORAGE_ACCESS_KEY, or AZURE_STORAGE_CONNECTION_STRING. More on this is available in the Azure Storage SDK docs.
Also wrote a few lines about it in this Medium article
Updated python based on Jon Canning's answer:
from azure.storage.queue import QueueServiceClient
queueService = QueueServiceClient.from_connection_string(conn_str="DefaultEndpointsProtocol=https;AccountName=<account>;AccountKey=<key>;EndpointSuffix=core.windows.net")
for queue in queueService.list_queues():
if "poison" in queue.name:
print(queue.name)
targetQueueName = queue.name.replace("-poison", "")
queue = queueService.get_queue_client(queue=queue.name)
targetQueue = queueService.get_queue_client(queue=targetQueueName)
while queue.peek_messages() :
messages = queue.receive_messages()
for msg in messages:
targetQueue.send_message(msg.content)
queue.delete_message(msg)
As Mikael Eliasson noted, the code in IGx89 answer is broken because
AddMessageAsync will overwrite some info on the message and then
DeleteMessagAsync will give a 404. The better solution is to copy the
values into a new message for AddMessageAsync
Please see enhanced version of RetryPoisonMesssages with an ability to specify only list of messages(instead of all in a queue) and allow to copy messages instead of move them.
It also logs success/failure for each message.
/// <param name="storageAccountString"></param>
/// <param name="queuename"></param>
/// <param name="idsToMove">If not null, only messages with listed IDs will be moved/copied</param>
/// <param name="deleteFromPoisonQueue">if false, messages will be copied; if true, they will be moved
///Warning: if queue is big, keeping deleteFromPoisonQueue=false can cause the same row
///from poisonqueue to be copied more than once(the reason is not found yet)</param>
/// <returns></returns>
private static async Task<int> RetryPoisonMesssages(string storageAccountString, string queuename, string[] idsToMove=null, bool deleteFromPoisonQueue=false)
{
var targetqueue = GetCloudQueueRef(storageAccountString, queuename);
var poisonQueueName = queuename + "-poison";
var poisonqueue = GetCloudQueueRef(storageAccountString, poisonQueueName);
var count = 0;
while (true)
{
var msg = await poisonqueue.GetMessageAsync();
if (msg == null)
{
Console.WriteLine("No more messages in a queue " + poisonQueueName);
break;
}
string action = "";
try
{
if (idsToMove == null || idsToMove.Contains(msg.Id))
{
var msgToAdd = msg;
if (deleteFromPoisonQueue)
{
//The reason is that AddMessageAsync will overwrite some info on the message and then DeleteMessagAsync will give a 404.
//The better solution is to copy the values into a new message for AddMessageAsync
msgToAdd = new CloudQueueMessage(msg.AsBytes);
}
action = "adding";
await targetqueue.AddMessageAsync(msgToAdd);
Console.WriteLine(action + " message ID " + msg.Id);
if (deleteFromPoisonQueue)
{
action = "deleting";
await poisonqueue.DeleteMessageAsync(msg);
}
Console.WriteLine(action + " message ID " + msg.Id);
}
}
catch (Exception ex)
{
Console.WriteLine("Error encountered when "+ action + " " + ex.Message + " at message ID " + msg.Id);
}
count++;
}
return count;
}
I want to use imapFolder.getSortedMessages(SortTerm[] term); method to get mail and sorted by sent date.
public List<MailMessage> load(SortTerm[] term,)
throws MessagingException, UnsupportedEncodingException {
List<MailMessage> mailMessages = new ArrayList<MailMessage>();
term[0] = SortTerm.DATE;
Store store = getStore();
Folder folder = store.getFolder("INBOX");
if (folder != null) {
IMAPFolder imapFolder = (IMAPFolder) folder;
imapFolder.open(Folder.READ_WRITE);
Message[] messages = imapFolder.getSortedMessages(term);
FetchProfile fp = new FetchProfile();
fp.add(UIDFolder.FetchProfileItem.UID);
fp.add(FetchProfile.Item.ENVELOPE);
folder.fetch(messages, fp);
UIDFolder uidFolder = (UIDFolder) folder;
for (Message message : messages) {
Long uid = uidFolder.getUID(message);
MimeMessage msg = (MimeMessage) message;
MailMessage mailMessage = new MailMessage(msg, uid);
mailMessages.add(mailMessage);
}
}
return mailMessages;
}
but it not work for me.It don't sort by sent date.
Any suggestion for my program.
i want to short my url with bitly but an exception is occur when i want to set out string to my text block
private void button1_Click(object sender, RoutedEventArgs e)
{
ShortenUrl(textBox1.Text);
}
enum Format
{
XML,
JSON,
TXT
}
enum Domain
{
BITLY,
JMP
}
void ShortenUrl(string longURL)
{
Format format = Format.XML;
Domain domain = Domain.BITLY;
string _domain;
//string output;
// Build the domain string depending on the selected domain type
if (domain == Domain.BITLY)
_domain = "bit.ly";
else
_domain = "j.mp";
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(
string.Format(#"http://api.bit.ly/v3/shorten?login={0}&apiKey={1}&longUrl={2}&format={3}&domain={4}",
"username", "appkey", HttpUtility.UrlEncode(longURL), format.ToString().ToLower(), _domain));
request.BeginGetResponse(new AsyncCallback(GetResponse), request);
}
void GetResponse(IAsyncResult result)
{
XDocument doc;
HttpWebRequest request = (HttpWebRequest)result.AsyncState;
HttpWebResponse response = (HttpWebResponse)request.EndGetResponse(result);
using (StreamReader reader = new StreamReader(response.GetResponseStream()))
{
string responseString = reader.ReadToEnd();
doc = XDocument.Load(reader.BaseStream);
}
//// var x = from c in doc.Root.Element("data").Elements()
// where c.Name == "url"
// select c;
//XElement n = ((IEnumerable<XElement>)x).ElementAt(0);
// textBox2.Text = ((IEnumerable<String>)x).ElementAt(0);
lista = (from Born_rich in doc.Descendants("url")
select new a()
{
shrtenurl = Born_rich.Value
}).ToList();
output = lista.ElementAt(0).shrtenurl;
textBox2.Text = output;
//
//
// textBox2.Text = s;
}
List<a> lista = new List<a>();
String output;
}
public class a
{
public String shrtenurl { set; get; }
}
The calback from HttpWebRequest occurs on a non-UI thread. If you want to change soemthing in the UI you must do it on the UI thread. Fortunatley there is an easy way to do this. You simply use the dispatcher to invoke the code in question on the UI.
Dispatcher.BeginInvoke(() => textBox2.Text = output);