How to sum certain values from Grafana-Loki logs? - grafana-loki

I have this log line:
Successfully encrypted 189322 bytes for upload req_id=MediaUpload
Successfully encrypted 189322 bytes for upload req_id=MediaUpload
Successfully encrypted 492346 bytes for upload req_id=MediaUpload
There's a way to sum the bytes of the matching query log lines? Per example, by the those logs I would like to have a summed value of 870990 bytes or 0.87099 MB.
Is that possible?

Sure you can. Check this out.
I've used the pattern parser to extract the bytes as a number out of your log lines.
Then you can run a range query on top of that:
Eg.
sum by (app)
(sum_over_time(
{app="your-app"}
| pattern `Successfully encrypted <byte_size> bytes for upload req_id=<_>`
| unwrap byte_size
| __error__="" [$__interval]
))
you can change $__interval based on your needs.

Related

Nifi Group Content by Given Attributes

I am trying to run a script or a custom processor to group data by given attributes every hour. Queue size is up to 30-40k on a single run and it might go up to 200k depending on the case.
MergeContent does not fit since there is no limit on min-max counts.
RouteOnAttribute does not fit since there are too many combinations.
Solution 1: Consume all flow files and group by attributes and create the new flow file and push the new one. Not ideal but gave it a try.
While running this when I had 33k flow files on queue waiting.
session.getQueueSize().getObjectCount()
This number is returning 10k all the time even though I increased the queue threshold numbers on output flows.
Solution 2: Better approach is consume one flow file and and filter flow files matching the provided attributes
final List<FlowFile> flowFiles = session.get(file -> {
if (correlationId.equals(Arrays.stream(keys).map(file::getAttribute).collect(Collectors.joining(":"))))
return FlowFileFilter.FlowFileFilterResult.ACCEPT_AND_CONTINUE;
return FlowFileFilter.FlowFileFilterResult.REJECT_AND_CONTINUE;
});
Again with 33k waiting in the queue I was expecting around 200 new grouped flow files but 320 is created. It looks like a similar issue above and does not scan all waiting flow files on filter query.
Problems-Question:
Is there a parameter to change so this getObjectCount can take up to 300k?
Is there a way to filter all waiting flow files again by changing a parameter or by changing the processor?
I tried making default queue threshold 300k on nifi.properties but it didn't help
in nifi.properties there is a parameter that affects batching behavior
nifi.queue.swap.threshold=20000
here is my test flow:
1. GenerateFlowFile with "batch size = 50K"
2. ExecuteGroovyScript with script below
3. LogAttrribute (disabled) - just to have queue after groovy
groovy script:
def ffList = session.get(100000) // get batch with maximum 100K files from incoming queue
if(!ffList)return
def ff = session.create() // create new empty file
ff.batch_size = ffList.size() // set attribute to real batch size
session.remove(ffList) // drop all incoming batch files
REL_SUCCESS << ff // transfer new file to success
with parameters above there are 4 files generated in output:
1. batch_size = 20000
2. batch_size = 10000
3. batch_size = 10000
4. batch_size = 10000
according to documentation:
There is also the notion of "swapping" FlowFiles. This occurs when the number of FlowFiles in a connection queue exceeds the value set in the nifi.queue.swap.threshold property. The FlowFiles with the lowest priority in the connection queue are serialized and written to disk in a "swap file" in batches of 10,000.
This explains that from 50K incoming files - 20K it keeps inmemory and others in swap batched by 10K.
i don't know how increasing of nifi.queue.swap.threshold property will affect your system performance and memory consumption, but i set it to 100K on my local nifi 1.16.3 and it looks good with multiple small files, and first batch increased to 100K by this.

BEP20 token not showing total supply as i entered on code on bscscan

I have deployed a BEP20 token. I followed the steps shown in this tutorial https://docs.binance.org/smart-chain/developer/issue-BEP20.html
and I entered total supply = 60000000000 but after varifying, the total supply is not showing which was I entered. Can anyone help me to add the total supply? The contract address is 0xE2cFe49999e3a133EaFE13388Eb47BCd223f5c5E
Your token uses 18 decimal places. Which means that the value 60000000000 hardcoded on line 359 of your contract represents 0.00000006 of the token. The BSCScan token tracker shows total supply 0 AAG just because it rounds to some predefined amount of decimals.
If you want a total supply of 60 billion, you need to add 18 zeros after this number to account for the decimals.
_totalSupply = 60000000000 * 1e18;

How to write data in real time to HDFS using Flume?

I am using Flume to store sensor data in HDFS. Once the data is received through MQTT. The subscriber posts the data in JSON format to Flume HTTP listener. It is currently working fine, but the problem is that flume is not writing to HDFS file till I stop it (or the size of the file reachs 128MB). I am using Hive to apply a schema on read. Unfortunately, the resulting hive table contains only 1 entry. This is normal because Flume did not write new coming data to file (loaded by Hive).
Is there any manner to force Flume to write new coming data to HDFS in a near-real time way? So, I don't need to restart it or to use small files?
here is my flume configuration:
# Name the components on this agent
emsFlumeAgent.sources = http_emsFlumeAgent
emsFlumeAgent.sinks = hdfs_sink
emsFlumeAgent.channels = channel_hdfs
# Describe/configure the source
emsFlumeAgent.sources.http_emsFlumeAgent.type = http
emsFlumeAgent.sources.http_emsFlumeAgent.bind = localhost
emsFlumeAgent.sources.http_emsFlumeAgent.port = 41414
# Describe the sink
emsFlumeAgent.sinks.hdfs_sink.type = hdfs
emsFlumeAgent.sinks.hdfs_sink.hdfs.path = hdfs://localhost:9000/EMS/%{sensor}
emsFlumeAgent.sinks.hdfs_sink.hdfs.rollInterval = 0
emsFlumeAgent.sinks.hdfs_sink.hdfs.rollSize = 134217728
emsFlumeAgent.sinks.hdfs_sink.hdfs.rollCount=0
#emsFlumeAgent.sinks.hdfs_sink.hdfs.idleTimeout=20
# Use a channel which buffers events in memory
emsFlumeAgent.channels.channel_hdfs.type = memory
emsFlumeAgent.channels.channel_hdfs.capacity = 10000
emsFlumeAgent.channels.channel_hdfs.transactionCapacity = 100
# Bind the source and sinks to the channel
emsFlumeAgent.sources.http_emsFlumeAgent.channels = channel_hdfs
emsFlumeAgent.sinks.hdfs_sink.channel = channel_hdfs
I think the tricky bit here is that you would like to write data to HDFS in near real time but don't want small files either (for obvious reasons) and this could be a difficult thing to a achieve.
You'll need to find optimal balance between the following two parameters:
hdfs.rollSize (Default = 1024) - File size to trigger roll, in bytes (0: never roll based on file size)
and
hdfs.batchSize (Default = 100) - Number of events written to file before it is flushed to HDFS
If your data is not likely to reach 128 MB in the preferred time duration, then you may need to reduce the rollSize but only to an extent that you don't run into the small files problem.
Since, you have not set any batch size in your HDFS sink, you should see the results of HDFS flush after every 100 records but once the size of the flushed records jointly reaches 128 MB, the contents would be rolled up in a 128 MB file. Is this also not happening? Could you please confirm?
Hope this helps!

Named pipe performance

I have an application, where I send approx. 125 data items via a named pipe.
Each data item consists of data block 1 with max. 300 characters and data block 2 with max. 600 characters.
This gives 125 data items * (300 + 600) characters * 2 bytes per character = 125 * 900 * 2 = 225000 bytes.
Each data item is surrounded by curly braces like {Message1}{Message2}.
I noticed that if I send the messages, there are sending/receiving problems. Instead of {Message1}{Message2} the receiving application gets {Messa{Message2}.
Then I changed the sending code so that the messages are sent in 500 ms intervals. Then, the problem disappeared.
If I do everything correct (no bugs on my side, no misconfiguration of named pipes), how much time is required to send 225000 bytes over a named pipe from application in Delphi 2009 to application in .NET on the same machine?
What is a reasonable time for sending data of that size?

mail body size limit in utl_smtp

I have a procedure to send mail using utl_smtp.
What is the maximum mail body size that I can send and if my mail body size exceeds this limit, then how can I send it?
Just send it in chunks:
l_offset := 1;
l_amount := 1900;
utl_smtp.open_data(l_connection);
while l_offset < dbms_lob.getlength(l_body_html) loop
utl_smtp.write_data(l_connection,
dbms_lob.substr(l_body_html,l_amount,l_offset));
l_offset := l_offset + l_amount ;
l_amount := least(1900,dbms_lob.getlength(l_body_html) - l_amount);
end loop;
From Oracle Documentation:
Rules and Limits No limitation or range-checking is imposed by the API. However, you should be aware of the following size
limitations on various elements of SMTP. Sending data that exceed
these limits may result in errors returned by the server.
Table 178-5 SMTP Size Limitation
Element | Size Limitation
user | The maximum total length of a user name is 64 characters.
domain | The maximum total length of a domain name or number is 64
characters.
path | The maximum total length of a reverse-path or forward-path is
256 characters (including the punctuation and element separators).
command line | The maximum total length of a command line including
the command word and the is 512 characters.
reply line | The maximum total length of a reply line including the
reply code and the is 512 characters.
text line | The maximum total length of a text line including the
is 1000 characters (but not counting the leading dot duplicated
for transparency).
recipients buffer | The maximum total number of recipients that must
be buffered is 100 recipients.
Anyway, I think if your email body is too big, the destination will reject it...
UPDATE
Anyway, if you send so big data over email, something is wrong. You shoud use another solution. A client that reads data from database and presents it to the user in a friendly format.
There is oracle discoverer, or you can develop an application with java, or php... there are many options...

Resources