Power Automate read large xml file - power-automate

I am trying to read an xml file of size 150 MB stored in SharePoint. I am getting error while reading file since message size limit is 100MB. Also get file content do not support chunking.
BadRequest. Http request failed as there is an error: 'Cannot write
more bytes to the buffer than the configured maximum buffer size:
104857600.'.
Is there a way to read large file or read file partially (by size or only few nodes in xml) and process the content in iteration using Power Automate?
Thank you!

Related

Error while extracting CSV from Excel file (Apache NiFi)

I'm using Apache NiFi version 1.16.3 and trying to extract .csv from Excel file (.xlsx) with ConvertExcelToCSVProcessor. Size of the .xlsx is 17 MB, but I can't share it here.
I receive an error:
org.apache.nifi.processors.poi.ConvertExcelToCSVProcessor.error
Tried to allocate an array of length 101,695,141, but the maximum length for this record type is 100,000,000. If the file is not corrupt or large, please open an issue on bugzilla to request increasing the maximum allowable size for this record type. As a temporary workaround, consider setting a higher override value with IOUtils.setByteArrayMaxOverride()
What can I do with it? It says
consider setting a higher override value with IOUtils.setByteArrayMaxOverride()
as temprorary workaround, but where can I find this option? Seems like I should write custom processor with this option or what?

Azure Databricks - Receive error Zip bomb detected! The file would exceed the max. ratio of compressed file size to the size of the expanded data

I have been through many a links to solve this problem. However, none have helped me. Primarily because I am facing this error on Azure Databricks.
I am trying to read Excel files located on ADLS Curated zone. There are about 25 of the excel files. My program loops through the excel files and reads them into a PySpark Dataframe. However, after reading about 9 excel files, I receive the below error -
Py4JJavaError: An error occurred while calling o1481.load.
: java.io.IOException: Zip bomb detected! The file would exceed the max. ratio of compressed file size to the size of the expanded data.
This may indicate that the file is used to inflate memory usage and thus could pose a security risk.
You can adjust this limit via ZipSecureFile.setMinInflateRatio() if you need to work with files which exceed this limit.
Uncompressed size: 6111064, Raw/compressed size: 61100, ratio: 0.009998
I installed the maven - org.apache.poi.openxml4j but when I try to call it using the simple following import statement, I receive the error "No module named 'org'"
import org.apache.poi.openxml4j.util.ZipSecureFile
Any ideas anyone about how to set the ZipSecureFile.setMinInflateRatio() to 0 in Azure Databricks?
Best regards,
Sree
The "Zip bomb detected" exception will occur if the expanded file crosses the default MinInflateRatio set in the Apache jar. Apache includes a setting called MinInflateRatio which is configurable via ZipSecureFile.setMinInflateRatio() ; this will now be set to 0.0 by default to allow large files.
Checkout known issue in POI: https://bz.apache.org/bugzilla/show_bug.cgi?id=58499

Handling large files with Azure search blob extractor

Receiving errors from the Blob extractor that files are too large for the current tier, which is basic. I will be upgrading to a higher tier, but I notice that the max size is currently 256MB.
When I have PPTX files that are mostly video and audio, but have text I'm interested in, is there a way to index those? What does the blob extractor max file size actually mean?
Can I tell the extractor to only take the first X MB or chars and just stop?
There are two related limits in the blob indexer:
Max file size limit that you are hitting. If file size exceeds that limit, indexer doesn't attempt to download it and produces an error to make sure you are aware of the issue. The reason we don't just take first N bytes is because for parsing many formats correctly, the entire file is needed. You can mark blobs as skipable or configure indexer to ignore a number of errors if you want it to make forward progress when encountering blobs that are too large.
The max size of extracted text. In case file contains more text than that, indexer takes N characters up to the limit and includes a warning so you can be aware of the issue. Content that doesn't get extracted (such as video, at least today) doesn't contribute to this limit, of course.
How large are the PPTX you need indexed? I'll add my contact info in a comment.

File uploading using chunking : Jmeter

Can any one please let me know will jmeter support performance testing for file uploads more than 10gb files. The way the files are uploading is through chunking in JAVA. I cannot do the file upload for more than 10 GB because int allows max size of 2^31. In the http sampler i am declaring the file size as one one chunk
for eg: file size is 444,641,856 bytes, I am declaring the whole in one chunk instead of dividing it into chunks of 5mb each.
The developers are not willing to change the code and also if I give the result using one chunk size its not a valid performance test.
Can anyone suggest will jmeter allowing chunking mechanism ..... and also is there a solution for file uploading for more than 10Gb Files
Theoretically JMeter doesn't have 2GB limitation (especially HTTPClient implementations) so given you configured it properly you shouldn't face errors.
However if you don't have as much RAM as 10GB x number of virtual users you might want to try HTTP Raw Request sampler available via JMeter Plugins.
References:
https://groups.google.com/forum/#!topic/jmeter-plugins/VDqXDNDCr6w%5B1-25%5D
http://jmeter.512774.n5.nabble.com/fileupload-test-with-JMeter-td4267154.html

PHPExcel Allowed memory size of 134217728 bytes exhausted after convert xlsx format

I wrote a method to read the excel files using PHPExcel library in Codeigniter. It is work fine for xls format. So I convert this file to xlsx format and test it. I is give following error when i check xlsx file.
Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 93 bytes) in /home/dinuka/workspace/sec_new/application/third_party/PHPExcel/Worksheet.php on line 1142
My excel file has 13 sheets. Issue is why it isn't work after convert same file. Why memory limit is not exhausted when use xls format?
The memory requirements for the different Readers and Writers in PHPExcel aren't the same, even though the data storage inside the PHPEXcel object may be.
If you're working with larger files, then I'd recommend using cell caching to reduce the memory storage requirements for the PHPExcel object, allowing more of your php memory to be used by the Readers/Writers, and/or increasing your php memory limit.

Resources