Google Translate: what's the document size limit? - limit

I'm trying to send a PDF file of only 2.75 Mb but with 156 pages to Google Translalte.
It gives the error "The page you requested was too large to translate".
I have tried to split the PDF into 2 parts but Google Translate always gives the same error.
Could someone tell me the maximum size I can send (pages, Mb, etc)?

You can translate 5000 character at a time using the Google Translate apps per "Request too big" in Google Translate

Related

UIPath truncating strings at 10k (10,000) characters

We are running into an issue with UIPath that recently started. It's truncating strings, in our case a base 64 encoded image, at 10k characters. Does anyone know why this might be happening, and how we can address it?
The truncation appears to be happening when loading the text variable base64Contents. Seen in the photo below.
base64Contents = Convert.ToBase64String(byteArray);
As per the UiPath documentation there is a limit of 10,000 characters. This is due to 'the default communication channel between the Robot Executor and the Robot Service has changed from WCF to IPC'
https://docs.uipath.com/activities/docs/log-message
Potential Solution
A way round this could be to write your string to a txt file rather than output it as a log. that way you are using a different activity and the 10,000 character limit may not apply.

How to increase the size of request in ab benchmarking tool?

I am testing with ab - Apache HTTP server benchmarking tool.
Can I increase the size of the request in Apache (now I see 1 request has the size of 146 bytes).
I tried to increase the size of TCP send/receive buffer (-b option), but It seems does not work. Because I still see the "Total transferred" is 146 bytes.
Do you know any way to increase the size of the request? (change the source code or something).
Or if it is impossible, can you give me a suggestion about some tools which are similar to ab but it can increase the size of the request.
Thank you so much!
Although -b option does seem like it should've worked, I can't say for sure as I haven't used.
Alternatively, have you tried sending a dummy large file in your POST request? that can be accomplished with the -p option followed by a plain text file for instance, that you can either create yourself or Google for something like "generate large file in bytes online" that you can download and pass into the command.
As far as alternatives go, I've heard another open source project called httpperf from HP to be a great option as well, though I doubt we're unable to figure it out how to do it with Apache Benchmark.

Handling large files with Azure search blob extractor

Receiving errors from the Blob extractor that files are too large for the current tier, which is basic. I will be upgrading to a higher tier, but I notice that the max size is currently 256MB.
When I have PPTX files that are mostly video and audio, but have text I'm interested in, is there a way to index those? What does the blob extractor max file size actually mean?
Can I tell the extractor to only take the first X MB or chars and just stop?
There are two related limits in the blob indexer:
Max file size limit that you are hitting. If file size exceeds that limit, indexer doesn't attempt to download it and produces an error to make sure you are aware of the issue. The reason we don't just take first N bytes is because for parsing many formats correctly, the entire file is needed. You can mark blobs as skipable or configure indexer to ignore a number of errors if you want it to make forward progress when encountering blobs that are too large.
The max size of extracted text. In case file contains more text than that, indexer takes N characters up to the limit and includes a warning so you can be aware of the issue. Content that doesn't get extracted (such as video, at least today) doesn't contribute to this limit, of course.
How large are the PPTX you need indexed? I'll add my contact info in a comment.

Sending file in chunk always crashes at 10th chunk

I have a strange problem with my ultra-simple method. It sends a file in 4MB chunks to foreign API. The thing is, always at 10th chunk, foreign API crashes.
It's impossible to debug the API error but it says: The specified blob or block content is invalid (That API is Azure Storage API but it's not important right now, the problems lays clearly on my side).
Because it crashes at 10th element (which is 40th megabite) it's a pain to test it and debugging it "by hand" takes a lot of time (partly in cause of my bad internet connection speed) i decided to share my method
def upload_chunk()
file_to_send = File.open('file.mp4', 'rb')
until file_to_send.eof?
#content = file_to_send.read 4194304 # Get 4MB chunk
upload_to_api(#content) # Line that produces the error
end
end
Can you see anything, that can be wrong with this code? Please have in mind that it ALWAYS crashes at 10th time and works perfectly for files of size lesser than 40 MB.
I did a search for ruby "The specified blob or block content is invalid" and found this as the second link (first was this page):
http://cloud.dzone.com/articles/azure-blob-storage-specified
This contains:
If you’re uploading blobs by splitting blobs into blocks and you get the above mentioned error, ensure that your block ids of your blocks are of same length. If the block ids of your blocks are of different length, you’ll get this error.
So my first guess is that the call to upload_to_api is assigning ids from 1-9, then when it goes to 10 the id length increases causing the problem.
If you don't have control over how the ids are generated, then perhaps you can set the amount of bytes read on each iteration to be no more than 1/9 of the total file size.

Rendering large collections of articles to PDF fails in MediaWiki with mwlib

I have installed the Mediawiki Collection Extension and mwlib to render articles (or collections of articles) to PDF. This works very well for single articles and collections with up to 20 articles.
When I render larger collections, the percentage counter in the parsing page (which counts to a 100% when rendering succeeds) is stuck at 1%.
Looking at the mwrender.log I see an Error 32 - Pipe Broken error. Searching the internet reveals that Error 32 can be caused by the receiving process (the part after the pipe) crashing or not responding.
From here it is hard to proceed. Where should I look for more clues? Could it be the connection to the MySQL server that dies?
The whole applicance is running on a Turnkey Linux Mediawiki VM.
I'm using PDF Export Extension and it works with more than 20 articles. Maybe try that?
I figured out the problem myself.
Mw-render spawns a parallel request for every article in a collection. This means that for a collection of 50 pages, 50 simultaneous requests are made. Apache could handle this, but not the MySQL db of MediaWiki.
You can limit the amount of threads that mw-render spawns with the --num-threads=NUM option. I couldn't find where mw-serve calls mw-render, so I just limited the maximum amount of threads (workers) Apache could spawn to 10.
mw-render automatically repeats requests for articles if the first ones fail, so this approach worked.
I rendered a PDF with 185 articles within 4 minutes, the resulting PDF had 300+ pages.

Resources