Alchemy API - truncated-oversized-text-content - alchemyapi

getting the warning back of "truncated-oversized-text-content" when using Alchamy's entities API - does anyone know the character limit - I can't find on their support pages.

The text limit is 50kb, it's a bit hidden at the bottom of this page.

Related

Text not splitting between two pages by generating a pdf

I'm building dynamic PDFs with the API pdfswitch https://pdfswitch.io. Unfortunately I got some problems if the content is more than one A4-page long. The API generates more pages, no problem, but the text will cut off at the end of a page. Do anybody using this API Service as well? How can I manage to define a content-block where is not splitting between two pages ? f.e. ..)?
Thanks for help.

Elasticsearch - TikaOnDotNet Text Extraction page by page

We are exploring the elastic search and currently we are extracting the text from ms office documents, pdf, .eml and other file formats using TikaOnDotNet.
We want to store the document content page by page to Elasticsearch. so that we can update users that the keyword you were looking for is available on page number x.
I am not sure whether it is possible or not, if you could share you though on the same or show some direction would be greatly appreciated.
Regards,
Hiten

How to load a specific number of records per page and add an more button

On my page I would like to output all records of a specific folder
but the number should initially be limited to a certain quantity (to reduce the loading times). With a "Load more" button further records should be loaded.
Does anyone have a hint on how I can achieve this?
I have already found several approaches on the web in connection with AJAX, but since I'm not familiar with this yet, more questions than answers have emerged ...
For info: I use an own Template Extension / Distribution under Typo3 9.5.8
Thank you in advance for any help!!
The state of the art solution is the AJAX solution, where you load only the required records from the server and modify the page on the fly.
Another option would be an URL parameter which is evaluated by your extension.
With the parameter the full list is shown,
without only the first N and a button with the link to the same URL including the parameter for the full list.
Make sure the paramter is handled correctly and generates another cached version of the page. (keywords: cHash)
As you now have two pages with partially identical content: don't forget to tell the searchengines that the short variant should not be indexed.
You could use the Paginate Widget like documented here: https://docs.typo3.org/other/typo3/view-helper-reference/9.5/en-us/typo3/fluid/latest/Widget/Paginate.html
By overriding the paginate template file and only rendering the pagination.nextPage link, you could load the nextpage via AJAX.

How to retrieve plain text from a formatted website to use in UIWebView

Not sure if what I want to do is possible, but what I am hoping to do is somehow gather certain pieces of text from a website, remove the header, footer, background, all formatting, and place it into my application in a scrollview or something similar...
I'll give you an example... Imagine I was making wikipedia's iPhone app, I want to download the information about the wiki on dogs, without the header, side bars etc, just the text. How would I go about doing this?
I understand that for this I have not provided any example code or what I've tried or started, but that's just because in this case I'm lost! That doesn't mean I want full chunks of code either. Any help will do. If this doesn't work, I will just have to make a 'mobile optimised' version of the webpages I want to include in my app.
Thanks
(Edit: the term I was trying to use was 'strip the web page of its HTML coding')
You may be going about this the wrong way, or perhaps even asking the wrong question.
Does the target website have an API or datafeed of some kind?
Can you get the information you need in JSON or XML format directly from the site?
I think you've misunderstood the technology. HTML is merely the framwork on which the formatting and data is hung.
Parsing the HTML page seems like an awfully big headache, I doubt you'll ever be able to get it to work, because almost all sites these days are partially or wholly generated on the server side, the page is only the result.
Some sites hide the information in memory and others get it dynamically through ajax for example, which means that simply trying to get the data by parsing the HTML will get zero data.
Another issue you should be aware of though, is that simply copying the data from generated websites may open yourself up to copyright issues.
You have to parse the html code and search for the part that you want and "throw" away the part that you do not need. This is more or less like bruteforcing and the code of the website should not change otherwise you are screwed. So you have to write the parser by hand with this method. But maybe there is a atom or rss feed and you can parse this one. This will be much more easier and you are not depending on the website layout because the rss/atom feed is just about the data. For parsing rss you could try out NSXMLParser.
And then you have to make a valid html page out of the data and present it in the UIWebView

Adobe InDesign Server examples

I'm new to Adobe InDesign Server and I'm having a hard time finding a good kitchen sink app. All the examples I got from the SDK seem to partially work. All I'm trying to do is use a master page from InDesign from the server side and edit certain text fields. For example placing first and last name in particular text fields. Does anyone know of a good place to get examples code that shows all the features or how I would approach this problem?
http://www.adobe.com/devnet/indesign/documentation.html#idserver Has a lot of resources that is useful when starting out. In particular http://wwwimages.adobe.com/www.adobe.com/content/dam/Adobe/en/devnet/indesign/cs55-docs/InDesignServer/ids-solutions.pdf includes a number of code examples for various common operations.
As to your specific example, the typical way to go about it is:
1. Get the page object from the master pages list.
2. Iterate over each text field on the page.
3. Somehow identify the fields, for example by setting the script label in the template document and checking the labels of each text field you iterate through.
4. Set the contents of the text field.
A lot of the official InDesign documentation is partial.
Jongware also hosts the complete reference documentation:
http://www.jongware.com/idjshelp.html
Probably the reason why teh IDS documentation isn't that exhaustive is that dealing with the server version is an extension of the classical indesign use. So the exception of some peculiarities detailed in the ids sdk docs, you will find most of the help with InDesign Scripting guides ;)

Resources