How can I improve performance when adding InDesign XMLElements via AppleScript? - macos

I have an AppleScript program which creates XML tags and elements within an Adobe InDesign document. The data is in tables, and tagging each cell takes .5 seconds. The entire script takes several hours to complete.
I can post the inner loop code, but I'm not sure if SO is supposed to be generic or specific. I'll let the mob decide.
[edit]
The code builds a list (prior to this loop) which contains one item per row in the table. There is also a list containing one string for each column in the table. For each cell, the program creates an XML element and an XML tag by concatenating the items in the [row]/[column] positions of the two lists. It also associates the text in that cell to the newly-created element.
I'm completely new to AppleScript so some of this code is crudely modified from Adobe's samples. If the code is atrocious I won't be offended.
Here's the code:
repeat with columnNumber from COL_START to COL_END
select text of cell ((columnNumber as string) & ":" & (rowNumber as string)) of ThisTable
tell activeDocument
set thisXmlTag to make XML tag with properties {name:item rowNumber of symbolList & "_" & item columnNumber of my histLabelList}
tell rootXmlElement
set thisXmlElement to make XML element with properties {markup tag:thisXmlTag}
end tell
set contents of thisXmlElement to (selection as string)
end tell
end repeat
EDIT: I've rephrased the question to better reflect the correct answer.

The problem is almost certainly the select. Is there anyway you could extract all the text at once then iterate over internal variables?

I figured this one out.
The document contains a bunch of data tables. In all, there are about 7,000 data points that need to be exported. I was creating one root element with 7,000 children.
Don't do that. Adding each child to the root element got slower and slower until at about 5,000 children AppleScript timed out and the program aborted.
The solution was to make my code more brittle by creating ~480 children off the root, with each child having about 16 grandchildren. Same number of nodes, but the code now runs fast enough. (It still takes about 40 minutes to process the document, but that's infinitely less time than infinity.)
Incidentally, the original 7,000 children plan wasn't as stupid or as lazy as it appears. The new solution is forcing me to link the two tables together using data in the tables that I don't control. The program will now break if there's so much as a space where there shouldn't be one. (But it works.)

I can post the inner loop code, but I'm not sure if SO is supposed to be generic or specific. I'll let the mob decide.
The code you post as an example can be as specific as you (or your boss) is comfortable with - more often than not, it's easier to help you with more specific details.

If the inner loop code is a reasonable length, I don't see any reason you can't post it. I think Stack Overflow is intended to encompass both general and specific questions.

Are you using InDesign or InDesign Server? How many pages is your document (or what other information can you tell us about your document/ID setup)?
I do a lot of InDesign Server development. You could be seeing slow-downs for a couple of reasons that aren't necessarily code related.
Right now, I'm generating 100-300 page documents almost completely from script/xml in about 100 seconds (you may be doing something much larger).

Related

UiPath Get Text crashes robot

I'm trying to get a text from a textfield with Get Text, but in some cases this field is optional and the robot crashes because it doesn't have anything in the field.
You have multiple options. It's hard to say which one fits best you so here is a pool of possible solutions:
when NOT using the Modern Design, you can easily use the Element exists activity, self explaining
if you use the Modern Design and miss old activities like Element exists, go to the filter dropdown and select Show Classic, this way you are now also able to choose Element exists
you could also wrap such failing activities into a Try Catch, then your process wont fail, but a Try Catch should always be the last way out
when using the Modern Design, you can try Find Element, if the returned object is empty you know that it was not found, make sure to set a proper Timeout here, otherwise you wait for 30 seconds
but on your case it could be better to use an Image exists or Find Image Matches as you said you are looking for text in a textfield, just inverse it and look for an empty textfield, and if you have no matches all is fine
But to be honest, I would go for the Element exists. Give this a try, but be aware that in the future this activity might be replaced by something else and your process will need a little bit of rework.

Apple Script for System Events (UI reading) very slow

I've written an Apple Script for checking a UI Element (table) of a specific application (Avid Pro Tools). The table consists of a given number of rows. Each row has an attribute for selected (boolean) and index (integer). The script is returning a list of the index number of every row which has the attribute "selected" to true. The script is working, however, it is incredibly slow. It will take a few seconds to return the values. Is there any way to speed this up?
tell application "System Events"
return value of attribute "AXIndex" of (rows whose value of attribute "AXSelected" is true) of table "Track List" of (windows whose name contains "Mix: ") of application process "Pro Tools" of application "System Events"
end tell
This is more an extended comment than a bone fide answer, but here is what would I would consider in a similar situation:
What are you doing with the AXIndex values later on in the script, i.e. do you need the index numbers to the rows, or can you just store the references to the row objects ? Retrieving a list of index values suggests you'll be iterating through those values at some point, so I wonder if you're slowing your script down by accessing an attribute you may not need, then iterating through a list you could incorporate into the filter.
If there's an attribute named "AXSelected", there's a good chance there's a property named selected, which usually contains the same value and would be faster to retrieve:
... of (rows whose selected = true) of ...
Is the script actually being slow, or is it just having to perform a demanding set of complex operations ? There are two whose filters and one of these is performing a contains comparison (you can think of this as asking it to iterate through characters in a string to find a matching subset, versus an equality test that would on require a glance to know when something doesn't match). See what happens when you don't filter the windows list ? If the purpose of the filter is to isolate windows that have a table "Track List" element, you might not need to: if you're unlucky, removing the filter would create an error for the first window where System Events can't find that particular table (and hence the attributes you're retrieving); but quite often, it just inserts some missing value items in your final result that would be a small trade-off for a big increase in speed.
Finally, your actual construction of the compound filters is syntactically flawed, and I'm surprised it actually runs and returns meaningful results. The sub-clause that reads:
(rows whose value of attribute "AXSelected" is true) of table "Track List"
doesn't actually make sense, because there's no indication at the point where you define the filter what rows is and what other element they belong to. It's seems obvious that you've stated they belong to the table object, but the statement actually references some objects of class row that could exist anywhere, and the ones that are selected are the ones that belong to table "Track List". As an analogy, it's sort of like a split infinitive, that the English language has somehow acclimatised to accepting these as syntax forms that make sense because we assume the underlying meaning without much difficulty, but they are logically corrupt and in some other languages would result in a person not being able to understand what is being said, or assume an incorrect meaning.
So I wonder if AppleScript might be doing that, and if so, is AppleScript making incorrect assumptions and returning inaccurate results; or is AppleScript making correct assumptions, but being slowed down in order to untangle the syntax ?
Here's the correct form of the expression, including removal of the superfluous double-referencing of application "System Events":
tell application "System Events" to return ¬
the value of attribute "AXIndex" of ¬
(rows of table "Track List" of ¬
(windows of application process "Pro Tools" whose name contains "Mix:") ¬
whose value of attribute "AXSelected" is true)
Hopefully, the way I've split the clauses over multiple lines helps to illustrate more clearly why this makes syntactical sense in a way the original does not. It's also when I noticed the same ambiguous referencing occur with the windows filter clause.
Conclusion
I can't promise any of these suggestions will result in quicker execution times. This is more of a walk-through the thought processes that help me improve my scripts, by considering all the "what ifs?" and asking the seemingly-pointless questions every time to myself.
Feel free to provide a bit more context and insight into what the rest of your script is doing overall, and perhaps it'll reveal a different way to get the same result at the end but in less time.

How to use "move..." verb to move sheets in Numbers?

I'm trying to figure out how to re-position sheets in Numbers. There is no way to insert things at specific location so I am hoping that I can find another way. The move verb drew my attention (it is in the Numbers dictionary) however there is little or no information, examples, usage scenarios or even what object types it works with.
Any insight in the context of the title?
The move in the Numbers dictionary is part of the Standard Suite, which typically works with files. I have tried using it to move text items and tables from one sheet to another, but it always fails. It is probably something they hope to provide functionality for some day.

Randomization in Qualtrics using Photos or Graphics and Loop and Merge

I am creating a survey in Qualtrics with many photos, say 1000. I want to have each survey participant answer, say 6, questions per photo. Each participant will see 5 photos that are randomly assigned.
Before looking into things, I assumed that there would be a way to upload the 1000 photos, create one block in Qualtrics (with the 6 questions) and then simply randomize the photo that occurs and have this be repeated this 5 times.
But it seems like this is either not possible or not obvious. I called Qualtrics and they said that I would manually need to create 1000 blocks (each block would be exactly the same with the exception of the title and the photo). I would then need to go into the Survey Flow and use the Randomizer there and manually add all 1000 blocks and have it randomly present 5 of the elements.
I really hope that there is a better way. This will take a ton of time if I have to do it this way.
If not, is there any way to automate anything?
Creating new blocks and automatically populating the photos. I know python and could possibly write a script to generate blocks, BUT the photo names are changed from their original names into some complicated code that Qualtrics generates.
Loading the photos into Qualtrics all at once (it currently requires one to load photos one at a time).
It turns out that there is a much better faster way to do this than the 1000 blocks fix.
There is a bunch of stuff going on to accomplish it, but it is possible.
First, one needs to put the photos into Qualtrics through the Graphics Library. The best way to do this is to simply drag and drop the photos into the desired location. Luckily one does not have to do this one-by-one. Make sure that they are in the order you want.
Second, create a block with a "question" where you want the random photo to appear. This block should also have all 6 questions.
Third, create a column in a spreadsheet (in, eg. Excel) of the URLs corresponding to the photos. This should be in order. One way to do this is mentioned at the bottom.
Fourth, go to the Loop and Merge option for this block. Copy and paste the column of URLs to, say, Field 1. Luckily this option exists and one does not have to do this one-by-one either. A sidenote is that if one changes the numbers in the gray boxes to the left of the rows, this changes what appears in the results. But there is no apparent way to change these more than one-by-one at a time.
Then you should be all set.
Finally, a little bit about how to get the URLs of the photos. Once again, make sure the photos in the library are in the order you want. Then you can use web scraping to scrape the image names, which can then be put into the proper URL. I used Python's Selenium and BeautifulSoup to accomplish this. Here is what I did, using a mac. The code at least gives you the idea:
from bs4 import BeautifulSoup
import codecs
import os
from selenium import webdriver
import re
chromedriver = "File path to /chromedriver"
os.environ["webdriver.chrome.driver"] = chromedriver
driver = webdriver.Chrome(chromedriver)
*In the Chrome browser that has appeared, manually navigate to the photos library page, then:
abc = driver.find_elements_by_css_selector(".thumbframe")
file = codecs.open('outputURLs.txt', 'w', encoding = 'utf-8')
urls = {}
for i in range(0,len(abc)):
h = abc[i].get_attribute("innerHTML")
soup = BeautifulSoup(h)
t = soup.find_all("img", attrs={"p4":re.compile('.*')})
urls[i] = t[0]['p1']
file.write("<img src=*Qualtrics Path/Graphic.php?IM=" + urls[i] + "/> + '\n')
One can find the proper first part to stick in "Qualtrics Path" by, eg. going to the Qualtrics Survey Editor, inserting a photo using Rich HTML Editing (or something similar), inserting the photo, clicking on View Source, and then looking at the pattern file path to use. It may begin with something like https://qualtrics.com/...
Then copy the results into a spreadsheet program and you should be ready to copy and paste.

Marking or Tagging Non-structured Data

I'm not entirely sure how to term this, but I've searched several phrases and haven't found what I need.
I have a whole lot of unstructured data that I need to get into a database. I used to do the heavy lifting with Needlebase and just clean up the data from there. But now that it's no more, I'm want for a good way to quickly grab pieces of text beyond select, copy, paste, lather, rinse, repeat.
Ideally something where I could select some text and a popup asks what it is (from a user-defined list, title, start time, image path, etc.) and then marks it as such. Naturally I would need to be able to mark the beginning and end of a record (all row data is consecutive, just not in an easily parseable format).
I could probably write something in a few hours that would do this, but I don't want to reinvent the wheel if something exists. I'm on OS X, but I'd be interested in software for any platform.
is your data in HTML format? if yes you can use Jsoup

Resources