I am trying to pull the table from https://rotogrinders.com/schedules/nfl into Google Sheets
I tried using ImportHTML("https://rotogrinders.com/schedules/nfl", "table", 1) but it just returns the header:
Time Team Opponent Line Moneyline Over/Under Projected Points Projected Points Change
Using ImportXML, I tried IMportXML("https://rotogrinders.com/schedules/nfl","//tr"), but it returns the same header and no data.
I dont think the tbody needs authentication to access. I logged out, cleared my cache and even tried on another computer and still no tbody.
I know its a table called "tschedules", but cant get the data
Is there another part of the XPATH I am missing?
This is the XPATH from google scraper: "//table[1]/tbody/tr[td]"
When you load this website the actual table doesn't load in for a few seconds. This is why you're not seeing any data come in. If you can set some wait time on that import call somehow then it would work.
Related
I'm having a strange issue with exporting/updating/importing data in our on-premises Dynamics 365 (8.2). I was doing a bulk update of over 3000 records by exporting the records to an Excel workbook, updating the data in a specific column, then importing the workbook back into CRM. It worked for all of the records except 14 of them, which according to the import log was for the reason that "You cannot import data to this record because the record was updated in Microsoft Dynamics 365 after it was exported." I looked at the Audit History of those 14 records, and find that they have not been modified in any way for a good two months. Strangely, the modified date of the most recent Audit History entry for ALL 14 records is the exact same date/time.
We have a custom workflow that runs once every 24 hours on a schedule that automatically updates the Age field of our Contact records based on the value in the respective Birthday field. For these 14 records, ALL of them have a birthday of November 3rd, but in different years. What that means though is that the last modification that was done to them was on 11/3/2019 via the workflow. However, I cannot understand why the system "thinks" that this should prevent a data update/import.
I am happy to provide any additional information that I may have forgotten to mention here. Can anyone help me, please?
While I was not able to discover why the records would not update, I was able to resolve the issue. Before I share what I did to update the records, I will try and list as many things as I can remember that I tried that did not work:
I reworked my Advanced Find query that I was using to export the records that needed updated to return ONLY those records that had actual updates. Previously, I used a more forgiving query that returned about 30 or so records, even though I knew that only 14 of them had new data to import. I did so because the query was easier to construct, and it was no big deal to remove the "extra" records from the workbook before uploading it for import. I would write a VLOOKUP for the 30-something records, and remove the columns for which the VLOOKUP didn't find a value in my dataset, leaving me with the 14 that did have new data. After getting the error a few times, I started to ensure that I only exported the 14 records that needed to be updated. However, I still got the error when trying to import.
I tried formatting the (Do Not Modify) Modified On column in the exported workbook to match the date format in the import window. On export of the records, Excel was formatting this column as m/d/yyyy h:mm while the import window with the details on each successful and failed import showed this column in mm/dd/yyyy hh:mm:ss format. I thought maybe if I matched the format in Excel to the import window format it might allow the records to import. It did not.
I tried using some Checksum verification tool to ensure that the value in the (Do Not Modify) Checksum column in the workbook wasn't being written incorrectly or in an invalid format. While the tool I used didn't actually give me much useful information, it did recognize that the values were checksum hashes, so I supposed that was helpful enough for my purposes.
I tried switching my browser from the new Edge browser (the one that uses Chromium) to just IE as suggested on the thread provided by Arun. However, it did not resolve the issue.
What ended up working in the end was Arun's suggestion to just do some arbitrary edit to all the records and exporting them afterward. This was okay to do for just 14 records, but I'm still slightly vexed as this wouldn't really be a feasible solution of it were, say, a thousand records that were not importing. There was no field that ALL 14 Contact records had in common that I could just bulk edit, and bulk edit back again. What I ended up doing was finding a text field on the Contact Form that did not have any value in it for any of the records, putting something in that field, then going to each record in turn and removing the value (since I don't know of a way to "blank out" or clear a text field while bulk editing. Again, this was okay for such a small number of records, but if it were to happen on a larger number, I would have to come up with an easier way to bulk edit and then bulk "restore" the records. Thanks to Arun for the helpful insights, and for taking the time to answer. It is highly appreciated!
When you first do an import of an entity (contacts for example) you see that your imported excel contains 3 hidden columns (Do Not Modify) Contact, (Do Not Modify) Row Checksum, (Do Not Modify) Modified On.
When you want to create new instances of the entity, just edit the records and clear the content of the 3 hidden colums.
This error will happen when there is a checksum difference or rowversion differs from the exported record vs the record in database.
Try to do some dummy edit for those affected records & try to export/reimport again.
I could think of two reasons - either the datetime format confusing the system :( or the the community thread explains a weird scenario.
Apparently when importing the file, amending and then saving as a different file type alters the spreadsheet's parameters.
I hence used Internet Explorer since when importing the file, the system asks the user to save as a different format. I added .xlsx at the end to save it as the required format. I amended the file and imported it back to CRM..It worked
For me it turned out to be a different CRM time zone setting for the exporter and importer. Unfortunately this setting doesn't seem to be able to be changed by an administrator via the user interface.
The setting is available for each user under File->Options->Time Zone.
I’m using Data Scraping to scrape a product Information (i.e Product Name, Url, Price, Model) from a shopping website.
When I search for a product, I want whatever item comes first it scrapes that item’s data and for that purpose I have set maximum number of results to 1. But the problem is sometimes it is returning empty Data table And I cannot figure out why.
What I think is, if the current search result matches those elements that I selected in data scraping wizard, it returns the data table and if it doesn’t match it returns empty Data table.
For Example, While selecting elements in Data scraping wizard the search results were Samsung monitors. And when I ran the project I searched for Dell monitors, it returned Data table but when I searched for Samsung series or Dell Series it returned empty Data table. What is wrong with this?
You need to tell what you actually need as output.
But if your output is empty, mostly the reason is one of the following:
make sure the timeout is high enough, set it to 30000 if you are unsure
set a proper selector that has not a bad impact even when the website is being changed for some reason
For me it working properly with a proper timeout and a flexible selector with a *.
I've been looking for answers and try so many ways. I need to automate one websites and this country field have 230 data. When I try to run the script, if the data less then 50 it can select easily. But if I want to select data from 50 onward it only scroll down until the last data. This is my code im using now. Working fine if the data that I want to select in the first 50 range
ObjPage. WebElement(xpath="some xpath data" ). Click
ObjPage. WebList("xpath=" some xpath data" ). Selecr strcountry
most probably the list items are loaded asynchronously(lazy loading) with AJAX, on-demand. Your best try is to look up in javascript / ask the developer about the events that make the list load the rest of the values. If this is clear, send those events to the list via UFT to trigger the loading
(Under the hood you could try calling the Javascript method directly with some runJS commands, however the event based solution is more realistic)
I spend hours trying to fix this but can't find where the issue is.
I try to import data in google spreadsheet using importxml.
Here is the url :
http://www.journaldesfemmes.com/maman/creches/3-pom/creche-3098
I'm interested in exctracting email and phone number for exemple. I used chrome inspector to copy the Xpath, and few chrome plugins. I guess the issu is the Xpath. Here is the formula I used in spreadsheet :
=importxml("http://www.journaldesfemmes.com/maman/creches/3-pom/creche-3098";"/html/body/div[4]/div/div[1]/div[2]/div[1]/div/div/div/div/div[10]/table/tbody/tr[2]/td[2]")
Hope someone can help
Since the data you want is in tables, it might be easier to use importhtml.
The table you want you can get with this:
=IMPORTHTML("http://www.journaldesfemmes.com/maman/creches/3-pom/creche-3098","table",2)
To get just the phone number add index (row and column of table)
=index(IMPORTHTML("http://www.journaldesfemmes.com/maman/creches/3-pom/creche-3098","table",2),3,2)
email is:
=index(IMPORTHTML("http://www.journaldesfemmes.com/maman/creches/3-pom/creche-3098","table",2),4,2)
I am building a google sheet to do calculations based on information I found on different websites and stumbled upon the IMPORTHTML function in Google Sheets.
Terrific, I want to import tables and then use some of the values out of those tables to build my sheet and make further calculations.
However, since the function retrieves both the headers and all the information in the table that makes it quite hard to work with. Instead I would like to pull only certain of the data, preferably specific cells in the table pulled.
Is this possible?
For example:
=ImportHtml("http://en.wikipedia.org/wiki/Demographics_of_India"; "table";3)
returns a huge list, what if I would like to pull only the values of B7 and D7? Is that possible? Even filtering out a single row would be useful, whatever that is more feasible. The most important part is that I can get a single row and dont have the full table.
Found the INDEX function, doing exactly what I need it to do!