Google NLP : row unreadable when the sentiment score is equal to 10 - google-cloud-automl

I have a problem when I try to add items to a new dataset to do a sentiment analysis.
I have a sentiment score scale from 0 to 10 and everything works perfectly for sentiment score from 0 to 9 but those associated to 10 can't be read despite the fact that I put maximal sentiment scale to 10.
Is there a special modification to do to my csv file so that google could recognise those rows ?
This is the error I got :
Invalid input found at row 2 of ... "Row parsing resulting in unexpected label name."
Thanks for your help !

You must save and import your .csv file with only two columns: 'text' and 'labels'. Make sure there are no more additional columns.

gs://tttttt-bucket/table_2jpg,table
gs://tttttt-bucket/table_l.png,table
You can test a file like this, just give the uri and label. I tested it works.

Related

How use filter formula in Google Sheet with data contains #N/A

For my example, I have 2 columns A,B in Google Sheet
Column A with list of Stocks symbols like AAPL, IBM, etc....
Column B with simple formula of GOOGLEFINANCE(A2,"price")
Sometimes GOOGLEFINANCE returns error and the cells display #N/A. But this is not my issue...
I would like using filter in column B which show all symbols with prices greater than 100 or #N/A
I prefer not using extra column to achieve that
I'm struggling with it and still didn't find the way to get my result
Just note, my issue isn't GOOGLEFINANCE, It's like example to get the #N/A value
My tought was using filter with formula like: =OR(ISNA(B:B), B:B>100)
But it seems it's ignore the #N/A and doesn't show it
Link for example
In my question I tried the formula "=OR(ISNA(B:B), B:B>100)"
But I must know that if Google Sheets "see" cells with N/A on this column - The result is automatically N/A, even if I put the ISNA in the first condition
So to solve it I used a formula like this:
=IF(ISERROR(B:B), TRUE, B:B>100)
I updated the sheet if someone wants to check it

Can't seemed to import google cloud Vertex AI Text Sentiment Analysis Dataset

I am experimenting with google cloud Vertex AI Text Sentiment Analysis. I created a sentiment dataset based on the following reference:
https://cloud.google.com/vertex-ai/docs/datasets/prepare-text#sentiment-analysis
When I created the dataset, I specified that maximum sentiment is 1 to get a range of 0-1. The document indicate that CSV file should have the following format:
[ml_use],gcs_file_uri|"inline_text",sentiment,sentimentMax
So I created a csv file with something like this:
My computer is not working.,0,1
You are really stupid.,1,1
As indicated in the documentation, I need at least 10 entry per sentiment value. I created 11 entries for the value 0 and 1, resulting in 22 entries total. I then uploaded the file and got "Unable to import data due to error", but the error message is blank. There doesn't appear to be errors logged in the log explorer.
I tried importing a text classification model and it imported properly. The imported line looks something like this.
The flowers are very pretty,happy
The grass are dead,sad
What am I doing wrong here for the sentiment data?
OK, the issue appears to be character set related. I had generate the CSV file using Libre Office Calc and exported it as CSV. Out of the box, it appears to default to a western europe character set, which looked fine in my text editor, but apparently caused problems I changed it to UTF-8 and now it's importing my dataset.

Trying to scrape data off of dividendinvestor.com

I'm trying to import some stock data regarding dividend history using Google Sheets.
The data I'm trying to grab is from this page: https://www.dividendinvestor.com/dividend-quote/
(e.g. https://www.dividendinvestor.com/dividend-quote/ibm or https://www.dividendinvestor.com/dividend-quote/msft)
With other sites, I've been able to use a combination of INDEX and IMPORTHTML to get data from a table. For example, if I wanted to get the "Forward P/E" for IBM from finviz.com, I do this:
=index(IMPORTHTML("http://finviz.com/quote.ashx?t=IBM","table", 11),11,10)
That grabs table 11 and goes down 11 rows and over 10 columns to get the piece of data that I want.
However, I cannot seem to find any tables to import via IMPORTHTML from the www.dividendinvestor.com/dividend-quote/ibm site.
I'm trying to import the value to the right of the "Consecutive Dividend Increases" field.
In this case, the output I'm trying to achieve is "19 years".
I've also tried IMPORTXML, but everything I try with XPATH (using this path: "/html/body/div[3]/div/div/div[2]/div/div/div[2]/div[2]/div[2]/span[20]" ) fails too.
Any help out there? The desired end result will be that I will dynamically build the dividendinvestor.com URL by appending a different ticker symbol and have a result of how many years of consecutive increases in their dividend payout.
Nice solution proposed by #player0. If you don't want to use INDEX, you can go with :
=IMPORTXML("https://www.dividendinvestor.com/dividend-quote/"&B3,"//a[.='Consecutive Dividend Increases']/following::span[1]")
Update (May 2022) :
New working formula :
=REGEXEXTRACT(TEXTJOIN("|";TRUE;IMPORTXML("https://www.dividendinvestor.com/ajax/?action=quote_ajax&symbol="&B2;"//text()"));"\d+ Years")
Note : I'm based in Europe, so semi-colons may have to be replaced with commas.
try:
=INDEX(IMPORTXML("https://www.dividendinvestor.com/dividend-quote/ibm/",
"//span[#class = 'data']"), 9, 1)

Pulling data from Crystal Reports

My data is stored in Oracle and the only way I can run a report with it is using Crystal Reports. I have a set of data that looks like this ,,,,,,,,,,1, or ,1,,,,,,, or ,1,,,,,1,,,,1,. There are more variations.
Each one means a value is true for a record. There are about 54 'ticks/commas' What I want is all records with the one at the X spot. So for one report I may want all records in the 10th spot that have a 1. There may be other times where I want the records where the 1 is after spot 36. I agree it will pull other records but the main once I want is the X spot.
How do I get this? I tried a Like command but that does not narrow the data down far enough. I am familiar with SQL but not Crystal.
Any help would be great. TIA
In Crystal, you might try setting up a Parameter Field to hold a numeric value (1 to 54) then use that in a formula as the Record Selection. You'll be prompted to enter the parameter when you run the report.
In record selection i was initially going suggest the following which would bring back all records with 1 in the 12 spot. But this makes it hard to bring back a range.
split({yourfield},",")[12] = 1
This will bring back the same
instr({#test},"1") = 12
Then for your suggestion above you could use the following to bring back any record if it has a one in any spot after 36
instr({#test},"1") >= 37
As long as there is only one 1 in the field you can use this for other ranges as well.
Actually I don't want to disturb both answerts by Clayton Morris and CoSpringsGuy hence posting my answer.
You need to combine both the solutions to get the desired result.
Create a number parameter and provide either 1 to 57 numbers or keep just a filed to enter desired number.
Now in record selection use the formula given by CoSpringsGuy
if instr({databasefield.column},"1") = {?Inputnumber} --parameter field
then {databasefield.column}

Weka NumericToNominal attributeIndices

I am using the Weka GUI and imported a csv file.
I want to transform a numerical attribute to nominal with the "NumericToNominal"-filter.
There are values between "-1" and "770".
If I set the attributeIndices value to "first-30,31-100,101-150,151-last", I get the error message: "Problem filtering instances: Invalid range list at first-30".
Do you have any idea, what is wrong?
Thanks in advance
I have just used the same NumericToNominal filter because I read in a csv file from the UI and it claimed everything was numeric.
You are using the -R switch and so it is looking for the range of column numbers. The values in whatever columns should not matter. Columns begin at 1 or first as you have above. The error message you get "Invalid range list" is when you reference a column number that does not exist. Therefore, it seems to indicate that either you have less than 30 columns or one of the columns between 1 and 30 has somehow been removed.. Did you mix up column numbers with the values contained within said columns because I believe having a negative value would not be a problem for this process?

Resources