Error importing JSONL dataset into Vertex AI - google-cloud-vertex-ai

I tried importing a JSONL dataset into Google's Vertex AI and get a weird and seemingly unrelated error:
Error: Could not parse the line, json is invalid or the format does not match the input schema: Cannot find field: classificationAnnotation in message google.cloud.aiplatform.master.schema.ImageBoundingBoxIoFormat. for: gs://[bucketname]/set.jsonl line 10
It happens every 4 lines of code. All of my lines are identical except the image name changes.
Line 10:
{"imageGcsUri":"gs://[mybucket]/path/to/image.png","classificationAnnotation":{"displayName":"MyLabel","annotationResourceLabels":{"aiplatform.googleapis.com/annotation_set_name":"MyLabel"}},"dataItemResourceLabels":{"aiplatform.googleapis.com/ml_use":"training"}}
Why am I getting this error?

From the line you are sharing, it seems like the image you are trying to access doesn't exist in the bucket you are using, so you would need to see if the image is on the same name or format you are calling it.

Related

BagOfFeatures for Image Category Classification in Matlab

Doing this example in Matlab Image Category Classification
I have found an error trying to get the vocabulary of SURF features with this command
bag = bagOfFeatures(trainingSet);
The error is the following
Error using bagOfFeatures/parseInputs (line 1023)
The value of 'imgSets' is invalid. Expected imgSets to be one of these types:
imageSet
Instead its type was matlab.io.datastore.ImageDatastore.
I am using a ImageDatastore input instead of imgSets, but I am following a Mathworks example. Anyone can explain me why is this happening and how can I convert trainingSet into a imgSets type?
You have to convert the ImageDatastore object to an imageSet object. This can simply be done by using the following line instead:
bagOfFeatures(imageSet(trainingSet.Files));

SSIS: Data conversion failed from a Flat File Source

Good day.
The following are the errors that had occured while processing the flat file:
Error: 0xC02020A1 at Task, File [1]: Data conversion failed. The data conversion for column "Column 0" returned status value 4 and status text "Text was truncated or one or more characters had no match in the target code page.".
Error: 0xC020902A at Task, File [1]: The "output column "Column 0" (14)" failed because truncation occurred, and the truncation row disposition on "output column "Column 0" (14)" specifies failure on truncation. A truncation error occurred on the specified object of the specified component.
Error: 0xC0202092 at Task, File [1]: An error occurred while processing file "filepath" on data row 1.
Error: 0xC0047038 at Task, SSIS.Pipeline: SSIS Error Code DTS_E_PRIMEOUTPUTFAILED. The PrimeOutput method on component "Retrieve Input Batch File" (1) returned error code 0xC0202092. The component returned a failure code when the pipeline engine called PrimeOutput(). The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing. There may be error messages posted before this with more information about the failure.
Source File is a Flat File
Data Type properties for External Column and Output Column is identical:
Data Type: String [DT_STR]
Length is 1143
I've tried to experiment with the values in the properties, but I got no luck. What could be the reason for the error?
In addition, I tried to test 2 files. First file got success result, while the latter did not. Difference between them is, the first one is Dos\Windows, while the other is UNIX. Does it affect the behavior of the flat file?
Thank you so much for your input :)
Go to your flat file connection manager>>> Flat file Source Editor then >>> Click on Error Output >>>> and then to the column in question and select Ignore Failure. This worked for me. (That is if the column size is right)

Error on importing 40M+ data using neo4j-import tool

I use neo4j-import to import 40M nodes, bellow is my shell:
[luning#pinnacle bin]$ ./neo4j-import --into ../data/weibo.db --nodes:User "/data/weibo/user-header.csv,/data/weibo/users/000000_0.csv,/data/weibo/users/000001_0.csv,/data/weibo/users/000002_0.csv,/data/weibo/users/000003_0.csv,/data/weibo/users/000004_0.csv,/data/weibo/users/000005_0.csv,/data/weibo/users/000006_0.csv,/data/weibo/users/000007_0.csv,/data/weibo/users/000008_0.csv,/data/weibo/users/000009_0.csv,/data/weibo/users/000010_0.csv,/data/weibo/users/000011_0.csv,/data/weibo/users/000012_0.csv,/data/weibo/users/000013_0.csv,/data/weibo/users/000014_0.csv,/data/weibo/users/000015_0.csv,/data/weibo/users/000016_0.csv,/data/weibo/users/000017_0.csv,/data/weibo/users/000018_0.csv,/data/weibo/users/000019_0.csv,/data/weibo/users/000020_0.csv,/data/weibo/users/000021_0.csv,/data/weibo/users/000022_0.csv,/data/weibo/users/000023_1.csv,/data/weibo/users/000024_0.csv,/data/weibo/users/000025_0.csv" --delimiter "TAB"
Nodes
[*>:87.20 MB/s---------------------------|PROPERTIES(2)===============|NOD|v:227.03 MB/s(2)====] 48MImport error: Panic called, so exiting
Neo4j Import Tool
neo4j-import is used to create a new Neo4j database from data in CSV files. See
the chapter "Import Tool" in the Neo4j Manual for details on the CSV file format
- a special kind of header is required.
Usage:
--into <store-dir>
Database directory to import into. Must not contain existing database.
--nodes [:Label1:Label2] "<file1>,<file2>,..."
Node CSV header and data. Multiple files will be logically seen as one big file
from the perspective of the importer. The first line must contain the header.
Multiple data sources like these can be specified in one import, where each data
source has its own header. Note that file groups must be enclosed in quotation
marks.
--relationships [:RELATIONSHIP_TYPE] "<file1>,<file2>,..."
Relationship CSV header and data. Multiple files will be logically seen as one
big file from the perspective of the importer. The first line must contain the
header. Multiple data sources like these can be specified in one import, where
each data source has its own header. Note that file groups must be enclosed in
quotation marks.
--delimiter <delimiter-character>
Delimiter character, or 'TAB', between values in CSV data. The default option is
,.
--array-delimiter <array-delimiter-character>
Delimiter character, or 'TAB', between array elements within a value in CSV
I have checked their schema. They are all consistent. It shows
Import error: Panic called, so exiting
Anybody knows how to solve it?
Below is my stacktrace:
java.lang.RuntimeException: Panic called, so exiting
at org.neo4j.unsafe.impl.batchimport.staging.StageExecution.stillExecuting(StageExecution.java:63)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisor.anyStillExecuting(ExecutionSupervisor.java:79)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisor.finishAwareSleep(ExecutionSupervisor.java:102)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisor.supervise(ExecutionSupervisor.java:64)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisors.superviseDynamicExecution(ExecutionSupervisors.java:65)
at org.neo4j.unsafe.impl.batchimport.ParallelBatchImporter.executeStages(ParallelBatchImporter.java:226)
at org.neo4j.unsafe.impl.batchimport.ParallelBatchImporter.doImport(ParallelBatchImporter.java:151)
at org.neo4j.tooling.ImportTool.main(ImportTool.java:263)
Caused by: java.lang.RuntimeException: Panic called, so exiting
at org.neo4j.unsafe.impl.batchimport.staging.AbstractStep.assertHealthy(AbstractStep.java:189)
at org.neo4j.unsafe.impl.batchimport.staging.AbstractStep.await(AbstractStep.java:180)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutorServiceStep.receive(ExecutorServiceStep.java:82)
at org.neo4j.unsafe.impl.batchimport.staging.AbstractStep.sendDownstream(AbstractStep.java:226)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutorServiceStep$2.call(ExecutorServiceStep.java:103)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutorServiceStep$2.call(ExecutorServiceStep.java:87)
at org.neo4j.unsafe.impl.batchimport.executor.DynamicTaskExecutor$Processor.run(DynamicTaskExecutor.java:217)
Caused by: java.lang.RuntimeException: Panic called, so exiting
... 7 more
Caused by: java.lang.RuntimeException: Panic called, so exiting
... 7 more
Caused by: org.neo4j.unsafe.impl.batchimport.input.InputException: ERROR in input
data source: BufferedCharSeeker[buffer:org.neo4j.csv.reader.SectionedCharBuffer#4ac5af5c, seekPos:2764030, line:2882236]
in field: descriptions:string:4
for header: [id:string, screenname:string, locations:string, descriptions:string, :IGNORE, profileimageurl:string, gender:string, followerscount:string, friendscount:string, statusescount:string, favouritescount:string, verified:string, verifiedreason:string, :IGNORE, :IGNORE, :IGNORE, :IGNORE, :IGNORE, :IGNORE, :IGNORE, darenint:string, :IGNORE, :IGNORE, updateddate:string]
raw field value: 6:19:
original error: Tried to read in a value larger than effective buffer size 8388608
at org.neo4j.unsafe.impl.batchimport.input.csv.InputEntityDeserializer.fetchNextOrNull(InputEntityDeserializer.java:152)
at org.neo4j.unsafe.impl.batchimport.input.csv.InputEntityDeserializer.fetchNextOrNull(InputEntityDeserializer.java:42)
at org.neo4j.helpers.collection.PrefetchingIterator.peek(PrefetchingIterator.java:60)
at org.neo4j.helpers.collection.PrefetchingIterator.hasNext(PrefetchingIterator.java:46)
at org.neo4j.helpers.collection.NestingIterator.fetchNextOrNull(NestingIterator.java:61)
at org.neo4j.helpers.collection.PrefetchingIterator.peek(PrefetchingIterator.java:60)
at org.neo4j.helpers.collection.PrefetchingIterator.hasNext(PrefetchingIterator.java:46)
at org.neo4j.unsafe.impl.batchimport.staging.IteratorBatcherStep.nextBatchOrNull(IteratorBatcherStep.java:54)
at org.neo4j.unsafe.impl.batchimport.staging.InputIteratorBatcherStep.nextBatchOrNull(InputIteratorBatcherStep.java:42)
at org.neo4j.unsafe.impl.batchimport.staging.ProducerStep.process(ProducerStep.java:73)
at org.neo4j.unsafe.impl.batchimport.staging.ProducerStep$1.run(ProducerStep.java:54)
Caused by: java.lang.IllegalStateException: Tried to read in a value larger than effective buffer size 8388608
at org.neo4j.csv.reader.BufferedCharSeeker.fillBufferIfWeHaveExhaustedIt(BufferedCharSeeker.java:258)
at org.neo4j.csv.reader.BufferedCharSeeker.nextChar(BufferedCharSeeker.java:231)
at org.neo4j.csv.reader.BufferedCharSeeker.seek(BufferedCharSeeker.java:109)
at org.neo4j.unsafe.impl.batchimport.input.csv.InputEntityDeserializer.fetchNextOrNull(InputEntityDeserializer.java:81)
... 10 more
One of the fields probably have a quote that doesn't end that quote... and so the CSV parser will read and read until it finds the next quote. It's inlikely that you've got one field in there that's 8M big, so that's what I'm thinking.
I had the same error and removing special characters such as "*", "&","/" BUT keeping the single quotes was enough to get rid of the error.
I also got "Import error: Executor has been shut down" and "Import error: Panic called, so exiting" errors when I tried to import data to my graph using this method.
My data was free of quote characters (" and ') when I was getting these errors.
What solved my problem was getting rid of all other special characters.
I might have missed something in the documentation because I thought all the text in my node attributes would be read in as strings. Turns out neo4j-import doesn't like characters like "&" and "/"!
When I edited my data (yay sed!) to contain only alphanumeric characters the import tool worked perfectly.

Conditional formatting with repository variable

I am trying to conditionally format a graph with a repository variable. My goal is to end with a number between 1-12 which corresponds to the current month.
When I try,
biServer.variables['CURRENT_MONTH']
I get the following error:
Graphing engine is not responding.
"A fatal error occurred while processing the request. The server responded with: oracle.bi.nanserver.fwk.exception.BISvsException: java.lang.NumberFormatException: For input string: "2014 / 07"."
Trying the following,
RIGHT(biSerber.variables['CURRENT_MONTH'],2)
I get an error:
"A type mismatch occurred while evaluating an expression."
Finally, the follow also errors.
RIGHT('biServer.variables['CURRENT_MONTH']',2)
"The syntax of the expression to be evaluated is invalid."
Anyone have ideas? Thanks.
I ended up with a workaround that is serviceable but not ideal.
I added a new column and created a custom formula where the month number, in this case "7", is compared to the repository value CURRENT_MONTH. If CURRENT_MONTH is greater than 7, then return ".", else return "null". (The period being the least noticeable character I could think of)
From here I added the new column to the graph and set a conditional format on that column where if the value is equal to not null (a period in this instance), apply the desired conditional format.
The following link was most helpful for me.
http://bidirect.blogspot.com/2013/10/conditional-formatting-is-it-possible.html

Multi Line Graph D3.js

I'm trying to plot more than one line on my graph as per this example:
http://bl.ocks.org/mbostock/3884955
The data is being pulled from a mySQL database using PHP - with output in following format:
[{"dateTimeTaken":"2013-02-21 07:39:29","reading":"12.2","parameterType":"Flouride"},
{"dateTimeTaken":"2013-02-21 07:39:34","reading":"12.01","parameterType":"Temperature"},
{"dateTimeTaken":"2013-02-2107:39:39","reading":"12.01","parameterType":"PH"},...etc.
I would like one line per parameterType but not having any luck getting it working. At the moment getting an error "Problem Parsing d" and no lines at all displaying.
https://gist.github.com/Majella/ab32fe0151fd487da3f6
I'd appreciate it if anyone could help me understand where I'm going wrong?
The problem is in your data.map call -- it's supposed to return the modified object you want in the result array. To fix, simply modify d and return it.
Working example here.

Resources