Unexpected keyword argument 'unk_token' - huggingface-transformers

When trying to load this tokenizer I am getting this error but I don't know why it can't take the ink_token strangely. Any ideas?
tokenizer = tokenizers.SentencePieceUnigramTokenizer(unk_token="", eos_token="", pad_token="")
----> 1 tokenizer = tokenizers.SentencePieceUnigramTokenizer(unk_token="", eos_token="", pad_token="")
TypeError: init() got an unexpected keyword argument 'unk_token'

Related

Can a logstash filter error be forwarded to elastic?

I'm having these json parsing errors from time to time:
2022-01-07T12:15:19,872][WARN ][logstash.filters.json ] Error parsing json
{:source=>"message", :raw=>" { the invalid json }", :exception=>#<LogStash::Json::ParserError: Unrecognized character escape 'x' (code 120)
Is there a way to get the :exception field in the logstash config file?
I opened the exact same thread on the elastic forum and got a working solution there. Thanks to #Badger on the forum, I ended up using the following raw ruby filter:
ruby {
code => '
#source = "message"
source = event.get(#source)
return unless source
begin
parsed = LogStash::Json.load(source)
rescue => e
event.set("jsonException", e.to_s)
return
end
#target = "jsonData"
if #target
event.set(#target, parsed)
end
'
}
which extracts the info I needed:
"jsonException" => "Unexpected character (',' (code 44)): was expecting a colon to separate field name and value\n at [Source: (byte[])\"{ \"baz\", \"oh!\" }\r\"; line: 1, column: 9]",
Or as the author of the solution suggested, get rid of the #target part and use the normal json filter for the rest of the data.

How to get file name that causes GraphQLError: Syntax Error: Unterminated string

In our team we sometimes get an GraphQl syntax error, when modifying our schema.
However, we don't seem to get the name of file causing the issue? The error looks like this:
GraphQLError: Syntax Error: Unterminated string.
at syntaxError (<full-path-to-project>>node_modules/graphql/error/syntaxError.js:15:10)
at readString (<full-path-to-project>>node_modules/graphql/language/lexer.js:513:38)
at readToken (<full-path-to-project>>node_modules/graphql/language/lexer.js:267:14)
at Object.lookahead (<full-path-to-project>>node_modules/graphql/language/lexer.js:54:43)
at Object.advanceLexer [as advance] (<full-path-to-project>>node_modules/graphql/language/lexer.js:44:33)
at Parser.parseStringLiteral (<full-path-to-project>>node_modules/graphql/language/parser.js:519:17)
at Parser.parseDescription (<full-path-to-project>>node_modules/graphql/language/parser.js:728:19)
at Parser.parseFieldDefinition (<full-path-to-project>>node_modules/graphql/language/parser.js:856:28)
at Parser.optionalMany (<full-path-to-project>>node_modules/graphql/language/parser.js:1497:28)
at Parser.parseFieldsDefinition (<full-path-to-project>>node_modules/graphql/language/parser.js:846:17)
at Parser.parseObjectTypeDefinition (<full-path-to-project>>node_modules/graphql/language/parser.js:798:23)
at Parser.parseTypeSystemDefinition (<full-path-to-project>>node_modules/graphql/language/parser.js:696:23)
at Parser.parseDefinition (<full-path-to-project>>node_modules/graphql/language/parser.js:146:23)
at Parser.many (<full-path-to-project>>node_modules/graphql/language/parser.js:1518:26)
at Parser.parseDocument (<full-path-to-project>>node_modules/graphql/language/parser.js:111:25)
at parse (<full-path-to-project>>node_modules/graphql/language/parser.js:36:17) {
message: 'Syntax Error: Unterminated string.',
locations: [ { line: 27, column: 51 } ]
}
Is this normal - and how do I get the file causing the problem?

Getting a WARNING and ERROR: unexpected keyword argument 'queryset'

Unexpected keyword argument 'queryset' in constructor call [E:unexpected-keyword-arg]
Tried using form_kwargs as shown on stack overflow here:
# How to use the new form_kwargs on an inline formset?
if request.method == "POST":
ctx['formset'] = project_comparison_form_set(
data=request.POST, files=request.FILES, queryset=ctx['projects'])
ctx['data1'] = request.POST.copy
if ctx['formset'].is_valid():
instances = ctx['formset'].save(commit=False)
for project in instances:
project.save()
Getting both a warning and error message in pylint Unexpected keyword argument 'queryset' in constructor call [E:unexpected-keyword-arg]
You don't show where project_comparison_form_set is defined, but i assume that it is a modelformset_factory.
Here, you do not have the queryset argument. If you do want to pass a queryset, you can pass it to the formset and the formset than to the modelformset_factory.
Check the documentation https://docs.djangoproject.com/en/4.1/topics/forms/modelforms/#changing-the-queryset.

Service Now : transform map stopped due to error: java.lang.NumberFormatException

The transform is getting aborted but only if I marked the checkbox copy empty fields and also the rest of the entry of the Import set is getting stuck at pending, also I verified the transform script but no luck.
Below is the error :
Import set: ISETxxxxxxx transform stopped due to error: java.lang.NumberFormatException
java.lang.NumberFormatException
at java.math.BigDecimal.<init>(BigDecimal.java:596)
at java.math.BigDecimal.<init>(BigDecimal.java:383)
at java.math.BigDecimal.<init>(BigDecimal.java:806)
at com.glide.script.glide_elements.GlideNumber.getSafeBigDecimal(GlideNumber.java:42)
at com.glide.currency.GlideElementCurrency.coerceAmount(GlideElementCurrency.java:406)
at com.glide.currency.GlideElementCurrency.cleanAmount(GlideElementCurrency.java:389)
at com.glide.currency.GlideElementCurrency.setDisplayValue(GlideElementCurrency.java:136)
at com.glide.currency.GlideElementCurrency.setValue(GlideElementCurrency.java:89)
at com.glide.db.impex.transformer.TransformerField.copyEmptyFields(TransformerField.java:202)
at com.glide.db.impex.transformer.TransformerField.setValue(TransformerField.java:130)
at com.glide.db.impex.transformer.TransformerField.transformField(TransformerField.java:84)
at com.glide.db.impex.transformer.TransformRow.transformCurrent(TransformRow.java:100)
at com.glide.db.impex.transformer.TransformRow.transform(TransformRow.java:69)
at com.glide.db.impex.transformer.Transformer.transformBatch(Transformer.java:150)
at com.glide.db.impex.transformer.Transformer.transform(Transformer.java:76)
at com.glide.system_import_set.ImportSetTransformerImpl.transformEach(ImportSetTransformerImpl.java:239)
at com.glide.system_import_set.ImportSetTransformerImpl.transformAllMaps(ImportSetTransformerImpl.java:91)
at com.glide.system_import_set.ImportSetTransformer.transformAllMaps(ImportSetTransformer.java:64)
at com.glide.system_import_set.ImportSetTransformer.transformAllMaps(ImportSetTransformer.java:50)
at com.snc.automation.ScheduledImportSetJob.runImport(ScheduledImportSetJob.java:55)
at com.snc.automation.ScheduledImportJob.execute(ScheduledImportJob.java:45)
at com.glide.schedule.JobExecutor.execute(JobExecutor.java:83)
at com.glide.schedule.GlideScheduleWorker.executeJob(GlideScheduleWorker.java:207)
at com.glide.schedule.GlideScheduleWorker.process(GlideScheduleWorker.java:145)
at com.glide.schedule.GlideScheduleWorker.run(GlideScheduleWorker.java:62)
I'm guessing you have a field that required that is a decimal or similar.
The error java.lang.NumberFormatException indicates it's failing to convert an empty string to 0.0.
Use a source script line to convert this, something along the lines of this
answer = (function transformEntry(source) {
if (source.u_number_field.nil())
return 0.0;
})(source);

Wordcount Nonetype error pyspark-

I am trying to do some text analysis:
def cleaning_text(sentence):
sentence=sentence.lower()
sentence=re.sub('\'','',sentence.strip())
sentence=re.sub('^\d+\/\d+|\s\d+\/\d+|\d+\-\d+\-\d+|\d+\-\w+\-\d+\s\d+\:\d+|\d+\-\w+\-\d+|\d+\/\d+\/\d+\s\d+\:\d+',' ',sentence.strip())# dates removed
sentence=re.sub(r'(.)(\/)(.)',r'\1\3',sentence.strip())
sentence=re.sub("(.*?\//)|(.*?\\\\)|(.*?\\\)|(.*?\/)",' ',sentence.strip())
sentence=re.sub('^\d+','',sentence.strip())
sentence = re.sub('[%s]' % re.escape(string.punctuation),'',sentence.strip())
cleaned=' '.join([w for w in sentence.split() if not len(w)<2 and w not in ('no', 'sc','ln') ])
cleaned=cleaned.strip()
if(len(cleaned)<=1):
return "NA"
else:
return cleaned
org_val=udf(cleaning_text,StringType())
df_new =df.withColumn("cleaned_short_desc", org_val(df["symptom_short_description_"]))
df_new =df_new.withColumn("cleaned_long_desc", org_val(df_new["long_description"]))
longWordsDF = (df_new.select(explode(split('cleaned_long_desc',' ')).alias('word'))
longWordsDF.count()
I get the following error.
File "<stdin>", line 2, in cleaning_text
AttributeError: 'NoneType' object has no attribute 'lower'
I want to perform word counts but any kind of aggregation function is giving me an error.
I tried following things:
sentence=sentence.encode("ascii", "ignore")
Added this statement in the cleaning_text function
df.dropna()
Its still giving the same issue, I do not know how to resolve this issue.
It looks like you have null values in some columns. Add an if at the beginning of cleaning_text function and the error will disappear:
if sentence is None:
return "NA"

Resources