Talend Convert string to Float - converters

I am using Talend to make an ETL project.
To convert my string to Double, i use Float.parseFloat(row4.Exportation2.trim())
This the error that it gives me.
This is how my data looks like "766,9997474" "1 345,43" in the exportation2
Does anyone have an idea why ?
Démarrage du job ConvertString a 14:03 27/01/2017.
Exception in component tMap_1
java.lang.NumberFormatException: For input string: "23,4897452"
at sun.misc.FloatingDecimal.readJavaFormatString(Unknown Source)
at sun.misc.FloatingDecimal.parseDouble(Unknown Source)
at java.lang.Double.parseDouble(Unknown Source)
at last.convertstring_0_1.ConvertString.tFileInputDelimited_1Process(ConvertString.java:1553)
at last.convertstring_0_1.ConvertString.runJobInTOS(ConvertString.java:2075)
at last.convertstring_0_1.ConvertString.main(ConvertString.java:1932)
[statistics] connecting to socket on port 3464
[statistics] connected
6|Royaume-Uni|BA|1971|23,4897452
[statistics] disconnected
Job ConvertString terminé à 14:03 27/01/2017. [Code sortie=1]

Multiple problems here :
If you want to convert from String to Double you should use Double.parseDouble().
"," is not the expected char : it should be "." :
You will have to convert "," char to "." char : if your input comes from an excel or delimited file, you can set this option on the advanced settings of tFileInput component ("advanced separator"). Otherwise you should use yourString.replaceAll(",", "."))
there is a non-standard space in the String that you should replace with yourString.replaceAll(" ", ""))

To do so u can use so many fucntions in t_Map :
columnValue = columnValue.replaceAll("\\W","");
\w = Anything that is a word character
\W = Anything that isn't a word character (including punctuation etc)
\s = Anything that is a space character (including space, tab characters etc)
`\S = Anything that isn't a space character (including both letters and numbers, as well as punctuation etc)
or if u want to ignore any thing other then a letter or a number u can use simply :
.replaceAll("[^a-zA-Z0-9]", "")

Related

rename multiple mp3 filenames from greek to english with bash script

I have dozens of mp3 files where the filename contains Greek letters. I would like to rename them to "latin only characters" so that the title etc. is displayed correctly on all common playback devices.
It takes a long time to do this manually, so I need your help.
Is there a simple bash script that can do this job?
as example:
I want the script to rename the file from σαγαπώ.mp3 to sagapo.mp3
edit://
I was now able to rename the file name with a python script.
Of:
Βασίλης Μπατής - Ζημιά _ Vasilis Mpatis - Zimia _ Official Video Clip HQ 2017.mp3
would:
Basilis Mpatis - Zimia _ Vasilis Mpatis - Zimia _ Official Video Clip HQ 2017.mp3
So far so good, now the question is how do I get rid of all "unnecessary" information from the file name, so that in the end only the artist and title remain as file names.
This is what the file name should look like at the end.
Basilis Mpatis - Zimia.mp3
Anyone an idea?
Here is my Python script:
import os
# Pfad zum Ordner mit den MP3-Dateien
path = '/home/sakis/mp3'
# Alle MP3-Dateien im Ordner durchlaufen
for file in os.listdir(path):
if file.endswith('.mp3'):
# Aktuellen Dateinamen speichern und in Unicode umwandeln
old_name = file.encode('utf-8').decode('utf-8')
# Dateinamen in zwei Teile trennen
name, extension = old_name.rsplit('.', 1)
# Griechische Buchstaben im Dateinamen ersetzen
new_name = old_name.replace('Ά', 'A').replace('Έ', 'E').replace('Ή', 'H').replace('Ί', 'I').replace('Ό', 'O').replace('Ύ', 'Y').replace('Ώ', 'W').replace('ΐ', 'I').replace('Α', 'A').replace('Β', 'B').replace('Γ', 'G').replace('Δ', 'D').replace('Ε', 'E').replace('Ζ', 'Z').replace('Η', 'H').replace('Θ', 'TH').replace('Ι', 'I').replace('Κ', 'K').replace('Λ', 'L').replace('Μ', 'M').replace('Ν', 'N').replace('Ξ', 'X').replace('Ο', 'O').replace('Π', 'P').replace('Ρ', 'R').replace('Σ', 'S').replace('Τ', 'T').replace('Υ', 'Y').replace('Φ', 'F').replace('Χ', 'X').replace('Ψ', 'PS').replace('Ω', 'O').replace('ά', 'a').replace('έ', 'e').replace('ή', 'i').replace('ί', 'i').replace('ό', 'o').replace('ύ', 'y').replace('ώ', 'w').replace('ϊ', 'i').replace('ϋ', 'u').replace('ό', 'o').replace('α', 'a').replace('β', 'b').replace('γ', 'g').replace('δ', 'd').replace('ε', 'e').replace('ζ', 'z').replace('η', 'i').replace('θ', 'th').replace('ι', 'i').replace('κ', 'k').replace('λ', 'l').replace('μ', 'm').replace('ν', 'n').replace('ξ', 'x').replace('ο', 'o').replace('π', 'p').replace('ρ', 'r').replace('ς', 's').replace('σ', 's').replace('τ', 't').replace('υ', 'y').replace('φ', 'f').replace('χ', 'x').replace('ψ', 'ps').replace('ω', 'o')
# Alle weiteren Zeichen im Dateinamen entfernen
name = ''.join(c for c in name if c.isalnum() or c in [' ', '-', '_'])
# Neuen Dateinamen setzen
os.rename(os.path.join(path, old_name), os.path.join(path, new_name))
print('Done!')
You need to define your own tranlsation table, because nobody can guess how you want to translate the names. Assume that the greek name is stored in variable greek_name, something like this could do:
english_name=$(tr αβΓγΔδεΖζ... avGgDdeZz... $greek_name)
Of course you have to make compromises: Since for instance the letter υ can be pronounced as "i", "f" or as "w" depending on the context, you have to settle for one.
Another problem is that several greek letters are pronounced the same; for instance, Ο and Ω. If you don't manage to map them uniquely, it might happen that two greek file names map to the same english file name. Therefore, when you do the renaming, make sure that you get at least an error message in this case:
if ! mv -n "$greek_name" "$english_name"
then
echo Can not rename "$greek_name", because "$english_name" already exists
fi
UPDATE:
It's not clear how you would translate i.e. ψ, as the most natural mapping would be to use two charcaters, "ps". You could either use an english letter which has no equivalent in Greek anyway ('c' comes to my mind), or you translate these special cases in a separate step, for instance:
# english_name could still contain a ψ because this
# was not handled by `tr`
english_name=${english_name//ψ/ps}
You have of course make up your mind whether you want the upper case Ψ being translated into PS or Ps.
You have not specified how you want to use in translation the English letter b. In Greek, this sound is written as νπ, i.e. two Greek letters map to a single English one. If you want to implement this mapping, you have to do it before the one-to-one translation done by tr, for instance:
# Already preprocess νπ before translating the other
# Greek letters:
greek_name=${greek_name//νπ/b}
greek_name=${greek_name//Ν[πΠ]/B}
This reflects the idea that a Greek word starting with Νπ is meant to be a word starting with an upper case letter, and ΝΠ is meant to start an all-upper-case word, both corresponding to an upper-case B in English.

How to send color images with ESC/POS?

I have an Epson CW-C6000 that I'm trying to control with ESC commands. I've gotten text to print, so I know I have the IP address, port, etc correct but cannot for the life of me get an image printed.
Here is my code (running from a Ruby on Rails server, with most of the image truncated):
streamSock = TCPSocket.new( "X.X.X.X", 9100 )
str = "~DYR:PRODIMG,B,P,183208,0,89504E470D...4AE426082" + "^XA" + "^FO150,150^IMR:PRODIMG.PNG^FS" + "^XZ"
streamSock.send( str , 0)
streamSock.close
The image is a .png I converted to hexadecimal with this site:
http://tomeko.net/online_tools/file_to_hex.php?lang=en
I'm mostly using page 10 of this PDF for reference:
https://files.support.epson.com/pdf/pos/bulk/esclabel_apg_en_forcw-c6000series_reve.pdf
Does anyone have a hint? Epson support staff was spectacularly unhelpful.
Also I'm sorry if my formatting is bad; I'm new here and will happily edit my post if something is wrong.
Alright I finally got it working. The command for printing a color .PNG is this:
~DYE:[Image Name].PNG,p,p,[Image Size],0,:B64:[Base64 String]:[CRC]
Things that tripped me up:
-You seem to need the .PNG extension on the file name, even though the Epson manual doesn't show that.
-[Image Size] is the number of characters in the Base64 string, even though the Epson manual says it should be the size of the original .PNG image file. If this is wrong the printer will hang and no longer accept input of any kind until restarted.
-There may be other options, but I could only get it working with a CRC of the hex CRC-16/XMODEM type.
Thanks to K J for his/her suggestions and coming along with me!
Perhaps this material can be used as an additional reference.
They seem to have a completely different command/data format than ESC/POS.
ESC/Label Command Reference Guide
Page 12
1.3.4 About Saving the Graphics and Label Formats in the Printer
With ESC/Label command, you can save graphics and label formats in the printer. The printer has a file system. Data saved in the printer is handled as files and is managed in the following way.
The file system does not have a hierarchy.
The printer has a non-volatile saving device, such as Flash ROM, and a volatile saving device, such as RAM, and different drive letters are allocated for each device.
Files are designated as
"<drive letter> colon <:> <file name> dot <.> <extension>".
Page 40-41
2.8 Printing Graphics
...Details have been omitted. Please refer to the actual document...
2.8.1 Registering a Graphic in a Printer and Printing It
...Pick up some from the content. Please refer to the actual document...
Delete the files that remain in the printer (^ID command).
Register the graphic in the printer (~DY command).
When registering a color graphic, you can use the PNG format. When registering a monochrome graphic, you can register the PNG format or the GRF format.
PNG format Monochrome and color graphics
GRF format Monochrome graphics
The reason to execute the step 1.
To ensure capacity of the storage memory necessary for print which application will perform.
2.8.2 Embedding a Graphic in the Field and Printing It
...Details have been omitted. Please refer to the actual document...
In Addition:
Page 104-106
~DY
[Name]
Save File
[Format]
~DY d: o ,f ,x ,t ,w ,data
...A table detailing the parameters is due, but omitted...
[Function]
...Further detailed explanations and figures of functions and parameters are due, but omitted...
Graphic data is handled as follows.
If the data format is binary, you can use any binary data as Parameter data. At this time, the size of Parameter data must be matched to the size specified in Parameter t.
If the data format is a hexadecimal character string, one character from 1. to 3. below is used as Parameter data. At this time, the size of Parameter data written in binary must be matched to the size specified in Parameter t.
0 to 9, A to F, and a to f in ASCII can be used as hexadecimal graphic data.
ASCII comma <,>, the parameter separator character, is used to separate lines. If a comma is input, processing is carried out as if ASCII 0 was input for the remainder of the line.
G to Y and g to z in ASCII can be used as repetition characters. For example, if I9 is input, processing is carried out as if 999 were input. The following table indicates the number of repetitions.
...Characters and repeat specified number of times table omitted...
Looking at the contents of this Technical Reference Guide, it seems that you can register images with tools instead of commands.
CW-C6000/C6500 Series Technical Reference Guide
Page 173-174
And page 288 outlines the Epson Inkjet Label Printer SDK and also describes the existence of sample programs.
#Farmbot26. I have been attempting this same using vb.Net and as you noted Epson support is not helpful. I'm not sure if it's the actual image data that is wrong, CRC, or the ZPL code as nothing helps. Here's 2 examples that have not worked.
`Dim binaryData As Byte() = System.IO.File.ReadAllBytes(txtPNGFile.Text)
zplImageData = Convert.ToBase64String(binaryData)
crc = calcrc(binaryData, binaryData.Length).ToString("X4")
Dim zplToSend As String = "~DYE:" & Path.GetFileName(txtPNGFile.Text).ToUpper & ",P,P," & zplImageData.Length & ",0,:B64:" & zplImageData & ":" & crc & "^XZ"`
`Dim binaryData As Byte() = System.IO.File.ReadAllBytes(txtPNGFile.Text)
crc = calcrc(binaryData, binaryData.Length).ToString("X4") 'Calculate CRC
zplImageData = BitConverter.ToString(binaryData).Replace("-", "")
Dim zplToSend As String = "~DYE:" & Path.GetFileName(txtPNGFile.Text).ToUpper & ",A,P," & zplImageData.Length & ",0,:B64:" & zplImageData & ":" & crc & "^XZ"`
This is the CRC example I have.
`Function calcrc(ByVal data() As Byte, ByVal count As Integer) As Integer
Dim crc As Integer = 0
For Each b As Byte In data
Dim d As Integer = CInt(b)
crc = crc Xor (d << 8)
For j = 0 To 7
If ((crc And &H8000) <> 0) Then
crc = (crc << 1) Xor &H1021
Else
crc = (crc << 1)
End If
Next
Next
Return crc And &HFFFF
End Function`
I have figured out another solution. Save the PNG Image using the Binary data. I found this when reading the Saved Backup file of Image data using the Epson Settings Utility.
~DYE:FILENAME.PNG,B,P,BINARYFILESIZE,0, BINARYIMGDATA
` Try
Dim binaryData As Byte() = System.IO.File.ReadAllBytes(txtPNGFile.Text)
Dim client As System.Net.Sockets.TcpClient = New System.Net.Sockets.TcpClient()
client.Connect(IP_TextBox1.Text.Replace(" ", ""), txtPort.Text)
Dim writer As System.IO.StreamWriter = New System.IO.StreamWriter(client.GetStream(), Encoding.UTF8)
Using mStream As New MemoryStream(binaryData)
Dim zplToSend As String = "~DYE:" & Path.GetFileName(txtPNGFile.Text).ToUpper & ",B,P," & mStream.Length & ",0,"
writer.Write(zplToSend)
writer.Flush()
mStream.WriteTo(client.GetStream())
writer.Flush()
End Using
writer.Close()
client.Close()
MsgBox("Send Complete", MsgBoxStyle.OkOnly, "Complete")
Catch ex As Exception
MsgBox(ex.Message.ToString, MsgBoxStyle.OkOnly, "ERROR")
End Try`
You can also open the image file in an IMAGE object and resize it as needed. I had to do this for the label size of the printer.

How to print unicode charaters in Command Prompt with Ruby

I was wondering how to print unicode characters, such as Japanese or fun characters like 📦.
I can print hearts with:
hearts = "\u2665"
puts hearts.encode('utf-8')
How can I print more unicode charaters with Ruby in Command Prompt?
My method works with some characters but not all.
Code examples would be greatly appreciated.
You need to enclose the unicode character in { and } if the number of hex digits isn't 4 (credit : /u/Stefan) e.g.:
heart = "\u2665"
package = "\u{1F4E6}"
fire_and_one_hundred = "\u{1F525 1F4AF}"
puts heart
puts package
puts fire_and_one_hundred
Alternatively you could also just put the unicode character directly in your source, which is quite easy at least on macOS with the Emoji & Symbols menu accessed by Ctrl + Command + Space by default (a similar menu can be accessed on Windows 10 by Win + ; ) in most applications including your text editor/Ruby IDE most likely:
heart = "♥"
package = "📦"
fire_and_one_hundred = "🔥💯"
puts heart
puts package
puts fire_and_one_hundred
Output:
♥
📦
🔥💯
How it looks in the macOS terminal:

Setting a string of ASCII-art to a variable while escaping special characters

I'm working in Ruby and I'd like to set a huge string of ASCII-art characters to a variable, however I am running into some problems dealing with special characters. Most ASCII-art contains octothorps and quotes and all sorts of problematic special characters that require escaping. Is there any easy way to escape mass bulks of special characters without having to go into each and every single character one by one?
Use heredoc. That is what it is for. With heredoc, you can set the terminating string to any string that does not appear in your ASCII art. Combining that with single quotes, which inactivates interpolation, you would not need to escape.
ascii_art = <<'SOME_SEQUENCE_THAT_DOES_NOT_APPEAR_IN_THE_ASCII_ART'
.....'',;;::cccllllllllllllcccc:::;;,,,''...'',,'..
..';cldkO00KXNNNNXXXKK000OOkkkkkxxxxxddoooddddddxxxxkkkkOO0XXKx:.
.':ok0KXXXNXK0kxolc:;;,,,,,,,,,,,;;,,,''''''',,''.. .'lOXKd'
.,lx00Oxl:,'............''''''................... ...,;;'. .oKXd.
.ckKKkc'...'',:::;,'.........'',;;::::;,'..........'',;;;,'.. .';;'. 'kNKc.
.:kXXk:. .. .................. .............,:c:'...;:'. .dNNx.
:0NKd, .....''',,,,''.. ',...........',,,'',,::,...,,. .dNNx.
.xXd. .:;'.. ..,' .;,. ...,,'';;'. ... .oNNo
.0K. .;. ;' '; .'...'. .oXX:
.oNO. . ,. . ..',::ccc:;,.. .. lXX:
.dNX: ...... ;. 'cxOKK0OXWWWWWWWNX0kc. :KXd.
.l0N0; ;d0KKKKKXK0ko:... .l0X0xc,...lXWWWWWWWWKO0Kx' ,ONKo.
.lKNKl...'......'. .dXWN0kkk0NWWWWWN0o. :KN0;. .,cokXWWNNNNWNKkxONK: .,:c:. .';;;;:lk0XXx;
:KN0l';ll:'. .,:lodxxkO00KXNWWWX000k. oXNx;:okKX0kdl:::;'',;coxkkd, ...'. ...'''.......',:lxKO:.
oNNk,;c,'',. ...;xNNOc,. ,d0X0xc,. .dOd, ..;dOKXK00000Ox:. ..''dKO,
'KW0,:,.,:..,oxkkkdl;'. 'KK' .. .dXX0o:'....,:oOXNN0d;.'. ..,lOKd. .. ;KXl.
;XNd,; ;. l00kxoooxKXKx:..ld: ;KK' .:dkO000000Okxl;. c0; :KK; . ;XXc
'XXdc. :. .. '' 'kNNNKKKk, .,dKNO. .... .'c0NO' :X0. ,. xN0.
.kNOc' ,. .00. ..''... .l0X0d;. 'dOkxo;... .;okKXK0KNXx;. .0X: ,. lNX'
,KKdl .c, .dNK, .;xXWKc. .;:coOXO,,'....... .,lx0XXOo;...oNWNXKk:.'KX; ' dNX.
:XXkc'.... .dNWXl .';l0NXNKl. ,lxkkkxo' .cK0. ..;lx0XNX0xc. ,0Nx'.','.kXo ., ,KNx.
cXXd,,;:, .oXWNNKo' .'.. .'.'dKk; .cooollox;.xXXl ..,cdOKXXX00NXc. 'oKWK' ;k: .l. ,0Nk.
cXNx. . ,KWX0NNNXOl'. .o0Ooldk; .:c;.':lxOKKK0xo:,.. ;XX: .,lOXWWXd. . .':,.lKXd.
lXNo cXWWWXooNWNXKko;'.. .lk0x; ...,:ldk0KXNNOo:,.. ,OWNOxO0KXXNWNO, ....'l0Xk,
.dNK. oNWWNo.cXK;;oOXNNXK0kxdolllllooooddxk00KKKK0kdoc:c0No .'ckXWWWNXkc,;kNKl. .,kXXk,
'KXc .dNWWX;.xNk. .kNO::lodxkOXWN0OkxdlcxNKl,.. oN0'..,:ox0XNWWNNWXo. ,ONO' .o0Xk;
.ONo oNWWN0xXWK, .oNKc .ONx. ;X0. .:XNKKNNWWWWNKkl;kNk. .cKXo. .ON0;
.xNd cNWWWWWWWWKOkKNXxl:,'...;0Xo'.....'lXK;...',:lxk0KNWWWWNNKOd:.. lXKclON0: .xNk.
.dXd ;XWWWWWWWWWWWWWWWWWWNNNNNWWNNNNNNNNNWWNNNNNNWWWWWNXKNNk;.. .dNWWXd. cXO.
.xXo .ONWNWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWNNK0ko:'..OXo 'l0NXx, :KK,
.OXc :XNk0NWXKNWWWWWWWWWWWWWWWWWWWWWNNNX00NNx:'.. lXKc. 'lONN0l. .oXK:
.KX; .dNKoON0;lXNkcld0NXo::cd0NNO:;,,'.. .0Xc lXXo..'l0NNKd,. .c0Nk,
:XK. .xNX0NKc.cXXl ;KXl .dN0. .0No .xNXOKNXOo,. .l0Xk;.
.dXk. .lKWN0d::OWK; lXXc .OX: .ONx. . .,cdk0XNXOd;. .'''....;c:'..;xKXx,
.0No .:dOKNNNWNKOxkXWXo:,,;ONk;,,,,,;c0NXOxxkO0XXNXKOdc,. ..;::,...;lol;..:xKXOl.
,XX: ..';cldxkOO0KKKXXXXXXXXXXKKKKK00Okxdol:;'.. .';::,..':llc,..'lkKXkc.
:NX' . '' .................. .,;:;,',;ccc;'..'lkKX0d;.
lNK. .; ,lc,. ................ ..,,;;;;;;:::,....,lkKX0d:.
.oN0. .'. .;ccc;,'.... ....'',;;;;;;;;;;'.. .;oOXX0d:.
.dN0. .;;,.. .... ..''''''''.... .:dOKKko;.
lNK' ..,;::;;,'......................... .;d0X0kc'.
.xXO' .;oOK0x:.
.cKKo. .,:oxkkkxk0K0xc'.
.oKKkc,. .';cok0XNNNX0Oxoc,.
.;d0XX0kdlc:;,,,',,,;;:clodkO0KK0Okdl:,'..
.,coxO0KXXXXXXXKK0OOxdoc:,..
...
SOME_SEQUENCE_THAT_DOES_NOT_APPEAR_IN_THE_ASCII_ART

searching "-" in websolr

websolr is returning
RSolr::Error::Http - 400 Bad Request
Error: <html><head><title>Apache Tomcat/6.0.28 - Error report</title><style><!--H1 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;} H2 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;} H3 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;} BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;} P {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A {color : black;}A.name {color : black;}HR {color : #525D76;}--></style> </head><body><h1>HTTP Status 400 - org.apache.lucene.queryParser.ParseException: Cannot parse '----': Encountered " "-" "- "" at line 1, column 1.
Was expecting one of:
"(" ...
"*" ...
<QUOTED> ...
<TERM> ...
<PREFIXTERM> ...
<WILDTERM> ...
"[" ...
"{" ...
<NUMBER> ...
when ever tried to search "-" character.
other special characters works fine like ":" etc i have tried to use CGI.escape but its not making escape to these characters.
Have you tried escaping it with backslash?
Normally when you index your documents, the tokenizer will remove dash characters on their own, so you may want to just strip the dash anyway, unless you mean for it to be a negative query.
The full Solr query syntax is here: http://wiki.apache.org/solr/SolrQuerySyntax
As Chris correctly notes, you need to escape the backslash.
Depending on which query parser you're using, there are some special characters that have meaning. As of this writing, the Lucene (and thus Solr) query parser assigns special meaning to these characters:
+ - && || ! ( ) { } [ ] ^ " ~ * ? : \
You should refer to the docs for Lucene query parser syntax for their full meaning. The default Solr query parser offers a superset of the Lucene query parser syntax, as described by the SolrQueryParser wiki page.
If you don't want to worry about escaping things, the DisMax Query Parser is designed to accept input that's closer to what a user might type into a search box. I haven't tested the various special against it recently, but as a rule it's probably more graceful in the input that it accepts.

Resources