Ruby text file parsing methodolgy

Ruby text file parsing methodolgy - ruby

I've got a text file where I have a lot of lines that look like the first example and only a couple that look like the second (NB the ** are just to show the fields I'm after, they don't look like this in the actual file);
22034 BUBBA C BC-022 **OWL SOFTIE** <N/A> <N/A> <N/A> 470 0.00 **6** 0.00 **1** **37.95**
22489 BUBBA C BC- **BUNNY BOO BOO** <N/A> <N/A> <N/A> 470 0.00 **2** 0.00 **1** **24.95**
My aim is to extract the ** surrounded fields into a format (probably csv) so I can add it to as a sheet to an existing Excel spreadsheet.
My issue is I can't figure out how to extract just the data I need using gsub, split, tr, scan, match etc.
My initial thinking was as I parsed each line, I'd delete up to the 4th instance of a space (which I can't find code for), then delete/skip everything between the first < and the last >, then try and delete the next 2 fields, keep 1, delete the next and keep the remaining 2.
All of which seems a bloody hard way to get to the end result.
I don't want the exact code to solve this problem, I'm more after the methodology you would go through when you're looking at this type of problem and what tools you'd use. (strip, gsub etc)
Any help greatly appreciated.

Use #split/#join pair to proceed:
a='22034 BUBBA C BC-022 **OWL SOFTIE** <N/A> <N/A> <N/A> 470 0.00 **6** 0.00 **1** **37.95**'.split
[ a[4..-10].join( ' ' ), a[-4], a[-2], a[-1] ].join ' '
# => "**OWL SOFTIE** **6** **1** **37.95**"

Space-delimited file, huh? That's not the most... optimal... format.
Anyway, I would use regex to snag that **OWL SOFTIE** field
[7] pry(main)> m = s.match /BC-\d*\s(.*?)\s</
=> #<MatchData "BC-022 OWL SOFTIE <" 1:"OWL SOFTIE">
[8] pry(main)> m.captures[0]
=> "OWL SOFTIE"
and then split to grab everything else.
[11] pry(main)> arr = s.split[-4..-1]
=> ["6", "0.00", "1", "37.95"]
[12] pry(main)> arr.select.with_index {|x,i| i!=1 }
=> ["6", "1", "37.95"]
Altogether:
[13] pry(main)> [s.match(/BC-\d*\s(.*?)\s</).captures[0]] + s.split[-4..-1].select.with_index {|x,i| i!=1 }
=> ["OWL SOFTIE", "6", "1", "37.95"]
(if you have any control over that input file whatsoever, see if you can make it delimited by something other than spaces :))

You only have one troublesome field with a variable number of words in it so start with split:
a = "22034 BUBBA C BC-022 OWL SOFTIE <N/A> <N/A> <N/A> 470 0.00 6 0.00 1 37.95".split
Then pick it apart:
[a[4..-10].join(' '), a[-4], a[-2], a[-1]]

Related

Sorting data with gnuplot

Sometimes it might be required to sort data. Unfortunately, gnuplot (as far as I know) doesn't offer this possibility. Of course, you can use external tools like awk, Perl, Python, etc. However, for maximum platform independence and avoiding the installation of additional programs and related complications, and also for curiosity, I was interested whether gnuplot can sort somehow nevertheless.
I will be grateful for comments on improvements, limitations.
Does anybody have ideas how to sort alphanumerical data with gnuplot only?
### Sorting with gnuplot
reset session
# generate some random example data
N = 10
set samples N
RandomNo(n) = sprintf("%.02f",rand(0)*n)
set table $Data
plot '+' u (RandomNo(10)):(RandomNo(10)):(RandomNo(10)) w table
unset table
print $Data
# Settings for sorting
ColNo = 2 # ColumnNo for sorting
stats $Data nooutput # get the number of rows if data is from file
RowCount = STATS_records # with the example data above, of course RowCount=N
# create the sortkey and put it into an array
array SortKey[RowCount]
set table $Dummy
plot $Data u (SortKey[$0+1] = sprintf("%.06f%02d",column(ColNo),$0+1)) w table
unset table
# print $Dummy
# get lines as whole into array
set datafile separator "\n"
array DataSeq[RowCount]
set table $Dummy2
plot $Data u (SortKey[$0+1]):(DataSeq[$0+1] = stringcolumn(1)) with table
unset table
print $Dummy2
set datafile separator whitespace
# do the actual sorting with 'smooth unique'
set table $Dummy3
plot $Dummy2 u 1:0 smooth unique
unset table
# print $Dummy3
# extract the sorted sortkeys
set table $Dummy4
plot $Dummy3 u (SortKey[$0+1]=$2) with table
unset table
# print $Dummy4
# create the table with sorted lines
set table $DataSorted
plot $Data u (DataSeq[SortKey[$0+1]+1]) with table
unset table
print $DataSorted
### end of code
First datablock unsorted data
second datablock intermediate with sortkeys
third datablock sorted data by the second column
Output:
5.24 6.68 3.09
1.64 1.27 9.82
6.44 9.23 7.03
8.14 8.87 3.82
4.27 5.98 0.93
7.96 3.64 6.15
6.21 6.28 6.17
1.52 3.17 3.58
4.24 2.16 8.99
8.73 6.54 1.13
6.68000001 5.24 6.68 3.09
1.27000002 1.64 1.27 9.82
9.23000003 6.44 9.23 7.03
8.87000004 8.14 8.87 3.82
5.98000005 4.27 5.98 0.93
3.64000006 7.96 3.64 6.15
6.28000007 6.21 6.28 6.17
3.17000008 1.52 3.17 3.58
2.16000009 4.24 2.16 8.99
6.54000010 8.73 6.54 1.13
1.64 1.27 9.82
4.24 2.16 8.99
1.52 3.17 3.58
7.96 3.64 6.15
4.27 5.98 0.93
6.21 6.28 6.17
8.73 6.54 1.13
5.24 6.68 3.09
8.14 8.87 3.82
6.44 9.23 7.03

For curiosity, I wanted to know whether an alphanumerical sort could be implemented with gnuplot code only.
This avoids the need for external tools and ensures maximum platform compatibility.
I haven't heard yet about an external tool which could assist gnuplot and which works under Windows and Linux and MacOS.
I am happy to take comments and suggestions about bugs, simplifications, improvements, performance comparisons, and limits.
For alphanumerical sort, the first stage is alphanumerical string comparison, which to my knowledge does not exist in gnuplot directly. So, the first part Compare.plt is about comparison of strings.
### compare function for strings
# Compare.plt
# function cmp(a,b,cs) returns a<b:-1, a==b:0, a>b:+1
# cs=0: case-insensitive, cs=1: case-sensitive
reset session
ASCII = ' !"' . "#$%&'()*+,-./0123456789:;<=>?#".\
"ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_\`".\
"abcdefghijklmnopqrstuvwxyz{|}~"
ord(c) = strstrt(ASCII,c)>0 ? strstrt(ASCII,c)+31 : 0
# comparing char: case-sensitive
cmpcharcs(c1,c2) = sgn(ord(c1)-ord(c2))
# comparing char: case-insentitive
cmpcharci(c1,c2) = sgn(( cmpcharci_o1=ord(c1), ((cmpcharci_o1>96) && (cmpcharci_o1<123)) ?\
cmpcharci_o1-32 : cmpcharci_o1) - \
( cmpcharci_o2=ord(c2), ((cmpcharci_o2>96) && (cmpcharci_o2<123)) ?\
cmpcharci_o2-32 : cmpcharci_o2) )
# function cmp returns a<b:-1, a==b:0, a>b:+1
# cs=0: case-insensitive, cs=1: case-sensitive
cmp(a,b,cs) = ((cmp_r=0, cmp_flag=0, cmp_maxlen=strlen(a)>strlen(b) ? strlen(a) : strlen(b)),\
(sum[cmp_i=1:cmp_maxlen] \
((cmp_flag==0 && (cmp_c1 = substr(a,cmp_i,cmp_i), cmp_c2 = substr(b,cmp_i,cmp_i), \
(cmp_r = (cs==0 ? cmpcharci(cmp_c1,cmp_c2) : cmpcharcs(cmp_c1,cmp_c2) ) )!=0 ? \
(cmp_flag=1, cmp_r) : 0)), 1 )), cmp_r)
cmpsymb(a,b,cs) = (cmpsymb_r = cmp(a,b,cs))<0 ? "<" : cmpsymb_r>0 ? ">" : "="
### end of code
Example:
### example compare strings
load "Compare.plt"
a="Alligator"
b="Tiger"
print sprintf("% 2d: % 9s% 2s% 6s", cmp(a,b,0), a, cmpsymb(a,b,0), b)
a="Tiger"
print sprintf("% 2d: % 9s% 2s% 6s", cmp(a,b,0), a, cmpsymb(a,b,0), b)
a="Zebra"
print sprintf("% 2d: % 9s% 2s% 6s", cmp(a,b,0), a, cmpsymb(a,b,0), b)
### end of code
Result:
-1: Alligator < Tiger
0: Tiger = Tiger
1: Zebra > Tiger
The second part makes use of the comparison for sorting.
### alpha-numerical sort with gnuplot
reset session
load "Compare.plt"
$Data <<EOD
1 0.123 Orange
2 0.456 Apple
3 0.789 Peach
4 0.987 Pineapple
5 0.654 Banana
6 0.321 Raspberry
7 0.111 Lemon
EOD
stats $Data u 0 nooutput
RowCount = STATS_records
ColSort = 3
array Key[RowCount]
array Index[RowCount]
set table $Dummy
plot $Data u (Key[$0+1]=stringcolumn(ColSort),Index[$0+1]=$0+1) w table
unset table
# Bubblesort
do for [n=RowCount:2:-1] {
do for [i=1:n-1] {
if ( cmp(Key[i],Key[i+1],0) > 0) {
tmp=Key[i]; Key[i]=Key[i+1]; Key[i+1]=tmp
tmp2=Index[i]; Index[i]=Index[i+1]; Index[i+1]=tmp2
}
}
}
set datafile separator "\n"
set table $Dummy # and reuse Key-array
plot $Data u (Key[$0+1]=stringcolumn(1)) with table
unset table
set datafile separator whitespace
set table $DataSorted
plot $Data u (Key[Index[$0+1]]) with table
unset table
print $DataSorted
set grid xtics,ytics
plot [-0.5:RowCount-0.5][0:1.1] $DataSorted u 0:2:xtic(3) w lp lt 7 lc rgb "red"
### end of code
Input:
1 0.123 Orange
2 0.456 Apple
3 0.789 Peach
4 0.987 Pineapple
5 0.654 Banana
6 0.321 Raspberry
7 0.111 Lemon
Output:
2 0.456 Apple
5 0.654 Banana
7 0.111 Lemon
1 0.123 Orange
3 0.789 Peach
4 0.987 Pineapple
6 0.321 Raspberry
and the output graph:

How can I extract parts of one column and append them to other columns?

I have a large .csv file that I need to extract information from and add this information to another column. My csv looks something like this:
file_name,#,Date,Time,Temp (°C) ,Intensity
trap12u_10733862_150809.txt,1,05/28/15,06:00:00.0,20.424,215.3,,
trap12u_10733862_150809.txt,2,05/28/15,07:00:00.0,21.091,1,130.2,,
trap12u_10733862_150809.txt,3,05/28/15,08:00:00.0,26.195,3,100.0,,
trap11u_10733862_150809.txt,4,05/28/15,09:00:00.0,25.222,3,444.5,,
trap11u_10733862_150809.txt,5,05/28/15,10:00:00.0,26.195,3,100.0,,
trap11u_10733862_150809.txt,6,05/28/15,11:00:00.0,25.902,2,927.8,,
trap11u_10733862_150809.txt,7,05/28/15,12:00:00.0,25.708,2,325.0,,
trap12c_10733862_150809.txt,8,05/28/15,13:00:00.0,26.292,3,100.0,,
trap12c_10733862_150809.txt,9,05/28/15,14:00:00.0,26.390,2,066.7,,
trap12c_10733862_150809.txt,10,05/28/15,15:00:00.0,26.097,1,463.9,,
I want to create a two new columns that contains the data from the "file_name" column. I want to extract the one to two numbers after the text "trap" and I want to extract the c or the u and create new columns with this data. Data should look like something like this after processing:
file_name,#,Date,Time,Temp (°C) ,Intensity,can_und,trap_no
trap12u_10733862_150809.txt,1,05/28/15,06:00:00.0,20.424,215.3,,u,12
trap12u_10733862_150809.txt,2,05/28/15,07:00:00.0,21.091,1,130.2,,u,12
trap12u_10733862_150809.txt,3,05/28/15,08:00:00.0,26.195,3,100.0,,u,12
trap11u_10733862_150809.txt,4,05/28/15,09:00:00.0,25.222,3,444.5,,u,11
trap12c_10733862_150809.txt,8,05/28/15,13:00:00.0,26.292,3,100.0,,c,12
trap12c_10733862_150809.txt,9,05/28/15,14:00:00.0,26.390,2,066.7,,c,12
trap12c_10733862_150809.txt,10,05/28/15,15:00:00.0,26.097,1,463.9,,c,12
I suspect the way to do this is with awk and a regular expression, but I'm not sure how to implement the regular expression. How can I extract parts of one column and append them to other columns?

Using sed you can do this:
sed -E '1s/.*/&,can_und,trap_no/; 2,$s/trap([0-9]+)([a-z]).*/&\2,\1/' file.csv
file_name,#,Date,Time,Temp (°C) ,Intensity,can_und,trap_no
trap12u_10733862_150809.txt,1,05/28/15,06:00:00.0,20.424,215.3,,u,12
trap12u_10733862_150809.txt,2,05/28/15,07:00:00.0,21.091,1,130.2,,u,12
trap12u_10733862_150809.txt,3,05/28/15,08:00:00.0,26.195,3,100.0,,u,12
trap11u_10733862_150809.txt,4,05/28/15,09:00:00.0,25.222,3,444.5,,u,11
trap11u_10733862_150809.txt,5,05/28/15,10:00:00.0,26.195,3,100.0,,u,11
trap11u_10733862_150809.txt,6,05/28/15,11:00:00.0,25.902,2,927.8,,u,11
trap11u_10733862_150809.txt,7,05/28/15,12:00:00.0,25.708,2,325.0,,u,11
trap12c_10733862_150809.txt,8,05/28/15,13:00:00.0,26.292,3,100.0,,c,12
trap12c_10733862_150809.txt,9,05/28/15,14:00:00.0,26.390,2,066.7,,c,12
trap12c_10733862_150809.txt,10,05/28/15,15:00:00.0,26.097,1,463.9,,c,12

gawk approach:
awk -F, 'NR==1{ print $0,"can_und,trap_no" }
NR>1{ match($1,/^trap([0-9]+)([a-z])/,a); print $0 a[2],a[1] }' OFS="," file
The output:
file_name,#,Date,Time,Temp (°C) ,Intensity,can_und,trap_no
trap12u_10733862_150809.txt,1,05/28/15,06:00:00.0,20.424,215.3,,u,12
trap12u_10733862_150809.txt,2,05/28/15,07:00:00.0,21.091,1,130.2,,u,12
trap12u_10733862_150809.txt,3,05/28/15,08:00:00.0,26.195,3,100.0,,u,12
trap11u_10733862_150809.txt,4,05/28/15,09:00:00.0,25.222,3,444.5,,u,11
trap11u_10733862_150809.txt,5,05/28/15,10:00:00.0,26.195,3,100.0,,u,11
trap11u_10733862_150809.txt,6,05/28/15,11:00:00.0,25.902,2,927.8,,u,11
trap11u_10733862_150809.txt,7,05/28/15,12:00:00.0,25.708,2,325.0,,u,11
trap12c_10733862_150809.txt,8,05/28/15,13:00:00.0,26.292,3,100.0,,c,12
trap12c_10733862_150809.txt,9,05/28/15,14:00:00.0,26.390,2,066.7,,c,12
trap12c_10733862_150809.txt,10,05/28/15,15:00:00.0,26.097,1,463.9,,c,12
NR==1{ print $0,"can_und,trap_no" } - print the header line
match($1,/^trap([0-9]+)([a-z])/,a) - matches the number following trap word and the next following suffix letter

With use of sed, this will be like:
sed 's/trap\([[:digit:]]\+\)\(.\)\(.*\)$/trap\1\2\3\2,\1/' file
Use sed -i ... to replace it in file.

Using python pandas reader because python is awesome for numerical analysis:
First: I had to modify the data header row so that the columns were consistent by appending 3 commas:
file_name,#,Date,Time,Temp (°C) ,Intensity,,,
There is probably a way to tell pandas to ignore the column differences - but I am yet a noob.
Python code to read your data into columns and create 2 new columns named 'cu_int' and 'cu_char' which contain the parsed elements of the filenames:
import pandas
def main():
df = pandas.read_csv("file.csv")
df['cu_int'] = 0 # Add the new columns to the data frame.
df['cu_char'] = ' '
for index, df_row in df.iterrows():
file_name = df['file_name'][index].strip()
trap_string = file_name.split("_")[0] # Get the file_name string prior to the underscore
numeric_offset_beg = len("trap") # Parse the number following the 'trap' string.
numeric_offset_end = len(trap_string) - 1 # Leave off the 'c' or 'u' char.
numeric_value = trap_string[numeric_offset_beg : numeric_offset_end]
cu_value = trap_string[len(trap_string) - 1]
df['cu_int'] = int(numeric_value)
df['cu_char'] = cu_value
# The pandas dataframe is ready for number crunching.
# For now just print it out:
print df
if __name__ == "__main__":
main()
The printed output (note there are inconsistencies in the data set posted - see row 1 as an example):
$ python read_csv.py
file_name # Date Time Temp (°C) Intensity Unnamed: 6 Unnamed: 7 Unnamed: 8 cu_int cu_char
0 trap12u_10733862_150809.txt 1 05/28/15 06:00:00.0 20.424 215.3 NaN NaN NaN 12 c
1 trap12u_10733862_150809.txt 2 05/28/15 07:00:00.0 21.091 1.0 130.2 NaN NaN 12 c
2 trap12u_10733862_150809.txt 3 05/28/15 08:00:00.0 26.195 3.0 100.0 NaN NaN 12 c
3 trap11u_10733862_150809.txt 4 05/28/15 09:00:00.0 25.222 3.0 444.5 NaN NaN 12 c
4 trap11u_10733862_150809.txt 5 05/28/15 10:00:00.0 26.195 3.0 100.0 NaN NaN 12 c
5 trap11u_10733862_150809.txt 6 05/28/15 11:00:00.0 25.902 2.0 927.8 NaN NaN 12 c
6 trap11u_10733862_150809.txt 7 05/28/15 12:00:00.0 25.708 2.0 325.0 NaN NaN 12 c
7 trap12c_10733862_150809.txt 8 05/28/15 13:00:00.0 26.292 3.0 100.0 NaN NaN 12 c
8 trap12c_10733862_150809.txt 9 05/28/15 14:00:00.0 26.390 2.0 66.7 NaN NaN 12 c
9 trap12c_10733862_150809.txt 10 05/28/15 15:00:00.0 26.097 1.0 463.9 NaN NaN 12 c

High & Low Numbers From A String (Ruby)

Good evening,
I'm trying to solve a problem on Codewars:
In this little assignment you are given a string of space separated numbers, and have to return the highest and lowest number.
Example:
high_and_low("1 2 3 4 5") # return "5 1"
high_and_low("1 2 -3 4 5") # return "5 -3"
high_and_low("1 9 3 4 -5") # return "9 -5"
Notes:
All numbers are valid Int32, no need to validate them.
There will always be at least one number in the input string.
Output string must be two numbers separated by a single space, and highest number is first.
I came up with the following solution however I cannot figure out why the method is only returning "542" and not "-214 542". I also tried using #at, #shift and #pop, with the same result.
Is there something I am missing? I hope someone can point me in the right direction. I would like to understand why this is happening.
def high_and_low(numbers)
numberArray = numbers.split(/\s/).map(&:to_i).sort
numberArray[-1]
numberArray[0]
end
high_and_low("4 5 29 54 4 0 -214 542 -64 1 -3 6 -6")
EDIT
I also tried this and receive a failed test "Nil":
def high_and_low(numbers)
numberArray = numbers.split(/\s/).map(&:to_i).sort
puts "#{numberArray[-1]}" + " " + "#{numberArray[0]}"
end

When omitting the return statement, a function will only return the result of the last expression within its body. To return both as an Array write:
def high_and_low(numbers)
numberArray = numbers.split(/\s/).map(&:to_i).sort
return numberArray[0], numberArray[-1]
end
puts high_and_low("4 5 29 54 4 0 -214 542 -64 1 -3 6 -6")
# => [-214, 542]

Using sort would be inefficient for big arrays. Instead, use Enumerable#minmax:
numbers.split.map(&:to_i).minmax
# => [-214, 542]
Or use Enumerable#minmax_by if you like result to remain strings:
numbers.split.minmax_by(&:to_i)
# => ["-214", "542"]

Checking if pixel belongs to an image

I have written the following function that find if a pixel belongs to an image in matlab.
At the beginning, I wanted to test it as if a number in a set belongs to a vector like the following:
function traverse_pixels(img)
for i:1:length(img)
c(i) = img(i)
end
But, when I run the following commands for example, I get the error shown at the end:
>> A = [ 34 565 456 535 34 54 5 5 4532 434 2345 234 32332434];
>> traverse_pixels(A);
??? Error: File: traverse_pixels.m Line: 2 Column: 6
Unexpected MATLAB operator.
Why is that? How can I fix the problem?
Thanks.

There is a syntax error in the head of your for loop, it's supposed to be:
for i = 1:length(img)
Also, to check if an array contains a specific value you could use:
A = [1 2 3]
if sum(A==2)>0
disp('there is at least one 2 in A')
end
This should be faster since no for loop is included.

for i = 1:length(image)
silly error, not : , it is =

MATLAB: how to display UTF-8-encoded text read from file?

The gist of my question is this:
How can I display Unicode characters in Matlab's GUI (OS X) so that they are properly rendered?
Details:
I have a table of strings stored in a file, and some of these strings contain UTF-8-encoded Unicode characters. I have tried many different ways (too many to list here) to display the contents of this file in the MATLAB GUI, without success. For example:
>> fid = fopen('/Users/kj/mytable.txt', 'r', 'n', 'UTF-8');
>> [x, x, x, enc] = fopen(fid); enc
enc =
UTF-8
>> tbl = textscan(fid, '%s', 35, 'delimiter', ',');
>> tbl{1}{1}
ans =
ÎÎÎÎÎÎ Î£Î¦Î©Î±Î²Î³Î´ÎµÎ¶Î·Î¸Î¹ÎºÎ»Î¼Î½Î¾ÏÏÏÏÏÏÏÏÏÏ
>>
As it happens, if I paste the string directly into the MATLAB GUI, the pasted string is displayed properly, which shows that the GUI is not fundamentally incapable of displaying these characters, but once MATLAB reads it in, it longer displays it correctly. For example:
>> pasted = 'ΓΔΘΛΞΠΣΦΩαβγδεζηθικλμνξπρςστυφχψω'
pasted =
>>
Thanks!

I present below my findings after doing some digging... Consider these test files:
a.txt
ΓΔΘΛΞΠΣΦΩαβγδεζηθικλμνξπρςστυφχψω
b.txt
தமிழ்
First, we read files:
%# open file in binary mode, and read a list of bytes
fid = fopen('a.txt', 'rb');
b = fread(fid, '*uint8')'; %'# read bytes
fclose(fid);
%# decode as unicode string
str = native2unicode(b,'UTF-8');
If you try to print the string, you get a bunch of nonsense:
>> str
str =
Nonetheless, str does hold the correct string. We can check the Unicode code of each character, which are as you can see outside the ASCII range (last two are the non-printable CR-LF line endings):
>> double(str)
ans =
Columns 1 through 13
915 916 920 923 926 928 931 934 937 945 946 947 948
Columns 14 through 26
949 950 951 952 953 954 955 956 957 958 960 961 962
Columns 27 through 35
963 964 965 966 967 968 969 13 10
Unfortunately, MATLAB seems unable to display this Unicode string in a GUI on its own. For example, all these fail:
figure
text(0.1, 0.5, str, 'FontName','Arial Unicode MS')
title(str)
xlabel(str)
One trick I found is to use the embedded Java capability:
%# Java Swing
label = javax.swing.JLabel();
label.setFont( java.awt.Font('Arial Unicode MS',java.awt.Font.PLAIN, 30) );
label.setText(str);
f = javax.swing.JFrame('frame');
f.getContentPane().add(label);
f.pack();
f.setVisible(true);
As I was preparing to write the above, I found an alternative solution. We can use the DefaultCharacterSet undocumented feature and set the charset to UTF-8 (on my machine, it is ISO-8859-1 by default):
feature('DefaultCharacterSet','UTF-8');
Now with a proper font (you can change the font used in the Command Window from Preferences > Font), we can print the string in the prompt (note that DISP is still incapable of printing Unicode):
>> str
str =
ΓΔΘΛΞΠΣΦΩαβγδεζηθικλμνξπρςστυφχψω
>> disp(str)
Î“Î”Î˜Î›ÎžÎ Î£Î¦Î©Î±Î²Î³Î´ÎµÎ¶Î·Î¸Î¹ÎºÎ»Î¼Î½Î¾Ï€ÏÏ‚ÏƒÏ„Ï…Ï†Ï‡ÏˆÏ‰
And to display it in a GUI, UICONTROL should work (under the hood, I think it is really a Java Swing component):
uicontrol('Style','text', 'String',str, ...
'Units','normalized', 'Position',[0 0 1 1], ...
'FontName','Arial Unicode MS', 'FontSize',30)
Unfortunately, TEXT, TITLE, XLABEL, etc.. are still showing garbage:
As a side note: It is difficult to work with m-file sources containing Unicode characters in the MATLAB editor. I was using Notepad++, with files encoded as UTF-8 without BOM.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Ruby text file parsing methodolgy - ruby

Use #split/#join pair to proceed: a='22034 BUBBA C BC-022 OWL SOFTIE <N/A> <N/A> <N/A> 470 0.00 6 0.00 1 37.95'.split [ a[4..-10].join( ' ' ), a[-4], a[-2], a[-1] ].join ' ' # => "OWL SOFTIE 6 1 37.95"

You only have one troublesome field with a variable number of words in it so start with split: a = "22034 BUBBA C BC-022 OWL SOFTIE <N/A> <N/A> <N/A> 470 0.00 6 0.00 1 37.95".split Then pick it apart: [a[4..-10].join(' '), a[-4], a[-2], a[-1]]

Related

Sorting data with gnuplot

How can I extract parts of one column and append them to other columns?

High & Low Numbers From A String (Ruby)

Checking if pixel belongs to an image

MATLAB: how to display UTF-8-encoded text read from file?

Categories

Resources

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Ruby text file parsing methodolgy - ruby

Use #split/#join pair to proceed: a='22034 BUBBA C BC-022 **OWL SOFTIE** <N/A> <N/A> <N/A> 470 0.00 **6** 0.00 **1** **37.95**'.split [ a[4..-10].join( ' ' ), a[-4], a[-2], a[-1] ].join ' ' # => "**OWL SOFTIE** **6** **1** **37.95**"

You only have one troublesome field with a variable number of words in it so start with split: a = "22034 BUBBA C BC-022 OWL SOFTIE <N/A> <N/A> <N/A> 470 0.00 6 0.00 1 37.95".split Then pick it apart: [a[4..-10].join(' '), a[-4], a[-2], a[-1]]

Related

Sorting data with gnuplot

How can I extract parts of one column and append them to other columns?

High & Low Numbers From A String (Ruby)

Checking if pixel belongs to an image

MATLAB: how to display UTF-8-encoded text read from file?

Categories

Resources

Use #split/#join pair to proceed: a='22034 BUBBA C BC-022 OWL SOFTIE <N/A> <N/A> <N/A> 470 0.00 6 0.00 1 37.95'.split [ a[4..-10].join( ' ' ), a[-4], a[-2], a[-1] ].join ' ' # => "OWL SOFTIE 6 1 37.95"