Genexus has the ExcelDocument data type that allows you to read data in a tabular way from an excel file, specifying rows and columns. Is there a way to do the same with a csv file? I can open it and read it like a normal txt, but a structure would be more effective
As ealmeida explained you can use the Delimited ASCII files functions.
Below you can see an example on how to code both read and write operations.
ASCII File sample
1,"Jane Doe",1955-05-21
2,"John Smith",1991-10-15
3,"William Shakespeare",2005-11-30
Setup variables with parameters
&FullFileName = !'Datos.txt'
&RecordLength = 50
&FieldsDelimiter = !','
&StringDelimiter = !'"'
&DateFormat = !'ymd'
&DateSeparator = !'-'
To read Delimited ASCII
&ErrorNbr = DFROpen(&FullFileName, &RecordLength, &FieldsDelimiter, &StringDelimiter)
do while DFRNext() = 0
&ErrorNbr = DFRGNum(&PersonNumber)
&ErrorNbr = DFRGTxt(&PersonName)
&ErrorNbr = DFRGDate(&PersonDOB, &DateFormat, &DateSeparator)
enddo
&ErrorNbr = DFRClose()
To write Delimited ASCII
&ErrorNbr = DFROpen(&FullFileName, &RecordLength, &FieldsDelimiter, &StringDelimiter)
for each Person
&ErrorNbr = DFWPNum(PersonNumber, 0)
&ErrorNbr = DFWPTxt(PersonName)
&ErrorNbr = DFWPDate(PersonDOB, &DateFormat, &DateSeparator)
&ErrorNbr = DFWNext()
endfor
&ErrorNbr = DFWClose()
Yes, it's posible, using Delimited ASCII files functions
Related
I would like to convert the data scraping from an internet site, regarding the time, the data is extracted like this (for example 9:15) and inserted into the cell, I would like at the bottom of the column to make the total of the hours, the problem I would like python to convert it to numerical format so that I can add it up.
any idea?
def excel():
# Writing on a EXCEL FILE
filename = f"Monatsplan {userfinder} {month} {year}.xlsx"
try:
wb = load_workbook(filename)
ws = wb.worksheets[0] # select first worksheet
except FileNotFoundError:
headers_row = [
"Datum",
"Tour",
"Funktion",
"Von",
"Bis",
"Schichtdauer",
"Bezahlte Zeit",
]
wb = Workbook()
ws = wb.active
ws.append(headers_row)
wb.save(filename)
ws.append(
[
datumcleaned[:10],
tagesinfo,
"",
"",
"",
"",
"",
]
)
wb.save(filename)
wb.close()
excel()
You should split the data you scrapped.
time_scrapped = '9:15'
time_split = time_scrapped.split(":")
hours = int(time_split[0])
minutes = int(time_split[1])
Then you can place it in separate columns and create formula at the bottom of the column.
I have no knowledge of VBScript and need help.
Logically - I thought of splitting it with # in a for loop and then using : to split again.
Example:
Text file:
a : 21312 # asdfasd23sad : 43624 # asdsad*:21
Excel file:
Function arr()
input = a : 21312 # asdfasd23sad : 43624 # asdsad*:21
arr1 = Split(input, "#")
For i = Lbound(arr1) To Ubound (arr1)
arr2 = Split(arr1(i),":")
For j = Lbound(arr2) To Ubound (arr2)
Msgbox arr2(j)
Next
Next
End function
I've got a text file of output that looks essentially like this:
SMITHERSON, SMITH|00012345|15-Jan-1999|000885340
619649339|29-Sep-2015 00:09:30|Black|JOHNERSON, JOHN
00067890|02-Dec-1996|000490365|620094551
29-Sep-2015 23:06:01|Green|DAVISON, DAVE|00086543|06-Jun-2001|000938585
226438332|28-Sep-2015 00:12:12|Yellow
Seven pieces of data, they are always in the correct order but unfortunately they run together and onto different lines. There are carriage return + line feeds at the end of each line and there aren't pipe delimiters. The individual pieces of data are never split over multiple lines - I'm having a hard time explaining so here's another example:
DATA 1|DATA 2|DATA 3
DATA 4
DATA 5|DATA 6|DATA 7
DATA 1|DATA 2|DATA 3|DATA 4
DATA 5|DATA 6|DATA 7
etc...
They will have spaces between them but each piece of data will always stay on it's own line.
And I'm trying to turn it into this:
SMITHERSON, SMITH|00012345|15-Jan-1999|000885340|619649339|29-Sep-2015 00:09:30|Black
JOHNERSON, JOHN|00067890|02-Dec-1996|000490365|620094551|29-Sep-2015 23:06:01|Green
DAVISON, DAVE|00086543|06-Jun-2001|000938585|226438332|28-Sep-2015 00:12:12|Yellow
DATA 1|DATA 2|DATA 3|DATA 4|DATA 5|DATA 6|DATA 7
DATA 1|DATA 2|DATA 3|DATA 4|DATA 5|DATA 6|DATA 7
etc.
Seven pieces of data each on their own line, but still seperated by the '|' for another piece of software to read correctly.
I am spending about one hour every day correcting the text files by hand, so I've been trying to find an example I can work from to do this for a while but have not had any luck wrapping my head around this.
This code is ok. I only tested your sample text, not big files.
It will replace line feeds with the delimiter, then convert the entire file into one big array:
Set fso = CreateObject("Scripting.FileSystemObject")
Set input = fso.OpenTextFile("input.txt", 1)
Set output = fso.OpenTextFile("output.txt", 2, True)
Dim data: data = input.ReadAll
input.Close()
data = Replace(data, vbCrlf, "|")
data = Split(data, "|")
For i=0 To UBound(data) Step 7
output.WriteLine data(i) & "|" & data(i+1) & "|" & data(i+2) & "|" & data(i+3) & "|" & data(i+4) & "|" & data(i+5) & "|" & data(i+6)
Next
output.Close()
Untested, but something like this might do it. (Essentially it copies input to output as a stream, but newlines in the input are converted to pipe characters and every seventh pipe in the output is converted to a newline)
Set fs = CreateObject("Scripting.FileSystemObject")
Set f = fs.OpenTextFile("D:\data\thefile.txt", 1)
Set o = fs.OpenTextFile("D:\data\combined.txt", 2, True)
pipecount = 0
Do While f.AtEndOfFile <> True
If f.AtEndOfLine = True Then
c = f.Read(2) ' Skip the CR+LF
c = "|" ' and pretend we got a pipe character
Else
c = f.Read(1)
End If
If c = "|" Then
pipecount = pipecount + 1
If pipecount = 7 Then
pipecount = 0
o.WriteLine()
Else
o.Write("|")
End If
Else
o.Write(c)
End If
End While
o.Close()
I want to load a CSV (just comma separated) file into my Hbase table. I already tried it with help of some googled articles, now just I am able to load entire row (or line) as value into Hbase, i.e. all values in single row are getting stored as single column, but I want to split the row based on delimiter comma (,) and store those vales into different columns in Hbase table's column family.
Please help to solve my issue. Any suggestions are appreciated.
Following are my present using input file, agent configuration file and hbase output files.
1)input file
8600000US00601,00601,006015-DigitZCTA,0063-DigitZCTA,11102
8600000US00602,00602,006025-DigitZCTA,0063-DigitZCTA,12869
8600000US00603,00603,006035-DigitZCTA,0063-DigitZCTA,12423
8600000US00604,00604,006045-DigitZCTA,0063-DigitZCTA,33548
8600000US00606,00606,006065-DigitZCTA,0063-DigitZCTA,10603
2)agent configuration file
agent.sources = spool
agent.channels = fileChannel2
agent.sinks = sink2
agent.sources.spool.type = spooldir
agent.sources.spool.spoolDir = /home/cloudera/Desktop/flume
agent.sources.spool.fileSuffix = .completed
agent.sources.spool.channels = fileChannel2
#agent.sources.spool.deletePolicy = immediate
agent.sinks.sink2.type = org.apache.flume.sink.hbase.HBaseSink
agent.sinks.sink2.channel = fileChannel2
agent.sinks.sink2.table = sample
agent.sinks.sink2.columnFamily = s1
agent.sinks.sink2.serializer = org.apache.flume.sink.hbase.RegexHbaseEventSerializer
agent.sinks.sink1.serializer.regex = "\"([^\"]+)\""
agent.sinks.sink2.serializer.regexIgnoreCase = true
agent.sinks.sink1.serializer.colNames =col1,col2,col3,col4,col5
agent.sinks.sink2.batchSize = 100
agent.channels.fileChannel2.type=memory
3)HBase output
hbase(main):009:0> scan 'sample'
ROW COLUMN+CELL
1431064328720-0LalKGmSf3-1 column=s1:payload, timestamp=1431064335428, value=8600000US00602,00602,006025-DigitZCTA,0063-DigitZCTA,12869
1431064328720-0LalKGmSf3-2 column=s1:payload, timestamp=1431064335428, value=8600000US00603,00603,006035-DigitZCTA,0063-DigitZCTA,12423
1431064328720-0LalKGmSf3-3 column=s1:payload, timestamp=1431064335428, value=8600000US00604,00604,006045-DigitZCTA,0063-DigitZCTA,33548
1431064328721-0LalKGmSf3-4 column=s1:payload, timestamp=1431064335428, value=8600000US00606,00606,006065-DigitZCTA,0063-DigitZCTA,10603
4 row(s) in 0.0570 seconds
hbase(main):010:0>
I have saved data in format of blob using powerbuilder ole control in oracle.
Now we want to convert these blob to files,
files are of different format(PDF, JPG,EXCEL,TEXT,DOC)
There are more than 1 Million files so it is not easy to do manually open and save using olecontrol.
Can we do it through script auto saving of blob to file in powerbuilder
Yes, it is possible:
Write a cursor in PowerBuilder embedded SQL to get for each record in your blob table the key and the and file extension (if you have those). The syntax for that kind of thing looks like this:
Long ll_Key
String ls_Ext
DECLARE GetBlobCursor CURSOR FOR
SELECT blob_key,
blob_extension
FROM blob_table ;
/* need to loop here while SQLCA.SQLCode is good */
FETCH GetBlobCursor
INTO :ll_Key,
:ls_Ext ;
Use a SELECTBLOB embedded SQL statement to get the blob data into a PowerBuilder BLOB variable:
Blob lblob_File
SELECTBLOB fileblob
INTO :lblob_File
FROM blobtable
WHERE blob_key = :ll_Key ;
Use FileOpen and FileWrite to write the blob with a valid file name and extension:
Long ll_Loops, ll_Step
Int li_File
String ls_Path
ls_Path = "<where do you want me?>." + String(ll_Key) + "." + ls_Ext
li_File = FileOpen(ls_Path, StreamMode!, Write!, LockWrite!, Append!)
If li_File > 0 Then
// Determine how many times to call FileWrite
ll_FileLen = Len(lblob_File)
If ll_FileLen > 32765 Then
If Mod(ll_FileLen, 32765) = 0 Then
ll_Loops = ll_FileLen/32765
Else
ll_Loops = (ll_FileLen/32765) + 1
End If
Else
ll_Loops = 1
End If
For ll_Step = 1 To ll_Loops
FileWrite(li_File,BlobMid(lblob_File,((ll_Step - 1)*32765) + 1, 32765))
Next
Else
//log the error, or handle
End If
FileClose(li_File)
Hope that gets you started.