I've writen a simple tail command using vbscript. It works fine except for very large files where it has to read through the entire file to get the last 10 lines. Is there a way to seek to the end of the file and then read backwards for ten lines?
I am afraid that seeking backwards is impossible in VBS TextStream, but instead of reading through the entire file you can seek to a position eg. 1K before EOF and then read the rest of the file, displaying only the last 10 lines.
EDIT: I'm adding some sample code to illustrate the idea:
set fso = CreateObject("Scripting.FileSystemObject")
set file = fso.GetFile(filePath)
set stream = file.OpenAsTextStream(1, -2)
pos1KBeforeEnd = file.Size-1024
if pos1KBeforeEnd<0 then pos1KBeforeEnd=0
stream.Skip pos1KBeforeEnd
Related
I have a template file in xlsx format and I want to paste a dynamic value in one particular cell i.e based on the flow of program the value in that cell will change which in turn changes conditions in xlsx file for a different process.
I have tried codes like
awk -v value=$value -v row=$row -v col=$col 'BEGIN{FS=OFS="#"} NR==row {$col=value}1' file.csv
but the issue is I cant use this code for xlsx file format. is there any way to do this for xlsx file format, since it's a template file I need to retain xlsx file format.
When I have to extract values from an Excel workbook on my Windows PC I install cygwin and then write a small shell script that does:
cygstart "/path/to/xls2csv.vbs" 'C:/cygwin64/path/to/bookName.xlsx'
awk 'whatever' '/path/to/bookName/sheetName.csv'
and the work of extracting every sheet from the workbook as a separate CSV named based on the sheet name suffixed with ".csv" under a common directory named after the workbook is done by this visual basic script:
$ cat xls2csv.vbs
csv_format = 6
Dim strFilename
Dim objFSO
Set objFSO = CreateObject("scripting.filesystemobject")
strFilename = objFSO.GetAbsolutePathName(WScript.Arguments(0))
If objFSO.fileexists(strFilename) Then
Call Writefile(strFilename)
Else
wscript.echo "no such file!"
wscript.echo strFilename
End If
Set objFSO = Nothing
Sub Writefile(ByVal strFilename)
Dim objExcel
Dim objWB
Dim objws
Set objExcel = CreateObject("Excel.Application")
Set objWB = objExcel.Workbooks.Open(strFilename)
For Each objws In objWB.Sheets
objws.Copy
objExcel.ActiveWorkbook.SaveAs objWB.Path & "\" & objws.Name & ".csv", csv_format
objExcel.ActiveWorkbook.Close False
Next
objWB.Close False
objExcel.Quit
Set objExcel = Nothing
End Sub
That command would fail given blanks in file or directory names so we need to replace those with, say, underscores. In reality I usually copy the Xls file to a temp directory and give it a temp name before running the above on it so I can run the above on it without affecting the original file and without having to care about the path to the original file. It requires an absolute path to the input Excel workbook.
You might need to throw a wait and/or sleep in before the awk command to ensure the VB script is done before the awk command runs. My not shown shell code is kinda convoluted testing for the VB script creating then removing tmp files to ensure the VB script is done and looping trying and then killing Excel if it doesn't start or hangs before calling awk - I wrote it a long time ago, it's a mess, and I doubt if it's really necessary or a good approach which is why I'm not including it here.
To get those values back INTO a multi-sheet workbook you'd have to open any updated/generated CSV with Excel (or copy/paste). There's probably some other VB script could be written to import the CSVs for you just like I export them above but I've never needed that functionality so idk what that'd look like.
I don't know if you need that though - if your awk script writes CSV then you can just double click on the output .csv and Excel will happily open and display it just like it would any .xls or .xlsx Excel file.
So, to do what you want, assuming your original content is in "Sheet1" of single-sheet Excel workbook "MyStuff.xlsx" you'd do this from cygwin:
cygstart "/path/to/xls2csv.vbs" 'C:/cygwin64/path/to/MyStuff.xlsx'
wait; sleep 10 # or similar
awk -v value="$value" -v row="$row" -v col="$col" 'BEGIN{FS=OFS=","} NR==row {$col=value}1' '/path/to/MyStuff/Sheet1.csv' > "/tmp/tmp$$" &&
mv "/tmp/tmp$$" '/path/to/MyStuff/Sheet1.csv'
and then in Windows just double-click on /path/to/MyStuff/Sheet1.csv to open it in Excel (you may need to associate the .csv file suffix with Excel the first time you do that).
Note that the above will only handle simple CSVs, see What's the most robust way to efficiently parse CSV using awk? for how to robustly handle CSVs with awk in general.
I have a issue, which i thought was to do with network drives but now i have tested and its not the issue.
There is 2000 files(totaling 328MB) that match, used in the test and it takes about 1.4seconds to complete this anytime i run it, EXCEPT the first time each day, when it takes anywhere from 30secs to 60 secs.
I thought Dir was causing the issue but its definitively inside the loop that is slow.
Would File caching be causing this issue?
Is there a better way to load in the first line of a large quantity of files quickly
'Get All Filenames
sAllFiles = Dir("C:\Folder\" & sFile & "??.???")
'Loop through each File
Do While Len(sAllFiles) > 0
sCurrentFileName = sAllFiles
sCurrentFilePath = "C:\Folder\" & sCurrentFileName
'Read 1st line from each file
Open sCurrentFilePath For Input As #1
Line Input #1, sFirstLine
Close #1
vRowData = Split(sFirstLine, "~")
'(Write data to array code here)
sAllFiles = Dir
Loop
Hi I'm trying to read a pdf in Ruby, first of all I want to convert it into a txt. path is the path to the PDF, The point is that I get a .txt file empty, and as someone told me is a pdftotext problem, but I don't know how to fix it.
spec = path.sub(/\.pdf$/, '')
`pdftotext #{spec}.pdf`
file = File.new("#{spec}.txt", "w+")
text = []
file.readlines.each do |l|
if l.length > 0
text << l
Rails.logger.info l
end
end
file.close
What's wrong with my code? Thanks!
It's not possible to extract text from every PDF. Some PDF files use a font encoding that makes it impossible to extract text with simple tools such as pdftotext (and some PDF files are even completely immune to direct text extraction with any tool known to me -- in these cases you'll have to apply OCR first to have a chance to extract text...).
So if you test your code with the same "weird" PDF file all the time, it may well happen that you're getting frustrated over your code while in reality the fault lies with the PDF.
First make sure that the commandline usage of pdftotxt works well with a given PDF, then test (and develop further) your code with that PDF.
The problem is you are opening the file in write ("w") mode, whuch truncates the file. You can see a table of file modes and what they mean at http://ruby-doc.org/core-1.9.3/IO.html.
Try something like this, it uses a pdftotext option to send the text to stdout to avoid creating a temporary file and uses blocks for more idiomatic ruby.
text = `pdftotext #{path} -`
text.split.select { |line|
line.length > 0
}.each { |line|
Rails.logger.info(line)
}
You would need to open the txt file with write permission.
file = File.new("#{spec}.txt", "w")
You could consult How to create a file in Ruby
Update: your code is not complete and looks buggy.
Cant say what is path
Looks like you are trying to read the text file to which you intend to write file.readlines.each
spell check length you have it l.lenght
You may want to paste the actual code.
Check this gist https://gist.github.com/4160587
As mentioned, your code is not working because you are reading and writing to the same file.
Example
Ruby code file_write.rb to do the file write operation
pdf_file = File.open("in.txt")
output_file = File.open("out.txt", "w") # file to which you want to write
#iterate over input file and write the content to output file
pdf_file.readlines.each do |l|
output_file.puts(l)
end
output_file.close
pdf_file.close
Sample txt file in.txt
Some text in file
Another line of text
1. Line 1
2. Not really line 2
Once your run file_write.rb you should see new file called out.txt with same content as in.txt You could change the content of input file if you want. In your case you would use pdf reader to get the content and write it to the text file. Basically first line of the code will change.
i am stumbling over one thing:
i am sorting a bunch of files in awk and saving the sorted particles as txt. but now i need to save them as .doc and especially in landscape format. i googled a lot and found out that the only way of doing this is save the file as doc but during creating the file, write these rtf code into file and then write the real content into file.
rtf start-tag code:
{\rtf1\ansi\deff0 {\fonttbl {\f0 Courier;}}
{\colortbl;\red0\green0\blue0;\red255\green0\blue0;}
\landscape
\paperw15840\paperh12240\margl720\margr720\margt720\margb720
and rtf close-tag:
}
the close tag will be written after the last line of the file as the last line into the new created file.
my problem is, how can i find the last line of the file inside the awk before coming to END.
this is my code. http://pastebin.com/mfjH4NYY
it is huge code to know what is happenning, but the point is: the fnnID is not available in the END tag, thru this, a new file will be created if i try to append the } char to close the rtf format. can someone help me figure out the clue?
thanks a lot
Let's say you'll have a function write_header(filepath) that will write the RTF header into a file. Make this function record in some global variable all the filepaths it was passed. Then, in your END, loop over these filepaths and write the RTF footer into them.
As for your new "ls -l" question: I don't see why you need to use it.
Here's what I suggested:
function write_header(filepath) {
print "{\\rtf1\\ans .... " >> filepath
tracked[max_header++] = filepath
}
BEGIN {
# You don't have to write the headers in BEGIN. Just make sure it's the
# first thing you write to the files.
write_header("file1.doc")
write_header("file2.doc")
write_header("another_file.doc")
}
END {
# Write the footers.
for(i in tracked) {
print "}" >> tracked[i]
}
}
I have an enormous text file that I'd like to parse into other files - I think the only thing I have on this box (company computer) to use is VBS, so here's my question:
I have text file with a bunch of network configurations that looks like this:
"Active","HostName","IPAddress"
!
.......
end
This repeats itself throughout the file, but obviously for each configuration different data will occur within the "..." It always says "Active" for each configuration as well.
I want to create save files of the type HostName.cfg for each configuration and save all of the text between and including the ! and "end" . The line with the three quoted attributes doesn't need to be copied.
I'm still learning VBS so I'd appreciate any help in the matter. Thanks!
Here are some useful file reading functions and statements:
Freefile
Returns an integer which you should assign to a variable and then use in an Open statement.
Open <string> For Input As <integer>
Opens the specified file using the specified file handle. Don't use the same file handle twice without closing it in between.
Line Input #<integer>, <variable>
Reads one line of the file into a string.
Input #<integer>, <variable>, <variable>, <variable>
Reads one line of the file delimited by commas into several variables.
I once had to read a text file while omitting starting and ending lines. Though I didn't need to write it anywhere, FSO is simple enough that you will be able to figure it out. Here's some code for reading a file and link for writing to file via FSO: link
'Create a FSO object and open the file by providing the file path
Dim fso, f, filePath, line
filePath = "test.txt"
Set fso = CreateObject("Scripting.FileSystemObject")
Set f = fso.OpenTextFile(filePath)
'Skip through the first two lines
f.readline
f.readline
'Read till the end of file, if the line is not "end", split it based on ","
while not f.AtEndOfStream
line = f.readline()
if line <> "end" then
arr = split(line, ",")
'Write the arr to file
end if
wend
f.Close