Is there an efficient way to read the last line of a text file? Right now i'm simply reading each line with code like below. Then S holds the last line read. Is there a good way to grab that last line without looping through entire text file?
TStreamReader* Reader;
Reader = new TStreamReader(myfile);
while (!Reader->EndOfStream)
{
String S = Reader->ReadLine();
}
Exactly as Remy Lebeau commented:
Use file access functions FileOpen,FileSeek,FileRead
look here for example of usage:
Convert the Linux open, read, write, close functions to work on Windows
load your file by chunks from end into memory
so make a static buffer and load file into it from end by chunks ...
stop on eol (end of line) usually CR,LF
just scan for 13,10 ASCII codes or their combinations from end of chunk. Beware some files have last line also terminated so you should skip that the first time ...
known eols are:
13
10
13,10
10,13
construct line
if no eol found add whole chunk to string, if found add just the part after it ...
Here small example:
int hnd,siz,i,n;
const int bufsz=256; // buffer size
char buf[bufsz+1];
AnsiString lin; // last line output
buf[bufsz]=0; // string terminator
hnd=FileOpen("in.txt",fmOpenRead); // open file
siz=FileSeek(hnd,0,2); // obtain size and point to its end
for (i=-1,lin="";siz;)
{
n=bufsz; // n = chunk size to load
if (n>siz) n=siz; siz-=n;
FileSeek(hnd,siz,0); // point to its location (from start)
FileRead(hnd,buf,n); // load it to buf[]
if (i<0) // first time pass (skip last eol)
{
i=n-1; if (i>0) if ((buf[i]==10)||(buf[i]==13)) n--;
i--; if (i>0) if ((buf[i]==10)||(buf[i]==13)) if (buf[i]!=buf[i+1]) n--;
}
for (i=n-1;i>=0;i--) // scan for eol (CR,LF)
if ((buf[i]==10)||(buf[i]==13))
{ siz=0; break; } i++; // i points to start of line and siz is zero so no chunks are readed after...
lin=AnsiString(buf+i)+lin; // add new chunk to line
}
FileClose(hnd); // close file
// here lin is your last line
So, I'm teaching myself some Golang by making a simple resource management game with ncurses. I'm using this library to connect Golang to ncurses.
I've made a simple text input panel that takes in one character at a time, displays it, and then adds it to a string composing the user's response. Here's what it looks like:
// Accept characters, printing them until end
ch := window.GetChar()
kstr := gc.KeyString(ch)
response := ""
cur := 0
for kstr != "enter" {
// Diagnostic print to get key code of current character
window.Move(0,0)
window.ClearToEOL()
window.MovePrint(0, 0, ch)
// If its a backspace or delete, remove a character
// Otherwise as long as its a regular character add it
if ((ch == 127 || ch == 8) && cur != 0){
cur--
response = response[:len(response)-1]
window.MovePrint(y, (x + cur), " ")
} else if (ch >= 33 && ch <= 122 && cur <= 52) {
window.MovePrint(y, (x + cur), kstr)
response = response + kstr
cur++
}
// Get next character
ch = window.GetChar()
kstr = gc.KeyString(ch)
}
However, the arrow and function keys seem to be coming up as keycodes already associated with the normal a-zA-Z characters. For example, right-arrow comes up as 67 and F1 as 80. Any ideas what I'm doing wrong here, or if there's a better approach to taking in alphanumerics through ncurses? I'd like to avoid ncurses fields and classes as much as possible, because the point here is to learn Golang, not ncurses. Thanks!
If you do not enable the keypad mode, (n)curses will return the individual bytes which make up a special key.
To fix, add this to your program's initialization:
stdscr.Keypad(true) // allow keypad input
which will return special keys such as right-arrow as values above 255. goncurses has symbols defined for those, e.g., KEY_RIGHT.
I have a NSTextView in which I have lots of text.
How can I get the line and the column number of the character at some index?
Lets say, I have this text in the NSTextView:
"This is just a\ndummy text\nto show you\nwhat I mean."
And I need the line and column number for the 16th character. In this case:
Line: 2
Column: 2
How can I calculate/get this using Swift?
Another example:
"This is just a\ndummy text\nto show you\nwhat I mean."
And I want the line and row number for 15th (or 16th, if \n are counted too) character, like this:
Line: 2
Column: 1
You just need to break your text lines using String method componentsSeparatedByString, then you just need to keep count of your lines, columns and character position as follow:
extension String {
func characterRowAndLineAt(position: Int) -> (character: String, line: Int, column:Int)? {
var lineNumber = 0
var characterPosition = 0
for line in components(separatedBy: .newlines) {
lineNumber += 1
var columnNumber = 0
for column in line {
characterPosition += 1
columnNumber += 1
if characterPosition == position {
return (String(column), lineNumber, columnNumber )
}
}
characterPosition += 1
if characterPosition == position {
return ("\n", lineNumber, columnNumber+1 )
}
}
return nil
}
}
let myText = "This is just a\ndummy text\nto show you\nwhat I mean."
let result = myText.characterRowAndLineAt(position: 16) // "(.0 "d", .1 2, .2 1)"
let character = result?.character // "d"
let line = result?.line // 2
let column = result?.column // 1
I am trying to write a custom reader which serves me the purpose of reading a record (residing in two lines) with defined number of fields.
For Eg
1,2,3,4("," can be there or not)
,5,6,7,8
My requirement is to read the record and push it into mapper as a single record like {1,2,3,4,5,6,7,8}. Please give some inputs.
UPDATE:
public boolean nextKeyValue() throws IOException, InterruptedException {
if(key == null) {
key = new LongWritable();
}
//Current offset is the key
key.set(pos);
if(value == null) {
value = new Text();
}
int newSize = 0;
int numFields = 0;
Text temp = new Text();
boolean firstRead = true;
while(numFields < reqFields) {
while(pos < end) {
//Read up to the '\n' character and store it in 'temp'
newSize = in.readLine( temp,
maxLineLength,
Math.max((int) Math.min(Integer.MAX_VALUE, end - pos),
maxLineLength));
//If 0 bytes were read, then we are at the end of the split
if(newSize == 0) {
break;
}
//Otherwise update 'pos' with the number of bytes read
pos += newSize;
//If the line is not too long, check number of fields
if(newSize < maxLineLength) {
break;
}
//Line too long, try again
LOG.info("Skipped line of size " + newSize + " at pos " +
(pos - newSize));
}
//Exit, since we're at the end of split
if(newSize == 0) {
break;
}
else {
String record = temp.toString();
StringTokenizer fields = new StringTokenizer(record,"|");
numFields += fields.countTokens();
//Reset 'value' if this is the first append
if(firstRead) {
value = new Text();
firstRead = false;
}
if(numFields != reqFields) {
value.append(temp.getBytes(), 0, temp.getLength());
}
else {
value.append(temp.getBytes(), 0, temp.getLength());
}
}
}
if(newSize == 0) {
key = null;
value = null;
return false;
}
else {
return true;
}
}
}
This is the nextKeyValue method which I am trying to work on. But still the mapper are not getting proper values.
reqFields is 4.
Look at how TextInputFormat is implemented. Look at it's superclass, FileInputFormat as well. You must subclass Either TextInputFormat of FileInputFormat and implement your own record handling.
Thing to be aware when implementing any kind of file input format is this:
Framework will split the file and give you the start offset and byte length of the piece of the file you have to read. It may very well happen that it splits the file right across some record. That is why your reader must skip the bytes of the record at the beginning of the split if that record is not fully contained in the split, as well as read past the last byte of the split to read the whole last record if that one is not fully contained in the split.
For example, TextInoutFormat treats \n characters as record delimiters so when it gets the split it skips the bytes until the first \n character and read past the end of the split until the \n character.
As for the code example:
You need to ask yourself the following question: Say you open the file, seek to a random position and start reading forward. How do you detect the start of the record? I don't see anything in your code that deals with that, and without it, you cannot write a good input format, because you don't know what are the record boundaries.
Now it is still possible to make the input format read the whole file end to end by making the isSplittable(JobContext,Path) method return false. That makes the file read wholly by single map task which reduces parallelism.
Your inner while loop seems problematic since it's checking for lines that are too long and is skipping them. Given that your records are written using multiple lines, it can happen that you merge one part of one record and another part of another record when you read it.
The string had to be tokenized using StringTokenizer and not split. The code has been updated with the new implmentation.
How can I print a number or make a string with zero padding to make it fixed width?
For instance, if I have the number 12 and I want to make it 000012.
The fmt package can do this for you:
fmt.Printf("|%06d|%6d|\n", 12, 345)
Output:
|000012| 345|
Notice the 0 in %06d, that will make it a width of 6 and pad it with zeros. The second one will pad with spaces.
Try it for yourself here: http://play.golang.org/p/cinDspMccp
Use the Printf function from the fmt package with a width of 6 and the padding character 0:
import "fmt"
fmt.Printf("%06d", 12) // Prints to stdout '000012'
Setting the width works by putting an integer directly preceding the format specifier ('verb'):
fmt.Printf("%d", 12) // Uses default width, prints '12'
fmt.Printf("%6d", 12) // Uses a width of 6 and left pads with spaces, prints ' 12'
The only padding characters supported by Golang (and most other languages) are spaces and 0:
fmt.Printf("%6d", 12) // Default padding is spaces, prints ' 12'
fmt.Printf("%06d", 12) // Change to 0 padding, prints '000012'
It is possible to right-justify the printing by prepending a minus -:
fmt.Printf("%-6d", 12) // Padding right-justified, prints '12 '
Beware that for floating point numbers the width includes the whole format string:
fmt.Printf("%06.1f", 12.0) // Prints '0012.0' (width is 6, precision is 1 digit)
It is useful to note that the width can also be set programmatically by using * instead of a number and passing the width as an int parameter:
myWidth := 6
fmt.Printf("%0*d", myWidth, 12) // Prints '000012' as before
This might be useful for instance if the largest value you want to print is only known at runtime (called maxVal in the following example):
myWidth := 1 + int(math.Log10(float64(maxVal)))
fmt.Printf("%*d", myWidth, nextVal)
Last, if you don't want to print to stdout but return a String, use Sprintf also from fmt package with the same parameters:
s := fmt.Sprintf("%06d", 12) // returns '000012' as a String
There is one simplest way to achieve this. Use
func padNumberWithZero(value uint32) string {
return fmt.Sprintf("%02d", value)
}
fmt.Sprintf formats and returns a string without printing it anywhere.
Here %02d says pad zero on left for value who has < 2 number of digits. If given value has 2 or more digits it will not pad. For example:
If input is 1, output will be 01.
If input is 12, output will be 12.
If input is 1992, output will be 1992.
You can use %03d or more for more zeros padding.
Just in case if you want to prefix or suffix to form another word by concatenating you can use below code.
package main
import "fmt"
func main() {
concatenatedWord:= "COUNTER_"+fmt.Sprintf("%02d", 1)
// use concatenatedWord
fmt.Println("ConcatenatedWordword is", concatenatedWord)
}
output : ConcatenatedWordword is COUNTER_01
link : https://play.golang.org/p/25g3L8TXiPP
The question "List of printing format in Go lang" reminds us that there is also the flag:
- pad with spaces on the right rather than the left (left-justify the field)
You can see more padding examples with DaddyOh/golang-samples/pad.go, if you want to pad with other string sequences (more complex than '0' or ''):
leftPad(s string, padStr string, pLen int)
rightPad(s string, padStr string, pLen int)
leftPad2Len(s string, padStr string, overallLen int)
rightPad2Len(s string, padStr string, overallLen int)
See play.golang.org:
1234567890
leftPad(str, "*", 3) ***1234567890
leftPad2Len(str, "*-", 13) -*-1234567890
leftPad2Len(str, "*-", 14) *-*-1234567890
leftPad2Len(str, "*", 14) ****1234567890
leftPad2Len(str, "*-x", 14) x*-x1234567890
leftPad2Len(str, "ABCDE", 14) BCDE1234567890
leftPad2Len(str, "ABCDE", 4) 7890
rightPad(str, "*", 3) 1234567890***
rightPad(str, "*!", 3) 1234567890*!*!*!
rightPad2Len(str, "*-", 13) 1234567890*-*
rightPad2Len(str, "*-", 14) 1234567890*-*-
rightPad2Len(str, "*", 14) 1234567890****
rightPad2Len(str, "*-x", 14) 1234567890*-x*
rightPad2Len(str, "ABCDE", 14) 1234567890ABCD
rightPad2Len(str, "ABCDE", 4) 1234
func lpad(s string,pad string, plength int)string{
for i:=len(s);i<plength;i++{
s=pad+s
}
return s
}
lpad("3","0",2) result: "03"
lpad("12","0",6) result: "000012"
Here's my solution:
func leftZeroPad(number, padWidth int64) string {
return fmt.Sprintf(fmt.Sprintf("%%0%dd", padWidth), number)
}
Example usage:
fmt.Printf("%v", leftZeroPad(12, 10))
prints:
0000000012
The advantage of this is that you can specify the pad length at run time if needed.
For those that want to right pad, you can do this:
str2pad := "12"
padWith := "0"
amt2pad := 6
//This will make sure there is always 6 characters total, padded on the right side
//Note to check if strings.Repeat returns a negative value
paddedStr := str2pad + strings.Repeat(padWith, amt2pad - len(str2pad))
//Outputs 120000
Another option is the golang.org/x/text/number package:
package main
import (
"golang.org/x/text/language"
"golang.org/x/text/message"
"golang.org/x/text/number"
)
var fmt = message.NewPrinter(language.English)
func main() {
n := number.Decimal(
12, number.Pad('0'), number.FormatWidth(6),
)
fmt.Println(n) // 000012
}
https://pkg.go.dev/golang.org/x/text/number
fmt.Printf("%012s", "345")
Result: 000000000345