Reading in .txt files to Apple's Numbers - applescript

I have a list of .txt files all in the same directory with the name "w_i.txt" where i runs from 1 to n. Each of these files contains a single number (non integer). I want to be able to read in the value from each of these files into a column in Apple's Numbers. I want w_1.txt in row 1 and w_n.txt's value in row n of that column. Should I use Applescript for this and if so what code would be required?

I think I would tackle this as a shell script, rather than applescript.
The following script will iterate over your set of files in numerical order, and produce plain text output. You can redirect this to a text file. I don't have access to Apple's Numbers, but I'd be very surprised if you can't import data from a plain text file.
I have hard-coded the max file index as 5. You'll want to change that.
If there are any files missing, a 0 will be output on that line instead. You could change that to a blank line as required.
Also I don't know if your files end in newlines or not, so the cat/read/echo line is one way to just get the first token of the line and not worry about any following whitespace.
#!/bin/bash
for i in {1..5} ; do
if [ -e w_$i.txt ] ; then
cat w_$i.txt | { IFS="" read -r n ; echo -e "$n" ; }
else
echo 0
fi
done

If all files end with newlines, you could just use cat:
cd ~/Documents/some\ folder; cat w_*.txt | pbcopy
This works even if the files don't end with newlines and sorts w_2.txt before w_11.txt:
sed -n p $(ls w_*.txt | sort -n)

Related

echo last character of text file in Unix/Bash

I need to see the last characters of bunch of text files (or alternatively test whether they are "}" and give a list of files that test negative ). Is there an easy way to do this from the command line.
(Ideally the solution works without reading the whole file from the start because in addition to there being many they can also be quite large.
P.S.: Any answer would be great but I would really appreciate if the function and syntax of everything in the answer can be fully explained.
It can be done fairly easily with tail and then string indexing in bash. For example, you obtain the last line in a file with, tail -n1 file. You will need to store the line in a variable using command-substitution, e.g.
lastln=$(tail -n1 file)
Then it is simply a matter of indexing the last characters, e.g.
echo ${lastln:(-1)}
(note: when indexing from the end of the string, you must put the offset (e.g. -1 in parenthesis (-1) -- or -- you must leave a space before the -1, e.g. echo ${lastln: -1} is also valid.)
You can try this:
for file in file1 file2; do tail -n 1 "$file" | grep -q '}$' || echo "$file"; done
where you should replace file1 file2 with the list of files you want to analyze, e.g. * or the like. Now what happens here? The outer part
for file in file1 file2; do ...; done
is a simple loop over the files, where inside the loop, you can refer to the current file as $file. Then,
tail -n 1 "$file"
prints the last line of the given file and
| grep -q '}$'
redirects the output to grep (turned into silent mode with -q), which looks for '}' immediatly followed by the end of the line ($). The return value of this command can be used to chain another action: when grep returns non-zero (indicating failure, i.e., the pattern is not matched), the last part
|| echo "$file"
is executed, resulting in the list of files you need.

Executing a bash loop script from a file

I am trying to execute this in unix. So let's for example say I have five files named after dates, and in each of those files there are thousand of numerical values (six to ten digit number). Now, lets say I also have bunch of numerical values and I want to know which value belongs to which file.I am trying to do it the hard way like below but how do I put all my values in a file and just do a loop from there.
FILES:
20170101
20170102
20170103
20170104
20170105
Code:
for i in 5555555 67554363 564324323 23454657 666577878 345576867; do
echo $i; grep -l $i 201701*;
done
Or, why loop at all? If you have a file containing all your numbers (say numbers.txt you can find in which date file each are contained and on what line with a simple
grep -nH -w -f numbers.txt 201701*
Where the -f option simply tells grep to use the values contained in the file numbers.txt to search in each of the files matching 201701*. The -nH options for listing the line number and filename associated with each match, respectively. And as Ed points out below, the -w option to insure grep only select lines containing the whole word sought.
You can also do it with a while loop and read from the file if you create it as #Barmar suggested:
while read -r i; do
...
done < numbers.txt
Put the values in a file numbers.txt and do:
for i in $(cat numbers.txt); do
...
done

Bash - read specific line from a file with all sorts of data and store as a variable

I have looked for an answer to what seems like a simple question, but I feel as though all these questions (below) only briefly touch on the matter and/or over-complicate the solution.
Read a file and split each line into two variables with bash program
Bash read from file and store to variables
Need to assign the contents of a text file to a variable in a bash script
What I want to do is read specific lines from a file (titled 'input'), store them variables and then use them.
For example, in this code, every 9th line after a certain point contains a filename that I want to store as a variable for later use. How can I do that?
steps=49
for((i=1;i<=${steps};i++)); do
...
g=$((9 * $i + 28)) #In.omega filename
`
For the bigger picture, I basically need to print a specific line (line 9) from the file whose name is specified in the gth line of the file named "input"
sed '1,39p;d' data > temp
sed "9,9p;d" [filename specified in line g of input] >> temp
sed '41,$p;d' data >> temp
mv temp data
Say you want to assign the 49th line of the $FILE file to the $ARG variable, you can do:
$ ARG=`cat $FILE | head -49 | tail -1`
To get line 9 of the file named in the gth line of the file named input:
sed -n 9p "$(sed -n ${g}p input)"
arg=$(cat sample.txt | sed -n '2p')
where arg is variable and sample.txt is file and 2 is line number

Is it possible to split a huge text file (based on number of lines) unpacking a .tar.gz archive if I cannot extract that file as whole?

I have a .tar.gz file. It contains one 20GB-sized text file with 20.5 million lines. I cannot extract this file as a whole and save to disk. I must do either one of the following options:
Specify a number of lines in each file - say, 1 million, - and get 21 files. This would be a preferred option.
Extract a part of that file based on line numbers, that is, say, from 1000001 to 2000001, to get a file with 1M lines. I will have to repeat this step 21 times with different parameters, which is very bad.
Is it possible at all?
This answer - bash: extract only part of tar.gz archive - describes a different problem.
To extract a file from f.tar.gz and split it into files, each with no more than 1 million lines, use:
tar Oxzf f.tar.gz | split -l1000000
The above will name the output files by the default method. If you prefer the output files to be named prefix.nn where nn is a sequence number, then use:
tar Oxzf f.tar.gz |split -dl1000000 - prefix.
Under this approach:
The original file is never written to disk. tar reads from the .tar.gz file and pipes its contents to split which divides it up into pieces before writing the pieces to disk.
The .tar.gz file is read only once.
split, through its many options, has a great deal of flexibility.
Explanation
For the tar command:
O tells tar to send the output to stdout. This way we can pipe it to split without ever having to save the original file on disk.
x tells tar to extract the file (as opposed to, say, creating an archive).
z tells tar that the archive is in gzip format. On modern tars, this is optional
f tells tar to use, as input, the file name specified.
For the split command:
-l tells split to split files limited by number of lines (as opposed to, say, bytes).
-d tells split to use numeric suffixes for the output files.
- tells split to get its input from stdin
You can use the --to-stdout (or -O) option in tar to send the output to stdout.
Then use sed to specify which set of lines you want.
#!/bin/bash
l=1
inc=1000000
p=1
while test $l -lt 21000000; do
e=$(($l+$inc))
tar -xfz --to-stdout myfile.tar.gz file-to-extract.txt |
sed -n -e "$l,$e p" > part$p.txt
l=$(($l+$inc))
p=$(($p+1))
done
Here's a pure Bash solution for option #1, automatically splitting lines into multiple output files.
#!/usr/bin/env bash
set -eu
filenum=1
chunksize=1000000
ii=0
while read line
do
if [ $ii -ge $chunksize ]
then
ii=0
filenum=$(($filenum + 1))
> out/file.$filenum
fi
echo $line >> out/file.$filenum
ii=$(($ii + 1))
done
This will take any lines from stdin and create files like out/file.1 with the first million lines, out/file.2 with the second million lines, etc. Then all you need is to feed the input to the above script, like this:
tar xfzO big.tar.gz | ./split.sh
This will never save any intermediate file on disk, or even in memory. It is entirely a streaming solution. It's somewhat wasteful of time, but very efficient in terms of space. It's also very portable, and should work in shells other than Bash, and on ancient systems with little change.
you can use
sed -n 1,20p /Your/file/Path
Here you mention your first line number and the last line number
I mean to say this could look like
sed -n 1,20p /Your/file/Path >> file1
and use start line number and end line number in a variable and use it accordingly.

Using bash and sed

Okay, so I'm not too great at this, but I have a bash script to pick a random number, then use sed to read lines off of files.
It's not working and I must have done something wrong. Could anyone correct my code?
I want the code to pull the line (random number) from each of those files, then output it as a single string (with spaces between).
NUMBER=$[ ( $RANDOM % 100 ) + 1 ]
sed -n NUMBER'p' /Users/user/Desktop/Street.txt
sed -n NUMBER'p' /Users/user/Desktop/City.txt
sed -n NUMBER'p' /Users/user/Desktop/State.txt
sed -n NUMBER'p' /Users/user/Desktop/Zip.txt
You probably need to use $NUMBER in your sed commands, rather than just NUMBER (or ${NUMBER} if other text is directly next to it). Example:
sed -n "${NUMBER}p" /Users/user/Desktop/Street.txt
The following script will use the same randomly chosen number to grab that line from each of the 4 input files you specified and concatenate those lines into a single variable called $outstring.
#!/bin/bash
NUMBER=$(((RANDOM % 100)+1))
for file in Street City State Zip; do
outstring+="$(sed -n "${NUMBER}p" "./${file}.txt") "
done
echo $outstring
Note: If you want (potentially) different line numbers from each of the 4 input files, then simply put the NUMBER= statement inside the for-loop.
This has the advantage of choosing from the whole of each file rather than only the first 100 lines. It will choose a different line from each file.
for f in Street City State Zip
do
printf '%s ' "$(shuf -n 1 "/Users/user/Desktop/$f.txt")"
done
printf '\n'

Resources