Sort scientific and float - shell

I have been trying desperately to use the command sort, to sort a mixture out of scientific and floating values which are both positive and negative, e.g.:
-2.0e+00
2.0e+01
2.0e+02
-3.0e-02
3.0e-03
3.0e-02
Without the floating point or without the scientific exponent, it works just fine with
sort -k1 -g file.dat. Using both at once as stated before, it results in:
-3.0e-02
-2.0e+00
2.0e+01
2.0e+02
3.0e-02
3.0e-03
This is obviously wrong since it should be:
-2.0e+00
-3.0e-02
3.0e-03
3.0e-02
...
Any idea how I can solve this issue? And once I solve this, is there any possibility to sort the absolute value (e.g. get rid of the negative ones)? I know I could try to square each value, sort, take the square root. Doing this I would be less precise though and it would be neat to have a nice, fast and straightforward way.
My linux system: 8.12, Copyright © 2011
Thank you very much!
UPDATE: if I run it in the debug mode sort -k1 -g filename.dat --debug I get the following result (I translated it into english, output was german)
sort: the sorting rules for „de_DE.UTF-8" are used
sort: key 1 is numerically and involves several fields
-3.0e-02
__
________
-2.0e+00
__
________
2.0e+01
_
_______
2.0e+02
_
_______
3.0e-02
_
_______
3.0e-03
_
_______

Based on comments under the question, this is a locale issue: sort is using a locale, which expects , as decimal separator, while your text has .. Ideal solution would to make sort use a different locale, and hopefully someone will write a correct answer covering that.
But, if you can't, or don't want to, change how sort works, then you can change the input it gets. This is easiest by making sort take its input from pipe, and modify it on the way. Here it is enough to change every . to ,, so the tool of choice is tr:
cat file.dat | tr . , | sort -k1 -g
This solution has one big drawback: if command is executed with locale where sort uses . as decimal separator, then instead of fixing, this will break the sorting. So if you are writing a shell script, which may be used elsewhere, don't do this.
Important note: Above command has unnecessary use of cat. Everybody who wants themselves to be taken seriously as professional shell script programmers, don't do that!

Related

Using sed (or other GNU/Linux tool) to find GPS within a bounding box?

I'm looking to filter from a very large csv file down to a smaller one using a broad stroke command line tool.
The example data is here:
2021-03-19 09:37:00,LISBON,39.1660,-9.5114,18.5600,60.3886
2021-03-19 09:38:00,LISBON,38.8799,-9.3713,19.1051,27.9254
2021-03-19 09:39:00,LISBON,38.5964,-8.8315,19.1044,29.2456
2021-03-19 09:40:00,LISBON,38.4241,-8.9433,18.1184,35.7412
2021-03-19 09:41:00,LISBON,38.8015,-8.6765,17.7960,41.2380
2021-03-19 09:42:00,LISBON,38.4844,-9.0106,19.4660,27.1470
2021-03-19 09:43:00,LISBON,38.3213,-8.9620,19.7043,45.5808
2021-03-19 09:44:00,LISBON,38.9479,-9.1680,19.0704,26.8376
^C21-03-19 09:45:00,LISBON,37.9198,-9.2775,17.8219,88.4726
The third and fourth fields here are GPS coordinates.
I'd like to be able to filter them down to within ~25 km of a central point 38.7077507, -9.1365919 and sed is very effective for this.
For example - sed -n '/38.7[2-4]..,-9.1[3-7]../p' gets pretty close.
HOWEVER, I'd like to make the 'bounding box' bigger, and this is where things get a bit confusing. For example, let's say i wanted to spread the longitude all the way down to -8.9. How do you write a regex for this?
I tried something like sed -n '/38.7[2-4]..,-[8-9]...../p', but the problem is that this returns '-8.1' which is too far, when I want to stop it at '-8.9'.
I know that if I got it into a richer language (e.g. Python) this is pretty straightforward, but I'd like to do as much on the front end (before I injest into the data pipeline), and sed is extremely performant for this.
Thanks!
Wouldn't want to abuse sed for this, so here's an awk solution.
awk -F, '{x=38.7077507-$3; y=-9.1365919-$4; if(x^2+y^2<0.3^2) print}' input.txt
# ^~~~~~~~~~ x ^~~~~~~~~~ y ^~~ r

Explain how this command finds the 5 most CPU intensive processes

If I have the command
ps | sort -k 3,3 | tail -n 5
How is this find the 5 most CPU intensive processes?
I get that it is taking all the processes, sorting them based on a column through the -k option, but what does 3,3 mean?
You could read what you seek for from the official manual of sort (info sort in linux); in particular, you are interested in the following extracts:
‘-k POS1[,POS2]’
‘--key=POS1[,POS2]’
Specify a sort field that consists of the part of the line between
POS1 and POS2 (or the end of the line, if POS2 is omitted),
_inclusive_.
and, skipping a few paragraphs,
Example: To sort on the second field, use ‘--key=2,2’ (‘-k 2,2’).
See below for more notes on keys and more examples. See also the
‘--debug’ option to help determine the part of the line being used
in the sort.
So, basically, 3,3 emphasises that only the third column shall be considered for sorting, and the others will be ignored.

Use cut and grep to separate data while printing multiple fields

I want to be able to separate data by weeks, and the week is stated in a specific field on every line and would like to know how to use grep, cut, or anything else that's relevant JUST on that field the week is specified in while still being able to save the rest of the data that's being given to me. I need to be able to pipe the information into it via | because that's how the rest of my program needs it to be.
as the output gets processed, it should look something like this
asset.14548.extension 0
asset.40795.extension 0
asset.98745.extension 1
I want to be able to sort those names by their week number while still being able to keep the asset name in my output because the number of times that asset shows up is counted up, but my problem is I can't make my program smart enough to take just the "1" from the week number but smart enough to ignore the "1" located in the asset name.
UPDATE
The closest answer I found was
grep "^.........................$week" ;
That's good, but it relies on every string being the same length. Is there a way I can have it start from the right instead of the left? Because if so then that'd answer my question.
^ tells grep to start checking from the left and . tells grep to ignore whatever's in that space
I found what I was looking for in some documentation. Anchor matches!
grep "$week$" file
would output this if $week was 0
asset.14548.extension 0
asset.40795.extension 0
I couldn't find my exact question or a closely similar question with a simple answer, so hopefully it helps the next person scratching their head on this.

less-like pager for (swi) prolog

The typical workflow in unix is to use a pipeline of filters ending up with a pager such as less. E.g. (omitting arguments)
grep | sed | awk | less
Now, one of the typical workflows in the swi-prolog's command line is asking it to give the set of solutions for a given conjunction like
foo(X),bar(X, Y),qux(buz, Y).
It readily gives me the set of soutions. Which can be much longer than the terminal window. Or a single query
give_me_long_list(X).
can give a very long list again not fitting on the screen. So I constantly find myself in situations where I want to slap |less at the end of the line.
What I am looking for is a facility to open in a pager a set of solutions or just a single large term. Something similar to:
give_me_long_list(X), pager(X).
or
pager([X,Y], (foo(X),bar(X, Y),qux(buz, Y))).
This is not a complete solution, but wouldn't it be rather easy to write your own pager predicate? Steps:
Create temp file
dump X into temp file with the help of these or those predicates
(I haven't done any I/O with Prolog yet, but it doesn't seem too messy)
make a system call to less <tempfile>

How do you generate passwords? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
How do you generate passwords?
Random Characters?
Passphrases?
High Ascii?
Something like this?
cat /dev/urandom | strings
Mac OS X's "Keychain Access" application gives you access to the nice OS X password generator. Hit command-N and click the key icon. You get to choose password style (memorable, numeric, alphanumeric, random, FIPS-181) and choose the length. It also warns you about weak passwords.
Use this & thumps up :)
cat /dev/urandom | tr -dc 'a-zA-Z0-9-!##$%^&*()_+~' | fold -w 10 | head -n 1
Change the head count to generate number of passwords.
A short python script to generate passwords, originally from the python cookbook.
#!/usr/bin/env python
from random import choice
import getopt
import string
import sys
def GenPasswd():
chars = string.letters + string.digits
for i in range(8):
newpasswd = newpasswd + choice(chars)
return newpasswd
def GenPasswd2(length=8, chars=string.letters + string.digits):
return ''.join([choice(chars) for i in range(length)])
class Options(object):
pass
def main(argv):
(optionList,args) = getopt.getopt(argv[1:],"r:l:",["repeat=","length="])
options = Options()
options.repeat = 1
options.length = 8
for (key,value) in optionList:
if key == "-r" or key == "--repeat":
options.repeat = int(value)
elif key == "-l" or key == "--length":
options.length = int(value)
for i in xrange(options.repeat):
print GenPasswd2(options.length)
if __name__ == "__main__":
sys.exit(main(sys.argv))
The open source Keepass tool has some excellent capabilities for password generation, including enhanced randomization.
I use password safe to generate and store all my passwords, that way you don't have to remember super strong passwords (well except the one that unlocks your safe).
An slight variation on your suggestion:
head -c 32 /dev/random | base64
Optionally, you can trim the trailing = and use echo to get a newline:
echo $(head -c 32 /dev/random | base64 | head -c 32)
which gives you a more predictable output length password whilst still ensuring only printable characters.
The standard Unix utility called pwgen.
Available in practically any Unix-like distribution.
The algorithm in apg is pretty cool. But I mostly use random characters from a list which I've defined myself. It is mostly numbers, upper- and lowercase letters and some punctuation marks. I've eliminated chars which are prone to getting mistaken for another character like '1', 'l', 'I', 'O', '0' etc.
I don't like random character passwords. They are difficult to remember.
Generally my passwords fall into tiers based on how important that information is to me.
My most secure passwords tend to use a combination of old BBS random generated passwords that I was too young and dumb to know how to change and memorized. Appending a few of those together with liberal use of the shift key works well. If I don't use those I find pass phrases better. Perhaps a phrase from some book that I enjoy, once again with some mixed case and special symbols put it. Often I'll use more than 1 phrase, or several words from one phrase, concatenated with several from another.
On low priority sites my passwords are are pretty short, generally a combination of a few familiar tokens.
The place I have the biggest problem is work, where we need to change our password every 30 days and can't repeat passwords. I just do like everyone else, come up with a password and append an ever increasing index to the end. Password rules like that are absurd.
For web sites I use SuperGenPass, which derives a site-specific password from a master password and the domain name, using a hash function (based on MD5). No need to store that password anywhere (SuperGenPass itself is a bookmarklet, totally client-side), just remember your master password.
I think it largely depends on what you want to use the password for, and how sensitive the data is. If we need to generate a somewhat secure password for a client, we typically use an easy to remember sentence, and use the first letters of each word and add a number. Something like 'top secret password for use on stackoverflow' => 'tspfuos8'.
Most of the time however, I use the 'pwgen' utility on Linux to create a password, you can specify the complexity and length, so it's quite flexible.
I use KeePass to generate complex passwords.
I use https://www.grc.com/passwords.htm to generate long password strings for things like WPA keys. You could also use this (via screenscraping) to create salts for authentication password hashing if you have to implement some sort of registration site.
In some circumstances, I use Perl's Crypt::PassGen module, which uses Markov chain analysis on a corpus of words (e.g. /usr/share/dict/words on any reasonably Unix system). This allows it to generate passwords that turn out to be reasonably pronounceable and thus remember.
That said, at $work we are moving to hardware challenge/response token mechanisms.
Pick a strong master password how you like, then generate a password for each site with cryptohash(masterpasword+sitename). You will not lose your password for site A if your password for site B gets in the wrong hands (due to an evil admin, wlan sniffing or site compromise for example), yet you will only have to remember a single password.
Having read and tried out some of the great answers here, I was still in search of a generation technique that would be easy to tweak and used very common Linux utils and resources.
I really liked the gpg --gen-random answer but it felt a bit clunky?
I found this gem after some further searching
echo $(</dev/urandom tr -dc A-Za-z0-9 | head -c8)
I used an unusual method of generating passwords recently. They didn't need to be super strong, and random passwords are just too hard to remember. My application had a huge table of cities in North America. To generate a password, I generated a random number, grabbed a randon city, and added another random number.
boston9934
The lengths of the numbers were random, (as was if they were appended, prepended, or both), so it wasn't too easy to brute force.
Well, my technique is to use first letters of the words of my favorite songs. Need an example:
Every night in my dreams, I see you, I feel you...
Give me:
enimdisyify
... and a little of insering numbers e.g. i=1, o=0 etc...
en1md1sy1fy
... capitalization? Always give importance to yourself :)
And the final password is...
en1Md1sy1fy
Joel Spolsky wrote a short article: Password management finally possible
…there's finally a good way to
manage all your passwords. This system
works no matter how many computers you
use regularly; it works with Mac,
Windows, and Linux; it's secure; it
doesn't expose your passwords to any
internet site (whether or not you
trust it); it generates highly secure,
random passwords for each and every
site, it's fairly easy to use once you
have it all set up, it maintains an
automatic backup of your password file
online, and it's free.
He recommends using DropBox and PasswordSafe or Password Gorilla.
import random
length = 12
charset = "abcdefghijklmnopqrstuvwxyz0123456789"
password = ""
for i in range(0, length):
token += random.choice(charset)
print password
passwords:
$ gpg --gen-random 1 20 | gpg --enarmor | sed -n 5p
passphrases:
http://en.wikipedia.org/wiki/Diceware
Mostly, I type dd if=/dev/urandom bs=6 count=1 | mimencode and save the result in a password safe.
On a Mac I use RPG.
In PHP, by generating a random string of characters from the ASCII table. See Generating (pseudo)random alpha-numeric strings
I start with the initials of a sentence in a foreign language, with some convention for capitalizing some of them. Then, I insert in a particular part of the sentence a combination of numbers and symbols derived from the name of the application or website.
This scheme generates a unique password for each application that I can re-derive each time in my head with no trouble (so no memorization), and there is zero chance of any part of it showing up in a dictionary.
You will have to code extra rules to check that your password is acceptable for the system you are writing it for. Some systems have policies like "two digits and two uppercase letters minimum" and so on. As you generate your password character by character, keep a count of the digits/alpha/uppercase as required, and wrap the password generation in a do..while that will repeat the password generation until (digitCount>1 && alphaCount>4 && upperCount>1), or whatever.
http://www.wohmart.com/ircd/pub/irc_tools/mkpasswd/mkpasswd+vms.c
http://www.obviex.com/Samples/Password.aspx
https://www.uwo.ca/its/network/security/passwd-suite/sample.c
Even in Excel!
https://web.archive.org/web/1/http://articles.techrepublic%2ecom%2ecom/5100-10878_11-1032050.html
http://webnet77.com/cgi-bin/helpers/crypthelp.pl
Password Monkey, iGoogle widget!
The Firefox-addon Password Hasher is pretty awesome for generating passwords: Password Hasher
The website also features an online substitute for the addon:
Online Password Hasher
I generate random printable ASCII characters with a Perl program and then tweak the script if there's extra rules to help me generate a more "secure" password. I can keep the password on a post-it note and then destroy it after one or two days; my fingers will have memorized it, and my password will be completely unguessable.
This is for my primary login password, something I use every day, and in fact many times a day as I sit down and unlock my screen. This makes it easy to memorize fast. Obviously passwords for other situations have to use a different mechanism.

Resources