Hacker News new | past | comments | ask | show | jobs | submit login
Using GPG to Encrypt Your Data (nasa.gov)
236 points by maxt on Jan 12, 2017 | hide | past | favorite | 99 comments



For GPG symmetric encryption, the kind the article describes, here are the best options I've found for my typical case:

   gpg --symmetric \
   --cipher-algo aes256 \
   --digest-algo sha256 \
   --cert-digest-algo sha256 \
   --compress-algo none -z 0 \
   --quiet --no-greeting \
   --no-use-agent "$@"
I keep this command here:

    https://github.com/SixArm/gpg-encrypt
The options are chosen to balance tradeoffs of convenience, strength, and portability.


You can also explore the --s2k-* options to add a level of difficulty to hashing the password to protect against brute-forcing. E.g.

  --s2k-mode 3 
  --s2k-digest-algo sha512 
  --s2k-count 1000000
Default is to use salt + one round of SHA1, which is relatively weak.


Thank you. I added your options to the repo and thanked you there too.


What is this getting you that a simple 'gpg -c' isn't?

(I'm asking seriously; I don't have a strong opinion about GPG command line arguments)


The defaults he has chosen hints at a distro that ships GPG v1 / classic, which has stranger defaults than GPG v2. IIRC the default ciphers are CAST5, (very slow) compression is on by default, hashes are RIPEMD I believe and so on


Exactly. The defaults of GPG 1, the one which can be used as a single not-too-big binary, seem to be poor.

GPG 2 has better defaults, but it grew to include the kitchen sink and have a lot of moving parts, and I'd still prefer to have a good smaller program that does only a few things but do them good. My ideal would be a small executable with as little dependencies as possible.



The GPG1 defaults are definitely not great, but I'm not sure they have much practical impact for this use case.


What is "this use case" for you? The topic of the article is symmetric encryption, and whoever uses the defaults has much more chance to produce the files which could be compromised. it is a huge practical impact for me. Even when not using the defaults, the passphrases have to be really, really long to keep the content safe.

http://security.stackexchange.com/questions/15632/what-is-pu...

"In GnuPG 1.4.12 defaults are (found experimentally):

--s2k-mode = 3

--s2k-digest-algo = SHA1 (supports MD5, RIPEMD-160, SHA2s too)

--s2k-count = 65536 (supports from 1024 to 65011712)

--s2k-cipher-algo = CAST5 (supports 3DES, CAST5, Blofish, AES, Twofish, Camellia too)"


That's not at all true. Simple symmetric offline encryption of files is one of the few crypto operations that is easy to get right. The GPG1 defaults aren't great, but they aren't going to get your files compromised. And with the command line options presented upthread, you have the passphrase problem either way.


The symmetric algorithm aside, if we just look at the key derivation, the --s2k* parameters go up to 65011712 rounds of SHA512. If you maxed out the --s2k* settings, its difference from the 1.4.12 default of 65536 rounds of SHA1 is not staggering, but not trivial either: 10 extra bits from the additional rounds and an additional 3-4 bits from straight SHA1 to straight SHA512, on modern GPUs (https://gist.github.com/epixoip/a83d38f412b4737e99bbef804a27...).

An additional 13 bits of safety margin basically gives you an extra Diceware word (log2(7776)), which, I agree, isn't a magical solution at all, but would to me cross the threshold of "it has some actual impact".

Of course, having much better usability for the average user, or just breaking OpenPGP compatibility so there are clean modern robust constructions like NaCl/libsodium running underneath are way better ways to get at good security margins, but here we are.


Thanks a lot for the link, it's a wonderful overview of the password cracking capabilities of the modern GPUs.


> they aren't going to get your files compromised.

The default encryption is CAST5 which is a 64-bit block size cipher (even if it is confusingly called "CAST-128").

The default password derivation is using SHA1.

That's the reason people change the defaults. If you like them, you're of course free to use them or recommend them to your clients. Good luck. Of course I'd also like to read your explanation how you can consider 64-bits "secure enough" today (or for what you consider them secure enough). Also your estimate of how expensive would be to brute force shorter passwords for the traditionally small number of default rounds of SHA1. Thanks.


Neither of those two things matter very much for file encryption. The short block size, for instance, is a very big deal with online encryption, but not a dealbreaker for offline encryption.


> The short block size ... not a dealbreaker for offline encryption.

Which scenarios do you assume to be valid for offline encryption which don't make short block sizes problematic?

Why is poor password handling not a problem under these scenarios?


Neither gpg2 nor gpg1's defaults make short passwords safe; really, though, with a single targeted password, your passphrase needs to be extreme no matter what settings you use.

I'm not sure why an 8 byte block would materially impact file encryption. The kinds of attacks where short blocks come in handy are all online, CCA-style attacks. You might worry about things like CTR counter block sizes, but, again, not an issue for GPG1's defaults.

I'm not saying they're good settings. And: in particular, if you used them to encrypt something like session cookies, you could have serious vulnerabilities. But like I said: it's easy to encrypt files, and some things that are survivable for files aren't for other applications.


My main gripe with them is, apart from being a bit obscure, is that they're bog-slow. They give me something like two dozen MB/s encryption/decryption speed, on a machine that can do AEAD at 2.5-4 GB/s (AES-GCM or Chapoly). A large part of that is the compression (zlib-ish I think), though.


One thing that that gives you is the MDC packet, which isn't enabled by default in gpg classic. (Hence the parade of "message was not integrity protected" warnings.)

It can be requested directly via '--force-mdc', but, as long as you're tweaking the configuration, you might as well boost everything up to the full "Grovergeddon" settings.


I did something similar to gpg-encrypt: https://github.com/larose/eef/

It's a wrapper for gpg to edit encrypted files.


Nice script, though I must note that it will only work on Linux. Tried it on macOS, but there is no `/dev/shm`.


Here's an alternative to wrapping GPG, using .gnupg/gpg.conf:

  personal-cipher-preferences AES256 AES
  personal-digest-preferences SHA256 SHA512
  personal-compress-preferences Uncompressed
  default-preference-list SHA256 SHA512 AES256 AES Uncompressed
  
  cert-digest-algo SHA256
  
  s2k-cipher-algo AES256
  s2k-digest-algo SHA256
  s2k-mode 3
  s2k-count 65011712
  
  disable-cipher-algo 3DES
  weak-digest SHA1
  force-mdc
Note that these options impact compatibility with other GPG/PGP clients.


Thanks, I've added your info to the README and a credit to you.


This illustrates what's wrong with GPG: it's too hard to use. Why so many arguments for a common task? Why aren't the defaults acceptable?


The defaults are acceptable, and will produce a symmetrically encrypted file that can be quickly decrypted on even low-powered ARM cores in a reasonable amount of time.

These suggestions strike a different balance between protection and speed.


You can safely just use "gpg -c" to encrypt files.


Does `-no-use-agent` work? I see this in man:

       --no-use-agent
              This is dummy option. gpg always requires the agent.


I'll remove it now. It was useful to have (for me) for GPG version 1 when I connected to machines via SSH and didn't want a GPG agent pop up UI, and didn't have an easy way to change the GPG agent settings.


Contemporary versions of GnuPG (>= 2.1 IIRC) always use gpg-agent, and this option does nothing except producing a warning:

  gpg: WARNING: "--no-use-agent" is an obsolete option - it has no effect


Regarding the code...

    set -euf
    onecmd --args "$@"
The set -u is unneeded, as there are no code variables involved.

The set -e is not needed, as there is only one command, and the script will return the exit status of such command. Always. And will exit after that command. Always.

The set -f, will disable globbing, which I'm not sure it's what you want, when using a simple wrapper passing "$@" as filenames to gpg...


I disagree. set -eu should be at the top of every bash script.

This is the classic braceless if-guard mistake; leave it out today because you don't need it, forget, add something tomorrow and it breaks.


You can over-rely on "set -e" however:

  #!/usr/bin/env bash
  set -e
  fail() { false ; echo hello; }
  if ! fail; then :; fi
That outputs "hello" and exits with 0.


Not the author of the code, but personally, I don't see any downside to putting "set -eu" at the beginning of every script I write. These should be defaults.


You're correct. I use -euf as a default for new scripts, until someone asks for relaxing these. I'll remove the -f now because you're right, globbing can be useful.


do you know if gpg embeds a header in the cyphertext ? It always bothered me that openssl (for symmetric aes) puts "Salted_" as the first 7 bytes in every encrypted file, because it seems to nullify the "plausible deniability" defense and the "cyphertext should be indistinguishable from random data" tenet. Sure, having "Salted" doesn't prove that AES was run on the following bytes, but there's no plausible explanation as to what other program would do such a thing.


> do you know if gpg embeds a header in the cyphertext ?

    $ file /tmp/something.gpg 
    /tmp/something.gpg: GPG symmetrically encrypted data (AES cipher)
It has to, otherwise you'd have to know and use exactly the same options when decrypting. You could always strip it manually if you don't want this...


If we're talking about GPG, please pay attention to https://www.passwordstore.org/ which is really cool, open source password manager built on GPG.


I switched from OSX to Linux a few months ago and had to find an alternative for 1Password. Pass has been great, I love its simplicity.

Not having a browser add-on to retrieve passwords was feeling like a step back in terms of convenience though. That's why I built & open sourced browserpass [1], a browser extension for Pass.

It uses Native Host Messaging to securely retrieve passwords, so no crazy port listening as some other open-sourced add-ons do (which is a terribly bad idea).

[1] https://github.com/dannyvankooten/browserpass


Is there anything like this that doesn't leak the folder structure in plaintext? Manually obfuscating site names would be very tedious.


It's open source, you could always submit a patch.


Masterkey [1] might be interesting for you. It's using NACL and Go and stores everything in a single encrypted file.

1: https://github.com/johnathanhowell/masterkey


You may like https://github.com/bwesterb/pol which "is a modern command line password manager with deniable encryption".


How about KeePass?


I personally think it's a pretty good password manager (like all others, database sync is a problem for users to solve). I had been using its Linux port -> keepassx (also available for macOS - yes NOT Mac OS X any more...)

Features: http://keepass.info/features.html

I switched to pass / qtpass (cross-platform Qt frond-end for pass) after seeing it on hacker news (or somewhere else like Twitter) because it uses GnuPG + git (simple and I am capable of both in CLI). Last but not least, pass provides migration scripts from keepass/keepassx (and a lot more... - feel like I had to migrate ;-)


And QtPass with it. https://qtpass.org/


In addition, there's a Android client for it as well, so you can take your passwords on the go:

https://play.google.com/store/apps/details?id=com.zeapo.pwds...


Unless compatibility with gpg is a requirement, I think scrypt[0] is a much simpler tool for file encryption. The utility is meant to showcase the KDF of the same name. It's very simple and has virtually no parameters. So:

  $ xz -k elrond_minutes.txt
  $ scrypt enc elrond_minutes.txt.xz elrond_minutes.txt.xz.enc
  $ signify -S \
      -s vilya.key \
      -m elrond_minutes.txt.xz.enc \
      -x elrond_minutes.txt.xz.enc.sig
  $ rm elrond_minutes.txt{,.xz}
Signing the final output is probably extraneous; I think scrypt uses a HMAC. This involves invoking multiple tools, but since each tool only does one thing it's much easier to reason about, and I prefer this over using an omnibus tool like gpg.

[0] https://github.com/Tarsnap/scrypt


My paranoid self wanted to replace rm with shred.


You can just pipe xz instead, although you may want to shred the original file:

  xz < file | scrypt enc - > file.xz.enc
And I agree: scrypt (the program) is much better for password encrypting documents. It is only a few thousand lines of readable code; it uses modern algorithm choices (scrypt, AES256-CTR, HMAC-SHA256), with no alternatives; there isn't any configuration involved; and it's written by a respected author.


shred is ineffective if you're using a CoW FS, and probably less effective on a journaling FS, and those probably covers 99% of all the FS people use today. Just use FDE.


>We suggest that you include five words of 5-10 letters in size, chosen at random, with spaces, special characters, and/or numbers embedded into words.

>You need to be able to recall the passphrase that was used to encrypt the file.

Why bother writing security guidelines which are impossible for a human to follow?

edit: Try recalling any passphrases generated by the command below, and that's before the random sprinkling of punctuation.

    grep -E "^[a-z]{5,10}$" /usr/share/dict/words | shuf -n5 | tr '\n' ' '


I've been thinking about this for a while, and the early conclusion I've come to, is that 64bits of provable random entropy in a password that's also memorable is a very high bar to clear.

Imagine this, you take four word types/groups, say, substantive, verb, adverb, preposition/place.

You list 128 of each - all with identified uniqly by the first two letters. You let a machine pick a word from each column at random. The phrase is your mnemonic key, the password (to type in) is the first two letters of each word, concatenated.

If you want to appease password strength checks, capitalise the first letter, and end the input with a period.

So: "girl runs happily up", becomes "giruhaup" (or, with equivalent entropy, but satisfying "at least three symbol groups": "Giruhaup.").

Now, that's then 4 picks out of 128 words, or an encoding of 4 times 7 bits (2^7=128) - 28 bits. You'd need three such passwords concatenated to break past 64 bits of entropy. And you'd have to type in 24 letters. That's pretty hard to type in blind without a typo.

You might be able to use lists of 256 words - but it'd make it a bit more difficult to make the wordlists (because words should be identified by the first two characters) - and you'd still need two "phrases" and type in 16 characters.

Adding random numbers, symbols or capitalization is probably not worth the challenge they add in remembering where they go, for the single/few bits of entropy they add.

And I'm still not convinced 16 characters is short enough to be usable for "most people".


Rather than rolling your own password system, I would recommend diceware.com for strong passwords (including master passwords) that you can memorize (I am bad at memorization, and have memorized 129 bit passwords this way, and 64 bit passwords are kind of a breeze to memorize).

For the long tail of passwords that you shouldn't be memorizing in the first place, a password manager with a good configurable password generator is invaluable. I use Lastpass (I like the breadth of it's platform support: all major consumer OSes, all major mobile OSes, extensions for all major browsers). Alternatively, lot of people recommend 1password.

Diceware has better guarantees, but the password managers are usually much more convenient[1]. I weigh these costs and benefits when choosing which way to go for a particular use case.

[1] With the significant exception of passwords that will regularly have to be typed out on mobile, since diceware passwords are much more virtual keyboard friendly than random character generated passwords. This is partly because you can typically keep the entire thing in your head, not having to reference your password manager multiple times, and partly because they don't rely on special characters for their entropy, so can be typed out on the primary keyboard without switching to numeral or special character keyboards.


The reason I've been thinking about this, is that I'm not happy with diceware. Five words (64 bits of "guaranteed" entropy) is around 20 characters - and I'm not sure if diceware looses some entropy if you omit spaces (eg: "at hat" and "a that" both become "athat").

My main takeaway looking at the problem, is that 64 bits is a lot to encode in ~26 letters and maybe 10 digits - in a way that is easy to remember, easy to type, easy to read (if eg: given a printed initial password, read/hear (sharing over the phone/double as a way to read out a hash/shared key etc).

My main issue with diceware is the large number of words; almost touching on typical active vocabulary of even native speakers - never mind if your users speak little or no English. One benefit of the system above is that as long as you can come up with four/five sets of 128 words that don't collide among themselves in the groups of 128 - you can adapt the system to any alphabet and preserve any guarantees of entropy. Making a diceware wordlists is a huge undertaking by comparison. (But the benefit is that people have already done this for many languages).


Why truncate the words down to the first two letters? Are you reinventing xkcd 936?


For ease of typing. Typically you have to enter passwords verbatim - typically passwords need to be entered without error, blindly. 16 characters is easier to get right that 60.

And while it might feel good to pretend full words add entropy, if you assume the attacker knows your system - it really doesn't (hence "guaranteed" entropy).

As for diceware, I don't find those passwords easy to remember - especially past 60 bits of entropy. But use what works for you.


> And while it might feel good to pretend full words add entropy, if you assume the attacker knows your system - it really doesn't (hence "guaranteed" entropy).

It does: munroe's proposed scheme operates on the assumption the attacker knows it. The 11 bits of entropy refer to a dictionary of 2K words to choose from. The reason to type full ones is you're not hamstrung by the "no common prefix" limitation, which allows larger (and easier to remember) dictionaries.

Also, we're talking theory. Typing them blindly is an artificial implementation limitation imposed on us by bad software. Just like "you need at least one digit", "maximum length 16", &c. If you're going to consider those, that's fine, but then you're not talking about actual password theory anymore--you're just discussing how to cope with bad platforms.

Case in point: many good PW forms (OS logins, &c) have no such limitations, and offer a "view password while typing" option.


But there's a reason for hiding password input: [ed: making shoulder surfing a little harder]. Or unlocking a computer that's projecting to an audience. [ed: see also citizenfour where Snowden uses a blanket when typing in a pass phrase].

This is indeed not about password "theory", because experience shows that actual system (in)security happens where computer systems and users interact.

Using a common subset of keyboard layouts for different languages (limiting the character set), being workable on touch screens, are important for security. And using passwords at all is working around "bad platforms".

> The 11 bits of entropy refer to a dictionary of 2K words to choose from. The reason to type full ones is you're not hamstrung by the "no common prefix" limitation, which allows larger (and easier to remember) dictionaries.

From playing with this, I'm not convinced the tradeoff of using a big dictionary whose that cannot be enumerated by a short unique prefix (to reduce length) really adds that much - just like increasing the character set beyond 26/36 helps all that much - because you only gain a bit for every doubling in size.

My idea is for the mnemonic to form an actual "story" (in a secure way) - in the hope that it's easier to remember : "boy flies angrily away" than "correct horse battery stapple".

A) that may be wrong

B) You still need too many words in order to encode a "high enough" entropy


> The 11 bits of entropy refer to a dictionary of 2K words to choose from. The reason to type full ones is you're not hamstrung by the "no common prefix" limitation, which allows larger (and easier to remember) dictionaries.

Another note on this - assume an average word length of 5 - that's 11/5 or 2.5 bits per character typed (again, assuming the wordlist doesn't loose some bits for "double coding" like "at hat/a that").

At 7 bits per word - of which two characters are enough, we type 7/2 or 3.5 bits per character.

Conversely, we only memorize 7 bits per word vs 11 bits.


Is it really impossible for a human to follow?

"Shiny C0rrect H0rse Battery Staple!"


That's a good long term solution but when policies force you to change your password every 45 days, it falls apart.

In my experience, overly restrictive password policies force users to choose passwords that are less secure and easier to remember.


The good news is that the practice is going away NIST revised it's guidance/recommendation for password cycling.


You can tell a company has this policy when every monitor has a sticky note on it with the numbers 1 to N on it, where 1 to N-1 are crossed out.


Yes indeed. For example they add the current year and month and keep the same "base password" which is unsafe.


"Password2017" is a typical "secure" password. Capital and small letters, and number - longer than 8 characters. Passes most "checks" for passwords...


"Password2017!" is even better. It's got a special character!


My favorite "pattern for stupid passphrase requirements" is "1qaz@WSX" - then just move a row to the right with every password change. :)


Funny how most people go for ! as the default special character :)


It adds to the excitement of logging into an application. Instead of "login", you get to "login!".


I think it's a natural outgrowth of how so many people chose "1" when they were forced to add a number to their passwords.


Embedding special characters only makes it harder to remember correctly yet has little benefit. Your example is the same used in the xkcd where they explain this (except you've added an additional word at the beginning) so you've probably seen it already but I'll link it anyways. https://xkcd.com/936/


At work I constantly deal with people who can't remember passwords as short as 8 characters, you have to remember we're not representative of the average person.


you should ask them what they prefer - remembering 8 random characters or 4 random words


For one specific site, no. But if you have 100-200 different passwords to remember, it's impossible for most people.


I do this... I have 3-4 randomly generated passwords memorized. One for each "important" account (e.g. email, banking).


s/human/me


Key stretching is critical for password-based encryption, and gpg's s2k options are vulnerable to GPU acceleration. Command-line tools to encrypt with bcrypt/scrypt are common and may be a better option.


Is there a benefit in using symmetric encryption vs specifying yourself as a recipient?


The HECC site here is one of the best support sites i've ever seen. Very logically laid out KB, news, ask a question, etc.

thanks for the link


There's the [2015] which should be included.


No it shouldnt. It imposes (false) perception that anything not from today is old/not fresh/known/bad knowledge. It is not true.

This hunt for dates in titles on HN is bad and it's awitch hunt these days.

Disclaimer: I'm not an author nor submitter.


Adding the year does nothing other than let people know when it was published and gives it some context. You're reading too much into this. The practice goes back at least 2½ years.


This is “Hacker News”.


The "News" is a bit of a misnomer. Any submission that "gratifies intellectual curiosity" is on-topic, regardless of age.


Why would they not use asymmetric encryption?


Because the paragraph on key generation and management would be 3 times as long as the entire article in its current form ?

Asymmetric encryption solves the problem of transmitting the password safely ("solve" is a rather optimistic word, maybe "delegates" is more appropriate); if you can safely transfer passwords from point to point, then using symmetric encryption is far easier.


Asymmetric cryptography transforms key distribution problems into key management problems.

Which is just a different problem, not necessarily an easier one, like you say.


> Asymmetric cryptography transforms key distribution problems into key management problems.

That's a very nice way to put it, I'm going to reuse that !


more importantly, they're more cpu intensive and slow to deal with large files


Asymmetric encryption adds constant overhead, independent of message size.

Unless you're doing it wrong.


while not an expert I disagree, encrypt a 1GB file must be different from 1MB file, no matter it is sym or asym encryption. normally Asym is for keys while symmetric encryption is for the real content.


...and why encrypt stuff transferred with scp?


because encryption in transit != encryption at rest. Maybe you don't trust the server you are scp'ing the data to, with encryption at rest you dont' need to.


That's not what the documentation is about, though:

====

Use GPG with the cipher AES256, without the --armour option, and with compression to encrypt your files during inter-host transfers. GPG

Encryption helps protect your files during inter-host file transfers (for example, when using the scp, bbftp, or ftp commands). We recommend GPG (Gnu Privacy Guard), an Open Source OpenPGP-compatible encryption system.

===

scp shouldn't be in that list.


If your goal is to transfer securely from person to person, 'scp' generally means there's a common server you're accessing - not that you're 'scp'ing directly to the other user's machine. Keeping it secure when "at rest" on the remote server would ensure it's securely transferred between the two end points.


NASA has historically done at least some open transfers, such as HTTP, FTP, etc. Using GPG for these is good. And it keeps the file encrypted at rest too.


Symmetric is easier to teach to people, especially large groups of people doing tech training.


GPG, of course, allows you to use asymmetric crypto and currently supports RSA, RSA-E, RSA-S, ELG-E, and DSA algorithms for that purpose.

But for bulk data encrypting good symmetric (AES, CAST5, etc.) is both more secure and significantly faster.


You should be aware that even for asymmetric encryption, only a one-time document-specific key will be encrypted with the asymmetric algorithm. The document itself will always be encrypted using a symmetric cipher.

This is how the same document can be encrypted for multiple recipients efficiently, without duplication all the data. First the document is encrypted with a symmetric key, which is then encrypted with the public key of each recipient. This information will be prepended to the actual encrypted document.

For details see: https://tools.ietf.org/html/rfc4880#section-2.1


GPG 2.1 (via libgcrypt) also supports various elliptic curve algorithms for asymmetric crypto, depending on what version of libgcrypt you have.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: