so I've never really implemented any cryptographic algorithms apart from a simple xor one time pad like application to just encrypt some test files.
I've briefly touched on encryption before, but a burning question of mine is when you encrypt a document with an algorithm such as AES you need to obviously create a password I'm assuming this password will be used as the key. Again to reiterate, let's say I have a document and decided to encrypt it with an encryption application that uses an algorithm such as AES, that said program will make you specify a key(will be used as your password). Where will that key be stored?? will the key be stored in the file itself? and if so wouldn't this be a weakness?? let's say if the password is indeed embedded somewhere in the file(probably appended or prepended to it) couldn't someone try use techniques such as deduction or even a bruteforce attack to figure out this password?
7-Zip also supports encryption with AES-256 algorithm. This algorithm uses cipher key with length of 256 bits. To create that key 7-Zip uses derivation function based on SHA-256 hash algorithm. A key derivation function produces a derived key from text password defined by user. For increasing the cost of exhaustive search for passwords 7-Zip uses big number of iterations to produce cipher key from text password.
The password is not stored inside the file itself. Essentially, you have some password like "Password123". AES-256 requires the cipher key to have 256 bits, so you perform some sort of hash on the password (in this case, SHA-256, with multiple iterations), and that hash becomes the cipher key. (The hash is not stored either, it's just re-computed when you re-type the password when decrypting)
ah ok got you, so the password "Password123" is hashed into a cipher key at a fixed length such as 256 bits, this cipher key is used to decrypt the encrypted document, if the encrypted file was encrypted with that cipher key then the same cipher key will decrypt the document, that being said that hashing algorithm to generate the cipher key must obviously be deterministic.
Let's say I encrypt a file with AES, would taking the hash of a password let's say again "Password123" with a hashing algorithm then giving it a salt make my password any harder to crack?
I know that there is a thing called a dictionary attack were permutations or even wordlists are pre computed to a hash and then tried against that file, that's essentially how WPA encryption is cracked.
would taking the hash of a password let's say again "Password123" with a hashing algorithm then giving it a salt make my password any harder to crack?
A salt is used to produce a new secret hash function out of a publicly known hash function. For example,
f(message) = sha256(sha256(message) XOR salt)
In this example, as long as salt is kept secret and even if an attacker knows how to produce SHA-256 collisions, they won't be able to produce a collision for f().
If the salt cannot be kept secret then there's little point in using one.
The only way to make a KDF difficult to crack when the function is known is to make it computationally impractical to brute-force. If the KDF requires so many successive executions of a hash function that even a high-end computer takes several seconds to derive the key from the passphrase, no one is going to bother trying.
When you say "make it computationally impractical to brute-force" are you implying that a longer password would make the difference between a feasible time(to successfully bruteforce/crack the password) versus a time that isn't feasible to crack that password(exponentially large).
so would a password such as "PassPassPassPassWord1234567890@!" be better than "Password1234"?
And would it matter if the password was a string of a long series of arbitrary uppercase and lowercase letters in conjunction with numbers with some punctuation thrown in for good measure such as "WxwaYp56Y7ipA900hj716aBbaM@!ki9"
I probably should rephrase that last question, both "PassPassPassPassWord1234567890@!" and WxwaYp56Y7ipA900hj716aBbaM@!ki9" are 32 characters long (I think, but just assume if not), would it mater that the latter password is a bunch of arbitrary characters in an arbitrary sequence rather than the former that has more English like properties, Pass and PassWord are English words.
Essentially what would make a password impractical to crack?
Also related, here is a very rudimentary encryption program I wrote earlier, how easy would it be to crack this simple encryption, all I did was a simple xor with a three letter key. Let's assume the key was longer such as "THISISAKEY12356789!@" would it make much difference in the security of such a rudimentary algorithm? and lastly what libraries do most C/C++ developers use for encryption?
helios,
first thanks for using the term "KDF" I didn't know that it had a specific name (key derivation function).
In this example, as long as salt is kept secret and even if an attacker knows how to produce SHA-256 collisions, they won't be able to produce a collision for f().
If the salt cannot be kept secret then there's little point in using one.
My understanding back from when I learned about this, is the purpose of a salt is to prevent precomputed hash attacks, forcing the attacker to recompute each hash (instead having, say, precomputed dictionary hashes). It's not meant as a secret appendage; it could be right next to the password hash itself in a theoretical database.
adam2016,
From the perspective of the KDF, it hardly matters whether the initial password has 3 letters or 1000 letters; it will become something hashed with some constant number of bits, and this hash will go through repeated cycles (for example, PBKDF2).
I probably should rephrase that last question, both "PassPassPassPassWord1234567890@!" and WxwaYp56Y7ipA900hj716aBbaM@!ki9" are 32 characters long (I think, but just assume if not), would it mater that the latter password is a bunch of arbitrary characters in an arbitrary sequence rather than the former that has more English like properties, Pass and PassWord are English words.
If there is a pattern used to generate your password, then there is less entropy than the optimal "every bit is random", meaning it is easier to predict your password, generally speaking. Of course, both of the passwords in your example are very long and pretty much impossible to guess, but the second one (at a glance) looks like it was generated with a "more random" method.
are you implying that a longer password would make the difference between a feasible time(to successfully bruteforce/crack the password) versus a time that isn't feasible to crack that password(exponentially large).
No, I'm saying a sufficiently expensive KDF can make even relatively weak passwords impractical to crack.
how easy would it be to crack this simple encryption
I'm not a cryptanalyst, but basically the problem with those kinds of encryption algorithms is that they leak statistical information from the plaintext into the ciphertext. An attacker could search for any repeated strings in the ciphertext and then try XORing things together to try and get at the passphrase. The longer the plaintext the more likely it is to find such repetitions. Additionally, any long strings of repeated characters in the plaintext (e.g. spaces for indentation in a source file) pretty much reveal the passphrase.
for something like an older zip file, password123 vs a longer one is more secure for longer passwords as one brute force attack is to generate the passwords over and over, starting with very small ones (even 1 letter) and slowly getting bigger. Its been a long, long while since anything important was breakable this way, but that is the scenario where the bigger password is more secure. *WORSE*, I think it was excel/ms office? some old software had bugs so that multiple passwords gave the SAME hash, so you could break it with the wrong password, making cracking those much faster than it should have been. This persisted for a long time and was a well known issue.
a lot of the password nonsense 'pro tips' is garbage. A serious server locks you out if you fail the password 3-4 times in a row; often locked out for hours or even days or for high end stuff until you validate with the IT guy. A 5 byte long password has 5*8 is 40 bits so 2^40 technically possible values or at least 2^32 if you discard the unprintable / aggravating characters that require a long code to input. that is ~4 billion possible passwords. If it locks you out after 3 failures for an hour, that would take a long, long time to crack. For reference, if you only had 1 million combinations, it would take over 15 years on the average (say you lucked into the brute force crack 1/2 way through the combinations).
For local files like a compressed with encryption zip file, the lockout idea is of course not a valid one.