Friday, October 31, 2008

An inconsistency between explicit and implicit hashing when signing in Java security?

For the connection to a certain other system within our network, the program I'm working on needs to verify that it indeed is what it claims to be: an authorized client. A common way to accomplish is through PKI: it signs the message it sends using a private key, and the other system can verify this signature using the corresponding public key. See e.g. this article for an explanation of how this works.

In our case, there are three steps in signing a message:
  • calculation of the message digest through a hashing algorithm,
  • calculation of the digital signature using the private key, and
  • coding the result to base64.
The last step is not part of the normal signing process, but we need to send the result as a string inside an XML message. Using the 'raw' signature would result in weird characters in the XML, very likely choking up the parser.

As I was coding away, I was lulled into performing each of these steps separately, so I started off with implementing the hashing using the java.security.MessageDigest class. I instantiated it with the "SHA-1" algorithm and simply called the digest method with the message to obtain
its hash. Pretty straightforward stuff.

Then I turned to the java.security.Signature class to supply me with the subsequent signing functionality. It occurred to me that there are algorithm choices that include hashing algorithm names, so I quickly found out that it is possible to let the Signature class take care of
both the first and second step of my signing process. While that struck me as quite convenient, I decided to stick to the original plan and not use the hashing possibility here. I chose the "NONEwithRSA" algorithm, and after feeding the message and private key the sign method provided me with an answer.

Then I encoded it in base64 (using the Apache Commons Codec library, which I also could have used for the hashing functionality) and presto! So I thought, at least...

But then...

The first test we performed immediately indicated something was wrong. And after checking everything else (like making sure code page encodings were correct and what not) we came to the conclusion that the signature itself had to be the culprit.

So I decide to put the 'my' way of creating a signature side by side with the signing method using the implicit hashing possibility, to see whether there might be a difference in the outcome:
public void test(byte[] data, PrivateKey privateKey) throws Exception {
   // Explicit hash and separate signing:
   byte[] hashedData = MessageDigest.getInstance("SHA-1").digest(data);
   byte[] signedData = signData(hashedData, privateKey, "NONEwithRSA");

   // Signing with implicit hashing:
   byte[] signedHashedData = signData(data, privateKey, "SHA1withRSA");

   System.out.println("Encoded data (explicit hashing) = "
           + new String(Base64.encodeBase64(signedData)));
   System.out.println("Encoded data (implicit hashing) = "
           + new String(Base64.encodeBase64(signedHashedData)));
}

private static byte[] signData(byte[] data, PrivateKey privateKey, String algorithm) throws Exception {
   Signature signature = Signature.getInstance(algorithm);
   signature.initSign(privateKey);
   signature.update(data);
   return signature.sign();
}
And even though:
  • from the code it seems that both paths should lead to the same result: no other configuration than the algorithm names are given, so all else should be default, and
  • from the Java security documentation for the signing options "NONEwithRSA" is stated to 'not use a digesting algorithm', so it should act as "SHA1withRSA" without the SHA-1 hashing,
there definitely is a difference between the outcomes!

In our situation, the implicit hashing turned out to deliver the correct result (at least, with regards to what the other system expected), so a minimal code change (getting rid of the explicit hashing step) did the trick. We use an external configuration file to set the signing algorithm through a system property, so changing that was easy.

What's causing this?

Now why is there a difference between the two approaches? I've tried to find an answer using Google, but that quest didn't turn up any answers.
So I did what any self-respecting developer would do: step through the implementation in a debugger. Unfortunately my toolkit didn't allow me to see everything I wanted to; however I could see that the input of the signing step was identical in both cases and the same implementation for signing is used under the hood (sun.security.rsa.RSACore). I cannot see what is happening with respect to internediate padding of the byte arrays, however, so I'm guessing that the 'default' settings of the two approaches - driven by different SignatureSpi implementations - differ in this respect.

If anyone could point out the actual difference to me, that would be greatly appreciated. For now I'll have to be content with knowing that the two approaches do return different results and that picking one at random may lead to problems...

1 comment:

  1. To anyone who would get stuck on this question, the problem is that, when you hash with MessageDigest, the bytes returned are not the raw hash, but a DERObject (DigestInfo) which contains another field. A solution is available here http://stackoverflow.com/questions/33305800/difference-between-sha256withrsa-and-sha256-then-rsa/33311324#33311324.

    ReplyDelete