Network Working Group D. Eastlake, 3rd
Request for Comments: 3174 Motorola
Category: Informational P. Jones
Cisco Systems
September 2001
US Secure Hash Algorithm 1 (SHA1)
Status of this Memo
This memo provides information for the Internet community. It does
not specify an Internet standard of any kind. Distribution of this
memo is unlimited.
Copyright Notice
Copyright (C) The Internet Society (2001). All Rights Reserved.
Abstract
The purpose of this document is to make the SHA-1 (Secure Hash
Algorithm 1) hash algorithm conveniently available to the Internet
community. The United States of America has adopted the SHA-1 hash
algorithm described herein as a Federal Information Processing
Standard. Most of the text herein was taken by the authors from FIPS
180-1. Only the C code implementation is "original".
Acknowledgements
Most of the text herein was taken from [FIPS 180-1]. Only the C code
implementation is "original" but its style is similar to the
previously published MD4 and MD5 RFCs [RFCs 1320, 1321].
The SHA-1 is based on principles similar to those used by Professor
Ronald L. Rivest of MIT when designing the MD4 message digest
algorithm [MD4] and is modeled after that algorithm [RFC 1320].
Useful comments from the following, which have been incorporated
herein, are gratefully acknowledged:
Tony Hansen
Garrett Wollman
Eastlake & Jones Informational [Page 1]
RFC 3174 US Secure Hash Algorithm 1 (SHA1) September 2001
Table of Contents
1. Overview of Contents........................................... 2
2. Definitions of Bit Strings and Integers........................ 3
3. Operations on Words............................................ 3
4. Message Padding................................................ 4
5. Functions and Constants Used................................... 6
6. Computing the Message Digest................................... 6
6.1 Method 1...................................................... 6
6.2 Method 2...................................................... 7
7. C Code......................................................... 8
7.1 .h file....................................................... 8
7.2 .c file....................................................... 10
7.3 Test Driver................................................... 18
8. Security Considerations........................................ 20
References........................................................ 21
Authors' Addresses................................................ 21
Full Copyright Statement.......................................... 22
1. Overview of Contents
NOTE: The text below is mostly taken from [FIPS 180-1] and assertions
therein of the security of SHA-1 are made by the US Government, the
author of [FIPS 180-1], and not by the authors of this document.
This document specifies a Secure Hash Algorithm, SHA-1, for computing
a condensed representation of a message or a data file. When a
message of any length < 2^64 bits is input, the SHA-1 produces a
160-bit output called a message digest. The message digest can then,
for example, be input to a signature algorithm which generates or
verifies the signature for the message. Signing the message digest
rather than the message often improves the efficiency of the process
because the message digest is usually much smaller in size than the
message. The same hash algorithm must be used by the verifier of a
digital signature as was used by the creator of the digital
signature. Any change to the message in transit will, with very high
probability, result in a different message digest, and the signature
will fail to verify.
The SHA-1 is called secure because it is computationally infeasible
to find a message which corresponds to a given message digest, or to
find two different messages which produce the same message digest.
Any change to a message in transit will, with very high probability,
result in a different message digest, and the signature will fail to
verify.
Eastlake & Jones Informational [Page 2]
RFC 3174 US Secure Hash Algorithm 1 (SHA1) September 2001
Section 2 below defines the terminology and functions used as
building blocks to form SHA-1.
2. Definitions of Bit Strings and Integers
The following terminology related to bit strings and integers will be
used:
a. A hex digit is an element of the set {0, 1, ... , 9, A, ... , F}.
A hex digit is the representation of a 4-bit string. Examples: 7
= 0111, A = 1010.
b. A word equals a 32-bit string which may be represented as a
sequence of 8 hex digits. To convert a word to 8 hex digits each
4-bit string is converted to its hex equivalent as described in
(a) above. Example:
1010 0001 0000 0011 1111 1110 0010 0011 = A103FE23.
c. An integer between 0 and 2^32 - 1 inclusive may be represented as
a word. The least significant four bits of the integer are
represented by the right-most hex digit of the word
representation. Example: the integer 291 = 2^8+2^5+2^1+2^0 =
256+32+2+1 is represented by the hex word, 00000123.
If z is an integer, 0 <= z < 2^64, then z = (2^32)x + y where 0 <=
x < 2^32 and 0 <= y < 2^32. Since x and y can be represented as
words X and Y, respectively, z can be represented as the pair of
words (X,Y).
d. block = 512-bit string. A block (e.g., B) may be represented as a
sequence of 16 words.
3. Operations on Words
The following logical operators will be applied to words:
a. Bitwise logical word operations
X AND Y = bitwise logical "and" of X and Y.
X OR Y = bitwise logical "inclusive-or" of X and Y.
X XOR Y = bitwise logical "exclusive-or" of X and Y.
NOT X = bitwise logical "complement" of X.
Eastlake & Jones Informational [Page 3]
RFC 3174 US Secure Hash Algorithm 1 (SHA1) September 2001
Example:
01101100101110011101001001111011
XOR 01100101110000010110100110110111
--------------------------------
= 00001001011110001011101111001100
b. The operation X + Y is defined as follows: words X and Y
represent integers x and y, where 0 <= x < 2^32 and 0 <= y < 2^32.
For positive integers n and m, let n mod m be the remainder upon
dividing n by m. Compute
z = (x + y) mod 2^32.
Then 0 <= z < 2^32. Convert z to a word, Z, and define Z = X +
Y.
c. The circular left shift operation S^n(X), where X is a word and n
is an integer with 0 <= n < 32, is defined by
S^n(X) = (X << n) OR (X >> 32-n).
In the above, X << n is obtained as follows: discard the left-most
n bits of X and then pad the result with n zeroes on the right
(the result will still be 32 bits). X >> n is obtained by
discarding the right-most n bits of X and then padding the result
with n zeroes on the left. Thus S^n(X) is equivalent to a
circular shift of X by n positions to the left.
4. Message Padding
SHA-1 is used to compute a message digest for a message or data file
that is provided as input. The message or data file should be
considered to be a bit string. The length of the message is the
number of bits in the message (the empty message has length 0). If
the number of bits in a message is a multiple of 8, for compactness
we can represent the message in hex. The purpose of message padding
is to make the total length of a padded message a multiple of 512.
SHA-1 sequentially processes blocks of 512 bits when computing the
message digest. The following specifies how this padding shall be
performed. As a summary, a "1" followed by m "0"s followed by a 64-
bit integer are appended to the end of the message to produce a
padded message of length 512 * n. The 64-bit integer is the length
of the original message. The padded message is then processed by the
SHA-1 as n 512-bit blocks.
Eastlake & Jones Informational [Page 4]
RFC 3174 US Secure Hash Algorithm 1 (SHA1) September 2001
Suppose a message has length l < 2^64. Before it is input to the
SHA-1, the message is padded on the right as follows:
a. "1" is appended. Example: if the original message is "01010000",
this is padded to "010100001".
b. "0"s are appended. The number of "0"s will depend on the original
length of the message. The last 64 bits of the last 512-bit block
are reserved
for the length l of the original message.
Example: Suppose the original message is the bit string
01100001 01100010 01100011 01100100 01100101.
After step (a) this gives
01100001 01100010 01100011 01100100 01100101 1.
Since l = 40, the number of bits in the above is 41 and 407 "0"s
are appended, making the total now 448. This gives (in hex)
61626364 65800000 00000000 00000000
00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000
00000000 00000000.
c. Obtain the 2-word representation of l, the number of bits in the
original message. If l < 2^32 then the first word is all zeroes.
Append these two words to the padded message.
Example: Suppose the original message is as in (b). Then l = 40
(note that l is computed before any padding). The two-word
representation of 40 is hex 00000000 00000028. Hence the final
padded message is hex
61626364 65800000 00000000 00000000
00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000028.
The padded message will contain 16 * n words for some n > 0.
The padded message is regarded as a sequence of n blocks M(1) ,
M(2), first characters (or bits) of the message.
Eastlake & Jones Informational [Page 5]
RFC 3174 US Secure Hash Algorithm 1 (SHA1) September 2001
5. Functions and Constants Used
A sequence of logical functions f(0), f(1),..., f(79) is used in
SHA-1. Each f(t), 0 <= t <= 79, operates on three 32-bit words B, C,
D and produces a 32-bit word as output. f(t;B,C,D) is defined as
follows: for words B, C, D,
f(t;B,C,D) = (B AND C) OR ((NOT B) AND D) ( 0 <= t <= 19)
f(t;B,C,D) = B XOR C XOR D (20 <= t <= 39)
f(t;B,C,D) = (B AND C) OR (B AND D) OR (C AND D) (40 <= t <= 59)
f(t;B,C,D) = B XOR C XOR D (60 <= t <= 79).
A sequence of constant words K(0), K(1), ... , K(79) is used in the
SHA-1. In hex these are given by
K(t) = 5A827999 ( 0 <= t <= 19)
K(t) = 6ED9EBA1 (20 <= t <= 39)
K(t) = 8F1BBCDC (40 <= t <= 59)
K(t) = CA62C1D6 (60 <= t <= 79).
6. Computing the Message Digest
The methods given in 6.1 and 6.2 below yield the same message digest.
Although using method 2 saves sixty-four 32-bit words of storage, it
is likely to lengthen execution time due to the increased complexity
of the address computations for the { W[t] } in step (c). There are
other computation methods which give identical results.
6.1 Method 1
The message digest is computed using the message padded as described
in section 4. The computation is described using two buffers, each
consisting of five 32-bit words, and a sequence of eighty 32-bit
words. The words of the first 5-word buffer are labeled A,B,C,D,E.
The words of the second 5-word buffer are labeled H0, H1, H2, H3, H4.
The words of the 80-word sequence are labeled W(0), W(1),..., W(79).
A single word buffer TEMP is also employed.
To generate the message digest, the 16-word blocks M(1), M(2),...,
M(n) defined in section 4 are processed in order. The processing of
each M(i) involves 80 steps.
Eastlake & Jones Informational [Page 6]
RFC 3174 US Secure Hash Algorithm 1 (SHA1) September 2001
Before processing any blocks, the H's are initialized as follows: in
hex,
H0 = 67452301
H1 = EFCDAB89
H2 = 98BADCFE
H3 = 10325476
H4 = C3D2E1F0.
Now M(1), M(2), ... , M(n) are processed. To process M(i), we
proceed as follows:
a. Divide M(i) into 16 words W(0), W(1), ... , W(15), where W(0)
is the left-most word.
b. For t = 16 to 79 let
W(t) = S^1(W(t-3) XOR W(t-8) XOR W(t-14) XOR W(t-16)).
c. Let A = H0, B = H1, C = H2, D = H3, E = H4.
d. For t = 0 to 79 do
TEMP = S^5(A) + f(t;B,C,D) + E + W(t) + K(t);
E = D; D = C; C = S^30(B); B = A; A = TEMP;
e. Let H0 = H0 + A, H1 = H1 + B, H2 = H2 + C, H3 = H3 + D, H4 = H4
+ E.
After processing M(n), the message digest is the 160-bit string
represented by the 5 words
H0 H1 H2 H3 H4.
6.2 Method 2
The method above assumes that the sequence W(0), ... , W(79) is
implemented as an array of eighty 32-bit words. This is efficient
from the standpoint of minimization of execution time, since the
addresses of W(t-3), ... ,W(t-16) in step (b) are easily computed.
If space is at a premium, an alternative is to regard { W(t) } as a
Eastlake & Jones Informational [Page 7]
RFC 3174 US Secure Hash Algorithm 1 (SHA1) September 2001
circular queue, which may be implemented using an array of sixteen
32-bit words W[0], ... W[15]. In this case, in hex let
MASK = 0000000F. Then processing of M(i) is as follows:
a. Divide M(i) into 16 words W[0], ... , W[15], where W[0] is the
left-most word.
b. Let A = H0, B = H1, C = H2, D = H3, E = H4.
c. For t = 0 to 79 do
s = t AND MASK;
if (t >= 16) W[s] = S^1(W[(s + 13) AND MASK] XOR W[(s + 8) AND
MASK] XOR W[(s + 2) AND MASK] XOR W[s]);
TEMP = S^5(A) + f(t;B,C,D) + E + W[s] + K(t);
E = D; D = C; C = S^30(B); B = A; A = TEMP;
d. Let H0 = H0 + A, H1 = H1 + B, H2 = H2 + C, H3 = H3 + D, H4 = H4
+ E.
7. C Code
Below is a demonstration implementation of SHA-1 in C. Section 7.1
contains the header file, 7.2 the C code, and 7.3 a test driver.
7.1 .h file
/*
* sha1.h
*
* Description:
* This is the header file for code which implements the Secure
* Hashing Algorithm 1 as defined in FIPS PUB 180-1 published
* April 17, 1995.
*
* Many of the variable names in this code, especially the
* single character names, were used because those were the names
* used in the publication.
*
* Please read the file sha1.c for more information.
*
*/
Eastlake & Jones Informational [Page 8]
RFC 3174 US Secure Hash Algorithm 1 (SHA1) September 2001
#ifndef _SHA1_H_
#define _SHA1_H_
#include