From: Tran Ngoc Duong on
THE BLOCK CIPHER NSABC
(Public domain)

Alice Nguyenova-Stepanikova

Tran Ngoc Duong <tranngocduong at gmail dot com> (corresponding
author)

27th July, 2010


Abstract. We introduce NSABC/w (Near-Skipjack Algebraic Block
Cipher
operating on w-bit word arithmetics), a 4w-bit analogous of
Skipjack
[NSA98] with 5w-bit key. The Skipjack's inner 4-round Feistel
structure is
replaced with a w-bit, 2-round cascade of the binary operation
(x,z) |-> (2*x*z + (1-2*e)*(x-z+e)) <<< (w/2), partially
encrypting a data
word x under control of a key word z and a round-dependent
constant word e,
allowing efficient decryption. Furthermore we introduce an
optional secret
4w-bit "tweak", i.e. a randomization parameter, to facilitate
applications
such as disk encryption and hashing, that is in turn derrived from
the
block index and a 4w-bit [additional] key. With w=64, on a modern
64-bit
processor, encryption takes about 5 clock cycles per byte and,
when tweaking
is unused, only 4.4 clock cycles per byte.




1. Introduction

In the today's world full of cryptographic algorithms, one may cast
question:
"What should makes a block cipher attractive to cryptographers?"

It is the authors' view that the answer is one word: elegance. If
something
looks elegant, there is a good chance that it is also good.

An elegant specification makes it easier to memorize. Memorizability
makes it
easier to analyze, thus allows for fruitful cryptanalytic results.
Results make
greater confidence in the algorithm.

The elegance comprises the following features:

-- Few algebraic operations. Using of many operations such as tables
of
constants, S-boxes, data-dependent transpositions, data-dependent
operator
selections etc results in "wild", hardly-tractable and possibly
undesirable
interactions between operations.

-- Simple and regular key schedule. A complex key schedule, such as
bit
transpositions, linear pseudo-random number generators, non-linear
operations
etc, which effectively adds another, complicated and un-related,
function to the
cipher, results in hardly-tractable and possibly undesirable
interactions
between the functions.

IDEA, a secure block cipher designed by Xuejia Lai and James L. Massey
[LM91],
besides being elegant, is outstanding for other features:

-- The use of "incompatible" operations. The incompatibility helps to
exclude
any [undesired] algebraic property and makes it impossible to solve
the cipher
algebraically.

-- The use of multiplication. Multiplication is more powerful than
addition and
thus greatly contributes to the security of the algorithm.

However, IDEA uses multiplication modulo p, where p is a prime of the
form 2**w
+ 1 which does not exist for w=32 or w=64, making it not extendable to
larger
machine word lengths. Furthermore, the key schedule uses a rotation on
the
[primary] key, making it irregular as some key bits are re-used much
more slowly
than others.

Skipjack, a secure block cipher designed by cryptographers of the U.S.
National
Security Agency [NSA98], is another example of elegant design.
Furthermore
Skipjack is outstanding for the following feature:

-- The use of two ciphers, an *outer* (wrapper) cipher, and an *inner*
(core)
cipher. As explained by the authors of MARS [IBM98], a block cipher
with similar
design principle, this two-layer structure breaks any iterative
property, makes
any iterative characteristic impossible, and complements eventual
weakness(s) in
either layer by the other one, thus making attacks more difficult. The
wrapper
is designed for fast diffusion and the core for strong confusion.
Here
*diffusion* refers to the process of making one data or key bit to
affect many
data bits in very different ways, *confusion* refers to the process of
making
one data bit to be affected by many data or key bits in very different
ways. The
terms are first used by Claude E. Shannon in his pioneer work
[Shannon49].

However, Skipjack uses S-boxes, making it slow, hard to implement in a
secure
manner, and not extendable to larger machine word lengths, as such.

This article describes an atempt to combine the ideas learned from
IDEA and
Skipjack, where Skipjack is used as a structural framework, into a
scalable and
tweakable block cipher called NSABC (Near-Skipjack Algebraic Block
Cipher).

In cipher design there is always a trade-off between security and
efficiency,
and designers always have to ask "What the users want, a very strong
and
relatively fast cipher, or relatively strong but very fast?"

If Skipjack is regarded as very secure but not very fast, then NSABC
may be
regarded as an atempt to re-use Skipjack with stress on the second
aspect: make
it very fast, but probably not very secure. For 64-bit word length,
NSABC is
designed to use a 320-bit key and an additional 256-bit "tweak" key,
but the
true level of security is yet to be determined. On the other hand, it
takes only
5 clock cycles per byte (and only 4.4 clock cycles per byte with
disabled
tweaking) on a modern 64-bit processor.

NSABC is not patented and is available for public use. As it re-uses
the
[overall] structure of Skipjack, excepting some [cosmetical] word re-
numbering,
eventual users should be aware of Skipjack patent(s) which may be held
by the
U.S. Government. We are not aware of eventual patent(s) related to
other parts
of the design.

The rest of the article is organized as follows. Section 2 defines
operations
over machine words, mainly a multiplication-based group operation over
w-bit
numbers. Section 3 specifies the algorithm. Section 4 suggests an
efficient
implementation. Section 5 concludes the article.




2. Operation over w-bit words

Let +, -, *, ~, ^ denote two-complement arithmetic/logic operations
with the
conventional (C programming languague) meanings.

Let's define binary operations "." and "o" as follows.

x . y = 2*x*y + x + y

x o y = 2*x*y + x - y

Note that the bivariate polynomials x.y and x o y are permutation
polynomials on
either variable for every fixed value of the other variable
[Rivest99]. In other
words, "." and "o" are quasi-group operations.

Furthermore, it is easy to see that "." is an abelian group operation
over the
set of w-bit numbers. (Here and later, uncited facts are from the
authors'
unpublished work, but they are trivial so the proofs are omitted.) The
group
is isomorphic to the multiplicative group of odd integers modulo 2**(w
+1). The
isomorphism is

x |-> 2*x + 1 (mod 2**(w+1))

The unit of the group is 0.

The inverse element of x, denoted x', is

x' = - x/(2*x + 1)

where "/" denotes the inverse operation of "*", i.e. modular
multiplication
by the inverse of the denominator, which indeed exists for every x.

The following relations hold.

(~x) . y = ~(x . y)

(1-x) o y = 1 - (x o y)

x . y = -((-x) o y)

x o y = -((-x) . y)

x . a = b if and only if x = b . a'

x o a = b if and only if x = b o a'

Let e is an arbitrary fixed w-bit integer. Let's define trivariate
polynomials
p and q as follows.

p(x,y,e) = (x-e).(y-e) + e

= 2*x*y + (1 - 2*e)*(x + y - e)

q(x,y,e) = -p(-x,y,e)

= 2*x*y + (1 - 2*e)*(x - y + e)

Then p and q are permutation polynomials on either variable while
keeping the
others fixed [Rivest99]. Thus the binary operations "(.)" and "(o)"
defined by

x (.) y = p(x,y,e)

x (o) y = q(x,y,e)

are quasigroup operations. Note that the previously defined operations
"." and
"o" are instances of "(.)" and "(o)", respectively, with e = 0:

x . y = p(x,y,0)

x o y = q(x,y,0)

Furthermore, "(.)" is an abelian group operation over the set of w-bit
numbers
that is isomorphic to the group "." defined above. The isomorphism is

x |-> x - e

The unit of this group is e. The inverse element of x, denoted x('),
is

x(') = (x - e)' + e

= ((2*e - 1)*x - 2*e*(e-1)) / (2*(x-e) + 1)

The following relations hold.

x (.) y = -((-x) (o) y)

x (o) y = -((-x) (.) y)

x (.) y = (x-e) . (y-e) + e

x (o) y = (x+e) o (y-e) - e

x (.) a = b if and only if x = b (.) a(')

x (o) a = b if and only if x = b (o) a(')

The operation "(o)" will be used for both encryption and decryption
thanks to
the following relation.

(x (o) z) (o) z(') = x




3. Specification

This section provides details of NSABC/w, where w is even.

Throughout this section, X denotes a 4w-bit plaintext block, Y a 4w-
bit
ciphertext block, Z a 5w-bit key, T a 4w-bit secret *tweak*, i.e., a
value that
is used to encrypt only one block under the key, and E a 64w-bit
public *bias*,
i.e. a constant value that will be added to / subtracted from the
intermediate
data words and key words before and after every multiplication.

NSABC is specified in terms of two functions:

ENCRYPT(X,Z,T,E), which encrypts X under control of Z, T and E,

DECRYPT(Y,Z,T,E), which decrypts Y under control of Z, T and E.

Relation:
DECRYPT( ENCRYPT(X,Z,T,E),Z,T,E ) = X

Tweaking is optional and can be disabled by letting T constantly zero.
With
disabled tweaking, NSABC becomes a conventional, non-tweakable, block
cipher.

E is declared as a parameter, abeit a constant with well-defined value
in
this specification, to indicate the possibility of e.g.
personalization, i.e.
making other, "non-standard", variants of the algorithm.




3.1. Word and byte order

A multi-component vector is written as a string in the usual comma-
separated
notation, i.e. the leftmost component being the first one. For
example,
(x0,x1,x2).

A multi-digit number is written as a string in the usual notation,
i.e. the
rightmost digit being the least significant one. For example, x2 x1
x0.

The same data entity is taken sometimes as a number and sometimes as a
vector.
To switch between these meanings, memory addressing is a common
terminology
framework. Vectors are stored in low-first order, i.e. the first
component is
stored at lowest address. Numbers are stored in memory in low-first
order, i.e.
the least significant digit is stored at lowest address. Thus for
example,
if V = (x0,x1,x2) and N = x2 x1 x0 then V = N.

The terms "component" and "digit" above both refer to [machine] words.
For
multi-byte words we assume little-endian byte order. Thus, a
"standard"
representative of NSABC is fully determined by a single parameter: the
word
length w.




3.2. Operations on word strings

Let R denote the transposition that reverses the word order of a word
string.
For example, for 8-bit words, 0x0123AB R = 0xAB2301.

Let S denote the transposition that swaps the high and low halves of
every word
of a word string. For example, for 8-bit words, 0x0123AB S =
0x1032BA.

Let ' denote the word-wise application of the inverse operator "'" on
a word
string. For example,

(a,b,c)' = (a',b',c').

The following relations hold.

xRS = xSR

xR' = x'R

Unless otherwise specified, operators "+" and "-" applied on multi-
word strings
denote word-wise modular addition and subtraction, respectively, i.e.
operands
are regarded as vectors of words. For example,

(a0,a1,a2) + (b0,b1,b2) = (a0+b0, a1+b1, a2+b2)

The following relations hold.

(x + y)R = xR + yR

(x - y)R = xR - yR




3.3. Functions ENCRYPT

This function is specified in terms of three functions: DE, a data
encryption
function; KE, a key expansion function; and TE, a tweak expansion
function.

Input:
X...4w-bit plaintext block
Z...5w-bit key
T...4w-bit tweak
E...64w-bit bias

Output:
Y...4w-bit ciphertext block

Relation:
ENCRYPT(X,Z,T,E) = DE(X, KE(Z), TE(T), E)




3.4. Functions DE, KE and TE

These functions operate on a 4-word data register (x0,x1,x2,x3) and
use a
5-word key register (z0,z1,z2,z3,z4) and a 4-word tweak register
(t0,t1,t2,t3).
The key register is initially loaded with the key Z, in low-first
order. The
tweak register is initially loaded with the tweak T, in low-first
order. The
data register is initially loaded with the plaintext block X in low-
first order,
and after encryption it contains the ciphertext block Y in that
order.

These functions consist of 32 rounds of operations. In each round, a
key-
dependent permutation G applies on the word x0, an exclusive-or (XOR)
operation
"mixes" x0 and t0 into another word which is either x1 or x3 (see
further).
There are two types of rounds: A and B. In an A-typed round, the
mixing takes
place after G and targets x1; in a B-typed round, mixing takes place
before G
and targets x3. In each round, the pair of key words (z0,z3) enter G-
box in that
order.

At the end of every round there is a rotation on the data register
that
circularly exchanges value of the four data words, and other rotations
that do
similarly for the key register and the tweak register.

Encryption runs in four passes: 8 rounds of type A, then 8 rounds of
type B,
then 8 rounds of type A again, then 8 rounds of type B again.



KE (key expansion)

Input:
Z...5w-bit key

Output:
K...64w-bit expanded key

Pseudo-code:
(z0,z1,z2,z3,z4) := Z
for k := 0, 1, 2,..., 31 loop
K(2*k) := z0
K(2*k+1) := z3
(z0,z1,z2,z3,z4) := (z1,z2,z3,z4,z0)
end loop


TE (tweak expansion)

Input:
T...4w-bit tweak

Output:
L...32w-bit expanded tweak

Pseudo-code:
(t0,t1,t2,t3) := T
for k := 0, 1, 2,..., 31 loop
L(k) := t0
(t0,t1,t2,t3) := (t1,t2,t3,t0)
end loop


DE (data encryption)

Input:
X...4w-bit plaintext block
K...64w-bit expanded key
L...32w-bit expanded tweak

Output:
Y...4w-bit ciphertext block

Pseudo-code:
(x0,x1,x2,x3) := X
for k := 0, 1, 2,..., 31 loop
if 0 <= k <= 7 or 16 <= k <= 23 then
x0 := G( x0, (K(2*k), K(2*k+1)), (E(2*k), E(2*k+1)) )
x1 := x1 ^ x0 ^ L(k)
elsif 8 <= k <= 15 or 24 <= k <= 31 then
x3 := x3 ^ x0 ^ L(k)
x0 := G( x0, (K(2*k), K(2*k+1)), (E(2*k), E(2*k+1)) )
end if
(x0,x1,x2,x3) := (x1,x2,x3,x0)
end loop
Y := (x0,x1,x2,x3)

Relations:
(x0(0),x1(0),x2(0),x3(0)) = X
(z0(0),z1(0),z2(0),z3(0),z4(0)) = Z
(t0(0),t1(0),t2(0),t3(0)) = T
(E(0),E(1),...,E(63)) = E
Y = (x0(32),x1(32),x2(32),x3(32))
for 0 <= k <= 7 or 0 <= k <= 23:
x0(k+1) = x1(k) ^ g(k) ^ L(k)
x1(k+1) = x2(k)
X2(k+1) = x3(k)
x3(k+1) = g(k)
for 8 <= k <= 15 or 24 <= k <= 31:
x0(k+1) = x1(k)
x1(k+1) = x2(k)
x2(k+1) = x3(k) ^ x0(k) ^ L(k)
x3(k+1) = g(k)
for 0 <= k <= 31:
g(k) = G(x0(k), (K(2*k), K(2*k+1)), (E(2*k), E(2*k+1)))
K(2*k) = z0(k)
K(2*k+1) = z3(k)
z0(k+1) = z1(k)
z1(k+1) = z2(k)
z2(k+1) = z3(k)
z3(k+1) = z4(k)
z4(k+1) = z0(k)
L(k) = t0(k)
t0(k+1) = t1(k)
t1(k+1) = t2(k)
t2(k+1) = t3(k)
t3(k+1) = t0(k)




3.5. Function G

Function G, which takes as arguments a w-bit intermediate plaintext
word x and a
pair of w-bit subkey words (k0,k1) to return a w-bit intermediate
ciphertext
word, is a 2-round cascade of the quasi-group operation "(o)", each
followed
by a swap of half-words. The first "(o)" uses unit e0, the second
"(o)" uses
unit e1.

In the relations below, the quasi-group operation is written in terms
of the
trivariate polynomial q, rather than "(o)", to specify the unit
element.

Relations:
G(x,(k0,k1),(e0,e1)) = q( q(x,k0,e0)S, k1, e1 )S




3.6. Function DECRYPT

DECRYPT can be easily derrived from the (implicit) relation with
ENCRYPT. Here
it is given explitly in terms of functions DE, KE and TE to emphasize
similarity
between encryption and decryption. Unlike Skipjack, DECRYPT cannot be
expressed
explicitly in terms of ENCRYPT. That's why the cipher is termed "near
Skipjack".

Relations:
DE( DE(X, KE(Z), TE(T), E)RS, (KE(ZR) - ER)' + ER, TE(TRS), ER )RS
= X
DECRYPT(Y,Z,T,E) = DE( YRS, (KE(ZR) - ER)' + ER, TE(TRS), ER )RS




3.7. The bias E

The 64w-bit constant E, which consist of units of the group under the
operation
"(.)" in function G, and which is used to bias the data words and key
words
from zero, may be used to personalize the algorithm. Every choice of E
makes
a unique tweakable block cipher. As a "standard" choice, we propose
the first
64 numbers (modulo 2**w) of the Fibonacci sequence.

Relations:
E(0) = 1
E(1) = 2
E(k) = E(k-1) + E(k-2) for k = 2,3,4,...,63.




3.8. Tweak counter mode

The 4w-bit secret tweak T is used to encrypt only one block [under a
fixed key
Z]. To encrypt multiple blocks the tweak is derrived from a 4w-bit
[additional]
key, called *tweak key*, and the block index i, as follows.

Let T(i) denote the tweak used to encrypt i-th block. For zero-th
block, the
tweak key is used as the tweak directly:

T(0) = tweak key

Successive tweak is computed from the last-used tweak by the recurrent
relation:

T(i) = T(i-1) + 2*T(0) + 1

or, explicitly,

T(i) = T(0) . i

where all operators, including ".", are defined on 4w-bit modular
arithmetics
and all operands, including 1 and i, are regarded as 4w-bit numbers.




4. Notes on implementation

(1) The polynomial q in function G can be evaluated using only one
multiplication and one addition, with a modified key schedule. Indeed,

q(x,z,e) = M*x + N

where x is a data word, z key word, e the unit of the [quasi-]group in
use, and

M = 2*(z - e) + 1

N = (2*e - 1)*(z - e)

is a pair of subkeys precomputed in the [modified] key schedule, which
consists
of 64 such pairs.

(2) The cipher is partially parallelizable. The following procedure
executes
all 32 rounds in 20 steps, of which half performing two or three
parallel
evaluations of the function G. Recall that g(k) is the result of G in
round k.

1. Compute g(0)
2. Compute g(1)
3. Compute g(2)
4. Compute g(3)
5. Compute g(4)
6. Compute g(5), g(11) in parallel
7. Compute g(6), g(9) in parallel
8. Compute g(7), g(10), g(13) in parallel
9. Compute g(8), g(14) in parallel
10. Compute g(12), g(15) in parallel
11. Compute g(16)
12. Compute g(17)
13. Compute g(18)
14. Compute g(19)
15. Compute g(20)
16. Compute g(21), g(27) in parallel
17. Compute g(22), g(25) in parallel
18. Compute g(23), g(26), g(29) in parallel
19. Compute g(24), g(30) in parallel
20. Compute g(28), g(31) in parallel

On a modern 64-bit processor, each step should take about 8 clock
cycles on
average. So, for w=64 i.e. 32-byte blocks it achieves 20*8/32 = 5
clock cycles
per byte encrypted. If tweak counter mode is not used we may disable
tweaking
completely to save about 1 clock cycle per byte per step, resulting in
20*7/32
= 4.4 clock cycles per byte.




5. Conclusion

We introduced NSABC, a Skipjack-like block cipher that is defined for
every
even word length, allowing scalability in block length and key
length.

NSABC is meant to be simple. It uses no hard-to-memorize S-boxes. It
makes use
only two word-oriented algebraic operations that are mutually
incompatible: "^"
and "(o)". It re-uses the overall structure of Skipjack that has been
made public
for over a decade -- sufficient to become truly understood. The
simplicity makes
it easier to analyze, which in turn makes greater confidence in the
algorithm.

NSABC uses multiplication which is much more powerful than addition
and should
greatly contribute to the security of the algorithm.

NSABC bases on some valuable design of a well-reputed agency in the
branch. We
therefore believe that it is worth analysis and it can withstand
rigorous
analysis. If this happens to be true, then we may have a strong cipher
with
256-bit blocks, allowing to encrypt enormous amount of data under the
same key,
and with 320-bit keys, allowing to keep data secret over any
imaginable time.

NSABC is fast enough to be comparable with any modern block cipher.




6. References

[IBM98] Burwick C., Coppersmith D., D'Avignon E., Gennaro G.,
Halevi S.,
Jutla C., Matyas S. M. Jr., O'Connor L., Peyravian M.,
Safford D.,
Zunic N.: MARS - a candidate cipher for AES. 1998.

[LM91] Lai X., Massey J. L., Murphy S.: Markov Ciphers and
Differential
Cryptanalysis. 1991.

[NSA98] National Security Agency: Skipjack and KEA
specification. 1998.

[Rivest99] Rivest, R. L.: Permutation polynomials modulo 2**w.
1999.

[Shannon49] Shannon C. E.: Communication Theory of Secrecy Systems.
1949.
From: Tom St Denis on
On Jul 25, 2:56 pm, Tran Ngoc Duong <tranngocdu...(a)gmail.com> wrote:
> THE BLOCK CIPHER NSABC
> (Public domain)

I haven't read your spec because it was line-wrapped as you posted it
to usenet [hint: get a website to post this, or turn it into a PDF and
post it ...]

But

- Lack of distinct pseudo code
- Lack of vectors

Makes me very dis-interested in pursuing it further.

Tom
From: Mok-Kong Shen on
Tran Ngoc Duong wrote:

> An elegant specification makes it easier to memorize.

This is true. However, easiness is a relative concept. Are
you very sure that the majority of people would agree that
e.g. what you showed as pseudo-code are easy to memorize for
them and hence could start rightaway from the brain to code
your encryption algorithm?

M. K. Shen
From: Tran Ngoc Duong on
On Jul 26, 7:54 pm, Tom St Denis <t...(a)iahu.ca> wrote:
> On Jul 25, 2:56 pm, Tran Ngoc Duong <tranngocdu...(a)gmail.com> wrote:
>
> > THE BLOCK CIPHER NSABC
> > (Public domain)
>
> I haven't read your spec because it was line-wrapped as you posted it
> to usenet [hint: get a website to post this, or turn it into a PDF and
> post it ...]
>

Thank you for the hint. I'll add schemata, vectors, reformat and post
it again to Usenet.

> But
>
> - Lack of distinct pseudo code
> - Lack of vectors
>
> Makes me very dis-interested in pursuing it further.
>

I'm sorry for dumb question but as English is not my nature language,
what is "distinct pseudo code"?
From: Tran Ngoc Duong on
On Jul 26, 8:31 pm, Mok-Kong Shen <mok-kong.s...(a)t-online.de> wrote:
> Tran Ngoc Duong wrote:
> > An elegant specification makes it easier to memorize.
>
> This is true. However, easiness is a relative concept. Are
> you very sure that the majority of people would agree that
> e.g. what you showed as pseudo-code are easy to memorize for
> them and hence could start rightaway from the brain to code
> your encryption algorithm?
>
I'm sure most people can start rightaway from the brain to think of
(understand, analyze, attack,...) it. But I think most people must
read some documents (pseudo-code, formulars, schemata, implementation
notes,...) in order to implement anything.

Anyway you're right that pseudo-code may not be the best way to ease
memory. Schemata are.