Galois fields

A pre-requisite to the consideration of multiplication is addition.  Then, multiplication is just successive addition. Hence, you easily can work it out.

To construct a multiplication table for, we need to know the specific ring and base that one desires,  For instance, the last time that I was requested to construct a multiplication table, was for a Galois field of order 19, to the base 8 (octal).

I have misplaced that table. I will have to work it out from scratch:

  1. First, in octal, the order becomes 23.
  2. Second, in each row (or column) since no number repeats, each of the 22 numbers must appear exactly once.
  3. Third, the table is symmetric about the principal diagonal.
  4. Fourth, the next column beyond the right-hand edge is all zeros.
  5. Fifth, do not be ashamed to employ your fingers!
  6. There are several other patterns.  Each of these patterns helps check upon your work, as you go.

xx 01 02 03 04 05 06 07 10 11 12 13 14 15 16 17 20 21 22
91 01 02 03 04 05 06 07 10 11 12 13 14 15 16 17 20 21 22
02 02 04 06 10 12 14 16 20 22 01 03 05 07 11 13 15 17 21
03 03 06 11 14 17 22 02 05 10 13 16 21 01 04 07 12 15 20
04 04 10 14 20 01 05 11 15 21 02 06 12 16 22 03 07 13 17
05 05 12 17 01 06 13 20 02 07 14 21 03 10 15 22 04 11 16
06 06 14 22 05 13 21 04 12 20 03 11 17 02 10 16 01 07 15
07 07 16 02 11 20 04 13 22 06 15 01 10 17 03 12 21 05 14
10 10 20 05 15 02 12 22 07 17 04 14 01 11 21 06 16 03 13
11 11 22 10 21 07 20 06 17 05 16 04 15 03 14 02 13 01 12
12 12 01 13 02 14 03 15 04 16 05 17 06 20 07 21 10 22 11
13 13 03 16 06 21 11 01 14 04 17 07 22 12 02 15 05 20 10
14 14 05 21 12 03 17 10 01 15 06 22 13 04 20 11 02 16 07
15 15 07 01 16 10 02 17 11 03 20 12 04 21 13 05 22 14 06
16 16 11 04 22 15 10 03 21 14 07 02 20 13 06 01 17 12 05
17 17 13 07 03 22 16 12 06 02 21 15 11 05 01 20 14 10 04 
20 20 15 12 07 04 01 21 16 13 10 05 02 22 17 14 11 06 03
21 21 17 15 13 11 07 05 03 01 22 20 16 14 12 10 06 04 02
22 22 21 20 17 16 15 14 13 12 11 10 07 06 05 04 03 02 01

This is the multiplication table for GF-19, in octal notation.

Table of reciprocals (that is, multiplicative inverses):
01 02 03 04 05 06 07 10 11 12 13 14 15 16 17 20 21 22
01 12 15 05 04 20 13 14 21 02 07 10 03 17 16 06 11 22.

Of course, one needs the addition table, as well.

++ 00 01 02 03 04 05 06 07 10 11 12 13 14 15 16 17 20 21 22
00 00 01 02 03 04 05 06 07 10 11 12 13 14 15 16 17 20 21 22
01 01 02 03 04 05 06 07 10 11 12 13 14 15 16 17 20 21 22 00
02 02 03 04 05 06 07 10 11 12 13 14 15 16 17 20 21 22 00 01
03 03 04 05 06 07 10 11 12 13 14 15 16 17 20 21 22 00 01 02
04 04 05 06 07 10 11 12 13 14 15 16 17 20 21 22 00 01 02 03
05 05 06 07 10 11 12 13 14 15 16 17 20 21 22 00 01 02 03 04
06 06 07 10 11 12 13 14 15 16 17 20 21 22 00 01 02 03 04 05
07 07 10 11 12 13 14 15 16 17 20 21 22 00 01 02 03 04 05 06
10 10 11 12 13 14 15 16 17 20 21 22 00 01 02 03 04 05 06 07
11 11 12 13 14 15 16 17 20 21 22 00 01 02 03 04 05 06 07 10
12 12 13 14 15 16 17 20 21 22 00 01 02 03 04 05 06 07 10 11
13 13 14 15 16 17 20 21 22 00 01 02 03 04 05 06 07 10 11 12
14 14 15 16 17 20 21 22 00 01 02 03 04 05 06 07 10 11 12 13
15 15 16 17 20 21 22 00 01 02 03 04 05 06 07 10 11 12 13 14
16 16 17 20 21 22 00 01 02 03 04 05 06 07 10 11 12 13 14 15
17 17 20 21 22 00 01 02 03 04 05 06 07 10 11 12 13 14 15 16
20 20 21 22 00 01 02 03 04 05 06 07 10 11 12 13 14 15 16 17
21 21 22 00 01 02 03 04 05 06 07 10 11 12 13 14 15 16 17 20
22 22 00 01 02 03 04 05 06 07 10 11 12 13 14 15 16 17 20 21

Table of additive inverses:
00 01 02 03 04 05 06 07 10 11 12 13 14 15 16 17 20 21 22
00 22 21 20 17 16 15 14 13 12 11 10 07 06 05 04 03 02 01.

000000000000000000000000000000000000000000000000000000

Let us be a little more systematic.

Without proof, I assert the well-known theorem:  For any prime number n, and any natural number p, there exists a unique field GF-n^p -- called Galios field -- of order n^p.  Furthermore, these are the only finite fields.

I will illustrate these fields for the first several primes, with p = 1.  The Galois fields with p > 1 are an entirely different breed of animal. We will ignore them here.

Zero multiplied by anything is zero. In particular, zero times zero is zero.  Hence, since multiplication by zero is not interesting, by convention, we omit zero from the multiplication tables.

GF-1. The only element is zero. Hence, there is nothing to show for the multiplication table.

GF-2. There are only the elements 0 and 1. The multiplication table is
x 1
1 1.
Table of reciprocals:
1
1.
The addition table is
+ 0 1
0 0 1
1 1 0.
Table of additive inverses:
0 1
0 1.

GF-3. The elements are 0, 1, and 2. The multiplication table is
x 1 2
1 1 2
2 2 1.
Table of reciprocals:
1 2
1 2.
The addition table is
+ 0 1 2
0 0 1 2
1 1 2 0
2 2 0 1.
Table of additive inverses:
0 1 2
0 2 1.

Four is a composite number. Hence, there is no GF-4.  However, there is a GF-2^2, which we ignore.

GF-5. The elements are 0, 1, 2, 3, and 4. The multiplication table is
x 1 2 3 4
1 1 2 3 4
2 2 4 1 3
3 3 1 4 2
4 4 3 2 1.
Table of reciprocals:
1 2 3 4
1 3 2 4.
The remaining addition tables are left as an exercise for the reader.

Six is a composite number. Hence, there is no GF-6..

GF-7. The elements are 0, 1, 2, 3, 4, 5, and 6. The multiplication table is
x 1 2 3 4 5 6
1 1 2 3 4 5 6
2 2 4 6 1 3 5
3 3 6 2 5 1 4
4 4 1 5 2 6 3
5 5 3 1 6 4 2
6 6 5 4 3 2 1.
Table of reciprocals:
1 2 3 4 5 6
1 4 5 2 3 6.

Eight, nine, and ten are composite numbers. Hence, there is neither a GF-8, GF-9, nor a GF-10.  However, there is a GF-2^3 and a GF-3^2, which we ignore.

GF-11. The elements are 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, and a. The multiplication table is
x 1 2 3 4 5 6 7 8 9 a
1 1 2 3 4 5 6 7 8 9 a
2 2 4 6 8 a 1 3 5 7 9
3 3 6 9 1 4 7 a 2 5 8
4 4 8 1 5 9 2 6 a 3 7
5 5 a 4 9 3 8 2 7 1 6
6 6 1 7 2 8 3 9 4 a 5
7 7 3 a 6 2 9 5 1 8 4
8 8 5 2 a 7 4 1 9 6 3
9 9 7 5 3 1 a 8 6 4 2
a a 9 8 7 6 5 4 3 2 1.
Table of reciprocals:
1 2 3 4 5 6 7 8 9 a
1 6 4 3 9 2 8 7 5 a.
Please observe that we had run out of the integer glyphs. Hence, we began with the alphabetic glyphs.

Twelve is a composite number. Hence, there is no GF-12.

GF-13. The elements are 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, b, and c. The multiplication table is
x 1 2 3 4 5 6 7 8 9 a b c
1 1 2 3 4 5 6 7 8 9 a b c
2 2 4 6 8 a c 1 3 5 7 9 b
3 3 6 9 c 2 5 8 b 1 4 7 a
4 4 8 c 3 7 b 2 6 a 1 5 9
5 5 a 2 7 c 4 9 1 6 b 3 8
6 6 c 5 b 4 a 3 9 2 8 1 7
7 7 1 8 2 9 3 a 4 b 5 c 6
8 8 3 b 6 1 9 4 c 7 2 a 5
9 9 5 1 a 6 2 b 7 3 c 8 4
a a 7 4 1 b 8 5 2 c 9 6 3
b b 9 7 5 3 1 c a 8 6 4 2
c c b a 9 8 7 7 5 4 3 2 1.
Table of reciprocals:
1 2 3 4 5 6 7 8 9 a b c
1 7 9 a 8 b 2 5 3 4 6 c.

Fourteen, fifteen, and sixteen are composite numbers. Hence, there is neither a GF-14, GF-15, nor GF-16.  However, there is a GF-2^4, which we ignore.

GF-17. The elements are 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, b, c, d, e, f, and g. The multiplication table is
x 1 2 3 4 5 6 7 8 9 a b c d e f g
1 1 2 3 4 5 6 7 8 9 a b c d e f g
2 2 4 6 8 a c e g 1 3 5 7 9 b d f
3 3 6 9 b e 1 4 7 a d g 2 5 8 b e
4 4 8 c g 3 7 b f 2 6 a e 1 5 9 d
5 5 a f 3 8 d 1 6 b g 4 9 e 2 7 c
6 6 c 1 7 d 2 8 e 3 9 f 4 a g 5 b
7 7 e 4 b 1 8 f 5 c 2 9 g 6 d 3 a
8 8 g 7 f 6 e 5 d 4 c 3 b 2 a 1 9
9 9 1 a 2 b 3 c 4 d 5 e 6 f 7 g 8
a a 3 d 6 g 9 2 c 5 f 8 1 b 4 e 7
b b 5 g a 4 f 9 3 e 8 2 d 7 1 c 6
c c 7 2 e 9 4 g b 6 1 d 8 3 f a 5
d d 9 5 1 e a 6 2 f b 7 3 g c 8 4
e e b 8 5 2 g d a 7 4 1 f c 9 6 3
f f d b 9 7 5 3 1 g e c a 8 6 4 2
g g f e d c b a 9 8 7 6 5 4 3 2 1.
Table of reciprocals:
1 2 3 4 5 6 7 8 9 a b c d e f g
1 9 6 d 7 3 5 f 2 c e a 4 b 8 g.

Eighteen is a composite number. Hence, there is no GF-18.

GF-19. This is where we began. But, let us do it again.
The elements are 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, b, c, d, e, f, g, h, and i. The multiplication table is
x 1 2 3 4 5 6 7 8 9 a b c d e f g h i
1 1 2 3 4 5 6 7 8 9 a b c d e f g h i
2 2 4 6 8 a c e g i 1 3 5 7 9 b d f h
3 3 6 9 c f i 2 5 8 b e h 1 4 7 a d g
4 4 8 c g 1 5 9 d h 2 6 a e i 3 7 b f
5 5 a f 1 6 b g 2 7 c h 3 8 d i 4 9 e
6 6 c i 5 b h 4 a g 3 9 f 2 8 e 1 7 d
7 7 e 2 9 g 4 b i 6 d 1 8 f 3 a h 5 c
8 8 g 5 d 2 a i 7 f 4 c 1 9 h 6 e 3 b
9 9 i 8 h 7 g 6 f 5 e 4 d 3 c 2 b 1 a 
a a 1 b 2 c 3 d 4 e 5 f 6 g 7 h 8 i 9
b b 3 e 6 h 9 1 c 4 f 7 i a 2 d 5 g 8
c c 5 h a 3 f 8 1 d 6 i b 4 g 9 2 e 7
d d 7 1 e 8 2 f 9 3 g a 4 h b 5 i c 6
e e 9 4 i d 8 3 h c 7 2 g b 6 1 f a 5
f f b 7 3 i e a 6 2 h d 8 5 1 g c 8 4
g g d a 7 4 1 h e b 8 5 2 i f c 9 6 3
h h f d b 9 7 5 3 1 i g e c a 8 6 4 2
i i h g f e d c b a 9 8 7 6 5 4 3 2 1.
Table of reciprocals:
1 2 3 4 5 6 7 8 9 a b c d e f g h i
1 a d 5 4 g b c h 2 7 8 3 f e 6 9 i.

This is where we began; hence, let us quit here.

Interlacing of sectors, within a track, on a platter, of a disk-drive

For a while, in the dim distant past, the disk-drives spun around faster than the computer could transfer data -- either read or write.  Had the sectors been numbered sequentially, they would have to be read (or written) on successive revolutions of the drive spindle.  Instead, the sector-numbers were staggered. The distance from sector zero to sector one is called the interleaving-ratio.   For example, an interleaving-ratio of 5:1, on a track with 19 (decimal) = 23 (octal) has the sectors numbered as follows:

00 04 10 14 20 01 05 11 15 21 02 06 12 16 22 03 07 13 17 (octal).

This is the 04 = 01 / 05 row of the multiplication table of the GF-23 (octal).  It takes the interleaving-ratio -- in this case 5 -- revolutions of the drive-spindle to read (or write) each track.

When a faster disk-controller (on a faster computer) became available, I was requested to re-format some disk drives to a 3:1 interleaving-ratio.  We wrote a very large file to a disk-drive, which had the old 5:1 interleaving-ratio.  We timed the reading of that file as m seconds.

We had an old disk-controller (on an old computer), for which we had a low-level formatting-driver.  We wrote a command for the 3:1 interleaving-ration formatting.

00 15 07 01 16 10 02 17 11 03 20 12 04 21 13 05 22 14 06.

This is the 15 = 01 / 03 row of the multiplication table of the GF-23 (octal).

On the new computer, with its new disk-controller, we wrote the same very large file. We timed the reading of that file.  It took (3 m) seconds, instead of (3 m / 5) seconds. What went wrong?  It should be noted that it is impossible to ascertain the interleaving-ratio on a disk-drive.  All that one can do, is format it to a specified interleaving-ratio. Then, time how long a read of a known file takes.  From this time, one can infer the actual interleaving-ratio.

Our first presumption was that we had made a mistake in entering the sector numbering table. So, we repeated the formatting.  We wrote new data. It still took (3 m) seconds to read the file.

Well, that did not work. Next, we systematically wrote each of the 18 (decimal) = 22 (octal) interleaving-ratios.  We saved each command -- including the sector numbering table -- to a floppy diskette, for future use.  Each time writing the file and timing its reading. It took about two days to do this experiment.

We sorted the data in increasing order of the read time.  We drew a graph, with the read-time as the ordinate. We noted the interleave-ratio that we had entered into the sector-table as the abscissa.  With the exception of one point, the data was a straight-line through the origin. The numbers along the abscissa were

xx yy zz 17 01 06 13 20 02 07 14 21 03 10 15 22 04 11 16 05+23=30 (12),

where the xx represents the origin -- no point here. There were no points as yy and zz.  This is the 05 row of the GF-19 (decimal) = GF-23 (octal) multiplication table.  The conclusion is that this disk-controller has a built-in (hardware) interleaving-ratio of 5:1.  Then the resulting interleaving-ratio on each track becomes the product of the built-in (5:1) ratio and that entered into the sector numbering table.  By the way, the 2:1 interleave-ratio about half the time would slip to the next rotation of the spindle-axis.  Thus, to achieve an interleaving-ratio of 3:1, one has to enter a table for (3 / 5):1 = (3 * 4) : 1 = 14 (Octal) : 1.  Of course that table is the row for 1/14 (octal) = 10 (octal), which is

00 10 20 05 15 02 12 22 07 17 04 14 01 11 21 06 16 03 13.

Now, it does read at (3 m / 5) seconds. Mystery solved!  Indeed, it is obvious that the controller has to have a timer which causes the controller to write the format in its native interleaving-ratio.  Only the formatting is under the control of the timer. The read or write operations are under the control of the sector-number written in the header of each sector, on the disk, and of the availability of a sector-buffer. The later depends upon the I/O speed of the disk-controller and of the host computer, to fill or empty the buffer, respectively.  As a check, we read these on the slower (5:1 interleave-ratio) computer/controller. Results as expected.  Eventually, even faster computer/controllers became available: 2:1 then 1:1. Again, results as expected.  For about a year, we made a little money recycling the disk-drives by re-formatting them into the newer interleaving-ratios.

Morale of this story: You need to know your Galois fields when you issue interleaving commands. :-)

Galois fields underlie the usual ECC (=Error Correction Code) employed on magnetic disk-drives.  However, we have to describe a disk-drive in greater detail, to provide the context.

Disk Drive

Here is all that you ever wanted to know about disk-drives, but were afraid to ask!

Format

There are two distinct concepts confounded under the single name "format":  1) The low-level -- aka (=also known as) hardware -- format and 2) the high-level -- aka software -- format.

low-level format

The low-level format writes the specified sector information on a given track.  Each of these sectors consists of a large gap, the header information, a small gap, and the body, which is filled with the specified data-block.  Each of the header and the body has a run-in and a run-out for synchronization.  The header includes the sector number and cylinder number.  The body is where all subsequent high-level read and write will be performed.  How many sectors will fit within a track depends on the specified size of each sector.  Ordinarily, there will be an extra-large gap following the last sector on each track.  You may specify any desired interleaving of the sectors.  The optimum interleaving ratio depends upon the relative speed of the computer with respect to that of the disk-drive.  With the present fast computers, a 1:1 interleaving ratio has become standard.  Any further discussion of interleaving will lead us to Galois field theory -- see the main body of this page. 

The act of performing a low-level format overwrites everything on a given track, except that whatever had been located where the current gaps occure will survive.

high-level format

The high-level format constructs the file system.  For instance, under Microsoft Windows, we have the choice of FAT16, FAT32, NTFS for NT, or NTFS for Windows 2000.

The act of performing a high-level format only destroys any pre-existing high-level format; but, does not over-write any appreciable portion of any pre-existing data.  Any pre-existing data becomes inaccessible to the Operating System.  However, for instance, Norton has an "un-format" utility, which will recover accessibility to most of the data that had been on the disk before the high-level format was performed, provided that no other commands had been executed in the interim.

Available commands

The Windows systems include the f-disk, format, and fast-format commands.  What do these accomplish?  It depends upon their target.

If the target is a floppy diskette:  The format command performs the low-level format first, then the high-level.  The fast-format command performs only the high-level format.  As its name implies, it is much faster; because, it writes much less.  Of course, it is applicable only if a low-lever format had been performed upon this diskette.  The f-disk command in inapplicable to a floppy diskette.

If the target is a hard-drive:  The f-disk command performs a high-level partition of the disk.  The format command performs a high-level format -- in essence, it corresponds to the fast-format of a diskette.  Obviously, there is no fast-format command for a hard-drive; because, this is what the format command performs.  Currently, Microsoft does not provide any command that would perform a low-level format upon a hard-drive.

SCSI hard-drives are designed to be low-level formatted by the user.  Usually, a SCSI hard-drive is shipped low-level formatted for a 512-byte sector.  However, a SCSI hard-drive may be formatted for any even sector-size in the closed interval of [40 bytes, 64 Kbytes].  Since the IBM AS/400 operating-system requires a sector of 520 bytes, one has to perform a low-level format.  If your operating-system can support such a sector-size; you will gain a space and time efficiency by changing the sector-size so that the cluster would be comprised of a single sector.  For large files (e.g., multimedia), a cluster-size such that a track would be comprised of a single cluster provides the greatest efficiency.  The utilities for performing a low-level format are marketed only to device-driver developers.  Of course, you may write your own.

IDE (and related) hard-drives are not designed to be low-level formatted by the user.  While utilities are available to low-level format an IDE hard-drive, use such utilities with great caution.  Hard-drives often have part of their device-driver stored in a hidden -- from the high-level format -- place on the hard-drive.  A low-level format may over-write this device-driver.  In such a case, the hard-drive becomes inoperable.  You will have to send it back to the manufacturer, to have the device-driver written back to the hard-drive.  (On a SCSI hard-drive, the low-level format command is safe to employ.)

Security

A high-level format does not provide any appreciable security.  A low-level format makes recovery of the data impossible for the ordinary user.  However, only either performing an adequate bulk-erase, heating above the Currie temperature, or physically destroying of the platters will obliterate the data.  It is obvious that the third option destroys the hard-drive.  Most hard-drives have servo information written to one of the surfaces.  Since either the first or the second option will clear this servo information, the hard-drive will become inoperable -- effectively destroyed.  Alternatively, never write any sensitive data, without first encrypting it; provided, of course, that you trust the encryption to be sufficiently secure.

Economics

An old hard-drive is old and thus has a reduced life-expectancy.  It is slower, smaller, and consumes more power than a new hard-drive would.  With the current low price of a new hard-drive, just destroy the old one; if you are concerned about the data on it.

RLL

 A magnetic-coil disk writing head writes a magnetic field proportional to the current through the coil.  However, it reads an electric potential proportional to the time rate of change of the magnetic field, as the disk surface passes underneath the head.  Thus, the read produces an electric pulse at each transition of the original signal.

The easiest is to write a pair of pulses -- called bipolar -- for each data bit.

data      write            read

0          10            0

1          01            1

Two bits are written for each one data bit.  The efficiency is 1/2 = 12/24.  Since this is the most reliable scheme, it is the one that is employed for writing the header block.

The read process drifts both in time and in level.  There is a severe limitation as to how many successive zero or one can be read reliably.  Here, a worst case is that of a data 01.

data                  write                 read

01                    1001                01

10                    0110                10

Here there is a RLL (=Rull Length Limited) of two.

The efficiency may be improved by coding pairs of data bits into triples, as follows

data                  write

00                    010

01                    011

10                    100

11                    101

The a worst cases is

data                  write

0110                011100

The RLL=3 and efficiency=2/3=16/24.

As the size of the table is increased, the efficiency increases at a slight cost of a corresponding increases in the RLL.  Other desiderata are that each written tuple have neither a large excess of zeros over ones nor ones over zeros.  The disk drive manufacturers do not disclose the actual RLL and patterns that are employed for writing the data block.  For practical reasons, the RLL table cannot be larger than 256 bytes.  With this limitation, the highest RLL efficiency that might be possible is 7/8 = 21/24.  The RLL encoding is very vulnerable to a known-input cryptanalysis, though.  Ordinarily, the hardware of the disk drive performs the coding and the decoding.  However, the disk drive may be commanded to not decode during the read.

The new technology employs a magneto-resistive read head.  It reads the actual magnetic field.  Now, the data is written directly; then, read back directly, as well.  No more coding.  The efficiency becomes 1/1=24/24.  This transition from employing a magneto-induction to a magneto-resistive read-head has occurred about when the hard-disks reached a size of 5 GBytes.

ECC

The data block is written with an ECC (=Error Correcting Code), which also has the CRC (=Cyclic Redundancy Code) capability.  The quest to write at ever greater data density upon the disk surface results in an inevitable occurrence of errors during the reading of the data.  To improve the quality of the data that is presented to the user, the written data is augmented with an ECC.  Then, during the read operation, if the data fails the CRC check; an attempt is made to correct the error by means of the ECC.

Desiderata for the ECC are:  fast and easy to write, fast and easy to check during reading, moderately fast correction and verification in the event of a failure of the aforementioned check.  Also, the ECC should add a relatively small amount of extra data that needs to be written.  The extent of the errors that can be detected or corrected should be as large as possible.  There are various ECC or CRC schemes possible.

A scheme based upon Galois polynomials was first implemented, as a cryptographic method, employing a cyclic shift register.  However, it soon became evident that anything based upon Algebra is too systematic to provide sufficient obfuscation of the data to be cryptographically effective.  This scheme has become popular, though, for ECC in data recording.  Since the selection of a good polynomial is something of a black-art, there are only a few popular polynomials in use.

Each sector is self-contained.  The header block is written with only a CRC and the more reliable bipolar.  The data block is written with an ECC.  Ordinarily, an ECC can check more error bits than it can correct.  During the read operation, the hardware checks, and if necessary, corrects any errors.  If unsuccessful, the hardware makes perhaps a dozen attempts at re-reading, before it reports a failure to the operating system.  The operating system then attempts perhaps a dozen reads (12*12=144 total attempts), before it acknowledges a failure to the user.  Optionally, the hardware may be instructed to perform a raw-read, so that the error checking or recovering may be accomplished in the software.

Now, the user may employ some data-recovery software, which will issue tens, hundreds (100*144=14400 total) mayhap, such read requests to the operating system.  The chance of a lucky good read is slim, after the failure of the original 144 attempts.

In the good old days, word processors actually wrote in ASCII characters. 

A faster, and certainly more reliable, strategy at error-recovery is to run the decrypting algorithm, upon the raw-read, in a backward direction in addition to the traditional forward direction.  Then, to splice these two results.  There would be obvious clear-text at both ends, with just a few characters garbled in the middle.  From a knowledge of the English (or other) language, make a good guess.  Now, perform a check employing the CRC capability.  Viola!  With this strategy, one increases the error correction capability to the error detection capability, which usually is strictly larger than the rated error correction capability.

the components

Writing, reading, and error-checking requires these components:

  • an n-stage shift-register, with appropriate tap-points, which are a shared-secret.  Each cell of the shift-register is a base-p, greater than one,  integer.  Thus, a shift in the forward direction is a multiplication of each cell by p; rearward ... a division by p.  The tap-points may be implemented in either of two equivalent ways.  In each case, the multiple is the coefficient of the polynomial, while the distance between the tap-point and whichever end of the shift-register to which it is connected is the degree of the term of the polynomial.  Since the addition is performed modulo p, there never is a carry.  Of course, when we are working backwards, the operation becomes a subtraction.  If p=2; either addition or subtraction becomes the bit-wise xor operation.  Considered as a whole, the shift-register performs a multiplication by the polynomial in the forward direction or division by the polynomial in the rearward.  Since the output of the shift-register is connected to its input, these multiplications or divisions are performed modulo the polynomial.  Thus, we have implemented a Galois field p^m.  There is no direct access to the contents of the shift-register -- one may neither read nor write.  The only direct operation is that of a reset to all zero.  As already mentioned, these tap-points may be implemented in either of these ways:

    • each tap takes the input of the shift-register and adds a multiple at the tap-point

    • a multiple of the output from each tap-point is added to the output of the shift-register.

  • a shared-secret n-digit string, for initialization of the shift-register.  There is a cult as to the relative merits of various alternative such strings.  Actually, any string will do; because of its vulnerability to cryptanalysis -- it may be ascertained by performing a subtraction, modulo p (if p=2, this becomes a bit-wise xor-operation.), of the first n digits of any known input (plain-text) message from its encrypted output..  Just be sure to employ the same initialization-string for reading as had been employed for writing!

  • an xor (= exclusive-or) gate.  In general, we are performing addition (or subtraction) modulo p.

  • the plain-text string (typically 512 bytes = 4096 bits), which is to be encrypted.

  • some storage or transmission medium where the encrypted string (4096+n bits) is to be stored.

  • a plain-text sink (4096+n bits) for the decrypted output, of which the last n bits constitute the syndrome.

 The two pairs of algorithms are:

  • encrypted -- employed by the drone-aircraft (early 1960s) and the disk i/o of the Apple II (early 1980s)

    • write

      • reset the shift-register to all-zero.

      • connect the output of the shift-register to one of the inputs of the xor-gate.

      • connect the output of the xor-gate to the input of the shift-register.

      • feed the n-bit initialization string into the second input of the xor-gate.

      • connect the output of the xor-gate to the storage/transmission medium.

      • feed the plain-text string into the second input of the xor-gate.

      • feed a string of n zero-bits into the second input of the xor-gate.

    • read

      • reset the shift-register to all-zero.

      • connect the output of the shift-register to one of the inputs of the xor-gate.

      • connect the output of the storage/transmission medium to the input of the shirt-register.

      • feed the n-bit initialization string into the second input of the xor-gate.

      • connect the output of the xor-gate to the storage/transmission medium.

      • connect the output of the xor-gate to the plain-text sink.

      • feed the output of the storage/transmission medium to the second input of the xor-gate.  the last n-bits comprise the syndrome, which is not part of the plain-text output.

  • plain-text -- employed by the hard-disk drives on the Data General main-frame (early 1990s)

    • write

      • reset the shift-register to all-zero.

      • connect the output of the shift-register to one of the inputs of the xor-gate.

      • connect the output of the xor-gate to the input of the shift-register.

      • feed the n-bit initialization string into the second input of the xor-gate.

      • connect the plain-text to the storage/transmission medium.

      • feed the plain-text string into the second input of the xor-gate.

      • switch the input to the storage/transmission medium to be the output of the xor-gate, for these n bits.

    • read

      • reset the shift-register to all-zero.

      • connect the output of the shift-register to one of the inputs of the xor-gate.

      • connect the output of the xor-gate to the input of the shirt-register.

      • feed the n-bit initialization string into the second input of the xor-gate.

      • connect the output of the xor-gate to the storage/transmission medium.  [Alternatively, connect the plain-text to the storage/transmission medium.  You will be sending plain-text.]

      • connect the output of the xor-gate to the plain-text sink.

      • feed the output of the storage/transmission medium to the second input of the xor-gate.  the last n-bits comprise the syndrome, which is not part of the plain-text output.

the CRC  this syndrome is the CRC error:  if it is not all-zero; it means that there is a detected error.  Then, we perform the ECC error-correction by working the syndrome back to the correct position and subtracting it from the output (clear-text) string.

cryptanalysis

Since for each flight of the drone aircraft, the tap-points of the shift-register were set by the user, the cryptographic system was reasonably secure -- we considered that after a week, we did not care.

Galois fields exist for each p^n, where p is a prime and n is a natural number.  For m>1, we have
p^m = {4, 8, 9, 16, 25, 27, 32, 49, 64, 81, 121, 125, 128, 169, 243, 256, 289, 343, 361, 512, 529, 625, 729, 841, 961, 1024, ....}.

With today's computers and the advances in cryptanalysis techniques, this system is not considered secure.  The original design called for a 64-bit shift-register for the ECC.  That would have been a disaster for the spread-spectrum, which was based upon a hierarchical binary multiplexing.  We went down to 63-bits -- better spread-spectrum but no ECC.  49 bits would have been too short.  We should have gone up to 81 bits = 3^4.  However, anything other than a power of two would have been beyond the computational capability of the computer on the drone.  :-)

However, if both the n-bit initialization string the the set of tap-points on the shift-register are invariant; this cryptographic system probably is vulnerable to a known-text attack.  I have been successful with an attack in which I could formulate the plain-text input and observe the encrypted-output.

  1. an all-zero input, whose length is at least that of the shift-register.  record the encrypted output.

  2. an input consisting of the previous output, a one-bit, and followed by all-zero.  the encrypted-output is the structure of the shift-register.

  3. construct a software simulation of the shift-register:  one for forward coding, another for backward coding.

  4. run the output of step one backwards through the software shift-register of step three, to obtain the initialization-string.

  5. now that you have everything in your software, compare to verify that you have succeeded in your cryptanalysis.

  6. perform a cryptanalysis upon the RLL encoding.

I was successful with the Apple II in the cryptanalysis of both the ECC and the RLL.

Since the Data General mainframe disk-drives do not display the encrypted string, cryptanalysis is not applicable.  The disk manufacturer provided us their initialization string.  Further, they said that they were employing various polynomials of low weight; hence, that we would have to ascertain ourselves the actual polynomial for the particular disk..  From its length of 32 bits, follows that one of the corresponding irreducible polynomials had to have been employed by the disk-drive.  We wrote a sector with all zero bytes.  Using a disk-editor, we changed the last bit to a one, without correcting the syndrome.  We performed a raw-read.  The resulting syndrome is the polynomial.  We coded that polynomial and, as a check, compared the syndrome in several known situations.  Now, we were in a position to attempt the data-recovery.  Thus, we read the sector(s), that failed the ECC, in both directions.  As described previously, from a knowledge of English, we reconstructed each of the corrupted sectors.  We did not investigate the RLL encoding.

Observe that one does not need a knowledge of Galois polynomials to perform this cryptanalysis.  Neither for the encryption, decryption, the syndrome, nor the CRC.  Only the following require a knowledge of Galois polynomials:

  • the choice of the p and m in the Galois field p^m.

  • the choice of a suitable polynomial.

  • the computation of the maximum error-detection CRC capability and the maximum error-correction ECC capability.

  • the position at which to subtract the syndrome from the output (clear-text), when performing the ECC correction of a detected burst error.

Actually, even the ECC correction can be achieved without a knowledge of Galois fields.  Define the weight of an integer as the sum of its digits.  The maximum-likelihood algorithm for ECC correction may be performed as follows:

  1. once an error has been detected by the presence of a non-zero syndrome compute its weight at this position.

  2. by feeding it a string of zeros, back up this syndrome one position at a time and compute its weight at that position.

  3. observe the positions at which the weight is least.

  4. if more than one position has the least value for the weight; declare a decoding failure and stop.

  5. else, at the unique position with the least value for the weight, subtract the syndrome from the input (crypt-text) text.

  6. decode the resulting crypt-text and stop successfully.

The accuracy of the corrected output is dubious if the weight of the correction exceeds the maximum of which the polynomial is capable.  Furthermore, there are faster methods of performing the ECC correction; but, they require a study of Galois fields.  As is true of any purported algorithm, the foregoing ones may contain minor (hopefully, not major) omissions or errors.  It would be interesting to write a computer program which implements the foregoing.  Mayhap, someday?  :-)  Only a successful computer program validates an algorithm.

[This line intentionally left blank.]

There is so much more that one could say regarding disk-drives.  Where to stop?  Enough, already! 

Now, for some serious study of Galois fields.  :-)

We begin by providing a link to my very brief exposition of Algebra.

 

 

Copyright (c) 2001, 3 by R.I. 'Scibor-Marchocki.  Modified on Friday 25-th July 2003.  Webmaster@rism.com