Using PE32 to Create an Index
for
Microsoft Word (and Other) Documents

This web site is being maintained by John R. Barnes, who was the President and Chief Engineer of dBi Corporation from 2002 to September 30, 2013, when we closed because ObamaCrap made it too expensive for us to remain in business.

John R. Barnes KS4GL, PE, NCE, NCT, ESDC Eng, ESDC Tech, PSE, Master EMC Design Engineer, SM IEEE
December 24, 2004
jrbarnes@iglou.com

In my opinion, many technical books, manuals, standards, and other long documents could be greatly improved as reference works by including a high-quality index. How many times have you found yourself reading through a document page-by-page, looking for some vital information? When none of the obvious search terms appear in the index, we may be able to narrow down our search based on the chapter titles. Or we may just have to start reading from the beginning, looking for the nugget of information that we need to solve a problem. In either case I get irked, because the author(s) traded off a one-time investment in time and effort--to create a decent index--against a lot of time and trouble for all of us readers.

Having written three technical books (Electronic System Design: Interference and Noise Control Techniques, and Robust Electronic Design Reference Book, Volumes 1 and 2), I know some of the obstacles faced by authors:

But preparing a high-quality index offers a number of benefits to us authors: When creating an index for a book or other document, our main goals are to:
  1. Extract the index terms (key words, key phrases, key ideas, names) from the document.
  2. Organize them so that a reader can find any desired index term quickly.
  3. List all the pages where each index term is discussed in the document.
The procedure described below is based on the method that I evolved for indexing Robust Electronic Design Reference Book, Volumes 1 and 2. In bits and pieces, I spent about one month preparing the 106-page index covering both volumes in Volume 1, and the 16-page index covering just the Appendices in Volume 2. I wound up with 9,200+ index terms and 25,600+ page references covering: In the process I found and fixed some 300 errors in grammar and punctuation, before submitting the final manuscript to Kluwer (now part of Springer). Kluwer's copyeditor(s) found another 10 or so mistakes, which we fixed before the books were printed. And since March 24, 2004 only 14 minor errors have been found and reported in the two volumes.

Stages A and P in the procedure below use Microsoft Word, because that is what I used to write the Robust Electronic Design Reference Book. This indexing procedure should work with other word-processing software too, as long as you have a way to:

For indexing my three books, I used the PE2 editor. I have been using PE/PE2 since about 1984, shortly after IBM came out with the original IBM PC (personal computer). PE2 is one of the most versatile programs--and certainly the most bug-free program--that I have ever used. Since PE2 is no longer commercially available, I modified the procedure and created a new profile file so that the PE32 Text Editor could be used instead. PE32 is a worthy successor to PE2, and may be purchased for $39 from http://www.pe32.com/.

As a test of this modified procedure, I re-indexed chapter 17 of Robust Electronic Design Reference Book, Volume 1 . This chapter is 16 pages long, and is pretty representative of the book as a whole. The entire process took me 2 hours and 15 minutes to generate INDEX.DOC with 370+ page references under 220+ index terms--about 7 pages per hour, including the time to tweak the instructions.

PREPARATIONS

We will need: If you happen to have a copy of PE2, you can use PE2 instead of PE32, using PE2.PRO (save this file in your root directory) instead of PE32.PRO. Depending on your operating system, you may also need to copy PE2.PRO to the subdirectory where you will create the index, in addition to the root directory. Hitting <F1> when in PE2 will bring up the PE32JRB.HLP file. I have redefined many of the keys so that the key strokes are compatible between Microsoft Word, PE2, and PE32, as well as taking care of many of the indexing tasks.

Copy FILLER.TXT from the root directory to the subdirectory where you will create the index. Use PE32 to look through FILLER.TXT. Delete any lines that contain words that might be key words or parts of key phrases in your document. Modify FILLER.TXT as desired to match your subjects and style of writing. The format of the lines in FILLER.TXT is
"c /<char_before><filler_word><char_after>/<char_before><char_after>/*p"
This tells PE32 to delete <filler_word> anywhere it appears between <char_before> and <char_after>, without deleting <filler_word> when it accidentally appears inside another word.

PROCEDURE

The overall process has 16 stages:
  1. Starting with <FILENAME>.DOC, create <FILENAME>.TXT with all the text, tables, and captions as pure ASCII text with linefeeds.
  2. Put blanks at the beginning and end of each line in <FILENAME>.TXT.
  3. Add a line with the page number before each page of text in <FILENAME>.TXT.
  4. Tack FILLER.TXT onto the end of <FILENAME>.TXT.
  5. Delete common "filler" text (word and phrases that aren't key ideas) from <FILENAME>.TXT.
  6. Check <FILENAME>.TXT for blunders--punctuation errors, incorrect words, etc.
  7. Pick out the key words and key phrases.
  8. Add the page number to each index term (don't worry about how many times an index term appears on a page at this point).
  9. Sort the index terms alphabetically and by page number.
  10. Sanity check the index terms.
  11. Concatenate all of the index files we have prepared for the .DOC files making up our document.
  12. Sort all of our index terms alphabetically and by page number.
  13. Sanity check the complete collection of index terms.
  14. Combine multiple lines with the same index term into one line.
  15. Clean up the index.
  16. Create INDEX.DOC.
Stage A: Starting with <FILENAME>.DOC, create <FILENAME>.TXT with all the text, tables, and captions as pure ASCII text with linefeeds:
  1. Go to Microsoft Word.
  2. Open <FILENAME>.DOC.
  3. Click on File, Save As.
  4. Set "Save as type" to "MS-DOS Text with Line Breaks (*.txt)".
  5. Click on Save.
  6. Click on X to exit the .DOC file without changing it.
Stage B: Put blanks at the beginning and end of each line in <FILENAME>.TXT:
  1. Open an MS-DOS window.
  2. Type PE32 <FILENAME>.TXT.
  3. Hit <Ctrl-Home> to move to the top of the file.
  4. Hold down <Ctrl-F8> until we reach the end of the file to add leading and trailing blanks to each line.
  5. Hit <F2><Enter>Y to save the file.
Stage C: Add a line with the page number before each page of text in <FILENAME>.TXT:
  1. Using our printed copy as a reference, move the cursor to the first line of the first page of the document.
  2. Hit <Home><Alt-S> to insert a blank line.
  3. Type in ", <page_number>", using leading zeros so that the pages will be in order when we sort the index terms.
  4. Hit <F2><Enter>Y to save the file.
  5. Move the cursor to the first line of the next page.
  6. Repeat steps 2 through 5 until we reach the end of the file, or the end of the text that we want to index.
  7. Delete any lines that we don't want to index by:
    1. Move the cursor to the top line.
    2. Hit <Alt-L>.
    3. Move the cursor to the bottom line.
    4. Hit <Alt-L><Alt-D>.
    5. After verifying that we haven't deleted too much, hit <F2><Enter>Y.
Stage D: Tack FILLER.TXT onto the end of <FILENAME>.TXT:
  1. Hit <Ctrl-End> to go to the end of the file.
  2. Hit <Esc>, type "E FILLER.TXT", hit <Enter> to open the "filler" file.
  3. Hit <Ctrl-A><F8><Alt-Z> to copy FILLER.TXT to the end of <FILENAME>.TXT.
  4. Hit <F8><Alt-U><F4> to exit from FILLER.TXT without changing it.
  5. Hit <F2><Enter>Y to save the file.
Stage E: Delete common "filler" text (word and phrases that aren't key ideas) from <FILENAME>.TXT:
  1. Move the cursor to the first line from FILLER.TXT.
  2. Hit <Ctrl-F9> to delete the word/phrase from the entire file--this also mangles the current line from FILLER.TXT.
  3. Hold <Ctrl-F9> down, or keep hitting it, until we reach the end of <FILENAME>.TXT.
  4. Hit <Alt-L>, move the cursor up to the top of the mangled "c" commands (check that all the commands have been executed), then hit <Alt-L><Alt-D>.
  5. Hit <F2><Enter>Y to save the file.
Stage F: Check <FILENAME>.TXT for blunders--punctuation errors, incorrect words, etc. In the previous stage we deleted many "filler" words without affecting punctuation. Punctuation and some grammatical errors will now jump out at us, making them easy to fix. Read through the file from top to bottom, looking for: If any blunders are serious, and likely to move text between pages, we may want to go back to Microsoft Word to fix the document, reprint it, and go through this whole process again before proceeding to the next stage, which is the most-tedious part of indexing.

Stage G: Pick out the key words and key phrases:

  1. Type "E FILLER.TXT"<Enter> on the command line to open FILLER.TXT again.
  2. Hit <F8> to toggle between <FILENAME>.TXT and FILLER.TXT.
  3. Leaving the lines with the page numbers alone for the time being, delete everything except for the key words and key phrases that we want in the index (putting one term per line, starting in column 1, editing them for clarity if necessary), using:
  4. If we find additional common "filler" words, add them to FILLER.TXT so that we can delete them automatically in later documents, by:
    1. Marking the word by moving the cursor to the character before it, hitting <Alt-C>, moving to the character after it, and hitting <Alt-C> again; or sweeping over the word and these two characters with the left mouse button held down.
    2. Hitting <F8> to toggle to FILLER.TXT.
    3. Moving to the line before the point where we'd like to add the "filler" word/phrase, and hitting <End><Enter>; OR moving to the line where we would like to add the "filler" word/phrase, and hitting <Home><Alt-S>.
    4. Typing "c /<Alt-Z>/<char_before><char_after>/*p"
    5. Hitting <F2><Enter>Y to save the modified FILLER.TXT.
    6. Hitting <F8> to return to <FILENAME>.TXT.
  5. Occasionally hitting <F2><Enter>Y<Esc> to save <FILENAME>.TXT.
  6. Hit <F3><Enter>Y<F3><Enter>Y to save both files and exit PE32 temporarily--time for a break!
Stage H: Add the page number to each index term (don't worry about how many times an index term appears on a page at this point):
  1. Type "PE32 <FILENAME>.TXT"<Enter> to open the file again.
  2. Hit <Ctrl-Home> to move to the top of the file.
  3. Hit <Ctrl-F10> to mark (grab) a page number.
  4. Hit <Ctrl-F12> repeatedly to copy the page number onto all the index terms found on that page.
  5. Hit <F2><Enter>Y<Esc> to save the modified file occasionally.
  6. Hit <F3>Y<Enter> to save the file and exit PE32, when done.
Stage I: Sort the index terms alphabetically and by page number. At the MS-DOS prompt, type
"SORT <<FILENAME>.TXT ><FILENAME>.ORD"<Enter>
For chapter 17, this was "SORT <REDRB17.TXT >REDRB17.ORD".

Stage J: Sanity check the index terms:

  1. Type "PE32 <FILENAME>.ORD"<Enter> to open the sorted file.
  2. Verify that we have lines with the page numbers for all the pages that we expected to index in this document--in case of an error, edit <FILENAME>.TXT to fix the error(s), then start over again at stage I.
  3. Move to the first line of page numbers, hit <Alt-L>, move to the last line of page numbers, and hit <Alt-L><Alt-D> to delete the lines with just page numbers.
  4. Look for similar index terms that we spelled differently (singular on some pages and plural elsewhere; spelled-out on some pages and abbreviated elsewhere) to decide how we want to index them. For example I decided to index mixed terms as the plural, and to use common abbreviations in the index, adding a line
    "<spelled-out-term>, see <abbreviation>" to <FILENAME>.ORD.
  5. Look for similar index terms that we wrote as separate words in some places, hypenated in other places, and may have merged into single words in yet other places, or may have capitalized different ways (E mail, E-mail, e-mail, Email, and email for example). Decide on our preferred form, and index them that way. Write these down on a list, and fix them next time we edit <FILENAME>.DOC. (When I'm reading something, and I run across a term that is *almost* the same as one that the author used elsewhere, I stop to check to see if the terms really are the same, or whether they express different ideas.)
  6. Delete duplicate lines from the file (the index term appeared several times on one page)--move to the extra line, and hit <Ctrl-Backspace>.
  7. Hit <F3><Enter>Y to save the file and exit PE32, when done.
Stage K: Concatenate all of the .ORD files we have prepared for the .DOC files making up our document. At the MS-DOS prompt, type
"COPY <INDEX1>.ORD+<INDEX2>.ORD+...+<INDEXN>.ORD ALL."<Enter>
For example "COPY CH1.ORD+CH2.ORD+CH3.ORD+CH4.ORD ALL."

Stage L: Sort all of our index terms alphabetically and by page number. At the MS-DOS prompt, type
"SORT <ALL. >ALL.ORD"<Enter>.

Stage M: Sanity check the complete collection of index terms. This is similar to what we did in Stage J, except we open ALL.ORD instead of <FILENAME>.ORD. Hit <F2><Enter>Y to save the changes as we go along. Hit <F3><Enter>Y when done.

Stage N: Combine multiple lines with the same index term into one line:

  1. If we aren't already editing the sanity-checked file, at the MS-DOS prompt type "PE32 ALL.ORD"<Enter> to open the file, or "PE32 ALL.OR"<Enter> to resume work.
  2. Type "NAME ALL.OR"<Enter> to avoid messing up the sanity-checked file.
  3. Move the cursor to the top line of an index term with two-or-more lines.
  4. Hit <Ctrl-Q>, then hit <Ctrl-F5> as many times as needed to get the next page number onto the current line--repeat as needed to get all the page numbers for the index term onto the one line.
  5. Delete the leading zeros on the page numbers.
  6. If an index topic has three or more consecutive pages, replace all but the first and last in the group with a hyphen or the word "to"; thus "Crowbars, 17-04, 17-05, 17-06" would become "Crowbars, 17-04 to 17-06" to compress the page numbers.
  7. Hit <F2><Enter>Y occasionally to save the changes we have made.
  8. Hit <F3><Enter>Y to save this completed pass of the file.
Stage O: Clean up the index:
  1. At the MS-DOS prompt, type "PE32 ALL.OR"<Enter> to open the file, or "PE32 ALL."<Enter> to resume work.
  2. Type "NAME ALL."<Enter> to avoid messing up our last pass over the file.
  3. Check index terms that start with the same word to make sure that they follow standard rules for alphabetical order--comma comes after space in ASCII, so a short index term may appear after a long one in the file.
  4. To move a line, move the cursor to it and hit <Alt-L>; then move to the line just ahead of where we want it, and hit <Alt-M>.
  5. Hit <F2><Enter>Y occasionally to save the changes we have made.
  6. Look through the index terms and fix any that are incorrectly capitalized--I used all lower case except for trade names and things named after people; I put acronyms in uppercase unless the standard spelling is different.
  7. Hit <F3><Enter>Y to save this completed pass of the file--we are now ready to create the index .DOC file!
Stage P: Create INDEX.DOC:
  1. Start Microsoft Word, and create INDEX.DOC in the usual manner.
  2. Go to an MS-DOS window, and at the MS-DOS prompt type "PE32 ALL."<Enter> to open the file. (If you have been using PE2, use WordPad to open the ALL. file.)
  3. Hit <Ctrl-A><Ctrl-C> to copy the entire file to the clipboard.
  4. Move to the Microsoft Word window, move to the spot where you want to start the index, and hit <Ctrl-V> to copy the data into INDEX.DOC.
  5. Go back to PE32, and hit <F4> to exit. (Or click on X to exit WordPad.)
  6. Manipulate the index as desired (font, font size, double columns, etc. in Microsoft Word.

REFERENCES

Prentice-Hall Author's Guide, 5th Edition. Englewood Cliffs, NJ: Prentice-Hall, 1978.

Bonura, Larry S., The Art of Indexing. New York: John Wiley & Sons, 1994.

DISCLAIMER

This procedure and the associated files are freely offered to anyone who wishes to use them. No warranty for their use is expressed or implied. If you run into problems with PE32, Paolo Chiartano (pe32@pe32.com) has been very responsive about fixing them in my limited experience with PE32. If you run into problems with this procedure, or have comments or suggestions to offer, please E-mail me at jrbarnes@iglou.com. I will help you if I can.


dBi Corporation was a one-man test house (testing laboratory) based in Lexington, Kentucky, testing a wide variety of commercial electronic products for electromagnetic compatibility (EMC), electromagnetic interference (EMI), and electrostatic discharge (ESD) under its ISO 17025 accreditation. dBi was founded in Winchester, Kentucky in 1995 by Donald R. Bush, shortly after he retired from 30 years service with IBM Lexington's/ Lexmark's EMC Lab. John R. Barnes, who'd worked with Don at IBM Lexington and Lexmark, bought dBi in 2002 after Don's death, and moved the company to Lexington, Kentucky. John closed dBi at 11:59pm EDT on September 30, 2013, because ObamaCrap had increased operating expenses to the point that we could no longer afford to remain in business.

We'd like to thank all of the clients who chose dBi to test their products from 1995 to 2013. Below is a brief summary of our accomplishments during the 18 years we were in business.

From 1995 to 2001, under Don Bush's ownership and operation, dBi:

From 2002 to 2013, under John Barnes' ownership and operation, dBi:

Go to Main Web Site Index Go to Full Standards Index Go to ITE Standards Index Go to Residential/ Commercial Standards Index Go to Industrial Standards Index
Go to Lab Equipment Standards Index Go to Audio/ Video Equipment Standards Index Go to Lamps/ Luminaires Standards Index Go to Appliance Standards Index
Last revised December 24, 2004.