Using Software for Quality Control in Legal Indexes


The production of indexes for legal publications shares many of the editorial and quality assurance steps involved in producing most other indexes. Other than the indexes published in law school textbooks, which are indexed by page number, legal indexes often use one or more types of numeric citations: statutes, regulations, court decisions (either court -assigned numbers such as docket numbers or court reporter citations assigned by publishers or on-line vendors), administrative opinions, directives, rulings, and orders, etc. Legal publishers favor numbering the material in their secondary source products: some employ paragraph numbers when others number sections of text. Because the law is always evolving and is not static, indexes to most types of legal material must be recompiled or updated on a frequent (quarterly or annual, if not on-going) schedule as contrasted with other types of publishing. Legal indexers then face some unique problems in quality control in dealing with the numbering schemes, the need to maintain consistency when updating, and depending on the size of the project, the merging of various indexers’ work product. While the analysis of legal information remains beyond the capability of computer programs, software tools do exist that can free up the indexer’s time from much of the detail checking and maintenance of legal indexing and allow the indexer to focus on the analysis.

Today's electronic world of the Internet and CD-ROMs has for many indexers shortened editorial cycles between publications. While certain paper-related problems have disappeared (What is a bad widow in an electronic index when the screen view changes continually?), the need for accuracy has even increased to make electronic links work properly.

This article will discuss some of the main areas in which computer programs assist legal indexers:

Index preparation software

This category of software has already been discussed and reviewed extensively in the literature so it is not covered in-depth here. (See, for example, Linda Fetters’ Guide to Indexing Software.) Automatic application of style attributes such as bolding of headings, placing of commas between the entry and the reference, is done at publication time rather than at data entry time. This ability to customize the file through the use of formatting commands and menus eliminates the need for most of the manual checking required to identify punctuation and formatting problems. While this type of software is helpful to legal indexers as it is to all indexers, it can also handle some of the peculiarities of legal indexes. For example, it can accommodate a wide variety of reference types (e.g., statute numbers, docket numbers, and others listed at the beginning of this article). Spell- checking that utilizes an internal legal dictionary also increases accuracy for legal indexers.

The rest of the article describes functionalities which are not addressed, all or in part, by indexing preparation software.

Heading control

One feature, which applies equally for other indexing disciplines, such as database indexing, is validation of main headings. Once a prescribed list has been developed using standard indexing preparation software, a program can check all new entries against it. Some software even allows copying one or more heading entries for use in the index being developed. This is not as helpful though as supplying a selection list from an authority file or from the headings in the current index.

Since legal indexes frequently have multiple levels (some use as many as five sublevels) under a main heading, control is more complex than in smaller back-of-the-book indexes that often employ only two levels. Cross reference generation can be automated. For instance, entering the heading "Motor Vehicles" may require the addition of the cross reference entries "Automobiles, see Motor Vehicles, at Automobiles" and "Cars, see Motor Vehicles, at Automobiles". [This is just one style of legal cross reference as is discussed below.]

One approach when starting a new index or a substantial revision of an existing index is to copy the main heading list to the new file and begin indexing "underneath" these headings. Then as a later step the indexer can identify and delete the headings that were unused along with any cross references to them.

Another approach is to validate headings in the entries that match the headings in the control list via a computer program after data entry has been completed. This can also insure correct headings that are properly capitalized and punctuated.

Citation control

Indexing preparation software provides the ability to sort entries in locator (citation) order. This is useful in performing number checks. For legal indexes that cite section numbers, an added benefit is that editing and updating can occur in the natural order of the source document, or amendments to it.

Automating a citation check to validate locators against a control list provides an additional way to identify citations that appear valid (by syntax) but either do not exist or may have been repealed or revoked. Of course, this assumes that such a list can be obtained from the publisher. Electronic indexes may also have invalid numbers that do not link to material that must be corrected. While electronic links can also be sorted in locator order, they usually have no implicit document ordering and thus it is harder to perform a number check.

In any event statutory and administrative code indexers find it helpful to obtain a list of changed material indicating any legal actions, such as repealeds, transferreds, or recodifications. The latter situation (renumbering/recodification) is another area that lends itself to automation. A correspondence list between old and new numbers can be used to update the index citations programmatically, avoiding a good deal of manual effort. Recodification edits can be done by global substitution, but they must be performed in an order that avoids erroneous changes. For example, 1303.45 might need to change to 1305.27, but there already may be a 1305.27 that is changing to some other section number. If the changes are not done in the "proper" order, all 1305.27s (both "old" and "new" ones), may be substituted to another number and no entries remain with 1305.27. A program could avoid such pitfalls by noting which records have just become a certain number and freezing those while changing the remaining untouched ones.

Cross references

Verification of cross references is a function provided by index preparation software, but this functionality is not robust enough to handle the variety of cross reference styles found in legal indexes. This is often because "explanatory" text is used to identify the subheading part of the destination, or to imply broader headings. Here are some examples:

Dogs. See Canines, this heading.


Dogs. See within this heading, Canines.

See also ANIMALS, at Canines.


See also ANIMALS, subheading: Canines.


See also Canines under ANIMALS.

DOMESTIC ANIMALS. Pets, this index.

The "extra" text in these examples is treated, by index preparation software, as part of the heading. This generates false errors in the cross reference validation reporting. Software specially created to be "smart" enough to handle these styles of cross references might save many hours of manual checking.

Legal cross references may also have other lead-in words, which some indexing preparation software can handle. These include see now, see under, and see generally.

Another type of cross reference that produces specious validation errors is the "generic" cross reference¾ that is, one that does not provide a distinct destination. These include:

Departments, see specific name of department.

Agency, see individual agencies.

Some index preparation software may allow you to indicate that these forms need not be validated thus reducing the number of errors in the validation reporting.

Changed text reports

If available, another helpful tool is a redlined version of the source material showing where language was changed, added, or deleted. In the case of statutes, legislative action often includes such notations. But such a mark-up would also be useful to indexers of secondary source material that is published in loose-leaf or electronic form. Indexers should ask the publishers if they have any sort of "differences report" that would readily allow them to identify changed language and judge the significance of the change against the existing indexing.

This is very helpful in meeting the shorter deadlines for electronic publications.

Derivative and merged indexes

A derivative index is one that can be created from a larger index. For example, a master index to statutes contains all the entries relating to motor vehicles. To produce an index to a publication dealing only with the motor vehicle law, the appropriate entries can be extracted from the statutes. This function can be accomplished with indexing preparation software. Several issues need to be kept in mind when taking this approach.

Cross references that refer to these entries must be extracted as well. Some index preparation software allows for matching headings to accomplish this. However, as noted above, the styles of cross references used in legal indexes are not handled by this software so manual checking is required.

Many, if not all, of the entries may fall under the main heading Motor Vehicles. This is not desirable for a smaller index, and especially one where the sole subject is motor vehicles. In this situation, the indexer will want to delete the existing heading and "promote" the subheadings. Indexing software makes this operation very easy.

Next, entries previously double posted under separate main headings could now be duplicated because of the promotion; these should be reviewed and eliminated or changed as necessary. Again, software eliminates much of this tedious task. If the entries become exact duplicates, the program may eliminate the multiple versions via a command or it may force one to overwrite the other so the index ends up with only one, not multiples. Once the indexer has done this file manipulation, it will still be necessary to proof and edit the file looking for slight variations that now fall together in a repetitive array.

Derivative indexes that are needed routinely may be candidates for an automated extraction process. Many of the steps can be combined into one program run.

Merging indexes is the reverse process. For instance, separate indexes for each volume in a statute or commentary set may exist, and the publisher may want a master or general index as well. If the volume indexes were not prepared using a heading control list, there may be problems with entries meeting up under variant versions of headings and subheadings. These entries need to be identified and consolidated. Cross references that were not necessary in the separate indexes may be required to link topics occurring in different volumes.

Indexing preparation software is quite good at merging indexes and performing these types of edits. If the merge of multiple indexes will be required over and over, the indexer can create special cross reference files for the program to pick up and add to the merged compilation.

Tables of cases, statutes, etc.

Indexing preparation software can also be used to produce simple tables—usually those that involve 2-3 columns of information for each entry, for example, old and new citation lists or statutes affected by legislative acts.

Case name tables present specific problems though. Some may include reversed case name entries so that cases can be found by defendant as well as plaintiff.

Smith v. Jone


Jones; Smith v.

There may be several types of case names that must be sorted in a prescribed order: simple plaintiff/defendant (as shown above), In re, Ex parte, Ex rel, etc. Additionally some of the text in the case names such as v., may need to be ignored for sorting. For example, in a letter-by-letter sort, the order would be:

Clark, In re
Clark v. Turtle
Clark Drug Co. v. George
Clarkson v. Farrah
Clark Theatre v. Jones

Index preparation software provides sorting features to deal with these situations but it takes more time to insert the appropriate coding and proofread the final order in the index. If the tables are large and include reversals, it may be easier to type only the "forward" case names and have software generate the case name reversals. Note that this can be complex because some types of case names might require the generation of more than one reversal (e.g., State ex rel. Jones v. Topper needs two more "reversed" entries: one at Jones and one at Topper).

In some tables, depending on the subject matter, cases may only be included by their reversal. For instance, in the field of criminal law, the majority of cases will begin United States v., Commonwealth v., State v., or People v., so typically publishers will want to list these cases only once by the distinctive name of the defendant.


This has been only a brief overview of quality control issues stemming from the very special needs of legal indexes. Since indexes to the laws, statutes, and regulations can often contain from 30,000 to over 1,000,000 entries, automating mechanical tasks for entry, management, and validation are highly desirable.


David K. Ream is president of Leverage Technologies, Inc. which provides computer consulting and programming for indexers and the publishing industry. LevTech is also the corporate & government account representative for CINDEX, a product of Indexing Research.