Photo by Alina Grubnyak

Photo by Alina Grubnyak 

Analytics refers to the systematic analysis of data for the purpose of discovery, interpretation, and communication of meaningful patterns. Analytics is critical to informed and effective decision-making in science, business, sports, economics, defense, international
relations, and countless other fields. Trial analytics refers to the use of analytics in trials. This article discusses the application of analytics to jury selection, and more specifically, the use of cluster analysis for evaluation of juror questionnaires. Cluster analysis can reduce the workload associated with questionnaire analysis and at the same time provide more comprehensive and accurate results.

To understand cluster analysis, it is useful to first summarize the traditional approach to questionnaire evaluation. Typically, an attorney or consultant will examine completed juror questionnaires, going through the stack one by one, and rate the likely favorability of each juror. Ratings are commonly assigned on a scale of 1 to 10, with higher numbers indicating more favorable potential jurors. The attorney or consultant may also make summary notes about the juror if they provide unique or outlying responses. The primary outcome of the
traditional method can be thought of as a partitioning of the venire into ten separate groups according to juror rating. The characteristics of each group depend on what the litigator feels is important for the current trial. For example, one group may contain professionals with a college degree in the age range 30-50. Another may contain employed jurors with a high school education, another unemployed without a high school degree, etc. In the end, each group will tend to contain jurors with similar characteristics.

Cluster Analysis reverses the traditional method by first partitioning the venire into groups of similar jurors and then afterwards assigning a rating to each group. Assuming the questionnaire responses are in electronic format—something many courts have gravitated toward post-pandemic—the grouping can be done by computer algorithm in a matter of seconds. Group ratings are then assigned by the evaluator according to the average or typical characteristics of the group’s members. Reversing the traditional procedure in this way reduces the litigator's workload from reading and rating individual questionnaires one-by-one to rating just a small number of groups of similar jurors. The automation permitted by cluster analysis is especially helpful when a sizeable venire is called or with large questionnaires.

Cluster analysis does require questionnaire data to be in electronic form. While several progressive courts in venues like California have been receptive to administering and collecting juror questionnaire data using the internet, many others have stuck with the traditional paper and pen method. If the court provides only paper copies, then the relevant responses will have to be electronically coded into a data-friendly format, e.g., an Excel spreadsheet. However, this coding requires no special legal expertise and can be done by staff or a third-party service.

Moreover, while it may not entirely replace the litigator’s one-by-one review and ratings, it creates substantial efficiency for a consultant charged with making recommendations regarding cause challenges and strikes. The following paragraphs discuss some of the advantages of using cluster analysis:

Cluster analysis saves time and effort. The traditional, one-by-one evaluation method is a tedious process that may not be the best use of limited resources. If the venire is large, or if the questionnaires contain many questions, the sheer amount of information contained in a stack of questionnaires can be overwhelming. At the same time, the court may allow only hours for the litigants to analyze the jurors’ questionnaire responses. Overwhelmed evaluators may not have sufficient time to examine each questionnaire thoroughly. In contrast, questionnaire coding can be done quickly by staff and the cluster analysis completed by computer in seconds. Lawyers and consultants need only evaluate a small number of resulting groups of essentially similar jurors, saving time and allowing them to focus on other important areas of litigation.

Cluster analysis reduces bias. There are two types of bias that can result from the traditional evaluation approach. First, poor interrater reliability occurs when the questionnaire stack is divided among several evaluators in order to save time. In this case, different evaluators may evaluate similar jurors differently, leading to inconsistent rating assignments. For example, one attorney may consistently rate college graduates more harshly than another attorney. Second, rating drift occurs when a single evaluator perceives similar jurors differently at different times. For example, a juror near the top of the stack may be rated differently from a similar juror near the bottom of the stack. Cluster analysis eliminates these problems by first grouping jurors by similarity and then rating the groups as a whole. Cluster analysis ensures that similar jurors are rated similarly.

Cluster analysis allows flexibility of evaluation. A litigator may want to change how questionnaires are evaluated partway through the review process. For example, a litigator may want to change the relative importance of particular responses. Using cluster analysis on
electronically coded data, jurors can be easily regrouped based on new evaluation criteria. The traditional approach would require starting from scratch and re-evaluating each questionnaire one-by-one, effectively doubling the effort.

Cluster analysis automatically flags outliers. While cluster analysis groups jurors according to similarity, it also automatically flags unique or outlying juror responses. Jurors with outlying responses can be further investigated and flagged for follow-up during voir dire. Such outliers may be missed using the traditional one-by-one approach.

Cluster Analysis in Practice

Performing a cluster analysis of the venire is a straightforward, four-step process:

  1. Data preparation. Obtain or code jury questionnaire responses in electronic format. Some courts will provide electronic versions of questionnaire responses. If a court provides only paper copies, it will be necessary to code the relevant responses in electronic format, using, for example, an Excel spreadsheet. Multiple choice questions can be coded in a straightforward manner. Open-ended questions with written responses are read and assigned a numerical value based on favorability of the response. This process can be done relatively quickly, especially if there are only a few relevant questions, and it requires no special legal training.
  2. Perform a cluster analysis. This can be done by any knowledgeable data analyst using readily available tools. For example, several Excel packages are available that are capable of performing the necessary clustering calculations. Alternatively, the electronic spreadsheet can
    be provided to a jury consultant trained in cluster analysis, or to a specialized data service. The analyst, consultant, or data service will provide a report detailing the juror groupings and the typical characteristics of each group. Any outlying data will also be flagged for follow-up investigation.
  3. Assign ratings. After receiving the cluster analysis report, study the typical characteristics of each group. Rate the favorability of each group on a scale of 1 to 10. Each juror in a given group is assigned the rating of that group. Investigate outliers. Finally, investigate any flagged outliers and determine whether their ratings should be adjusted up or down relative to other members of the group. A juror may be an outlier if one or two of their responses differ significantly from other members of their group. Outlying jurors may also be flagged for further questioning during voir dire.


Many courts are in the process of modernizing their records management systems, including systems for jury management. Modern jury management systems include the ability to administer juror questionnaires online and store juror responses electronically. It is likely,
going forward, that most, if not all, courts will adopt systems that provide litigators with access to electronic juror questionnaire response data, removing barriers to the adoption of analytical, data-driven methods. At the same time, supplemental juror questionnaires have become more prevalent, increasing the workload associated with questionnaire evaluation, and increasing the need for more automated, data-driven systems.

Cluster analysis provides clear advantages compared with traditional one-by-one questionnaire evaluation. It does, however, require some knowledge of the methods of data analytics. Most data analysts could easily perform a cluster analysis given a set of questionnaire response data. However, it could also be accomplished by jury consultants with a basic grasp of mathematics and a small amount of training. Consultants familiar with the tools of data analytics will be well-equipped for modern data-intensive courtroom processes.

Author’s biography
David Caditz, founder of JuryTek Consulting, is a Stanford Ph.D. physicist with extensive experience in mathematics and computer science. Dr. Caditz pioneered the use of game theory in jury selection and has published numerous academic articles on courtroom jury selection. Dr. Caditz has provided consulting services and has participated in courtroom jury selection with successful high dollar jury awards. Dr. Caditz participated as an invited speaker to the annual meeting of the American Society of Trial Consultants and has contributed to numerous online educational seminars for jury selection.