Tags: assumption, assumptions, attacker, audit sampling, auditing, desire, efficiency, entities, judgment, motivations, probability, proponents, random sample, random1, recoil, risk of failure, sampling plan, total population, vote count, wholesale,
Election Audit Sampling Plan Design-- It's Not Just About Sampling
Without Replacement
By
Jerry Lobdill
October 9, 2006
Copyright 2006. All rights reserved
Introduction
The design of a sampling plan for an election audit depends on precinct vote count
distribution and the assumptions made about the attacker's motivations, risk averseness,
desire to succeed, and ability to attack (wholesale vs retail). It is also influenced by the
stated purpose of the audit. This paper will discuss these factors, and a sampling plan
will be defined. The purpose of the sampling plan defined here is to detect tampering
with a 0.99 probability with the greatest possible efficiency.
In this paper it will be assumed that the population of auditable entities are the precincts
involved in the race in question and that the race is a two-candidate race.
The Attack
Some researchers have apparently felt that any assumption made about the attack would
incur risk of failure of the audit. These researchers do not discuss either the attacker or
the attack, but proceed immediately to calculate the sample size to be selected at random
from the total population of precincts involved in the election and, perhaps, a particular
number of corrupt precincts assumed to be scattered among the entire population at
random1. They prescribe that once a random sample of precincts has been selected the
auditing will proceed until either the last selected precinct has been audited or a corrupted
precinct has been found. The sample is presumably audited in the random order in which
the list is prepared since the subject is never discussed.
Though proponents of this approach recoil at the thought of making any assumptions
about the potential attack, this approach does not completely avoid matters of judgment
since the number of corrupt precincts is an input whose value is based on the assumed
maximum vote switch percentage per precinct.
In this paper we will define a sensible attack based on the attacker's driving motivations
and fears. The audit plan that results from this analysis is robust and effective against
conceivable wholesale attacks that have the potential to reverse an election.
1
The smallest number of precincts that can produce a fraudulent victory can be calculated and used for this
number, See "Designing Mandatory Election Audits", by Jerry Lobdill 8/15/06 or "Random Auditing of E-Voting
Systems: How Much Is Enough?", by Howard Stanislevic, 8/16/06, p 6. Of course, this assumes an upper
limit on the fraction of votes the attacker is willing to risk switching from one candidate to another in any
precinct.
1
Audit Purpose
The overriding purpose of an audit is to detect and discourage the large-scale wholesale
attacks that have been made possible by electronic voting machines, especially the
machines currently in use (2006). It is crucial to detect and thwart wholesale attacks that
can be implemented by a very small number of people and that can affect the statewide
outcome of a federal election.
Some researchers have expressed a desire to uncover all election irregularities, whether
caused by deliberate attacks or by software errors or other anomalies. Some such
irregularities will tend to produce such bizarre results that inspection will reveal their
presence--such as the Tarrant County, Texas March 2006 primary, where the tallying
software announced a total vote count of about three times the number of actual voters.
Others will produce micro effects such as the corruption of a single DRE or precinct,
producing an overall effect that would not change the winner of a race. Detecting small
anomalies that cannot change the outcome of a race is not considered to be a purpose of
the audit, although if it turns out that this is a frequent result of audits, it will enhance
public perception that the audit process provides excellent protection.
The Attacker's Goal, Foreknowledge, and Limitations
We postulate a serious attacker who desperately (but not too desperately) wants her
candidate to win. She is not playing hacker games. She will not use her access to attack
one or a few precincts or a number of precincts chosen at random. She will not try to
reverse an election in a jurisdiction that has historically voted heavily against her wishes,
because if successful, she fears that her attack would ring alarm bells and motivate an
audit.
She fears that her prediction of the margin against her will be too small, and if so, she
will fail in her attempt. Therefore she will switch as many votes as she thinks she can get
away with, but she will not risk switching all votes in a precinct to her desired winner,
nor will she risk switching more than some estimated maximum percentage of the votes
in any precinct, county, or district.
There is clearly a dichotomy between what her desires and her fears tell her to do.
What does she know in advance of the election? She has historic data on voting patterns
down to the precinct level. She has a political strategist's estimate of the expected
turnout, the direction of the political winds, and an insider's view of how the voting
equipment is prepared and the details of the security safeguards in place. She has access
to election equipment at the level required to implant a software Trojan Horse in every
voting machine and ballot scanner in a county.
2
Attacker's Trojan Horse
The attacker's Trojan Horse is a security-conscious autonomously operating software
program that cannot be detected through testing. It attacks vote counts, not individual
ballots. If there is a voter-verified paper trail or paper ballots, an audit will reveal the
fraud created by the Trojan Horse. Therefore, the attacker attempts to set the Trojan
Horse parameters so that her candidate will win, but no recount will be ordered, and the
mandatory election audit has a minimal probability of discovering the fraud. The Trojan
Horse operates on the precinct vote count for a particular race.
Pseudocode for the Trojan Horse vote switching algorithm
Calibration Inputs:
Maximum total precinct vote count to attack, L2.
Minimum total precinct vote count to attack, L1.
Minimum precinct vote count for the desired loser required to attack, VLmin.
Fraction of total precinct vote count to switch, VS.
For each precinct--
At the close of polls read the reported precinct vote count tallies for the desired
loser and desired winner, respectively, VL, and VW.
Compute the total vote count, VT = VL+VW.
If VL L2 END
If VT < L1 END
If VL - VS x VT < 0,
VL = VL
VW = VW
END
else
VL = VL - VS x VT
VW = VW + VS x VT
END
If L2 is greater than or equal to the largest vote count in the county the attacker is
attacking all the largest precincts. If not, the attacker is trying to fool an audit plan that
presumes she will attack all the large precincts.
L1 is used to avoid corrupting lower vote count precincts and to minimize the number of
corrupted precincts in the hope that the audit will miss the precincts that were corrupted.
VLmin is used to avoid showing a zero vote count for the desired loser unless that situation
actually occurred.
VS is the assumed fraction of the total precinct vote count the attacker believes can be
switched without raising suspicion sufficiently to cause a recount. This value depends on
3
the attacker's desires and fears. Various researchers have assumed values between 0.05
and 0.2.
Excel equations:
New VL = IF(VT >Upper_limit, VL, IF(VT >Threshold, ROUND(IF(VL - VT
*Switch_Fraction