A note to draft readers: thank you - your time & attention is a real motivation. Secondly, I’m trying to balance readability, length, and the need to provide concrete information to skeptical readers. As you read this, if you notice yourself wanting to close the tab or otherwise get off the wagon, please let me know where and why that is. I have many other outlines for how I could present this information, and I’m still not at all sure this is the right one.
The following is a sketch for granting America the benefits of ranked-choice voting, without requiring electoral reform, by collecting and publishing common knowledge that alters the game theory of party politics.
The story goes like this:
- Org builds a powerful opinion surveying stack
- Org predicts the performance of hypothetical candidates and platforms in the general election before the primary
- Profit (via prediction markets)
- Org predictions are believed to be ~accurate
- Party primaries begin to nominate generally popular candidates, or else lose to parties that do
- America gets the benefits of RCV
On a personal note, I’m working on this because I see political polarization as one of the gravest threats to civilization. It exacerbates most other problems: the risks of economic disparity, nuclear war, unaligned artificial intelligence, designer pathogens, and climate change can all be mitigated with savvy policy. We need cool heads, with a durable mandate, charting a course over decades. Put another way, the more capable our technology, the more wisely a government must act. If technology requires wisdom from our leaders, it’s important to investigate if the same technology can be used to improve their quality. The idea for this org came from pondering that question. It’s little more than an idea, which I hope the codename-y mantle conveys, but I do plan to push it into the world as a serious effort with a serious name.
Polarization and Its Discontents
Every governmental approval rating is nose-diving. This is caused by and contributes to the centrifuging of our political landscape. If we can neutralize key drivers of polarization, we might put enough slack into the knot to start to untie it. Two key drivers are internet-mediated polarization and politically-mediated polarization, and the thesis of Spectra is that both can be addressed by surfacing the truth of who wants what across the country.
Internet-mediated polarization hardly needs an introduction. We live in bubbles of manufactured outrage and algorithmically rewarded scissor statements. Our podcasts, feeds, and op-eds are full of claims about the hypocrisy of our out groups. They can be amusing or enraging, and either way, the algorithm decides we need more. One goal of Spectra is to contextualize and neutralize the worst of these.
Politically-mediated polarization is a process in which unpopular candidates narrowly win elections, anger most voters, and in so doing create conditions for the process to repeat. Another goal of Spectra is to tilt the information landscape in favor of more widely supported candidates running and winning.
The key to addressing both is making helping who wants what become common knowledge. This post will get into the specifics, but first, consider why it is that generally popular candidates do not run. Where “generally popular” means candidates who can win blowout victories in a general election. In a country of 300 million, such candidates exist, so where are they?
Why generally popular candidates don’t run
Concisely, the electoral mechanics of the US don’t encourage them to run. Consider a toy example where you have an even split of red and blue voters arranged in a line according to where their beliefs fall on a spectrum. We would expect the most viable candidates (represented below as circles) to be positioned towards the middle of that distribution as they attempt to win over enough of the center to secure a majority.
In countries with mandatory and ranked choice voting, you do get this (see: Australia) however, in America we get this:
And when one candidate wins, they are so far ideologically from the voters in the losing party, it pisses them off, and contributes to negative polarization.
Mainly, our closed primary system is to blame. Candidates are forced to enter their primary with a position closer to the fringes of their party, both because of who votes in the primary and how important the support of donors and groups are to the campaign effort. And while candidates do move to the center after the primary, they can only credibly go so far.
Another reason is that candidates need to worry about third parties running in the general election. In this example, the blue candidate moves to the center to capture more voters, but they leave their flank open to a well-run green campaign and lose far blue voters to them, ultimately causing blue to lose in the general election to red.
So what can we do about it?
What about election reform?
Ranked open primaries and ranked-choice voting schemes straightforwardly solve these issues. There’s a catch-22 here because only widely popular candidates support this kind of reform, but they often struggle to get into office in the first place. So we’ll have to bootstrap past this.
Sure, but how?
Imagine you have a perfect oracle who can, before the primaries, know the result of every possible general election matchup. This oracle could then issue statements about each candidate’s chances of winning the general election, even before the primary is held.
For example, consider the oracle says:
🧙 Alice (D) has a 72% of winning the general election. Her worst matchup is Bob (R). 🧙 Bob (R) has a 68% of winning the general election. His worst matchup is Alice (D).
The outcomes of the two primaries can be seen in this 2x2:
The oracle’s predictions mean that at the primary stage, the D primary voters know that they will likely lose the general if they don’t nominate Alice, even if she isn’t their favorite. The same is true for the R voters and Bob. If Alice and Bob are both nominated, instead of a blowout, it will be a close election. The fact that Alice and Bob have blowout potential means they must have been appealing to a wide swath of the electorate, so either one would be a popular moderate and achieve our goal of unwinding political polarization a bit.
Sure, but how do we really do it?
The question is whether we can get close enough to the magic oracle with advanced polling and modeling. We can get most of the effect we’re looking for as long as the system can identify large discrepancies between possible candidates. The most ambitious goal of Spectra is to see how close we can get to this by surveying and modeling voter preferences within a robust election simulation.
If AOC secures the Democratic primary in 2028, what are her odds against each possible Republican nominee? Is there an optimal platform that would alter those odds? Is there a candidate who hasn’t announced their candidacy yet but who would perform extremely well if they did? Most importantly, is there a candidate and platform that would threaten a blowout victory against the likely nominee of the other major party?
Having answers to these questions, even with a wide margin of error, alters the common knowledge everyone operates on and thus can have drastic implications for the kinds of people who make it through each primary and thus who will ultimately secure the White House. The same applies to every other elected position.
Put another way, when everyone knows what everyone wants, this moves our closed-primary-first-past-the-post system closer to one that effectively acts like one with ranked choice voting - one with less negative polarization and less gridlock.
Milestone 1: On-Demand Survey Analysis
Loosely, these are the steps. Information flows top to bottom.
First up are the parts in
which enables the on demand survey analysis.Demographics Layer
This layer holds data made available by the US Census. This is primarily age, sex, race/ethnicity, household composition, and income. The ACS surveys conducted by the office also provide us with social, educational, economic, and housing data. We will maintain it in the most granular form available, typically at the census block (~1000 people) or tract (~5000 people) level. We will store this data in databases optimized for large-scale statistical analysis (DuckDB or ClickHouse), ensuring that for each region, we have a statistical distribution of each of these traits.
Survey Import Layer
This layer holds high-quality surveys of Americans along with their demographic cross-tabs. These cross-tabs allow us to contextualize survey results and understand how they do or do not apply to the larger population. At minimum, this layer needs to make the surveys available in a concise and common format that the next layer can read. We’re not attempting to normalize and combine them all, but we could consider doing so in the future.
Survey Analysis Layer
This layer contains the code & scaffolding to employ an LLM agent as a data scientist. Prompted with a question, it performs a reasoning step and then executes a query against our survey database (via MCP) to identify surveys that might answer the question. It then engages in a reasoning loop, referencing both the surveys and our demographics layer (via a separate MCP) to answer the question.
For example, if asked for the current presidential approval rating in the town of Grass Valley, it would first find the most recent sizable opinion poll with demographics cross-tabs, poll the demographics of Grass Valley, and use post-stratification techniques to answer.
First public release
The initial release will look like an LLM chat interface. People can submit a political claim and request an analysis. Ideally, some people share the results on social media to offer context and diffuse tension in the discourse.
One audience likely to find the most value at first is journalists, short on time but eager to call out when a given political action is popular or unpopular. Other projects are working on making the machinations of government legible (the contents of bills, actions of committees, votes, etc.). It could be useful for their systems to consult Spectra to understand if the actions of a given politician are in or out of alignment with their constituency.
This will be particularly useful to local and state politicians. At the national level, there are already significant efforts to accomplish this (and deep pockets working to convince candidates to overlook the findings).
If we decide strategically we need more user growth, we could work to make this available as a tool call to popular LLM information systems. However, I suspect what we’ll do is make a TED talk, put the feather in our cap, and move on to the next phase.
Milestone 2: Election Forecasting
Things get more ambitious here. To shift the primary outcomes, we must generate common knowledge about which possible candidates will win against which other possible candidates. Such common knowledge is only yielded by forecasts when the predictor develops a very strong track record. Thus, we need to build a highly capable system and utilize it to predict every election we can for years to establish credibility.
Thus, we’ll start betting in real money markets. If we’re even close to achieving what is necessary for this milestone, we’ll earn enough money to fund the org and generate buzzy magazine articles. At that point, we will be generating common knowledge.
As for how, let’s look at our roadmap again:
This milestone requires the parts in
Psychographics Layer
Surveys are expensive! One reason is that their answers do not generalize across populations or topics. If you ask one group about property taxes, you can’t infer much about other groups’ opinions, or how the same group feels about adjacent questions about sales taxes. To make it worse, when asking people about questions they have not considered, you need to give them time to learn about and deliberate before their answer is predictive of a vote. For this project to succeed, we need to overcome all of these limitations to get meaningful signal on new political questions rapidly.
We need to discover stable and generative traits of voters that are predictive of their answers across different questions. The exact attributes we track will be the largest research project of our organization. To build out this layer, we must evaluate the psychographics of a few thousand people, ask them political questions, and work out which psychographics are the most predictive of answers. The first schema to try is Haidt’s Moral Foundations Theory, which identified five axes: harm/care, fairness/reciprocity, ingroup/loyalty, authority/respect, and purity/sanctity. Another is Schwartz’s Value Theory, which enumerates two: self-direction, stimulation, hedonism, achievement, power, security, conformity, tradition, benevolence, and universalism.
Once we have a good theory of which psychographic properties predict political stances (perhaps in conjunction with demographics), we then need to determine them at the census block/tract level for everyone in the country. If we’re lucky, these attributes are highly correlated with demographics, and we only need to issue ten thousand surveys to build our model of the country. If we’re unlucky, and demographics and psychographics are rather uncorrelated, we could need as many as 10 million surveys to feel we have a good sketch of the nation. Ultimately, this data is stored in our database alongside the demographic information.
Survey Ops Layer
We’ll have to run our own surveys, both because we need the psychometrics of respondents and because we’ll need rapid turnaround for new political questions.
The most efficient way to do this is to build a survey population that covers the range of attributes and will consent to ongoing resurveys. We’ll have to pay them both in cash and in the pride that their answers improve the state of politics. And we’ll make it fun. If we can afford to do it in person, we’ll try, but we may need to lean on the use of LLM voice agents. Given these incentives, we will need to put in effort to ensure the answers are real (not LLM-generated, not one person answering multiple times).
Running these ourselves lets us get creative. Can we ask people to place bets on how their county will vote on a given ballot item, and reward those who guess correctly with kudos or cash? Can we incentivize answers with sweepstakes or donations to charity? Can we offer people infographics of their psychometrics and stances as a reward? Can we publish a fun, polished personality test that people want to take? The team running this layer should feel that the job is dynamic and vital.
Issue Stances Layer
Should the US supply Israel with money and weapons? Should we have single-payer healthcare? This layer holds the data representing how people feel about a policy action the government can take. It is the database, schema, and import process.
It will synthesize data from different sources. The primary source would be our Survey Ops layer. Another source could be the surveys conducted by other organizations. Another approach could be through media analysis, where robots read, listen, and tag the opinions of pundits, writers, and podcast hosts, providing an early signal for new questions. With enough historical data, a model of how opinions percolate up to the broader electorate could be generated.
A significant challenge of this layer is determining what to track and how to define the contents of a stance. How can we quantify someone’s stance on US-Israel relations? Is a vector of responses to questions with 1-to-5 scale answers good enough? Probably not! And it’s not enough to know someone’s opinion; we need to understand the importance of a stance relative to others.
There is a lot to work out here, still, but building this is a considerable public good in itself.
Election Sim Layer
This layer brings it all together to predict hypothetical election outcomes before the primaries. We synthesize information from two approaches.
The first approach is to forget specific candidates and come up with a range of possible issue slates a candidate can run on, and then to build an understanding of how each voter would feel towards that slate (negative to positive) by referencing the issue stances layer, and finally to run a simulation using a model of the electoral dynamics of the country to determine the odds each slate would have in each possible general election. The idea is to understand how slates would perform without understanding the charisma of the candidates running on the slate.
The second approach is to predict the candidate effects. Here, we use our in-house surveying tools to understand people’s attitudes towards candidates contextualized by their demographics and psychometrics. This will capture factors such as incumbent effects, the state of the economy, and other cultural trends not accounted for by the slate-based approach. Again, we run a simulation using a model of the country’s electoral dynamics, which includes the propensity for each cluster of voters actually to participate.
Our best prediction of who wins the hypothetical matchups will combine both. This includes our bets in prediction markets. Each has value on its own, though. The slate-based prediction would likely highlight moderate positions outside of the Overton window of the primary races, and if we have a reputation for being right (due to our rigorous approach and prediction market successes) it would inspire candidates who can credibly endorse such slates to make a run for office when they usually wouldn’t bother. Or at least, it would nudge candidates who have already decided to run into adopting key planks for the slate that they might not have otherwise. The candidate-based prediction would nudge up candidates likely to be popular with all voters (not just primary voters) and thus prevent divisive candidates from advancing to the general election.
How you can help
This document is as much a signal flare as anything else. If you have thoughts, send them my way. If you know someone who might find it interesting, please pass it on. If you know anyone working on a similar project, please let me know.