Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Here's this re-explained with a simpler example.

Imagine you have 1,000 votes. You want to show that your political party got 60% of the vote, so, you claim:

My party: 600 votes Opposition: 300 votes Other: 100 votes

Presto, we got a good breakdown. The people will buy it....

It makes sense that 600 is exactly 60% of 1,000, because this was an artificial example.

But in the real world, we don't get 1,000 votes.

We get 10,058,774 votes. What are the odds that the % of votes you get is a round number like 60%, or 51.2%? They're infinitesimally small. You're much more likely to get ugly numbers, like 59.941323854% of the vote, unless you choose some artificial percentage and work backward.



This is not correct since you can claim the above for any number of votes actually obtained (if you asked someone to pick a number from 1 to 10 million, any number the person picks (assuming iid picks) will be by definition 10^-7).

The problem is more subtle.

There are around 10,000 integers n such that n/10058774 when rounded to 3 decimal places gives 0.512. Of those 10,000 this particular one has the smallest rounding error. That's what gives one the sense that probably they started with the clean fraction 0.512 and then worked their way to the tally.


My favourite part of my math education, solutions were always nice.

Compute the eigenvalues of a random-looking (but still integers) 4x4 matrix? Oh, it's sqrt(2), I probably didn't make an error in the calculation.

Then came the advanced physics / mechanics exam. It threw a wrench into our beautiful system. The results were just about anything, incredibly ugly, like the real world :yuck: :vomit:


> Oh, it's sqrt(2), I probably didn't make an error in the calculation

You remind me of my university maths exam. In all the past papers, the eigenvalues came out to be round numbers. But in the real paper I sat, no matter how many times I tried to find my mistake, they didn't. I wasted hours of the exam on that.

It was the professor's final year before his retirement.


The year that I took AP Physics, every single piece of study material and practice test exercised only really simple math - small numbers, everything cleanly worked out into integers, etc etc. I did almost everything in my head or with quick notes on paper. This pattern was so consistent I almost didn't bring my calculator into the actual exam because I hadn't needed it all year, and grabbed it only at the last second "just in case".

Turns out that was not a design goal of the real exam and basically nothing worked out to neat, small integer solutions - I probably would have hard failed without the calculator. I'm still sort of confused why prep materials and the real exam diverged so much.


I had a university exam where my calculator literally didn't work. I put a note on the paper to that effect and worked out as far as I could by hand without actually giving any of the final answers. Given the test was about knowledge and not the precise answers, I don't think it harmed me any (my grade was over 80%).


I passed my physics classes refusing to evaluate the final expressions, after all that's what calculators and computers are for. I don't feel that had a huge impact on my grades either and my sanity/stubbornness went unharmed.


Except in the real world we are allowed to offload the computation to a computer and have more time to double check things. Nice solutions are necessary due to time and resource constraints that exist within an educational setting.


Tests should have checksums built into the correct answers.


You are right that it is unlikely that one candidate gets the number of votes that exactly matches a certain percentage with one decimal (1:10.000 as per the source article).

But it's even more unlikely and astonishing that the second candidate also gets a number of votes corresponding to a percentage with one decimal!

This is highly suspicious if the vote counts are presented as official result.

But as mentioned in the comments, we cannot be sure that someone was given the total vote count, and the percentages rounded to one decimal, and thought it would be helpful to recalculate how many votes each candidate must have gotten.


The results we're discussing were read live by the president of the Venezuelan electoral authority. It is possible that they... simply read the wrong results? Like an internal estimate rather than the real numbers? But that is a wild mistake for the electoral authority to make.

https://news.ycombinator.com/item?id=41125031


If I'm in charge of forging a presidential election, how difficult is it for me to use realistic, "ugly" numbers to sell it more effectively?


In light of recent analyses of suspicious elections (Iran, Russia, Venezuela), it seems harder than it sounds to avoid discernible patterns.

On the other hand, the goal is to get away with fraud, not to convince an international community who will likely look for any confirmation of their suspicion. It would be interesting to look for patterns in a (presumably) fair election like the recent British one for comparison.

Disclaimer: I have never tried to rig elections myself so I don't really know how hard it is.


> On the other hand, the goal is to get away with fraud, not to convince an international community who will likely look for any confirmation of their suspicion.

Despite my joke in a sibling comment, this is key. When you're a politician everything is power relations. Sometimes it's necessary to show that you have the power to semi-obviously rig an election. Your bargaining position is different if it requires military force to remove you vs just an unhappy electorate. You can achieve different things.


Yup. It’s a special kind of power that can flat out rig an election and have opponents ‘fall out of windows’ with no repercussions.

The type no one wants to even be seen trying to challenge.


>It would be interesting to look for patterns in a (presumably) fair election like the recent British one for comparison.

Someone should run this same analysis on all the election data they can get their hands on. Who knows what might be found.


There were some pretty shit analysises of the 2020 US elections that Matt Parker covered with videos like "Why do Biden's votes not follow Benford's law?"[1] and "Why was Biden's win calculated to be 1 in 1,000,000,000,000,000?"[2].

I don't know about other countries, but the amount of data that every county in every US state produces makes systematic fraud pretty much impossible. If there's literally only 3 numbers produced by the Venezuelan government, you need to be seriously incompetent to have detectable fraud because techniques like Benford's or Zipf's law need lots of individual numbers.

[1]: https://youtu.be/etx0k1nLn78

[2]: https://youtu.be/ua5aOFi-DKs


Those are really well done, not to mention hilarious. Thanks for sharing.


If you were forced against your will to aid in this type of fraud, might you not intentionally include a subtle error in your work that reveals its illegitimacy to a careful observer?


Gun to my head? No.


If have thought it more likely that the stress would cause an accidental subtle error.


E. Goldstein wins with 51.2HELPIMTRAPPEDINANELECTIONRIGGINGBUNKER% of the vote!

One would have to take care with the analysis because humans are actually trapped in vote counting bunkers (or local sports halls more likely) in legitimate elections. Any analysis that simply concludes votes were subject to the foibles of hand counting wouldn’t be very useful.


This kind of mistake is easy to avoid. The problem is that there are a lot of potential mistakes that could be made, this is just one of them.


And the people doing the fraud aren't going to be computer scientists or statisticians. They were chosen for their loyalty to the dear leader.


The dear leader wasn't chosen for his ability, either, but for his loyalty to the previous dear leader.


Not difficult at all. Just pick the approximate numbers you want and then introduce a random error of a few percent. (Normal, uniform, doesn't really matter). This is also not hard for statistics experts to detect, but it's much harder to prove (aka you've got plausible deniability).

One wonders why they didn't even bother to do fraud slightly better.


If they can get away with being balatant, that is even more of a show of power.

Think of it this way - who has more power in a relationship? The one who is really good at cheating and hiding it? Or the the one who doesn’t even try to hide it, but suffers no consequences?

Just look at how many comments are trying to figure out how the numbers could be legitimate, and how unlikely it is that Maduro is going to actually be removed from power.


Unless done carefully this will almost certainly fail Benford’s Law.

Manipulating statistics is harder than you think.


> Unless done carefully this will almost certainly fail Benford’s Law.

IIRC Benford's law relies upon things that have power-law underpinnings, such as iterated growth% at different rates. In contrast, relative vote amounts at a given point in time don't have many ways to exhibit that, particularly when the total number of voters is fixed rather than having voters divide like bacteria during polling day.

However it might work if you were checking the growth in total eligible voters in different locations over time.

I like to imagine Benford's Law a bit like throwing randomly distributed darts through the air at a paper target, exept the target is graph paper with log-10 subdivisions. The "leading 1" zones are simply bigger targets. [0]

[0] https://commons.wikimedia.org/wiki/File:Logarithmic_scale.sv...


It's my understanding that legitimate vote totals aren't likely to conform to Benford's law in the first place.

Even if that's the case, though, there might very well be other applicable tests this would run afoul of.


I’m not a statistician so I may be confusing it with Zipf’s law. But IIRC tallies from individual precincts should roughly conform to Benford’s law.


Precints have roughly even populations and therefore typically don't conform to Benford's law. https://www.youtube.com/watch?v=etx0k1nLn78&t=76s&pp=ygUQYmV...


I think the concern is that precinct size tends to cluster in ways that mean results can cluster in ways that - for a large portion of the data - does not span a full order of magnitude.


To elaborate, if we imagine a polity with precincts that turn out 10,000 people each election, with two major party candidates that each get between 20% and 80% of the vote, we'd see precisely 0% precincts with a leading digit of 1, much less the ~30% predicted by Benford's law. Of course that doesn't exactly describe any real polity, but it doesn't seem surprising that real elections would be enough like that to screw with the pattern.


The Biden election in 2020 also failed Benford's law - unless you're suggesting that one was fake, it seems that failing Benford's law is okay.


There was a good paper in an American Statistical Association journal about this.

https://chance.amstat.org/2022/04/benfords-law-votes/

https://www.tandfonline.com/doi/abs/10.1080/09332480.2022.20...


Nobody is saying that failing a statistical test is by itself indicative of anything.


Statistically, yes they are. Immibis just did.


But votes aren’t tallied in one location, districts individually tally.

So now you’ve got to force each of those districts to change the numbers.


Generate 10,058,744 random floating numbers from [0,1), and count how many will fall into each of the intervals: [0, 0.522), [0.522, 0.522+0.442), or [0.522+0.442, 1). You can do it with less calculations if you know how to generate random numbers from binomial distribution. Then you should calculate percentages from these three numbers just to be sure they are right.

It is not difficult at all, but it needs some basic programming skills and some basic knowledge of statistics.


Hard. Naively, run the election, use real vote counts, claim that you're the one that got more.

Ok but now we need to fake this on a local level. But everybody knows people in this district are more Party A and in that are more Party B. Well.... let's correct for that...

If you want to give top line numbers, fine. Credible local numbers would get really hairy


> Naively, run the election, use real vote counts, claim that you're the one that got more.

I think this would be even more obvious, even to the general public, especially when there is a landslide victory. Oh, your result is 70%? That's exactly what the exit polls said about our candidate.


It's not hard for you or me.

But for the brutal thugs running Venezuela, it is a very advanced conept.


it's like Russia killing someone with polonium. Hitting 60% exactly sends a message to anyone with the bright idea of running against the brutal thug's preferred candidate that it might not be a good idea, because there are brutal thugs involved.


Actually think this is a fascinating question, although perhaps for different reasons within whatever reasons might have led you to ask it.

I think you raise a legitimate point that it's not that hard to create ugly numbers.

I also think that authoritarian social dynamics come to these questions with a kind of brutal simplicity, lack of intellectual curiosity or creativity, and a lot of the traits that would entail a value for democracy are mutually exclusive with the brute simplicity of authoritarian mindset. And so the story they choose to tell of how they won is going to have similar hallmarks of brute simplicity and absence of nuance.


Votes are not random.

So it would be non-trivial to make results look real.

Additionally, if even a few polling places release real data - that can really complicate things.

It will look very, very suspicious when some polling places display wildly different behaviors (especially if they match expectations) then the rest of the polling places.


Depends on if you're in on it with the forgers or if you're against it but being forced to do it. It's the perfect clue to leave in, as its intention is plausibly deniable and you can tip statisticians to uncover the fraud.


I don't know, but the numbers they provided here are impossible.


You just need to run some Monte Carlo sims where the priors are your desired rates, then use this results to alter the real numbers. It's as random as it gets.


Use the real numbers but change the owners of each count.

Except, of course, if the winning party has a huge advantage which you don't think people would buy.


No you see, they had the perfect plan.

But they got foiled by that one thing they always forget at every single election (that never gets brought up when your government agrees with the result, because in that case it's just a "statistical anomaly" or "shit happens sometimes")


Thanks. Note to self: Next time I want to rig election results, generate a random integer between 54.2% and 54.3% of the vote and count it as the winner's vote, subtract from total pool, wash rinse repeat.


It's remarkable that they don't do this.

The sheer incompetence of the Maduro government and other governments that rig election results is surprising.


Well, yes, but I think you are vastly overestimating the percentage of the population that would even think of this. Countries like Venezuela have a major problem with brain drain as it is. There's very little chance they would think to get a competent statistician involved in rigging their election. They're just simple numbers, right? You don't know what you don't know.


You’re much more intelligent than any of the people that control your life, especially so in a dictatorship. That’s a pill hard to swallow, but ideas itt aren’t even remotely a concern for them. They are idiots with a microphone who excel at being at power, that’s it. Everything else gets done by lower and lower ranks with higher and higher competence. Since election rigging isn’t an industry, you can’t expect it to be any smart. It’s not even that “only 1% who understands will be unconvinced, so why care”. They simply aren’t aware of this because it works without it.


Have you ever heard a politician speak? Not surprising at all.


So there are 2 things that may not make this no longer so surprising.

1) The Maduro government is more like a large gang that is holding a population hostage, than a government.

All major businesses/imports/exports are owned by people connected to the Maduro regime. They are extorting remissions out of the population because they're the only ones who can import products. The government is so incompetent, it no longer has sufficient machinery or brains to operate their petroleum extractors, so instead they've pursued the more lucrative method of drug smuggling.

The upper echelons of military are in on it and are all very individually wealthy, the lower echelons are brainwashed, but still well compensated for a "government" employee in Venezuela. Think $100 month vs $3 a month.

This "government" will never give this up. They make too much money, and they have bought out the military. They can't just peacefully go away, or they will be tried for their crimes in any major nation. Almost every country has placed sanctions on various high level individuals from the government and frozen all of their assets.

2) There are no intellects in this government. The socialists that fought violently in the 90s that had little/no education rose up the ranks and are now extravagantly rich and powerful.

Imagine if you took a bus driver and made him the dictator of a country. That is exactly what happened, Maduro was literally a bus driver.

This is not to disparage bus drivers, they're fine people, but countries should be ran by experts. Economists, politicians, lawyers, people with some form of education.

They don't understand economics. They don't understand engineering. Almost all of the intellectual work of the country is outsourced to Chinese or Russians.

The entire country is being held hostage by people who have about a 3rd grade education, and that's being generous. But it's because they have guns, and money. But mostly the guns.


Why didn't they outsource the election count rigging to Russian or Chinese?

Because they didn't know they would need help?


Because the Russians (don't know about China) also don't try to hide it.

Not hiding your election fraud isn't always a sign of incompetence; it's also a show of power over the people these dictatorships are oppressing. In Russia, the election is blatantly forged[0]. The goal here isn't just to validate the dictatorship it's to dare you to speak up against it so the nice men with guns can knock on your door and tell you to knock it off, whether that's nicely or less nicely.

[0]: https://www.economist.com/graphic-detail/2021/10/11/russian-...


Knock it off why? The electorate doesn’t care, it will never get to tv or average feed or even have effect on an average person who barely understands the graph. Everyone knows it’s rigged. This “power show” argument copes with dumb reality more than anything else, it is a HNer idea of how to be a dictator. Power doesn’t need to show itself in such an intricate way, as if it was hiding. It’s right in your face all the time.


That would significantly weaken whatever negotiating position they have with either country.


When I try to explain the issue, it boils down to this : the results look like they have been cooked. And the probability of that hapening by chance is 1 in 100 million.

Mathematically, if votes are random, with 10,058,774 voters you have 10,058,774^2 ~= 1e14 possibilities of different results for 3 candidates (Maduro, Gonzalez and "Other"). On the other hand, the number of possible results that land exactly on the closest integer to be a round 0.1 percentage point is 1000^2 = 1e6. So the probability of the actual votes landing on a round 0.1 percentage point purely by chance is 1000^2/10,058,774^2 ~= 1 in 100 million.

Of course the votes are not entirely random, but they have a random element, so it gives a rough idea of the reality.


The odds are, that if you go looking for any one of multiple low-probability events, one of them will be found to have happened.


You can also encode little ascii messages onto the fraction. 60,7097107101 % is the worlds smallest whistle


It's called "digit tests" and it was further theorized that the last digit had a particularly even distribution in natural, honest elections.

Further research showed that last digit test wasn't very good - there are multiple obvious counters to the test.


This isn't a digit test, though—the giveaway here isn't a problem with the last digit (or any single digit), the giveaway is that the vote tallies reported exactly match what you would arrive at if you attempted to derive them from nice round 3-digit percentages.


[flagged]


Many instances of numerical manipulations end up being discovered because the cheater didn't understand math well enough to hide their tracks correctly.

See this link for a recent example that's been on my mind: https://en.m.wikipedia.org/wiki/Mnet_vote_manipulation_inves... Basically, they just grabbed a random-looking numerical constant and used its multiples as the difference between vote numbers.


I have always called the argument that someone couldn't have possibly done it because it would be stupid to have done so the "Connie Defense" after a woman who tried to assert same before I picked her up and put her out of the house like Fred Flinstones cat.

This was after she was caught driving without a license while speeding and smoking weed in a state where it was still illegal.


I don't understand your logic.

"The easiest thing to do" is to do less thinking, less mathematical operations


What's wrong with announcing results with rounded percentages?!


Nothing. The problem is when you obviously picked the rounded percentages that sounded good first and then calculated the number of votes from that.


Not necessarily. If the person announcing was given the number of votes and rounded percentages, then this could explain it. For example, in my country, they always report only turnout as a percentage with a single decimal and the share of each candidate/party with up to 2 decimals, never the number of votes - who cares about the absolute numbers anyway?


The thing is, that the absolute number of votes work out to give the announced percentage with 6 decimal digits, just as if they put "51.2%" on a calculator and worked backwards. The point is that they didn't actually round the percentage, it was actually 51.199999 for the President and 44.199999 for the opposition, the only credible explanation is that they picked the percentages and then cooked the absolute numbers to line up, so the numbers look "ugly", but the percentages are neat.


As they say in the article one explanation is the guy publishing the numbers was not given the actual counts only the percentages and imputed the counts on his own.


That's fishy in its own right. The absolute vote tallies are the key thing in a democratic election. The percentages are a derived value to quickly make sense of the vote tallies, but the vote tallies are the actual results. Why would you need to derive vote tallies from percentages when you derived the percentages from the tallies?

It'd be like feeding your English marketing copy into Google translate to Spanish and back and using that instead of the original copy.


Because voting results are universally reported as percentages, that's what everyone uses and understands.


Reporting just the percentages makes sense. Reporting rounded versions of those percentages not only makes sense, but is the universal idiom for reporting percentages. But reporting synthesized vote counts from the percentages --- even from non-rounded percentages --- is not normal.

People on this thread are hung up on the reported percentages, but those don't matter in this analysis at all. They're not the problem. The problem is the counts themselves. Discard the reported percentages entirely; exact same critique, one statistics students would spot instantly.


Maybe I don't understand what you have identified as the problem. My understanding of the article is that the raw tallies should not correspond to "precise" rounded percentages. The article in an addendum points out one way that could legitimately occur (some underling has the totals and rounded percentages but needs the raw tallies and naively multiplies to get them).


I'm summarizing that PPS in my comment. The exculpatory scenario is: (1) start with real numbers, (2) compute percentages, (3) round percentages, (4) discard original numbers, (5) compute new numbers from the round percentages.

Steps (4) and (5) don't have any valid explanation, and few (though maybe some) plausible human error explanations.

As long as we're on the same page that nobody ever had any business reporting the numbers in step (5) --- they're completely fictitious! --- I don't have much to argue about here. The politics aren't interesting to me.


> Steps (4) and (5) don't have any valid explanation, and few (though maybe some) plausible human error explanations.

It does... Person A didn't send the original numbers to Person B. And then Person B wanted to publish a document that showed the original numbers anyway (maybe they were asked to by a media person or something). And they did the glaringly obvious calculation of g% x total_votes and called it a day instead of being delayed for hours or days waiting for a request for the original numbers. This is really a very common scenario that happens everywhere in multiple fields.


Person B made up vote counts for the candidates in your scenario. That is not a very common scenario in official elections results reporting, which is what this was.


No, they are universally reported in raw numbers accompanied by percentages, as indeed they were here. The raw numbers are universally understood to be derived from the percentages and not vice versa. The votes are the ground truth.

That's how elections always work. The votes are what counts, the percentages are an abstraction to make the votes easier to parse. Any government agency that doesn't operate that way doesn't understand democracy, even if they weren't committing outright fraud.


First these were intermediate results. Second virtually no one reads or understands raw tallies, I don't know anyone who would or could quote them in any election. The final result, the result that is published as a headline in the newspaper are the rounded percentages.

No one is saying that the percentages are not derived from the raw tallies they are saying that it might be that somewhere in the game of telephone to the person that goes on TV and reports only the percentages were communicated and they realized they should put the tallies in too so they imputed them from the numbers they had, the total votes cast and the percentages (and naively it seems obviously okay to do that).


> Virtually no one reads or understands raw tallies...

I believe virtually anyone could look at the raw tallies and see which is the largest, and that a majority could calculate the percentages by themselves, if they were so inclined. This was direct election plurality voting, not some sort of proportional voting scheme, and even if it were, having the raw tallies in the public domain is essential to transparency, verification and legitimacy.


> it might be that somewhere in the game of telephone to the person that goes on TV and reports only the percentages were communicated and they realized they should put the tallies in too so they imputed them from the numbers they had

And I'm telling you that anyone who handles votes this way doesn't understand democracy. The best case scenario here is that the Venezuelan government doesn't really care about the vote tally (which is, again, bad, because the votes are the thing). The worst case is that they fabricated it entirely. Neither one speaks well for the state of democracy in Venezuela.


Finally some good explanation


That's a really bad thing and a reason not to trust the entire system.

They should report the absolute number of votes at each counting station.


I am talking about publicizing on media - the raw data of Bulgarian elections is available for download in real time during the counting and afterward [0], including the scanned protocols of each polling station and the video recording, which is now required. Even if the voting is electronic in particular (well, most) stations, there's still a paper protocol signed by the members of the section's committee.

A tweet, an article, or a chart on TV doesn't prove anything, as they are not official documents.

[0]: https://results.cik.bg/


Well, nobody else is talking about the headline numbers. That was the miscommunication.


A possibility, but not a good one. Depending on your goal, you either care a _lot_ about the raw number (in which case doing that calculation is _insane_), or you don't care really at all (so...why would you calculate it?).


But they did report the absolute number of votes.


Where? In a tweet?


It's...rarely to never done? The exact counts are nearly always provided by voting officials.

The press might summarize an election in whole numbers and maybe round up, but...that's very different from voting officials doing it.


The didn’t announce the percentages, they announced the vote counts.


Why is that article not pointing to the source? I've looked for it, and I couldn't find it.



I don't see the total number of votes/ballots. Is the vote in Venezuela 100% electronic? If not, there might be invalid paper ballots, too.


They are read aloud by the presenter in Spanish (both total votes and percentages). You can also see them in the tweet if you don't speak Spanish. The announcer appears to represent the national electoral council (Consejo Nacional Electoral), so it's unlikely that he didn't have access to the exact counts (and had to compute them from percentages).


[flagged]


Or maybe western democracies do demand higher standards of transparency. Notice which countries called to congratulate Maduro immediately without waiting a day to find out if any the announced results were valid: Russia, Iran and Cuba. Paragons of liberty.


s/(non-)?western//g

> See when democracies do it, it's ¨just to make it simpler"

> When dictatorships do it, it's "fraud".

Seems plausible.

Credentials and reputation matter.


In western democracies, among others, we use "" to denote that we are quoting somebody.


In western countries power is handed over routinely amongst political enemies. So what, the incumbent is cooking the books to give power to their rival? If power is being handed over, where is the book cooking?

Here they are staying in power.

You think the Liberals in Australia wanted to give power over to Labor? You think Obama liked having Trump follow him? Macron cooked the votes so his own party lost the majority?




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: