Reviewing Ethereum Smart Contracts

krupan · on Sept 30, 2017

I saw this on Twitter today:

Cyber security is a nightmare.

Crypto asset security is a 10X nightmare.

Smart contract security is a 100X nightmare.

https://twitter.com/lopp/status/911020364829884416

616c · on Sept 30, 2017

But if pioneers do not grind their teeth on real world implementation, application, and hardening how do you expect those scales to change? Was it better when even US mil branches roamed the public internet and people dialed in with naive attacks?

I applaud these developers like those who died exploring countries unknown, minus the raping and pillaging and colonizing (not ironic; I just know people on HN will call me out if not clear preemptively).

kevingadd · on Sept 30, 2017

It remains a complete embarrassment that Ethereum is based on a home-grown trainwreck of a programming language instead of one of the existing well-supported, battle-tested verifiable programming languages out there. Honestly, they could have used C and been better off because of the amount of C verification tools out there - Solidity hardly seems safer than C and its compiler has a history of potentially catastrophic newbie mistakes [1]. Note that some of these bugs are in an optimizer, something a security-focused package handling money shouldn't even need, especially if it risks compromising correctness of the contracts.

People talk about Solidity being JS-inspired, but simply using JS would have also been better, because there is verification tooling out there for it, along with compression tooling to make it smaller. And people can use higher-level to-JS compilers like Typescript to get stronger safety assurances if they wish.

Any claims that performance justifies the use of a home-grown language are also absurd: High-performance trading firms like Jane Street make use of verifiable functional programming languages like OCaml despite the importance of speed & latency in order to turn a profit. [2]

No amount of effort from security researchers is going to completely compensate for the fact that the people running Ethereum have incredibly bad taste and make poor decisions. I remain concerned for people who put their wealth at risk investing in smart contracts because the average contract probably has unnoticed issues.

[1] http://solidity.readthedocs.io/en/develop/bugs.html

[2] https://www.janestreet.com/technology/

zeroxfe · on Sept 30, 2017

Whoa there! Sure Ethereum has it's problems, but going as far as "the people running Ethereum have incredibly bad taste and make poor decisions", displays a huge amount of ignorance towards some incredible engineers who have built a fantastic (and very successful) platform.

Ethereum is far more than just the contract language -- it's the distributed consensus system, the merkle-tree backed blockchain, the p2p network, the cryptographic transaction ledger, the virtual machine, and much more. It is way more complex and sophisticated than you're giving it credit for.

There have been multiple languages to target the EVM, and they're been getting incrementally better, and will continue to evolve rapidly as more PL experts have gotten involved.

So for all the things they've got wrong, they've got lots more right -- and it's only been a bit over two years since the initial release. Give them a break!

UncleMeat · on Sept 30, 2017

The EVM itself is fucked, not just Solidity. It has poorly defined semantics and has a number of strange behaviors that you don't want in high integrity systems.

DennisP · on Sept 30, 2017

For example?

nokcha · on Sept 30, 2017

If a function was not designed to be reentrant, then merely sending money to a malicious third-party contract can break the function's intended preconditions and result in a vulnerability. This is exactly what happened with the DAO exploit.

http://hackingdistributed.com/2016/06/16/scanning-live-ether...

kevingadd · on Sept 30, 2017

I would argue that attacks like this https://blog.golemproject.net/how-to-find-10m-by-just-readin... are only possible due to the lack of attention paid to safety in the design of EVM. You can compensate for it by putting more smarts into Solidity and other compilers but the basic fact is that the VM doesn't do much to prevent it. Other modern (and even non-modern) VMs take steps to prevent this without reducing expressiveness or speed at all.

At a fundamental level, getting 0s when reading missing or out-of-bounds values is an incredible red flag for security-focused design. Incidentally I can't even find a concrete source that says how calldataload behaves when an index is out of bounds. It's not mentioned anywhere in the documentation I've found (on the ethereum website or elsewhere).

UncleMeat · on Oct 1, 2017

Exception semantics are unclear, requiring explicit checks only in some cases depending on how a function is called.

kevingadd · on Sept 30, 2017

I'm not willing to "give them a break" if they ignored decades of existing academic research and production software to roll their own low-quality packages that actively cost users money through bugs and footguns. Ethereum is not breaking new ground all over the place to the extent that would justify overlooking quality issues, even if it's the first to do particular things.

I'm sure the Ethereum ecosystem as a whole contains some clever designs or even good engineering by the people who back it, but that doesn't excuse the fact that they overestimated their competence and their customers are paying for it. It's not "more complex and sophisticated than I'm giving it credit for": I'm giving it "credit" for being too complex and too sophisticated, creating a huge attack surface for security vulnerabilities and lots of space for bugs to dwell because the developers insisted on NIHing things they had no reason to.

Complex and Sophisticated is not a good thing in mission-critical software. You want code that is as simple as possible, does the job correctly, and is easy to understand and audit. Complex and Sophisticated is what you write when tackling an incredibly difficult problem or when you're trying to get a promotion.

You seem to be confusing the (no personal opinion on this statement) "fantastic and successful" nature of the Ethereum platform with its quality. There's no intrinsic correspondence between success and quality. JavaScript is wildly successful and the experience of using it in 2017 is (thanks to modern tooling and browsers) pretty good, some people might even call it fantastic. But that doesn't erase the long history of defects in the language, defects that people continue to pay for. People invested sweat and blood to turn it into a successful thing.

Likewise, sweat and blood is being invested to compensate for Ethereum's numerous deficiencies. Some of those investments were made by the Ethereum developers - props to them - but a lot of that cost is being paid by end users, security researchers, and investors.

Tony Hoare calls nulls his 'billion dollar mistake'. How much do you think the cost of Ethereum's numerous design and engineering errors will add up to when all is said and done?

zeroxfe · on Sept 30, 2017

Find me a system that does not "actively cost users money through bugs and footguns", and I'll find you something that isn't used.

> Complex and Sophisticated is not a good thing in mission-critical software.

The complexity of Ethereum comes from the domain. You can hold on to the platitudes.

> Complex and Sophisticated is what you write when tackling an incredibly difficult problem or when you're trying to get a promotion.

If you don't believe that cryptocurrencies are an "incredibly difficult problem", then, again, you're showing your ignorance. As someone who worked on distributed consensus systems (Paxos) for nearly a decade, I would recommend you pick up a textbook before passing judgement on the nature of this problem.

> You seem to be confusing the (no personal opinion on this statement) "fantastic and successful" nature of the Ethereum platform with its quality.

Thanks, however, I'm quite familiar with the orthogonal nature of success and quality -- it also turns out that just because it's successful doesn't automatically make it low quality.

And really... chill out.

wruza · on Oct 1, 2017

So we shall believe that new dubious language with its runtime and each-time custom scripts magically turns high-quality, because the environment is complex and someone spent time and read a book in that area? Despite the fact that many users suffered security bugs in carefully managed world-grade quality products for decades and it still happens? Okay.

>chill out

and give me your money, right. Yet another soapy bubble.

zodiac · on Sept 30, 2017

> Note that some of these bugs are in an optimizer, something a security-focused package handling money shouldn't even need

Contract callers pay gas fees to execute the contract so inefficient contracts literally waste transaction fees - if anything optimizers are even more important here.

> Any claims that performance justifies the use of a home-grown language are also absurd: High-performance trading firms like Jane Street make use of verifiable functional programming languages like OCaml despite the importance of speed & latency in order to turn a profit

"Performance" in the traditional PL sense (eg "OCaml has high performance") also does not carry over directly. In traditional PL perf you'd care about cache, memory, IO, pipeline-friendliness, parallelism, all of which don't carry over to the EVM, where it's all about reducing bytecode instructions count (and hence transaction costs as well as blockchain size).

I agree that solidity as a frontend language is prerrt weird and not veey security friendly, and that using a non-solidity frontend is of course feasible, but simply using the whole toolchain of C or OCaml - not so much. These two languages in particular will even need some modification to match Solidity's Event language construct

kevingadd · on Sept 30, 2017

You're accepting the assumption/abstraction made by Ethereum here, where "gas" is equivalent to performance, even though gas is an arbitrary measurement. You could easily build your own gas-focused optimizer for a toolchain like llvm or for a compiler like ocaml's.

Ethereum could bill contracts based on the actual cycle count they take to execute on the CPU, or some other metric. Then you wouldn't need to use a specialized bytecode designed for the gas system with its own optimizer. Of course, then people would potentially be tuning for the most common CPUs (or ASICs, or whatever) used by miners. But again, it's not necessary to roll your own solution.

Consider also whether gas makes any sense in the long run: Compute power continues to increase, and the cost of compute power continues to drop. If smart contracts are really going to be the foundation of a new economy and people are going to build applications on top of them, does gas make that much sense vs billing for actual compute and network resources like AWS or EC2? Is gas even a meaningful abstraction? I'd argue it isn't. If the 'value' of executing contracts in $/gas exceeds the cost of renting AWS/EC2 nodes you're basically created a market for mining arbitrage, and perverse incentives where those miners want to push the prices in one direction.

Event-oriented programming is old hat in every modern programming language. C/C++ have plenty of frameworks out there you can use to get at it, so do other languages. In some environments it's the de-facto way to write code.

Alternate frontends for the EVM potentially address some issues - which is great - but it doesn't get around the fact that EVM itself is poorly designed, and this is reflected in the system all the way from top to bottom. Some of the errata in the post I linked earlier were ways to exploit design flaws in both the VM and compiler to manipulate contracts, and instead of fixing that in the VM they make you put checking prologues in your functions :/

DennisP · on Sept 30, 2017

> Ethereum could bill contracts based on the actual cycle count they take to execute on the CPU

No, you can't, because (1) you need consensus on the results, so you can't just have everybody measure CPU cycles on their own machines, and (2) you also can't trust miners to correctly report cycles on their own machines.

> cost of compute power continues to drop

Which is fine because gas is an abstraction. The price per unit of gas is market based.

kevingadd · on Sept 30, 2017

You can absolutely use a consensus model with cycle counting. The cycle count for x86 instructions (ignoring unpredictable factors like cache misses) is well-defined on all the modern architectures, you can find tables of the latencies and everything. There's no reason you couldn't do this instead of gas (though I don't know if I would argue that you should)

DennisP · on Sept 30, 2017

If you are ignoring cache misses, and using a particular CPU architecture instead of whatever is actually running, then you're back to using gas. You still have a fixed list of costs for different operations, it's just a different list.

But CPU cycles aren't actually a sufficient cost model, since you also have to pay for storage.

kevingadd · on Sept 30, 2017

The original argument (not yours, to be fair) was that EVM/solidity + an optimizer are necessary because contracts are billed for gas. I was attempting to illustrate that you can build a platform around a more representative billing system (you can pick one!) such that existing, world-class languages can be used instead. This pays security dividends and reduces the cost of your system (for everyone, not just you).

Whether or not gas is a useful solution as-is isn't a particularly important question to me, and I don't see any reason why you'd want to replace it now. It's just a poor excuse for Solidity/EVM.

DennisP · on Sept 30, 2017

Sure, but I'm not aware of any language with an optimizer that minimizes permanent storage (for example).

The gas model has been forced to be a close representation of actual costs, because last year it wasn't close enough, and someone took advantage of that to launch denial of service attacks.

Incidentally, last year someone wrote an Idris backend compiling to EVM for their doctoral thesis. They concluded that it had some benefits but not as much as they expected. [1]

There's an effort in progress to migrate from the existing EVM to a modified webassembly [2], which would allow usage of existing languages, and potentially improve performance significantly. It's still experimental and may or may not work out, but seems to be making good progress. [3]

With the existing EVM there are new languages in development that may improve matters, including Viper [4] (by Vitalik) and Bamboo [5] (by someone the Ethereum Foundation employs to work on formal proofs).

[1] https://publications.lib.chalmers.se/records/fulltext/234939...

[2] https://github.com/ewasm/evm2wasm

[3] https://blog.ethereum.org/2017/08/23/roundup-5/

[4] https://github.com/ethereum/viper

[5] https://github.com/pirapira/bamboo

zodiac · on Sept 30, 2017

> If the 'value' of executing contracts in $/gas exceeds the cost of renting AWS/EC2 nodes you're basically created a market for mining arbitrage, and perverse incentives where those miners want to push the prices in one direction.

I think you misunderstand what gas is for. The cost (in terms of transaction fees) to execute something on the EVM is already many many orders of magnitude higher than what it would cost for a single person to run the computation by renting some CPU time from AWS.

Think about it in the case of Bitcoin. Simplifying a bit, most Bitcoin transactions just consist of subtracting a number from one account balance and adding it to another account balance. That's an incredibly cheap operation by any metric, yet Bitcoin burns through a million dollars of electricity every day.

The reason it costs so much to run code on the EVM is, most of the "cost" goes toward ensuring consensus (or, from the contracting parties' point of view, immutability). Transaction costs and block rewards currently go to miners (in the current PoW world anyway, in PoS this explanation will be slightly different), and miners don't spend most of their CPU cycles running the computation, they spend most of their CPU cycles trying out a hash function (Keccak-256) as directed by ethereum's PoW algorithm (ethhash), and if they're lucky with the values they hash they mine the block. This CPU time only serves to reach consensus, and does not to do any useful work for "just running" the EVM code itself.

Another thing that proves that gas does not really correspond to CPU cost directly is that they recently extended it with new primitives (specifically, "elliptic curve multiplication, addition and pairing") as part of a research effort collaborating with zcash (which uses those primtives in their zero-knowledge-proof protocol for allowing private transactions). Of course the EVM could do it before this change (since it's Turing-complete), so if gas = cpu time, why bother adding this primitive?

As to what purpose gas serves, the best source is probably http://vitalik.ca/general/2017/09/14/prehistory.html, but as I understand it it's to prevent DoS attacks and to limit the size of the blockchain (since every contract call ever made, its source code, and hence every EVM instruction executed, is part of the immutable ledger).

> Event-oriented programming is old hat in every modern programming language. C/C++ have plenty of frameworks out there you can use to get at it, so do other languages. In some environments it's the de-facto way to write code.

Sure, but it would have to be a core part of the language, it can't be a framework or library. For instance in this contract https://theethereum.wiki/w/index.php/ERC20_Token_Standard#Sa... there's a transfer "function" that addresses call to transfer tokens around. It doesn't run in the context of a "main" function (in the C sense), it just runs and updates some state. So there's some different semantics here (not big differences, just enough that I would call it necessarily a different language). Similarly an event-oriented programming framework for C would probably include a runtime scheduler (probably written in C or assembly), but the EVM and the block-mining framework is the scheduler in EVM's case, it makes no sense to have a runtime scheduler run as "bytecode" or "CPU instructions".

fsiefken · on Sept 30, 2017

As an alternative you could write your smart contracts with NEO in C#, F#, Java or Kotlin http://docs.neo.org/en-us/sc/introduction.html

zmonx · on Sept 30, 2017

int_19h has previously posted a nice overview of some of Solidity's flaws:

https://news.ycombinator.com/item?id=14810008

Such issues make the language extremely error-prone.

A possible remedy was also suggested:

https://news.ycombinator.com/item?id=14809743

Prolog sounds like a great tool for encoding rules, and there has already been some research for encoding legal texts in Prolog.

wruza · on Sept 30, 2017

Imagine if Windows 95 stored your payment data and was bridge-connected to the internet. This will give you an idea of how I look at these smart contracts.

Is there any reason to think of it another way?

DennisP · on Sept 30, 2017

Yes: smart contracts are extremely short programs, usually doing very simple things. A thousand lines is a fairly large contract. It's feasible to spend a lot of time refactoring to make them as simple and clear as possible, to unit test extensively, and then pay several expert parties to review them in detail.

We're not quite at the point of being able to do formal proofs of their properties, but that's on the way, and also relatively practical given how short the contracts are.

We can see whether contract authors have done these things because unlike Win95, smart contract source code is always published, since everyone knows a closed-source contract could trivially steal their money.

dogma1138 · on Sept 30, 2017

Smart contracts are not that short, and if you look at entries for the Unhardended Solidty Code contest you'll discover just how easy it is to implement a vulnerability due to the nuances of Solidity and the runtime.

The virtual machine is also not even remotely proven yet.

There is a lot of risk.

DennisP · on Sept 30, 2017

I write and audit Ethereum smart contracts for a living. Most of the projects I work on aren't much bigger than a thousand lines, and many are less. Some especially large projects are several thousand lines, still many orders of magnitude smaller than Win95.

magnus1 · on Sept 30, 2017

[flagged]

richardknop · on Sept 30, 2017

Is this a joke? An ICO which pretends to do security auditing for other ICOs (why do you need your own cryptocurrency for that other than as a quick cash grab)?

jon_richards · on Sept 30, 2017

Without reading, maybe they put up some sort of stake that is forfeit if the other ICO gets hacked?