My experience with copilot has been very different. It easily pays for itself, a...

lolinder · on Nov 20, 2022

> Just the time saved writing tests alone pays for it.

This, so much. My code since using Copilot is easily ten times better tested than it was before, and I wasn't especially lazy when it comes to testing.

Given 1-2 hand-written unit tests, Copilot can start filling in test bodies that correctly test what's described in the function name. When I can't think of any more edge cases, I'll go prompt it with one more @Test annotation (or equivalent in another language) and it will frequently come up with edge cases that I didn't even think of and write a test that tests that edge case.

(One great part about this use case for those who are a little antsy about the copyright question is that you can be pretty darn confident that you're not running a risk of accidental copyright violation. I write the actual business logic by hand, which means copilot is generating tests that only interact with an API that I wrote.)

madsbuch · on Nov 21, 2022

APIs are not copyrightable (Oracle vs Google). However, the code that interacts with an API might be.

Regardless, it is interesting to think about what domains are easier to generate effective models for. I would expect it to be easier to generate a supervised model <test description> => <test code>. My intuition is also, that it is easier to generate React component code, and harder to generate feature code.

qw3rty01 · on Nov 22, 2022

Unfortunately that ruling was overturned and APIs were found to be copyrightable: https://en.wikipedia.org/wiki/Google_LLC_v._Oracle_America,_...

UncleEntity · on Nov 22, 2022

Which was later overturned by the Supreme Court if you read farther down the article you posted the link for.

And was all over the tech news at the time.

I kind of suspect you are intentionally trolling…

qw3rty01 · on Nov 23, 2022

Incorrect, the case was appealed to the supreme court and the appeal was denied, so the lower court ruling held.

What was ruled by the supreme court was that Google's usage of the API (which had already determined to be copyrighted) fell under fair use in copyright law.

dragonwriter · on Nov 23, 2022

> Incorrect, the case was appealed to the supreme court and the appeal was denied, so the lower court ruling held.

Kind of; the appeal denied was an interlocutory appeal (an appeal before final judgement), so the lower court ruling was left in place until final resolution of the case potentially to be settled in any final appeal.

However, while copyrightability of APIs was raised on the final appeal, the Supreme Court sidestepped it, ruling that because Google’s use was fair use even if the API was copyrightable, it was unnecessary to decide the copyrightability question at all. So the Federal Circuit decision remains in place on copyrightability.

On the gripping hand, though, that decision really doesn't matter much because Federal Circuit decisions on issues outside of those it has unique appellate responsibility aren’t binding precedent on trial courts (it is supposed to apply the case law of the geographic circuit that would otherwise apply, but its rulings don’t have the precedential effect that rulings of that circuit would have.)

So, basically, as far as appellate case law, Oracle v. Google provides no binding precedent on copyrightability of APIs, but precedent that at least one pattern of API copying is fair use if APIs are copyrightable.

Which isn’t encouraging for anyone looking to protect an API with copyrights.

matkoniecz · on Nov 21, 2022

> which means copilot is generating tests that only interact with an API that I wrote

It bases this generated test cases on other similar test cases in other software, including GPL licensed

UncleEntity · on Nov 22, 2022

Which doesn’t matter unless you are distributing said test cases.

omnicognate · on Nov 21, 2022

> If my company didn't pay for it, I would

Testimonials of this form are near worthless to a company. Maybe it's true for you. Statistically, it's highly likely to be misleading.

People overestimate their willingness to pay for something for a number of reasons, but one of the biggest is that they incorrectly visualise what the choice to pay or not looks like. They often imagine a moment of abstract choice after which everything remains exactly the same but some small amount of money magically vanishes from their bank account. In reality, paying for something is a tedious inconvenience, and not paying for it more often takes the form of never getting round to putting your card details in than consciously deciding "this isn't worth it".

It can be taken to questionable extremes, but there's truth in the idea that the only real evidence as to what customers will do is what they actually do, not what they say they will do. I don't know if their interpretation is correct, but it sounds like Kite at least has evidence of the former sort.

slashdev · on Nov 22, 2022

It's my personal anecdote, I make no claims that other people feel the same way. If they were logical, they should though. Paying $10 to get several hundred dollars in productivity gains is a no-brainer. It's the same reason I pay for a second monitor, a properly comfortable chair and desk, a top end computer, and JetBrains. At software engineer salaries, even small productivity improvements pay off handsomely.

mrtranscendence · on Nov 22, 2022

I don't know about software engineering more generally, but I found it worse than useless for my work in data science (machine learning and ETL pipelines), spitting out code that was so wrong that it couldn't be salvaged. I suspect there's a wide variance in the degree to which Copilot will indeed pay for itself.

jascination · on Nov 20, 2022

Out of interest, how are you using it to write tests? Do you just write "make a test for functionX" or something?

(Don't have much experience with it)

premun · on Nov 20, 2022

It is amazing for typing out mock data. Say you're testing parsing of XML - it can easily suggest the the assertions over the data parsed from the XML. Example test that was 95% coming out of Copilot: https://github.com/dotnet/arcade-services/blob/61babf31dc63c...

It also predicts comments and logging messages amazingly well (you type "logger." add 7/10 times get what you want, sometimes even better), incorporating variables from the context around. This speeds up the tedious parts of programming when you are finalizing the code (adding docs + tracing).

Honestly, Copilot saves me so much time every week while turning chores into a really fun time.

trip-zip · on Nov 20, 2022

I honestly thought I'd never use copilot, but when I need to write something to interface with XML via a SOAP API, boy copilot is my best friend...

JustLurking2022 · on Nov 20, 2022

That code is wretched... Why have serializer logic embedded in a data object, especially when .NET provides generic discrete serializers?

jackcviers3 · on Nov 21, 2022

Yeah, tabnine kills it at the data entry parts of test dev as well for sure.

simonw · on Nov 20, 2022

I wrote up some notes on a recent experience I had writing tests with Copilot here: https://til.simonwillison.net/gpt3/writing-test-with-copilot

Once you get the hang of how to prompt it (mainly through clever use of comments) it can be a HUGE time saver.

satvikpendem · on Nov 20, 2022

Yes, if you show an example, or even have the test file open, it will make the other tests for you.

dboreham · on Nov 20, 2022

I wonder if this says something about the nature of test code?

jacurtis · on Nov 21, 2022

Test are always mostly boilerplate and rarely include anything crafty.

95% of tests are: instantiating a class, running a method, and then asserting that the result. Tests do not or should not be crafty creative code snippets. They are boring functional code blocks by design and most are very similar, only changing out inputs and assertions between tests.

paledot · on Nov 21, 2022

I'd go so far as to say if your test is doing something crafty, you're doing tests wrong. Maybe in a mock or fixture, but that's a write-once sort of affair.

I also don't apply DRY (don't repeat yourself) to tests. Tests should be independently readable beginning to end, no context needed. After all, the true value of a unit test is to take a block of code too complicated to easily fit in your mind, and break it down into a series of examples simple enough to fit.

djbusby · on Nov 20, 2022

Sure, it's been loads of boilerplate since forever.

throwawaysleep · on Nov 20, 2022

Tests often have tons and tons of boilerplate

mattwad · on Nov 20, 2022

The best part about it for me is just the Intellisense (in Typescript). I'm using it on probably 3/5 lines that I write as a smarter version, but I rarely use it to do more than finish the current line I am writing.

jrsj · on Nov 20, 2022

Kite has been around for a lot longer, if anything Copilot was Github copying them

esperent · on Nov 20, 2022

I don't think it's reasonable to say either was copying. AI assisted tooling is obvious and people have been waiting decades for the tech to reach a point where they can build these tools. Kite tried to get in early - too early probably - but even if they were the very first they didn't invent the idea.

forgotpwd16 · on Nov 21, 2022

Interestingly checking previous submissions going back to 2016 the project had subtitle "programming copilot."

bastardoperator · on Nov 21, 2022

^ This, I may not use copilot as much on production code, but the testing code it produces makes it easily worth it from a time saved and coverage perspective.

vjust · on Nov 21, 2022

I like CoPilot and paid for it out of pocket. I think its worth it. Its sometimes like having a smart programmer pairing with you.

morelisp · on Nov 20, 2022

How are you validating the quality of its tests? Are you trying any mutations, checking branch coverage, etc.?

simsla · on Nov 21, 2022

I'd assume you still read the generated code, as if you're reviewing a PR

bachmitre · on Nov 21, 2022

I second that