Very interesting! I can see how useful it would be, though I’m hoping the adversarial “AI learns how a specific individual types” and just dumps its output in there doesn’t become a thing.
It's a tool to help teachers detect student assignments that have been written by AI. Unlike other solutions out there, it's an entire web-based text editor that analyses not just the final assignment, but all the keystrokes used during the writing process.
My theory is that analysing the final text only is a futile struggle - billions are being pumped into making LLM text look more human, so trying to make an assessment off final text alone is guess work at best.
I'm curious what folks think! Especially teachers, devs, and anyone navigating this space...
I can't help but immediately think about a counteracting piece of software, which asks an LLM for variations of a paragraph, or a phrase, or a few synonyms, and types it the way a human would, with pauses, typos, navigation, rearranging pieces via copy-paste, etc.
Not that your software is going to be useless. But as long as there is an incentive to cheat, new and better tools that facilitate cheating will crop up. Something else should change.
Yeah it's a good call out. I think it's a (more) winnable battle though.
For both a keystroke based AI detector, and software designed to mimic human keystroke patterns, performance will be determined by the size of the dataset they have of genuine human keystroke patterns. The detector has an inherent leg-up in this, because it's constantly collecting more data through the use of the tool, whereas the mimic software doesn't have any built in loop to collect those inputs.
Interesting idea! Could someone use the software to train an LLM prompt that will get around it? By learning what passes and what doesn’t and then having the LLM train on that
Yeah this is something I'm a little worried about - right now it's not extremely difficult to just take an AI generated essay and then just tweak the essay until it passes.
My first pass approximation is to make the assessment of whether the essay is AI generated or not accessible only to teachers. I may need to also rate-limit the checks, so people can't brute force it to gather data on what passess.
I got burned by software like this, when I pasted in a essay I transcribed while driving through Whisper, and software like this thought I had pasted AI content lol
Please don’t delete this thread. Yes it’s getting pretty heated, but it’s by far the most rational discussion of this topic I’ve seen in a while.
Plus I’ve learnt a few things, which tends to be a positive signal for quality
I suspect a lot of professionals that need decent compute find desktop PCs to sit in something of a no-man’s-land these days.
Personally I find my MacBook Pro plenty powerful for most day to day activities, including testing out ideas. The kicker is that when I need real compute, a desktop that’s maybe 3-5x faster pales in comparison to jumping onto a hosted GPU cluster. It also ends up cheaper because I’m not paying for a chunky GPU that’s sitting idle on my desk half the time.
From limited conversation with people in other industries, their experience is similar.
If you need to run mostly Windows-only software, a MacBook probably isn’t a good choice.
You have to look at Windows laptops and it’s pretty grim. Plasticy machines with some combination of garbage keyboards, shitty displays, terrible trackpads, short battery life, and loud fans (my last 3 machines have been ThinkPads) that are worth almost nothing when you want to upgrade after a couple years.
Haha yeah I was about to comment that I recall a period just after Word2Vec came out where embeddings were most definitely not underrated but rather the most hyped ML thing out there!
I like it! Frankly, it should be something that hiring companies do anyways.
Perhaps extend it to something like everyone who applies, or responds to your comment”
Practically speaking, enforcement will be difficult/impossible for actions off HN - if someone claims a company isn’t responding, or using a templated email, how would you verify that?
By enforcing the same rule for a job thread, there’s a very clear location where the behavior of the company can be observed.
I think you’re getting downvoted because “yeah yeah yeah” is normally a sign that someone is sarcastically dismissing an idea, but the rest of your comment suggests you’re not at all - linerp is a great idea!
Are you offering shares in your venture? I’d be mildly interested in investing on the expectation that I’d see returns that are slightly worse than inflation.
Haven’t really told anyone about it so no one is using it, but I think it’s a good idea!
reply