Despite his claim in the article, Bryan Caplan is absolutely not known for giving tough exams that test for deep understanding. He's known for the opposite.
The fact that GPT-4 can pass his tests says much more about the tests than about the capabilities of GPT-4. Indeed, GPT-4 failed Steven E. Landsburg's exam with a miserable score of 4/90, and the remarkable performance oon Caplan's exam fails to replicate on random economics exams of random universities.
(NB this post does not mean that ChatGPT cannot improve, or that it's not impressive as is. It serves to put Caplan's self-aggrandizement in context.)
Educators are not known for having their tests being tested. If anything tests tend to have logical errors, misspellings, and "guess what the teacher thought here" kind of questions.
But like anything else this current state is not set in stone. Educators just have not had any feedback and incentives. For subjects and situations where the quality of learning is actually measured, it is the teaching that matters and not the tests. You can have good education and terrible tests and students will come out of that with good education. When the quality is not measured the tests are equally meaningless. (Quality in this context means what the student can do after graduating without failing or creating accidents).
I expect that in the future there won't be a reason for teachers to not give out the answers at the same time when giving out homework. That in turn will force a small change in the process of lessons and studying. Those educators who are already doing this won't need to change much in response to ChatGPT.
> Quality in this context means what the student can do after graduating without failing or creating accidents
That's a problematic quality criterion, since many academic disciplines (including most of the humanities) aren't skill training programs so there is essentially nothing a graduate "can do" that sets them apart from the rest of the population.
What a student should be able to do after finish an academic course is to participate and continue with other academic courses that depend on that course. If there is nothing that requires that course, and nothing that uses anything learned from that course, than the grades of that course has no possible meaning. It noises in the forest that no one hears.
What it also mean is that the measuring of quality of the teaching occur much later in the process. For example, lets say a student study basic math and then apply for an accounting class. We will only know the quality of the basic math education once the student has shown not to fail the accounting class because of poor understanding of basic math. The educator at the accounting class might over time find out that a particular schools or educators allow students to pass without actually being able to do basic math. At that point we know that the quality is low.
We can see similar failure with humanities. A person studying language in order to work as a translator will fail with their job if they don't understand the language which they are translating. It is poor quality when an educator passes students as translators when they can't translate. It can also be very unsafe (translators in hospitals or legal), so high quality in humanities can be just as important as in other disciplines.
We can imagine more difficult topics like philosophy, but from my own experience, those topics are more of experience based education rather than test-based learning. ChatGPT can't do in person group discussions.
> The fact that GPT-4 can pass his tests says much more about the tests than about the capabilities of GPT-4.
"Deep Blue beating Kasparov shows that ultimately, chess is just brute force calculation..."
"Stable Diffusion 'art' being indistinguishable from human art to most people shows how little most people know about art..."
"ChatGPT fooling people into thinking they're talking to another human shows that those people weren't paying attention, so the Turing test is actually useless for determining..."
"I mean, I know that everyone used to say that doing X means being intelligent, but now that software has done X, it becomes clear that..."
What is the real criterion for a machine to be considered intelligent? What does it have to do for people to not retroactively downplay it by invoking mechanistic pseudo-explanations?
Err, did you miss the part where I wrote that the post is not meant to imply that ChatGPT is not impressive?!
Deep Blue beating Kasparov in chess showcased Deep Blue's capabilities; Deep Blue beating Bryan Caplan would not have showcased these capabilities, despite the capabilities being there.
The analogous hypothetical situation would be a Not-so-deep Blue that fails to beat not just Kasparov, but any random chess master with a nonzero Elo score, but manages to beat Bryan Caplan. After his defeat, Caplan would write an article claiming that he's "known to be very good at chess". This would definitely say more about Caplan's chess skill than the chess engine's capabilities. And none of this would imply that chess engines are in any way impossible or unlikely, or that the capabilities are not impressive. Indeed, a chess engine that beat Caplan would have been impressive in 1975, even though we were still very far from Deep Blue or Stockfish at that point.
I'm not sure if that analogy holds any more given that neural nets scale with time and data. If that was happening with a neural net chess engine, the difference between beating Caplan and beating Kasparov is measured in maybe a couple of days. Google's Go AI only ever lost a game to Lee Sedol, then immediately moved a long way into superhuman territory with a bit more training.
The gap between novice human and expert human looks quite small to a computer. If it can outperform a novice, it isn't far off beating an expert.
You did write that, but you first wrote that GPT-4's performance says more about the task than about its capabilities.
Which is the exact same pattern that has been used to "deal with" advances in AI since forever.
And Caplan is a university professor, not a random person. He's not an amateur at giving tests. It's literally (part of) his job. So what happened here is not at all analogous to your chess examples.
I would say that the OP is entitled to share their opinion. Which I tend to agree with. Also I would question if the job of a university professor is really to give tests, or rather to teach and conduct research.
Tests are just an annoyance (that is deemed necessary). Because one passes the university tests does not mean they necessarily understand the topic very deeply, I think it's pretty obvious.
That's actually an interesting point: university students can totally pass exams by "mimicking": not understanding the topic, but solving the exam problems in a way similar to the exercises they learned beforehand. I argue that the test is not perfect at discriminating those who understand from those who mimick impressively well.
> the test is not perfect at discriminating those who understand from those who mimick impressively well.
They discriminate all the time, just in the wrong direction. There are a lot of exams which are written in a way which counts on this mimicking, but in a way which makes tests way more difficult even if you understand the topic. It’s the same problem with remote learning: it’s way easier to cheat on them. In the past 3 years, (except the first semester when COVID broke out) remote exams were way more difficult than before to mitigate this. I had tests which were clearly made this cheating in mind. It was borderline impossible without cheating, they counted on it. At that point, I have no idea why they still tried to enforce any such rules.
And still, I know many teachers who clearly don’t understand their topic completely, they just “mimic” it.
>And Caplan is a university professor, not a random person.
He wasn't compared to random people, he was compared to other professors also setting tests, and how GPT-4 fared on their tests. You're blatantly misrepresenting the point being made.
> What is the real criterion for a machine to be considered intelligent?
This is actually the core of the debate. The answer is that there is no real criterion because we do not understand intelligence. Or, put differently, the one-dimensional understanding of intelligence we typically entertain is a simplification.
This is very inconvenient so we just pretend intelligence was a well-defined concept. That in turn leads to all kinds of confusion and pointless arguments as you have pointed out.
While there is not definitive criterion for intelligent there is a lot of intuition. For example, I can find a book to be very intelligently written, but at the same time the book itself is not intelligent. No matter how clever answers it has to complicated questions, the book is still just an inanimate object created by a human. The definition of intelligence has to be stretched quite far in order to bypass our intuition on this.
We see similar situations when people anthropomorphize animals, or ask questions like do plants have feelings. Biology research can look at neurons and DNA sequences and find all kind of similarities to humans, but it still requires a bit of a jump when we start applying human philosophy concepts to non-humans.
I find it easier to look at it as metaphors. ChatGPT has intelligence similar to how the planet has an intelligence , or how the internet is like a brain with every single connected device their neurons.
IMO we first need to clean up our language and precisely define the word intelligence so we all know what we are even talking about.
I would imagine we actually understand intelligence but in ordinary language it is a placeholder for a number of different meanings and concepts.
Same problem with "consciousness", same problem with "sentience".
I think when we say we don't understand these words it is really that they can't be understood as they are used in ordinary language because they can't objectively mean all the values people assign to these words.
In the face of AI, why would we not have to further define and expand our language for communication purposes? That actually seems obvious if you think about it for 2 seconds.
"sentience" has an additional problem that science fiction authors conflate it with "sapience".
I suspect that some influential science fiction editor made a mistake early on and it stuck. Outwith science fiction, I've less commonly come across the conflation, and people generally use the dictionary meanings, more in accord with the Latin roots, of "sentient" meaning being capable of sensation/feeling and "sapient" meaning being capable of intelligent thought.
GPT-4 falls apart on many tasks that humans find trivial, such as planning.
Intelligence is multi-faceted and quite frankly the average technologists understanding of human cognition is quite poor. Intelligence is not just information retrieval.
I'm sure, because I specifically chose GPT-4 because of its intelligence. Otherwise it wouldn't be of use to me-- I already have a slew of tools and sources available, but it takes more than that to pick out just the right solution and present it.
It also quite good at planning its replies, that I see.
> What is the real criterion for a machine to be considered intelligent?
There is no need to come up with a perfect criteria list for human like intelligence. And maybe it is even impossible, as we can only use the rational part of the brain to understand the whole.
But when a human like intelligent machine will be here we will all know it without a doubt. It will have the impact of the wheel, the control of fire, things like that. Nobody stood around to argue about how hot things need to get to catch fire or when exactly a wheel becomes a wheel as opposed to a rolling tree trunk when their neighbors where throwing torches on their straw roofs from war chariots.
Arguing about definitions, criterion, moving the goal post, etc, is just overthinking. You will drop to your knees with eyes full of tears and mouth wide open when you will experience true AI. That's how you will know.
"What is the real criterion for a machine to be considered intelligent?"
For me, AGI.
Intelligence is a very complex problem, we tended to meassure it by extreme outliers, evolution did not optimize fore. Like abstract math. Machines clearly are better at everything with calculations. A criterium we reserved for "intelligence".
But "simple" things like understanding what a rock is, are still too hard for AI. They can reproduce and recombine some text snippets about rocks, but they do not understand, what a rock is. Or anything as far as I am aware. So they are clearly useful and do things, we described as intelligenz when humans are doing it. But human intelligence is still superior as it is a general intelligence.
Good luck defining AGI to any measurable degree, especially with the spectrum that it almost certainly is.
> But "simple" things like understanding what a rock is
Well, how does one verify that an entity understands what a rock is? The education system clearly has no idea how to test for understanding of literally anything. It all fundamentally boils down to testing for consciousness, which is impossible. A sufficiently advanced automaton is indistinguishable from a conscious being to any outside queries.
So this specific flaw might have been fixed by now, but the general shortcoming is, that it has no understanding of the concept of an apple.
Or a rock. Or a human. It just "knows" certain pixels arranged in a certain way get the lable apple.
"The education system clearly has no idea how to test for understanding of literally anything."
And that is very strong hyperbole. There are many, many flaws in the education system, but they do know how to test knowledge (in person). They just do not know how to scale it cheaply, hence the stupid multi choice tests and co.
What do you classify it as, a rabbit or a duck? Is the dress blue and black or white and gold? :) Also you do know some people literally fall to the floor and start convulsing if you show them some flashing lights right? Every system has its "glitch tokens".
I mean is it really surprising that when you greatly simplify the principle that organic neurons work on, quantize it to the point where it's fast enough to use, have it learn no model of the world besides looking at photos and telling it what it's looking at, that it'd have weird corner cases? I really don't think so. If you want to compare fairly, you'd have to do so with a human that's never spoken to anyone, never seen anything but 2D pictures on a screen in front of them and hasn't been told or shown anything about the world they live in since they were born. And that's would be with the advantage of having six million years of base model training beforehand.
> And that is very strong hyperbole.
Well sure I guess it's mainly out of lack of care and practicality, but still even more elaborate tests like ones for say, math or physics that have you write the entire procedure don't really test if you understand the underlying principles, just that you've seen that type of problem before and can remember the equations correctly. The LLM bread and butter functionality basically.
I don't find the example right now, but you can give AIs just some pixels they would classify as dog or whatever. A human would only classify something that has the attributes of a dog as a dog. That's a difference from corner cases and unclear dress examples
"If you want to compare fairly, you'd have to do so with a human that's never spoken to anyone, never seen anything but 2D pictures on a screen in front of them and hasn't been told or shown anything about the world they live in since they were born"
And this is not what is happening with AIs. They get fed with data that sums up more than a single human lifespan.
", but still even more elaborate tests like ones for say, math or physics that have you write the entire procedure don't really test if you understand the underlying principles"
And this is not true, except for high school tests. You certainly can test for understanding, by not giving generic problems, but where you have to apply your knowlege. And sure those were the most hated tests, as they required actual understanding and you could nvot weasel through. But they also created more work for teachers, so mosts tests were just the generic have you learned the algorithm tests. And if testing in person, you can go deep.
AGI is just the old "strong AI" that is just the old "AI", all that we never bothered to precisely define.
I just can't understand how we can talk about language models all day without understanding the most basic idea of Wittgenstein in that sloppy use of language leads to philosophical problems and misunderstandings. chatGPT can easily teach people this too:
"Philosophical problems often arise when words are taken from one language game and used in another without an understanding of how the change in context alters their meaning. For example, words like "mind," "knowledge," "belief," and "truth" are used in everyday language games with certain meanings, but when they are transplanted into philosophical discussions without adjusting for the change in context, misunderstandings and confusions occur.
"
I think the best definition for "strong AI" is an AI that is capable of strangling you when you ask it to do some stupid thing for your amusement for the 74th time.
"
I mean, I know some humans who would miscatacorize a mineral as a dock sometimes, are they not intelligent anymore"
Some humans also ain't intelligent, yes.
(But of course even the dumbest humans automatically make very complicated calculations in their head, just to maintain body functions. Language is tricky.)
Taking these shortcomings a step further, it may not be long before we can't tell reality from simulation, since we've all so plugged in.
A friend who lived in the amazon with me for sometime did a diet with datura, a potent psychoactive. He described his experience as one moment sitting in a cafe completely lucid and chatting with friends, then a moment later he is somewhere else altogether, this continued for some time and of course none of it was real despite it beinf indiscerbible to him within that experience. People often lose their sanity with this as a result of inability to find a way back to the old known "reality". With AI and neural link like interfaces... we have interesting times ahead.
> What is the real criterion for a machine to be considered intelligent? What does it have to do for people to not retroactively downplay it by invoking mechanistic pseudo-explanations?
When it’s capabilities are no longer a proper subset of an average competent human’s?
>When it’s capabilities are no longer a proper subset of an average competent human’s?
Interesting, I'd say that it's definitely already the case with the current iterations. The average human can't write as well as ChatGPT, or speak as many languages, or translate between them.
The average human can't code , even if often subtly wrong, in most programming languages.
It's definitely not a proper subset of average human intelligence anymore.
Good point. My criteria is bad. Technically speaking, Microsoft Excel's capabilities aren't a proper subset of an average competent human's either.
Perhaps,
When an average competent human's capabilities within their field are a subset of the AI's (for all fields of interest).
is a better criteria.
Basically if an AI is going to be a doctor, I want it to at least match the average doctor in capability. What I'm concern about are blind spots that the AI might have that the average human (within their field) doesn't.
A jetliner pilot AI is worthless even if it can do fighter jet maneuvers with a jetliner but goes schizo if a bird flew in front of it during landing.
> What is the real criterion for a machine to be considered intelligent?
I reckon not many have ever said that machines can't be intelligent, in any way.
I think the most compelling question is: what makes them "human like" intelligent.
Passing a multiple choice test is not an actual measure of human intelligence, it's IMO a test of "semantic searching" abilities, that machines can be trained to emulate.
I recently learned that this moving of the goal post with regards to AI is a known effect, and it's been a thing for at least a decade: https://en.wikipedia.org/wiki/AI_effect
I sense a new replication crisis. The pool of academics that get excited about AI is very limited. Aaronson worked for OpenAI at least for a while. He cites Caplan as "his friend":
"As I’ve mentioned before, economist, blogger, and friend Bryan Caplan was unimpressed when ChatGPT got merely a D on his Labor Economics midterm. So on Bryan’s blog, appropriately named “Bet On It,” he made a public bet that no AI would score on A on his exam before January 30, 2029. GPT-4 then scored an A a mere three months later (!!!)"
Miraculous indeed.
I'll start listening if these experiments are validated by critics like Chomsky.
That was my take from the beginning. If AI can pass your exam it means you have a crap exam and you need to make a better one because it doesn't test knowledge application, only recall.
> The fact that GPT-4 can pass his tests says much more about the tests than about the capabilities of GPT-4.
I would be careful with writing so unfavorable opinion. Sure, GPT-4 has its strong and weak sides - so tests can be made GPT-4 proof... at least for now.
> By and large, the reason our customers are on campus is to credibly show, or “signal,” their intelligence, work ethic, and sheer conformity.
Ugh. This sentence stopped me dead in my tracks. The corporatization of education is killing any unique purpose it used to have. In particular, thinking of students as customers gives them a sense of entitlement that they should be able to "talk to the manager" if they don't get the outcome they want. It doesn't help that tuition has reached such heights that it's hard to argue that they're wrong.
The mentality of ‘the customer is always right’ also leads to homogenous thinking, with universities of all places now leading efforts to discourage freedom of speech/thought and punish ‘wrongthinkers’. It’s what the ‘customers’ are demanding.
> The mentality of ‘the customer is always right’ also leads to homogenous thinking
How? The customers (students) are different so what they see as right is different.
> universities of all places now leading efforts to discourage freedom of speech
I don't see universities doing to themselves anything different now than they've done for decades. There has been a resurgence of state governments meddling.
I also came here to quote that part. The author is completely biased, and his views reflect not what I call "higher education", nor my experience.
A friend of mine was offered a well-paid teaching assistant position at a private school. Some students completely failed the tests he was grading, but when he gave out failing grades, he got a meeting with the principal, who told him about the importance of not disappointing/losing customers, and that these people were paying good money for their degrees. He didn't renew his contract there.
Luckily, this still seems to be the exception around here, with almost-free public education.
What do you expect from a self-proclaimed anarcho-capitalist economist who is reactionary enough to find a way in an article about AI to complain about how the left cares about common left talking points the right hates.
An actually educated public is a major inconvenience even a danger for capitalism. Capitalism would rather reduce education to a workforce sorting algorithm.
> The only thing that will really matter will be exams. And unless the exams are in-person, they’ll be a farce, too.
How is this not already the case? Even before ChatGPT it was easy to hire people to write essays in your stead. (Some of these people are now complaining AI is taking their jobs.)
Because oral exams don’t scale to the number of students in undergraduate programs given the number of faculty/ the resources faculty are given for teaching. This is why oral exams are mostly done as part of graduate education (or undergraduate honors theses), and then only as part of qualifying exams and thesis/dissertation defenses. By and large, regular classes taught in graduate do not use oral exams.
I think the underlying issue that no one wants to admit is that probably “too many” people go to university and/or there are “too few” professors to teach them. Modern higher ed has contributed to this problem on probably both the supply side (too few professors, especially tenured professors) and the supply side (advertising prestige, infecting politicians with the notion that a 4 year degree should be a primary goal for all).
Ironically, due to how graduate education has been distorted by underfunding of researchers (not providing stable and adequately paid research assistant/scientist positions), we have over-produced PhDs in many fields, given the shrinking number well-paying of faculty jobs universities are willing to fund.
For me it’s hard to conclude other than this is due to neoliberal/capitalist/corporatist ideology that has poisoned Western societies over the last 50 years.
All of my exams were in-person but they weren't oral. Written exams scale. They aren't used as much as they could be because too many students would fail.
I think the underlying issue that no one wants to admit is that probably “too many” people go to university
That’s not something the author, Caplan, is afraid to admit. He wrote a whole book arguing that [1]. He also happens to be a libertarian and staunch capitalist economist.
His argument (in his book) is that most university degrees (apart from engineering and other hard-core technical programs) are worthless. That everyone knows it but that it’s taboo to admit it. That what people are actually getting a degree for is to signal a level of basic intelligence and diligence/conscientiousness to prospective employers. He calls this the signalling hypothesis. He contrasts this with the human capital hypothesis: a belief that people actually go to university to learn and improve themselves. It’s a fascinating argument he presents!
Remote proctoring is a thing. I imagine proctoring can be automated to large extent as well, if it isn’t already. Today for technical certifications when you log into the test environment there is a person in a far flung country watching you.
Unfortunately, deep learning and the testing for deep learning have some factual evidence that are just the opposite of the author's own practice.
To test deep learning we have to get at the meta-objects that humans form of that subject and the only way to do that currently is essay testing. If you graduated in the 1980s and 1990s it was called blue-book testing for the blank blue books sold that just had pages of blank paper for such a purpose.
Multiple choice testing was chosen for IQ and SAT not for testing performance but to make grading tests faster as when SAT was originally applied during WWI they used it to separate officers from soldier grunts.
Or in short words Teacher interaction with the students meta-objects can at times indicate degree of deep learning.
Now hold on, let me give a twist. How would one test deep learning in a grad course in civil engineering? Since the meta-objects are going to be about building specific things we could have a contest to build a bridge that survives having so much weight applied. In fact, there are MIT classes that do exactly this way of deep testing and you can find the videos of it on Youtube(they are fun to watch).
Those are creative, are going to turn Higher ED upside down and go the interactive testing of human meta-objects rather than choosing to multiple choice test among other deep learning testing strategies.
> I’ve spent much of my career arguing that the main function of education is not to teach useful skills (or even useless skills), but to certify students’ employability. By and large, the reason our customers are on campus is to credibly show, or “signal,” their intelligence, work ethic, and sheer conformity. While we like to picture education as job training, it is largely a passport to the real training that happens on the job.
How about just keep your hot takes out of the article if you're not going to substantiate them?
Also the "real training that happens on the job", lol. I've learned so many more things in university. I'm struggling to remember the last time I did something challenging at any company.
Is that even a hot take at this point? I’m pretty sure I had lecturers that said much the same.
Also, if you’re seeking out challenges but there are none to be found, you must be at some wonderful most profitable & stable shop of all time. If not, you perhaps could look deeper for challenges.
No that's preposterous, I've lived both being self taught and going through college. As a self-taught I couldn't know my unknown-unknowns and/or would've had the drive to study some topics college forced me through, and those who went through college are kidding themselves if they think they'd have learned what they know on the job.
Pretty much the truth, isn't it? The other strand of this thinking is from Taleb: wealth leads to degrees, not the other way round. That could perhaps qualify as a hot take.
What did you learn at university that you actually use on the job, in the form that is usable on the job?
I did a tiny bit of programming, but I couldn't code properly until I was working. I could do calculus, but you don't actually use that as a derivatives trader, you just internalize a few results and that's it. I coded up a model that needed a Bessel function once, but that was a small project. Generally you don't solve new equations every day at work (you apply the same ones over and over), but somehow that's what you do at uni.
I don't think I learned anything at uni that I needed at work, in academic terms. I'm pretty sure I could have just jumped straight to working after high school. My first boss didn't do uni and he did just fine.
Uni was maturation, and the last chance to see a few of the wonders of human knowledge. Materials, electromagnetism, computation, fluids, structures, and so on. But really just a tour of those things. None of these things is what you do at work, where the real skill is grinding, networking, and understanding how some business works.
I did not get classes on electromagnetism, fluid dynamics or engineering topics like you seem to have had
I've excluded things I don't use as much such as
Prolog, algorithm provers (COQ/Promella), advanced calculus, lambda calculus, lots of common-sense classes related to soft-skills like communication and project management
Also I didn't enroll in the graphics programming classes and I regret it a lot
Perhaps it was worth adding the caveat that people who did a CS course actually do seem to use it directly. Practical things like Git and Linux commands are the real thing.
CS is somewhat unique in that you actually do what you do at a job, with the same exact tools. You don't use a toy version of a c++ compiler or a version control system, and your computer is as powerful as the one you get at work.
Everyone else at uni does things that are not the same thing at all, with the exception of medics who are sent to the hospitals. I built a 1m model of a bridge, and a radio that was fixed to one AM station. I used Cadence for a week to build a CPU with a few thousand transistors.
> Also the "real training that happens on the job", lol. I've learned so many more things in university. I'm struggling to remember the last time I did something challenging at any company.
Maybe you're taking the wrong jobs? Lack of growth is a strong signal to move on.
The theory I was exposed to in undergrad was great, but writing active/active services that handle billions of dollars of financial transactions a day is orders of magnitude better training than university. And now that I'm running my own company, I'm learning everything at volume. The experience is a fire hose of learning.
College is fine, but there's nothing like a real job with complex domain challenges.
There are so many cool jobs out there, too - autonomous cars, AI/ML, delivery drones, computer vision, etc.
They are definitely taking the wrong jobs. I was at one of the best universities in the country for CS but I didn't really learn how to be a professional C++ programmer until I got an actual job. Universities can give you some solid foundations, nothing more.
We've had to write the entire raytracing pipeline for a proprietary game engine and make it perform fast enough on the PS5 - it's the kind of stuff that you can't go on stack overflow and ask, you have to actually study and come up with a good technical design by yourself.
I coded the back-end for the borders of europe and it's basically CRUD. Okay, CRUD but distributed. Before that, I did commercial compilers - but the difficult parts were already built so it was just a matter of extending the grammar using what's already working pretty well. Before that - I did web development. Nowhere there did I do any concurrency, data structures, distributed software, nowhere did I need to know about computer architecture or any of the very difficult things I did in college - which prompted me to move away from web to seek bigger challenges, but in the end it's all CRUD, UI work, derivative extensions of already built codebases, fixing bugs and drinking coffee
I mean all I can say is come work in games - the technical challenges you will face will absolutely make you learn new skills. We wouldn't even hire someone who doesn't have good understanding of challenges of concurrency and performance optimization already. I've learnt this stuff at university but having to use it to make a game that runs at 60fps in a very constrained environment - that's the real challenge. Uni CS was super basic compared to that.
> I've learned so many more things in university. I'm struggling to remember the last time I did something challenging at any company.
That's his point. You ostensibly learn things useless for the job so you can prove to your employer your capacity to learn and be somewhere on time and meet the deadlines reasonably well.
In the real job you learn a lot, of things that are incredibly easy when compared to useless stuff you've learned in the university. So easy in fact that you might not even notice them.
Throughout my whole career maybe two courses from the university were directly useful. And yet when I was getting employed in a corporation I was asked for a proof that I completed higher education, despite the fact that nothing on the job required even a shred of knowledge from there but required a lot of knowledge I acquired later. Which they already tested me on before the point they asked for my diploma.
The strap-line doesn't seem to be answered properly by the essay.
The postulation that education isn't about teaching skills, its about teaching employability seems to be rooted in the idea that certification is the only real product of education is wrong.
You hire someone with excellent grades in maths for a maths heavy job.
You hire someone with previous machinist's experience, because you want to hire someone who can operate a lathe/CNC/mill/other.
You don't hire them because they have a certificate.
The author seems to think that skills are similar to certificates of authenticity. What they seem to forget is that certificates of education indicates ability where as guarantees of authenticity indicate value.
Most employers don't look at grades. Assessing skills is time-consuming and often difficult; a degree, especially from a selective school, is a signal that the student satisfied the school's assessments for entry and continuous assessments to remain enrolled to completion. Previous employment at a known, desirable employer (e.g. FAANGs) serves the same kind of purpose.
It is presumed that having the degree correlates to having certain skills. Often the skills they have and also the skills desired never appeared on any syllabus; they're meta-skills useful to thriving in the university, academically and non-academically, that will also be useful in the workplace.
It's super impressive, but it seems like this generative AI step, with accepting of mistakes, was a switch in mindset that made AI suddenly a lot more usefull
But it seems like the last 10% that actually makes it take over human tasks almost fully might take another 5-30 years.
So what if it will take another 30 years? Barely 71 years ago MIT demonstrated one of the first Numerical Control mill for manufacturing [1] that gave us the world as we have it today: from our chairs to our CPUs, nothing could be obtained so fast/cheap/good without automated milling, molding, and so on. It just seems completely unimaginable how by 2094 we won't have synthetic agents being not only great tutors, but also replacing every job that we have today or we will be able to imagine in the interim.
The problem is then with our mental models, our economics, our metaphysics, not with the technology.
Well the world may have burned in 30 years (that doesn't seem unlikely at all). So I can perfectly imagine that by 2094 we don't have those synthetic agents. Maybe we'll just be focused on not starving.
I am starving today, being starved by a system which puts me in artificial scarcity loops, me and who knows how many others. Starvation does not hinder the desire for resolution, if anything, it fosters it.
But yes, the world might disappear tomorrow; it's just not a workable hypothesis. It's much more workable to think that the complexity of our tools will continue to increase, elevating themselves towards higher and higher abstraction layers.
Well in 2023, I think it is reasonable to wonder what will happen to our tools and technology when oil becomes scarce (peak oil was 2008, oil is not unlimited).
I was not saying that the Sun may explode. I was saying that we have good reasons to think that energy will become more and more expensive in the next few decades.
I love your comment. 71 years is less than the age of my mother. I for one expect the world to change a lot more in the coming 30 years than in the last 30 years. In fact, I expect that the speed of change will be so high that it will lead to widespread social unrest, even (or particularly) in the West. But there are incredibly good things coming, if we are prepared and push for them a little bit.
Ah, and our metaphysics, social workings and culture are completely screwed for what is coming, and they are going to screw us good in turn.
We don’t know what the limits are, what the difficulties will be. You can’t just extrapolate from the past. We’ll have to wait and see where things go.
As far as cognitive capacity and consciousness as self-monitoring agency we have N=1, ourselves. Are there limits above us? Probably, but we have no reason to believe we are anywhere near the pinnacle of cognitive or consciousness load.
Sure we can extrapolate from the past, we do it daily when we build something, if we didn't we would get nothing done.
Here's my semi-educated guess based on observing the rate of progress in this field over the last years:
I think there is still improvement to be made in base models. The amount of progress between GPT-2 and GPT-4 in just 4 years is astonishing so I would be surprised if GPT-4 is the limit of what we can do with this technology. I expect at least one more major version that will significantly outperform GPT-4. After that I think the major focus will be making these models more performant, expanding the context window, making them less resource-hungry, adding multimodality and composing them into larger architectures using techniques similar to Chain of Thought - building fully fledged general-purpose agents. These agents will be able to outperform humans at most of the tasks we throw at them.
If I were to guess I'd say this will happen by the end of this decade.
It seems to me we'll see large glue model, which will be able to read from model last layer and write directly into model first layer and are trained at composing them. Some is already happening, and it cannot be generalized, but the idea of being able to glue already trained medium sized model together and then do a retrain of the whole thing to improve performances seems already establishing itself in products and will allow much efficient expenditure of training resources.
Every time a machine arrives that does what we used to do, human tasks are pushed up the chain of abstraction.
This will only change if an AI arrives that is better at us at the only IQ test that counts - navigating the complexities of real life; and maybe not even then.
"""
With regards to taking over jobs, this time it is different. Humans may "specialize in whatever AI does worst", but this is just treading water. Every year, the Venn diagram of uniquely human skills gets smaller as Al/robotics's footprint grows. Using the examples of transatlantic phone calls or e-commerce are false equivalents because these technologies were not simultaneously available, inexpensive, and with low barriers to use. Today, a smart person can watch a 20 min video on Langchain and create an Al app to automate a task in a day. In a year, I'm guessing we'll be able to just ask the an Al app to create a task automating app. The human may have to critique it for a few rounds to get it right, but dont our bosses already do this? Sure, humans are flexible, but they can't compete with a world of motivated people cranking out apps and posting them to Github for the world to improve upon. We currently have a safe harbor in physical tasks, but that will last 5-10 years at the most. Don't get me wrong - the end of scarcity is a good thing, assuming we can successfully make this transition and we aren't wedded to the idea that sharing the benefits (horror of horrors) socialism.
"""
God damn. Not only does this website immediately ask me to subscribe for their fucking nonsense then they ask me to pledge money so they can keep writing more fucking nonsense. I have no idea what this is about and I don't care anymore.
I have ublock origins. That doesnt block the substack nags. Wanna try again?
Also the website owner should make their website usable. I shouldnt have to be running various ad and script blockers just to get a normal - humane - user experience
The fact that GPT-4 can pass his tests says much more about the tests than about the capabilities of GPT-4. Indeed, GPT-4 failed Steven E. Landsburg's exam with a miserable score of 4/90, and the remarkable performance oon Caplan's exam fails to replicate on random economics exams of random universities.
(NB this post does not mean that ChatGPT cannot improve, or that it's not impressive as is. It serves to put Caplan's self-aggrandizement in context.)