Curious, what do you think now that GPT-4 is out?
When I first saw this post it was at -1 karma, which didn't make much sense to me, so I upvoted it back to zero. Can anyone who downvoted it share their reasoning?
if there is any way of fixing this mess, it's going to involve clarifying conflicts rather than obfuscating them
This immediately brought to mind John Nerst's erisology. I've been paying attention to it for a while, but I don't see it much here (speaking as a decade-long lurker); I wonder why.
Human/Machine Intelligence Parity by 2040? on Metaculus has a pretty high bar for human-level intelligence:
Assume that prior to 2040, a generalized intelligence test will be administered as follows. A team of three expert interviewers will interact with a candidate machine system (MS) and three humans (3H). The humans will be graduate students in each of physics, mathematics and computer science from one of the top 25 research universities (per some recognized list), chosen independently of the interviewers. The interviewers will electronically communicate (via text, image, spoken word, or other means) an identical series of exam questions of their choosing over a period of two hours to the MS and 3H, designed to advantage the 3H. Both MS and 3H have full access to the internet, but no party is allowed to consult additional humans, and we assume the MS is not an internet-accessible resource. The exam will be scored blindly by a disinterested third party.
Question resolves positively if the machine system outscores at least two of the three humans on such a test prior to 2040.
(I graduated in physics from a top-25 research university, and I'm not at all confident I'd pass this test myself.)
In any case, I wonder if it's better to not overly focus on the question of "the right operational definition of human-level intelligence" and instead adopt Holden's approach of talking about PASTA, in particular the last 2 sentences:
By "transformative AI," I mean "AI powerful enough to bring us into a new, qualitatively different future." The Industrial Revolution is the most recent example of a transformative event; others would include the Agricultural Revolution and the emergence of humans.2
This piece is going to focus on exploring a particular kind of AI I believe could be transformative: AI systems that can essentially automate all of the human activities needed to speed up scientific and technological advancement. I will call this sort of technology Process for Automating Scientific and Technological Advancement, or PASTA.3 (I mean PASTA to refer to either a single system or a collection of systems that can collectively do this sort of automation.) ... [some paragraphs on what PASTA can do]
By talking about PASTA, I'm partly trying to get rid of some unnecessary baggage in the debate over "artificial general intelligence." I don't think we need artificial general intelligence in order for this century to be the most important in history. Something narrower - as PASTA might be - would be plenty for that.
I feel like he was falling into a kind of fallacy. He observed that a concept isn't entirely coherent, rejected the concept.
My go-to writeup on this is Luke Muehlhauser's Imprecise definitions can still be useful section of his What is Intelligence? MIRI essay written in 2013, which discusses the question of operationalizing the concept of "self-driving car":
...consider the concept of a “self-driving car,” which has been given a variety of vague definitions since the 1930s. Would a car guided by a buried cable qualify? What about a modified 1955 Studebaker that could use sound waves to detect obstacles and automatically engage the brakes if necessary, but could only steer “on its own” if each turn was preprogrammed? Does that count as a “self-driving car”?
What about the “VaMoRs” of the 1980s that could avoid obstacles and steer around turns using computer vision, but weren’t advanced enough to be ready for public roads? How about the 1995 Navlab car that drove across the USA and was fully autonomous for 98.2% of the trip, or the robotic cars which finished the 132-mile off-road course of the 2005 DARPA Grand Challenge, supplied only with the GPS coordinates of the route? What about the winning cars of the 2007 DARPA Grand Challenge, which finished an urban race while obeying all traffic laws and avoiding collisions with other cars? Does Google’s driverless car qualify, given that it has logged more than 500,000 autonomous miles without a single accident under computer control, but still struggles with difficult merges and snow-covered roads?4
Our lack of a precise definition for “self-driving car” doesn’t seem to have hindered progress on self-driving cars very much.5 And I’m glad we didn’t wait to seriously discuss self-driving cars until we had a precise definition for the term.
Bertrand Russell put it more pithily:
[You cannot] start with anything precise. You have to achieve such precision… as you go along.
I agree with this comment, and I'm confused why it's so disagreed with (-6 agreement karma vs +11 overall). Can anyone who disagreed explain their reasoning?
Apparently Jeff Bezos used to do something like this with his regular "question mark emails", which struck me as interesting in the context of an organization as large and complex as Amazon. Here's what it's like from the perspective of one recipient (partial quote, more at the link):
About a month after I started at Amazon I got an email from my boss that was a forward of an email Jeff sent him. The email that Jeff had sent read as follows:
“?”
That was it.
Attached below the “?” was an email from a customer to Jeff telling him he (the customer) takes a long time to find a certain type of screws on Amazon despite Amazon carrying the product.
A “question mark email” from Jeff is a known phenomenon inside Amazon & there’s even an internal wiki on how to handle it but that’s a story for another time. In a nutshell, Jeff's email is public and customers send him emails with suggestions, complaints, and praise all the time. While all emails Jeff receives get a response, he does not personally forward all of them to execs with a “?”. It means he thinks this is very important.
It was astonishing to me that Jeff picked that one seemingly trivial issue and a very small category of products (screws) to personally zoom in on. ...
Where are you going with this line of questioning?
If it's high-quality distillation you're interested in, you don't necessarily need a PhD. I'm thinking of e.g. David Roodman, now a senior advisor at Open Philanthropy. He majored in math, then did a year-long independent study in economics and public policy, and has basically been self-taught ever since. Holden Karnofsky considers what he does extremely valuable:
David Roodman, who is basically the person that I consider the gold standard of a critical evidence reviewer, someone who can really dig on a complicated literature and come up with the answers, he did what, I think, was a really wonderful and really fascinating paper, which is up on our website, where he looked for all the studies on the relationship between incarceration and crime, and what happens if you cut incarceration, do you expect crime to rise, to fall, to stay the same? He picked them apart. What happened is he found a lot of the best, most prestigious studies and about half of them, he found fatal flaws in when he just tried to replicate them or redo their conclusions.
When he put it all together, he ended up with a different conclusion from what you would get if you just read the abstracts. It was a completely novel piece of work that reviewed this whole evidence base at a level of thoroughness that had never been done before, came out with a conclusion that was different from what you naively would have thought, which concluded his best estimate is that, at current margins, we could cut incarceration and there would be no expected impact on crime. He did all that. Then, he started submitting it to journals. It’s gotten rejected from a large number of journals by now. I mean starting with the most prestigious ones and then going to the less.
Robert Wiblin: Why is that?
Holden Karnofsky: Because his paper, it’s really, I think, it’s incredibly well done. It’s incredibly important, but there’s nothing in some sense, in some kind of academic taste sense, there’s nothing new in there. He took a bunch of studies. He redid them. He found that they broke. He found new issues with them, and he found new conclusions. From a policy maker or philanthropist perspective, all very interesting stuff, but did we really find a new method for asserting causality? Did we really find a new insight about how the mind of a …
Robert Wiblin: Criminal.
Holden Karnofsky: A perpetrator works. No. We didn’t advance the frontiers of knowledge. We pulled together a bunch of knowledge that we already had, and we synthesized it. I think that’s a common theme is that, I think, our academic institutions were set up a while ago. They were set up at a time when it seemed like the most valuable thing to do was just to search for the next big insight.
These days, they’ve been around for a while. We’ve got a lot of insights. We’ve got a lot of insights sitting around. We’ve got a lot of studies. I think a lot of the times what we need to do is take the information that’s already available, take the studies that already exist, and synthesize them critically and say, “What does this mean for what we should do? Where we should give money, what policy should be.”
I don’t think there’s any home in academia to do that. I think that creates a lot of the gaps. This also applies to AI timelines where it’s like there’s nothing particularly innovative, groundbreaking, knowledge frontier advancing, creative, clever about just … It’s a question that matters. When can we expect transformative AI and with what probability? It matters, but it’s not a work of frontier advancing intellectual creativity to try to answer it.
Curious, what do you think now that GPT-4 is out?