Ziz is...exceptionally (and probably often uncomfortably) aware of the way people's minds work in a psychoanalytic sense.
What do you mean by this? Like, she's better than average at predicting people's behavior in various circumstances?
Back in January, I participated in a workshop in which the attendees mapped out how they expect AGI development and deployment to go. The idea was to start by writing out what seemed most likely to happen this year, and then condition on that, to forecast what seems most likely to happen in the next year, and so on, until you reach either human disempowerment or an end of the acute risk period.
This post was my attempt at the time.
I spent maybe 5 hours on this, and there's lots of room for additional improvement. This is not a confident statement of how I think things are most likely to play out. There are already some ways in which I think this projection is wrong. (I think it's too fast, for instance). But nevertheless I'm posting it now, with only a few edits and elaborations, since I'm probably not going to do a full rewrite soon.
2024
2025
2026
2027 and 2028
2028
2029
2030
the fundamental laws governing how AI training processes work are not "thinking back"
As a commentary from an observer: this is distinct from the proposition "the minds created with those laws are not thinking back."
Near human AGI need not transition to ASI until the relevant notKillEveryone problems have been solved.
How much is this central to your story of how things go well?
I agree that humanity could do this (or at least it could if it had it's shit together), and I think it's a good target to aim for that buys us sizable successes probability. But I don't think it's what's going to happen by default.
This seems clearly false in the case of deep learning, where progress on instilling any particular behavioral tendencies in models roughly follows the amount of available data that demonstrate said behavioral tendency. It's thus vastly easier to align models to goals where we have many examples of people executing said goals. As it so happens, we have roughly zero examples of people performing the "duplicate this strawberry" task, but many more examples of e.g., humans acting in accordance with human values, ML / alignment research papers, chatbots acting as helpful, honest and harmless assistants, people providing oversight to AI models, etc. See also: this discussion. [emphasis mine]
The thing that makes powerful AI powerful is that it can figure out how to do things that we don't know how to do yet, and therefore don't have examples of. The key question for aligning superintelligences is "how do they generalize in new domains that are beyond what humans were able to do / reason about / imagine.
I haven't seen careful analysis of LLMs (probably because they're newer, so harder to fit a trend), but eyeballing it... Chinchilla by itself must have been a factor-of-4 compute-equivalent improvement at least.
Ok, but discovering the Chinchilla scaling laws is a one time boost to training efficiency. You should expect to repeatedly get 4x improvements because you observed that one.
At the time, the largest training run was AlphaGoZero, at about a mole of flops in 2017. Six years later, Metaculus currently estimates that GPT-4 took ~10-20 moles of flops.
By "mole" do you mean the unit from chemistry?
In chemistry, a mole is a unit of measurement used to express amounts of a chemical substance. It is one of the base units in the International System of Units (SI) and is defined as the amount of substance that contains as many elementary entities (such as atoms, molecules, ions, or electrons) as there are atoms in 12 grams of carbon-12 (12C), the isotope of carbon with relative atomic mass 12 by definition. This number is known as Avogadro's number, which is approximately 6.022×10236.022×1023 entities per mole.
Am I missing something? Why use that unit?
Back in 2020, a group at OpenAI ran a conceptually simple test to quantify how much AI progress was attributable to algorithmic improvements. They took ImageNet models which were state-of-the-art at various times between 2012 and 2020, and checked how much compute was needed to train each to the level of AlexNet (the state-of-the-art from 2012). Main finding: over ~7 years, the compute required fell by ~44x. In other words, algorithmic progress yielded a compute-equivalent doubling time of ~16 months (though error bars are large in both directions).
Personally, I would be more interested in the reverse of this test: take all the prior state of the art models, and ask how long you need to train them in order to match the benchmark of current state of the art models.
Would that even work at all? Is there some (non-astronomically large) level of training which makes AlexNet as capable as current state of the art image recognition models?
The experiment that they did is a little like asking "at what age is an IQ 150 person able to do what an adult IQ 70 person is able to do?". But a more interesting question is "How long does it take to make up for being IQ 70 instead of IQ 150?"
You should have her decide (and write down) what to encode in advance, so that you can check later not just if you remembered something, but if you successfully encoded it in such a way that you communicate what you intend to communicate to yourself.
(Since Drake managed to send a memory, but was only guessing about what it was intended to mean.)