aysja

Wiki Contributions

Comments

aysja5238

I agree this is usually the case, but I think it’s not always true, and I don’t think it’s necessarily true here. E.g., people as early as Da Vinci guessed that we’d be able to fly long before we had planes (or even any flying apparatus which worked). Because birds can fly, and so we should be able to as well (at least, this was Da Vinci and the Wright brothers' reasoning). That end point was not dependent on details (early flying designs had wings like a bird, a design which we did not keep :p), but was closer to a laws of physics claim (if birds can do it there isn’t anything fundamentally holding us back from doing it either).

Superintelligence holds a similar place in my mind: intelligence is physically possible, because we exhibit it, and it seems quite arbitrary to assume that we’ve maxed it out. But also, intelligence is obviously powerful, and reality is obviously more manipulable than we currently have the means to manipulate it. E.g., we know that we should be capable of developing advanced nanotech, since cells can, and that space travel/terraforming/etc. is possible. 

These two things together—“we can likely create something much smarter than ourselves” and “reality can be radically transformed”—is enough to make me feel nervous. At some point I expect most of the universe to be transformed by agents; whether this is us, or aligned AIs, or misaligned AIs or what, I don’t know. But looking ahead and noticing that I don’t know how to select the “aligned AI” option from the set “things which will likely be able to radically transform matter” seems enough cause, in my mind, for exercising caution. 

aysja4626

Bloomberg confirms that OpenAI has promised not to cancel vested equity under any circumstances, and to release all employees from one-directional non-disparagement agreements.

They don't actually say "all" and I haven't seen anyone confirm that all employees received this email. It seems possible (and perhaps likely) to me that many high profile safety people did not receive this email, especially since it would presumably be in Sam's interest to do so, and since I haven't seen them claiming otherwise. And we wouldn't know: those who are still under the contract can't say anything. If OpenAI only sent an email to some former employees then they can come away with headlines like "OpenAI releases former staffers from agreement" which is true, without giving away their whole hand. Perhaps I'm being too pessimistic, but I am under the impression that we're dealing with a quite adversarial player, and until I see hard evidence otherwise this is what I'm assuming. 

aysja139

Why do you think this? The power that I'm primarily concerned about is the power to pause, and I'm quite skeptical that companies like Amazon and Google would be willing to invest billions of dollars in a company which may decide to do something that renders their investment worthless. I.e, I think a serious pause, one on the order of months or years, is essentially equivalent to opting out of the race to AGI. On this question, my strong prior is that investors like Google and Amazon have more power than employees or the trust, else they wouldn't invest. 

aysja20

"So God can’t make the atoms be arranged one way and the humans be arranged another contradictory way."

But couldn't he have made a different sort of thing than humans, which were less prone to evil? Like, it seems to me that he didn't need to make us evolve through the process of natural selection, such that species were always in competition, status was a big deal, fighting over mates commonplace, etc. I do expect that there's quite a bit of convergence in the space of possible minds—even if one is selecting them from the set of "all possible atomic configurations of minds"—but I would still guess that not all of those are as prone to "evil" as us. I.e., if the laws of physics were held constant, I would think you could get less evil things than us out of it, and probably worlds which were overall more favorable to life (fewer natural disasters, etc.). But perhaps this is even more evidence that God only cares about the laws of physics? Since we seem much more like an afterthought than a priority?   

aysja126

Secondly, following Dennett, the point of modeling cognitive systems according to the intentional stance is that we evaluate them on a behavioral basis and that is all there is to evaluate.

I am confused on this point. Several people have stated that Dennett believes something like this, e.g., Quintin and Nora argue that Dennett is a goal "reductionist," by which I think they mean something like "goal is the word we use to refer to certain patterns of behavior, but it's not more fundamental than that."

But I don't think Dennett believes this. He's pretty critical of behaviorism, for instance, and his essay Skinner Skinned does a good job, imo, of showing why this orientation is misguided. Dennett believes, I think, that things like "goals," "beliefs," "desires," etc. do exist, just that we haven't found the mechanistic or scientific explanation of them yet. But he doesn't think that explanations of intention will necessarily bottom out in just their outward behavior, he expects such explanations to make reference to internal states as well. Dennett is a materialist, so of course at the end of the day all explanations will be in terms of behavior (inward or outward), on some level, much like any physical explanation is. But that's a pretty different claim from "mental states do not exist." 

I'm also not sure if you're making that claim here or not, but curious if you disagree with the above? 

aysja117

I don't know what Katja thinks, but for me at least: I think AI might pose much more lock-in than other technologies. I.e., I expect that we'll have much less of a chance (and perhaps much less time) to redirect course, adapt, learn from trial and error, etc. than we typically do with a new technology. Given this, I think going slower and aiming to get it right on the first try is much more important than it normally is.  

aysja42

I agree there other problems the EA biosecurity community focuses on, but surely lab escapes are one of those problems, and part of the reason we need biosecurity measures? In any case, this disagreement seems beside the main point that I took Adam to be making, namely that the track record for defining appropriate units of risk for poorly understood, high attack surface domains is quite bad (as with BSL). This still seems true to me.   

aysja177

Dennett meant a lot to me, in part because he’s shaped my thinking so much, and in part because I think we share a kindred spirit—this ardent curiosity about minds and how they might come to exist in a world like ours. I also think he is an unusually skilled thinker and writer in many respects, as well as being an exceptionally delightful human. I miss him. 

In particular, I found his deep and persistent curiosity beautiful and inspiring, especially since it’s aimed at all the (imo) important questions. He has a clarity of thought which manages to be both soft and precise, and a robust ability to detect and avoid bullshit. His book Intuition Pumps and Other Tools for Thinking is full of helpful cognitive strategies, many of which I’ve benefited from, and many of which have parallels in the Sequences. You can just tell that he’s someone in love with minds and the art of thinking, and that he’s actually trying at it.  

But perhaps the thing I find most inspiring about him, the bit which I most want to emulate, is that he doesn’t shy away from the difficult questions—consciousness, intentionality, what are real patterns, how can we tell if a system understanding something, etc—but he does so without any lapse in intellectual rigor. He’s always aiming at operationalization and gears-level understanding, but he’s careful to check for whether mechanistic models in fact correspond to the higher level he’s attempting to address. He doesn’t let things be explained away, but he also doesn’t let things remain mysterious. He’s deeply committed to a materialistic understanding of the world which permits of minds. 

In short, he holds the same mysteries that I do, I think, of how thinking things could come to exist in a world made out of atoms, and he’s committed, as I am, to naturalizing such mysteries in a satisfying way. 

He’s also very clear about the role of philosophy in science: it’s the process of figuring out what the right questions even are, such that one can apply the tools of science to answer them. I think he’s right, both that this is the role of good philosophy and that we’re all pretty confused about what the right questions of mind are. I think he did an excellent job of narrowing the confusion, which is a really fucking cool and admirable thing to spend a life on. But the work isn’t done. In many ways, I view my research as picking up where he left off—the quest for a satisfying account of minds in a materialistic, deterministic world. Now that he’s passed, I realize that I really wanted him to see that. I wanted to show him my work. I feel like part of the way I was connected to the world has been severed, and I am feeling grief about that. 

I’ve learned so much from Dennett. How to think better, how to hold my curiosity better, how to love the mind, and how to wonder productively about it. I feel like the world glows dimmer now than it did before, and I feel that grief—the blinking out of this beautiful light. But it is also a good time to reflect on all that he’s done for the world, and all that he’s done for me. He is really a part of me, and I feel the love and the gratitude for what he’s brought into my life. 

aysja4511

Aw man, this is so exciting! There’s something really important to me about rationalist virtues having a home in the world. I’m not sure if what I’m imagining is what you’re proposing, exactly, but I think most anything in this vicinity would feel like a huge world upgrade to me.

Apparently I have a lot of thoughts about this. Here are some of them, not sure how applicable they are to this project in particular. I think you can consider this to be my hopes for what such a thing might be like, which I suspect shares some overlap.


It has felt to me for a few years now like something important is dying. I think it stems from the seeming inevitability of what’s before us—the speed of AI progress, our own death, the death of perhaps everything—that looms, shadow-like. And it’s scary to me, and sad, because “inevitability” is a close cousin of “defeat,” and I fear the two inch closer all the time.   

It’s a fatalism that creeps in slow, but settles thick. And it lurks, I think, in the emotional tenor of doom that resides beneath nominally probabilistic estimates of our survival. Lurks as well, although much more plainly, within AI labs: AGI is coming whether we want it to or not, pausing is impossible, the invisible hand holds the reins, or as Claude recently explained to me, “the cat is already out of the bag.” And I think this is sometimes intentional—we are supposed to think about labs in terms of the overwhelming incentives, more than we are supposed to think about them as composed of agents with real choice, because that dispossesses them of responsibility, and dispossesses us of the ability to change them.

There is a similar kind of fatalism that often attaches to the idea of the efficient marketplace—that what is desired has already been done, that if one sits back and lets the machine unfold it will arrive at all the correct conclusions itself. There is no room, in that story, for genuinely novel ideas or progress, all forward movement is the result of incremental accretions on existing structures. This sentiment looms in academia as well—that there is nothing fundamental or new left to uncover, that all low hanging fruit has been plucked. Academic aims rarely push for all that could be—progress is instead judged relatively, the slow inching away from what already is. 

And I worry this mentality is increasingly entrenching itself within AI safety, too. That we are moving away from the sort of ambitious science that I think we need to achieve the world that glows—the sort that aims at absolute progress—and instead moving closer to an incremental machine. After all, MIRI tried and failed to develop agent foundations so maybe we can say, “case closed?” Maybe “solving alignment” was never the right frame in the first place. Maybe it always was that we needed to do the slow inching away from the known, the work that just so happens not to challenge existing social structures. There seems to me, in other words, to be a consensus closing in: new theoretical insights are unlikely to emerge, let alone to have any real impact on engineering. And unlikelier, still, to happen in time. 

I find all of this fatalism terribly confused. Not only because it has, I think, caused people to increasingly depart from the theoretical work which I believe is necessary to reach the world that glows, but because it robs us of our agency. The closer one inches towards inevitability, the further one inches away from the human spirit having any causal effect in the world. What we believe is irrelevant, what is good and right is irrelevant; the grooves have been worn, the structures erected—all that’s left is for the world to follow course. We cannot simply ask people to do what’s right, because they apparently can’t. We cannot succeed at stopping what is wrong, because the incentives are too strong to be opposed. All we can do, it seems, is to meld with the structure itself, making minor adjustments on the margin.  

And there’s a feeling I get, sometimes, when I look at all of this, as if a tidal wave were about to engulf me. The machine has a life of its own; the world is moved by forces outside of my control. And it scares me, and I feel small. But then I remember that it’s wrong. 


There was a real death, I think, that happened when MIRI leadership gave up on solving alignment, but we haven’t yet held the funeral. I think people carry that—the shadow of the fear, unnamed but tangible: that we might be racing towards our inevitable death, that there might not be much hope, that the grooves have been worn, the structures erected, and all that’s left is to give ourselves away as we watch it all unravel. It’s not a particularly inspiring vision, and in my opinion, not a particularly correct one. The future is built out of our choices; they matter, they are real. Not because it would be nice to believe it, but because it is macroscopically true. If one glances at history, it’s obvious that ideas are powerful, that people are powerful. The incentives do not dictate everything, the status quo is never the status quo for very long. The future is still ours to decide. And it’s our responsibility to do so with integrity. 

I have a sense that this spirit has been slipping, with MIRI leadership largely admitting defeat, with CFAR mostly leaving the scene, with AI labs looming increasingly large within the culture and the discourse. I don’t want it to. I want someone to hold the torch of rationality and all its virtues, to stay anchored on what is true and good amidst a landscape of rapidly changing power dynamics, to fight for what’s right with integrity, to hold a positive vision for humanity. I want a space for deep inquiry and intellectual rigor, for aiming at absolute progress, for trying to solve the god damn problem. I think Lightcone has a good shot at doing a fantastic job of bringing something like this to life, and I’m very exited to see what comes of this!  

aysja30

Huh, I feel confused. I suppose we just have different impressions. Like, I would say that Oliver is exceedingly good at cutting through the bullshit. E.g., I consider his reasoning around shutting down the Lightcone offices to be of this type, in that it felt like a very straightforward document of important considerations, some of which I imagine were socially and/or politically costly to make. One way to say that is that I think Oliver is very high integrity, and I think this helps with bullshit detection: it's easier to see how things don't cut to the core unless you deeply care about the core yourself. In any case, I think this skill carries over to object-level research, e.g., he often seems, to me, to ask cutting-to-the core type questions there, too. I also think he's great at argument: legible reasoning, identifying the important cruxes in conversations, etc., all of which makes it easier to tell the bullshit from the not. 

I do not think of Oliver as being afraid to be disagreeable, and ime he gets to the heart of things quite quickly, so much so that I found him quite startling to interact with when we first met. And although I have some disagreements over Oliver's past walled-garden taste, from my perspective it's getting better, and I am increasingly excited about him being at the helm of a project such as this. Not sure what to say about his beacon-ness, but I do think that many people respect Oliver, Lightcone, and rationality culture more generally; I wouldn't be that surprised if there were an initial group of independent researcher types who were down and excited for this project as is. 

Load More