JenniferRM - LessWrong

St. Louis – ACX Meetups Everywhere Spring 2024

I think I'll be there and will bring a guest or three and will bring some basic potluck/picnic food :-)

Some Experiments I'd Like Someone To Try With An Amnestic

There was an era in a scientific community where they were interested in the "kinds of learning and memory that could happen in de-corticated animals" and they sort of homed in on the basal ganglia (which, to a first approximation "implements habits" (including bad ones like tooth grinding)) as the locus of this "ability to learn despite the absence of stuff you'd think was necessary for your naive theory of first-order subjectively-vivid learning".

(The cerebellum also probably has some "learning contribution" specifically for fine motor skills, but it is somewhat selectively disrupted just by alcohol: hence the stumbling and slurring. I don't know if anyone yet has a clean theory for how the cerebellum's full update loop works. I learned about alcohol/cerebellum interactions because I once taught a friend to juggle at a party, and she learned it, but apparently only because she was drunk. She lost the skill when sober.)

William_S's Shortform

JenniferRM5d70

Wait, you know smart people who have NOT, at some point in their life: (1) taken a psychedelic NOR (2) meditated, NOR (3) thought about any of buddhism, jainism, hinduism, taoism, confucianisn, etc???

To be clear to naive readers: psychedelics are, in fact, non-trivially dangerous.

I personally worry I already have "an arguably-unfair and a probably-too-high share" of "shaman genes" and I don't feel I need exogenous sources of weirdness at this point.

But in the SF bay area (and places on the internet memetically downstream from IRL communities there) a lot of that is going around, memetically (in stories about) and perhaps mimetically (via monkey see, monkey do).

The first time you use a serious one you're likely getting a permanent modification to your personality (+0.5 stddev to your Openness?) and arguably/sorta each time you do a new one, or do a higher dose, or whatever, you've committed "1% of a personality suicide" by disrupting some of your most neurologically complex commitments.

To a first approximation my advice is simply "don't do it".

HOWEVER: this latter consideration actually suggests: anyone seriously and truly considering suicide should perhaps take a low dose psychedelic FIRST (with at least two loving tripsitters and due care) since it is also maybe/sorta "suicide" but it leaves a body behind that most people will think is still the same person and so they won't cry very much and so on?

To calibrate this perspective a bit, I also expect that even if cryonics works, it will also cause an unusually large amount of personality shift. A tolerable amount. An amount that leaves behind a personality that similar-enough-to-the-current-one-to-not-have-triggered-a-ship-of-theseus-violation-in-one-modification-cycle. Much more than a stressful day and then bad nightmares and a feeling of regret the next day, but weirder. With cryonics, you might wake up to some effects that are roughly equivalent to "having taken a potion of youthful rejuvenation, and not having the same birthmarks, and also learning that you're separated-by-disjoint-subjective-deaths from LOTS of people you loved when you experienced your first natural death" for example.This is a MUCH BIGGER CHANGE than just having a nightmare and a waking up with a change of heart (and most people don't have nightmares and changes of heart every night (at least: I don't and neither do most people I've asked)).

Remember, every improvement is a change, though not every change is an improvement. A good "epistemological practice" is sort of a idealized formal praxis for making yourself robust to "learning any true fact" and changing only in GOOD ways from such facts.

A good "axiological practice" (which I don't know of anyone working on except me (and I'm only doing it a tiny bit, not with my full mental budget)) is sort of an idealized formal praxis for making yourself robust to "humanely heartful emotional changes"(?) and changing only in <PROPERTY-NAME-TBD> ways from such events.

(Edited to add: Current best candidate name for this property is: "WISE" but maybe "healthy" works? (It depends on whether the Stoics or Nietzsche were "more objectively correct" maybe? The Stoics, after all, were erased and replaced by Platonism-For-The-Masses (AKA "Christianity") so if you think that "staying implemented in physics forever" is critically important then maybe "GRACEFUL" is the right word? (If someone says "vibe-alicious" or "flowful" or "active" or "strong" or "proud" (focusing on low latency unity achieved via subordination to simply and only power) then they are probably downstream of Heidegger and you should always be ready for them to change sides and submit to metaphorical Nazis, just as Heidegger subordinated himself to actual Nazis without really violating his philosophy at all.)))

I don't think that psychedelics fits neatly into EITHER category. Drugs in general are akin to wireheading, except wireheading is when something reaches into your brain to overload one or more of your positive-value-tracking-modules, (as a trivially semantically invalid shortcut to achieving positive value "out there" in the state-of-affairs that your tracking modules are trying to track) but actual humans have LOTS of <thing>-tracking-modules and culture and science barely have any RIGOROUS vocabulary for any them.

Note that many of these neurological <thing>-tracking-modules were evolved.

Also, many of them will probably be "like hands" in terms of AI's ability to model them.

This is part of why AI's should be existentially terrifying to anyone who is spiritually adept.

AI that sees the full set of causal paths to modifying human minds will be "like psychedelic drugs with coherent persistent agendas". Humans have basically zero cognitive security systems. Almost all security systems are culturally mediated, and then (absent complex interventions) lots of the brain stuff freezes in place around the age of puberty, and then other stuff freezes around 25, and so on. This is why we protect children from even TALKING to untrusted adults: they are too plastic and not savvy enough. (A good heuristic for the lowest level of "infohazard" is "anything you wouldn't talk about in front of a six year old".)

Humans are sorta like a bunch of unpatchable computers, exposing "ports" to the "internet", where each of our port numbers is simply a lightly salted semantic hash of an address into some random memory location that stores everything, including our operating system.

Your word for "drugs" and my word for "drugs" don't point to the same memory addresses in the computer's implementing our souls. Also our souls themselves don't even have the same nearby set of "documents" (because we just have different memories n'stuff)... but the word "drugs" is not just one of the ports... it is a port that deserves a LOT of security hardening.

The bible said ~"thou shalt not suffer a 'pharmakeia' to live" for REASONS.

William_S's Shortform

JenniferRM6d41

These are valid concerns! I presume that if "in the real timeline" there was a consortium of AGI CEOs who agreed to share costs on one run, and fiddled with their self-inserts, then they... would have coordinated more? (Or maybe they're trying to settle a bet on how the Singularity might counterfactually might have happened in the event of this or that person experiencing this or that coincidence? But in that case I don't think the self inserts would be allowed to say they're self inserts.)

Like why not re-roll the PRNG, to censor out the counterfactually simulable timelines that included me hearing from any of the REAL "self inserts of the consortium of AGI CEOS" (and so I only hear from "metaphysically spurious" CEOs)??

Or maybe the game engine itself would have contacted me somehow to ask me to "stop sticking causal quines in their simulation" and somehow I would have been induced by such contact to not publish this?

Mostly I presume AGAINST "coordinated AGI CEO stuff in the real timeline" along any of these lines because, as a type, they often "don't play well with others". Fucking oligarchs... maaaaaan.

It seems like a pretty normal thing, to me, for a person to naturally keep track of simulation concerns as a philosophic possibility (its kinda basic "high school theology" right?)... which might become one's "one track reality narrative" as a sort of "stress induced psychotic break away from a properly metaphysically agnostic mental posture"?

That's my current working psychological hypothesis, basically.

But to the degree that it happens more and more, I can't entirely shake the feeling that my probability distribution over "the time T of a pivotal acts occurring" (distinct from when I anticipate I'll learn that it happened which of course must be LATER than both T and later than now) shouldn't just include times in the past, but should actually be a distribution over complex numbers or something...

...but I don't even know how to do that math? At best I can sorta see how to fit it into exotic grammars where it "can have happened counterfactually" or so that it "will have counterfactually happened in a way that caused this factually possible recurrence" or whatever. Fucking "plausible SUBJECTIVE time travel", fucking shit up. It is so annoying.

Like... maybe every damn crazy AGI CEO's claims are all true except the ones that are mathematically false?

How the hell should I know? I haven't seen any not-plausibly-deniable miracles yet. (And all of the miracle reports I've heard were things I was pretty sure the Amazing Randi could have duplicated.)

All of this is to say, Hume hasn't fully betrayed me yet!

Mostly I'll hold off on performing normal updates until I see for myself, and hold off on performing logical updates until (again!) I see a valid proof for myself <3

William_S's Shortform

JenniferRM6d84

For most of my comments, I'd almost be offended if I didn't say something surprising enough to get a "high interestingness, low agreement" voting response. Excluding speech acts, why even say things if your interlocutor or full audience can predict what you'll say?

And I usually don't offer full clean proofs in direct word. Anyone still pondering the text at the end, properly, shouldn't "vote to agree", right? So from my perspective... its fine and sorta even working as intended <3

However, also, this is currently the top-voted response to me, and if William_S himself reads it I hope he answers here, if not with text then (hopefully? even better?) with a link to a response elsewhere?

((EDIT: Re-reading everything above his, point, I notice that I totally left out the "basic take" that might go roughly like "Kurzweil, Altman, and Zuckerberg are right about compute hardware (not software or philosophy) being central, and there's a compute bottleneck rather than a compute overhang, so the speed of history will KEEP being about datacenter budgets and chip designs, and those happen on 6-to-18-month OODA loops that could actually fluctuate based on economic decisions, and therefore its maybe 2026, or 2028, or 2030, or even 2032 before things pop, depending on how and when billionaires and governments decide to spend money".))

Pulling honest posteriors from people who've "seen things we wouldn't believe" gives excellent material for trying to perform aumancy... work backwards from their posteriors to possible observations, and then forwards again, toward what might actually be true :-)

Thoughts on seed oil

JenniferRM6d20

I look forward to your reply!

(And regarding "food cost psychology" this is an area where I think Neo Stoic objectivity is helpful. Rich people can pick up a lot of hedons just from noticing how good their food is, and formerly poor people have a valuable opportunity to re-calibrate. There are large differences in diet between socio-economic classes still, and until all such differences are expressions of voluntary preference, and "dietary price sensitivity has basically evaporated", I won't consider the world to be post-scarcity. Each time I eat steak, I can't help but remember being asked in Summer Camp as a little kid, after someone ask "if my family was rich" and I didn't know, about this... like the very first "objective calibrating response" accessible to us as children was the rate of my family's steak consumption. Having grown up in some amount of poverty, I often see "newly rich people" eating as if their health is not the price of slightly more expensive food, or their health is "not worth avoiding the terrible terrible sin of throwing food in the garbage (which my aunt who lived through the Great Depression in Germany yelled at me, once, with great feeling, for doing, when I was child and had eaten less than ALL the birthday cake that had been put on my plate)". Cultural norms around food are fascinating and, in my opinion, are often rewarding to think about.)

William_S's Shortform

JenniferRM7d163

What are your timelines like? How long do YOU think we have left?

I know several CEOs of small AGI startups who seem to have gone crazy and told me that they are self inserts into this world, which is a simulation of their original self's creation. However, none of them talk about each other, and presumably at most one of them can be meaningfully right?

One AGI CEO hasn't gone THAT crazy (yet), but is quite sure that the November 2024 election will be meaningless because pivotal acts will have already occurred that make nation state elections visibly pointless.

Also I know many normies who can't really think probabilistically and mostly aren't worried at all about any of this... but one normy who can calculate is pretty sure that we have AT LEAST 12 years (possibly because his retirement plans won't be finalized until then). He also thinks that even systems as "mere" as TikTok will be banned before the November 2024 election because "elites aren't stupid".

I think I'm more likely to be better calibrated than any of these opinions, because most of them don't seem to focus very much on "hedging" or "thoughtful doubting", whereas my event space assigns non-zero probability to ensembles that contain such features of possible futures (including these specific scenarios).

Were there any ancient rationalists?

Answer by JenniferRMMay 03, 2024273

It was a time before LSTMs or Transformers, a time before Pearlian Causal Graphs, a time before computers.

Indeed, it was even a time before Frege or Bayes. It was a time and place where even arabic numerals had not yet memetically infected the minds of people to grant them the powers of swift and easy mental arithmetic, and where non-syllabic alphabets (with distinct consonants and vowels) were still kinda new...

...in that time, someone managed to get credit for inventing the formalization of the syllogism! And he had a whole school for people to get naked and talk philosophy with each other. And he took the raw material of a simple human boy, and programmed that child into a world conquering machine whose great act of horror was to sack Thebes. (It is remarkable how many philosophers are "causally upstream, though a step or two removed" from giant piles of skulls. Hopefully, the "violent tragedy part" can be avoided this time around.)

Libertinism, logic, politics, and hypergraphia were his tools. His name was Aristotle. (Weirdly, way more people name their own children after the person-shaped-machine who was programmed to conquer the world, rather than the person-shaped programmer. All those Alexes and Alexandras, and only a very few Aristotles.)

Ironing Out the Squiggles

JenniferRM8d90

I appreciate this response because it stirred up a lot of possible responses, in me, in lots of different directions, that all somehow seems germane to the core goal of securing a Win Conditions for the sapient metacivilization of earth! <3

(A) Physical reality is probably hyper-computational, but also probably amenable to pulling a nearly infinite stack of "big salient features" from a reductively analyzable real world situation.

My intuition says that this STOPS being "relevant to human interests" (except for modern material engineering and material prosperity and so on) roughly below the level of "the cell".

Other physics with other biochemistry could exist, and I don't think any human would "really care"?

Suppose a Benevolent SAI had already replaced all of our cells with nanobots without our permission AND without us noticing because it wanted to have "backups" or something like that...

(The AI in TMOPI does this much less elegantly, because everything in that story is full of hacks and stupidity. The overall fact that "everything is full of hacks and stupidity" is basically one of the themes of that novel.)

Contingent on a Benevoent SAI having thought it had good reason to do such a thing, I don't think that once we fully understand the argument in favor of doing it that we would really have much basis for objecting?

But I don't know for sure, one way or the other...

((To be clear, in this hypothetical, I think I'd volunteer to accept the extra risk to be one of the last who was "Saved" this way, and I'd volunteer to keep the secret, and help in a QA loop of grounded human perceptual feedback, to see if some subtle spark of magical-somethingness had been lost in everyone transformed this way? Like... like hypothetically "quantum consciousness" might be a real thing, and maybe people switched over to running atop "greygoo" instead of our default "pinkgoo" changes how "quantum consciousness" works and so the changeover would non-obviously involve a huge cognitive holocaust of sorts? But maybe not! Experiments might be called for... and they might need informed consent? ...and I think I'd probably consent to be in "the control group that is unblinded as part of the later stages of the testing process" but I would have a LOT of questions before I gave consent to something Big And Smart that respected "my puny human capacity to even be informed, and 'consent' in some limited and animal-like way".))

What I'm saying is: I think maybe NORMAL human values (amongst people with default mental patterns rather than weirdo autists who try to actually be philosophically coherent and ended up with utility functions that have coherently and intentionally unbounded upsides) might well be finite, and a rule for granting normal humans a perceptually indistinguishable version of "heaven" might be quite OK to approximate with "a mere a few billion well chosen if/then statements".

To be clear, the above is a response to this bit:

As such, I think the linear separability comes from the power of the "lol stack more layers" approach, not from some intrinsic simple structure of the underlying data. As such, I don't expect very much success for approaches that look like "let's try to come up with a small set of if/else statements that cleave the categories at the joints instead of inelegantly piling learned heuristics on top of each other".

And:

I don't think that such a model would succeed because it "cleaves reality at the joints" though, I expect it would succeed because you've managed to find a way that "better than chance" is good enough and you don't need to make arbitrarily good predictions.

Basically, I think "good enough" might be "good enough" for persons with finite utility functions?

(B) A completely OTHER response here is that you should probably take care to NOT aim for something that is literally mathematically impossible...

Unless this is part of some clever long term cognitive strategy, where you try to prove one crazy extreme, and then its negation, back and forth, as a sort of "personally implemented GAN research process" (and even then?!)...

...you should probably not spend much time trying to "prove that 1+1=5" nor try to "prove that the Halting Problem actually has a solution". Personally, any time I reduce a given plan to "oh, this is just the Halting Problem again" I tend to abandon that line of work.

Perfectly fine if you're a venture capitalist, not so great if you're seeking adversarial robustness.

Past a certain point, one can simply never be adversarially robust in a programmatic and symbolically expressible way.

Humans would have to have non-Turing-Complete souls, and so would any hypothetical Corrigible Robot Saint/Slaves, in order to literally 100% prove that literally infinite computational power won't find a way to make things horrible.

There is no such thing as a finitely expressible "Halt if Evil" algorithm...

...unless (I think?) all "agents" involved are definitely not Turing Complete and have no emotional attachments to any questions whose answers partake of the challenges of working with Turing Complete systems? And maybe someone other than me is somehow smart enough to write a model of "all the physics we care about" and "human souls" and "the AI" all in some dependently typed language that will only compile if the compiler can generate and verify a "proof that each program, and ALL programs interacting with each other, halt on all possible inputs"?

My hunch is that that effort will fail, over and over, forever, but I don't have a good clean proof that it will fail.

Note that I'm pretty sure A and B are incompatible takes.

In "take A" I'm working from human subjectivity "down towards physics (through a vast stack of sociology and biology and so on)" and it just kinda seems like physics is safe to throw away because human souls and our humanistically normal concerns are probably mostly pretty "computational paltry" and merely about securing food, and safety, and having OK romantic lives?

In "take B" I'm starting with the material that mathematicians care about, and noticing that it means the project is doomed if the requirement is to have a mathematical proof about all mathematically expressible cares or concerns.

It would be... kinda funny, maybe, to end up believing "we can secure a Win Condition for the Normies (because take A is basically true), but True Mathematicians are doomed-and-blessed-at-the-same-time to eternal recursive yearning and Real Risk (because take B is also basically true)" <3

(C) Chaos is a thing! Even (and especially) in big equations, including the equations of mind that big stacks of adversarially optimized matrices represent!

This isn't a "logically deep" point. I'm just vibing with your picture where you imagine that the "turbulent looking" thing is a metaphor for reality.

In observable practice, the boundary conditions of the equations of AI also look like fractally beautiful turbulence!

I predict that you will be surprised by this empirical result. Here is the "high church papering" of the result:

TITLE: The boundary of neural network trainability is fractal

Abstract: Some fractals -- for instance those associated with the Mandelbrot and quadratic Julia sets -- are computed by iterating a function, and identifying the boundary between hyperparameters for which the resulting series diverges or remains bounded. Neural network training similarly involves iterating an update function (e.g. repeated steps of gradient descent), can result in convergent or divergent behavior, and can be extremely sensitive to small changes in hyperparameters. Motivated by these similarities, we experimentally examine the boundary between neural network hyperparameters that lead to stable and divergent training. We find that this boundary is fractal over more than ten decades of scale in all tested configurations.

Also, if you want to deep dive on some "half-assed peer review of this work" hacker news chatted with itself about this paper at length.

EDITED TO ADD: You respond "Lots of food for thought here, I've got some responses brewing but it might be a little bit" and I am happy to wait. Quality over speed is probably maybe still sorta correct. Timelines are compressing, but not so much that minutes matter... yet?

Ironing Out the Squiggles

JenniferRM10d20

I actually kind of expect this.

Basically, I think that we should expect a lot of SGD results to result in weights that do serial processing on inputs, refining and reshaping the content into twisted and rotated and stretched high dimensional spaces SUCH THAT those spaces enable simply cutoff based reasoning to "kinda really just work".

Like the prototypical business plan needs to explain "enough" how something is made (cheaply) and then explain "enough" how it will be sold (for more money) over time with improvements in the process (according to some growth rate?) with leftover money going back to investors (with corporate governance hewing to known-robust patterns for enabling the excess to be redirected to early investors rather than to managers who did a corruption coup, or a union of workers that isn't interested in sharing with investors and would plausibly decide play the dictator game at the end in an unfair way, or whatever). So if the "governance", "growth rate", "cost", and "sales" dimensions go into certain regions of the parameter space, each one could strongly contribute to a "don't invest" signal, but if they are all in the green zone then you invest... and that's that?

If, after reading this, you still disagree, I wonder if it is more because (1) you don't think that SGD can find space stretching algorithms with that much semantic flexibility or because (2) you don't think any list of less than 20 concepts like this could be found whose thresholds could properly act as gates on an algorithm for making prudent startup investment decisions... or is it something totally else you don't buy (and if so, what)?

LESSWRONG
LW

Posts

Wiki Contributions

Comments