All of Kenoubi's Comments + Replies

I think this comment demonstrates that the list of reacts should wrap, not extend arbitrarily far to the right.

The obvious way to quickly and intuitively illustrate whether reactions are positive or negative would seem to be color; another option would be grouping them horizontally or vertically with some kind of separator. The obvious way to quickly and intuitively make it visible which reactions were had by more readers would seem to be showing a copy of the same icon for each person who reacted a certain way, not a number next to the icon.

I make no claim that either of these changes would be improvements overall. Clearly the second would require a way to handl... (read more)

In the current UI, the list of reactions from which to choose is scrollable, but that's basically impossible to actually see. While reading the comments I was wondering what the heck people were talking about with "Strawman" and so forth. (Like... did that already get removed?) Then I discovered the scrolling by accident after seeing a "Shrug" reaction to one of the comments.

I've had similar thoughts. Two counterpoints:

  • This is basically misuse risk, which is not a weird problem that people need to be convinced even needs solving. To the extent AI appears likely to be powerful, society at large is already working on this. Of course, its efforts may be ineffective or even counterproductive.

  • They say power corrupts, but I'd say power opens up space to do what you were already inclined to do without constraints. Some billionaires, e.g. Bill Gates, seem to be sincerely trying to use their resources to help people. It isn't har

... (read more)

On SBF, I think a large part of the issue is that he was working in an industry called cryptocurrency that is basically has fraud as the bedrock of it all. There was nothing real about crypto, so the collapse of FTX was basically inevitable.

I don't deny that the cryptocurrency "industry" has been a huge magnet for fraud, nor that there are structural reasons for that, but "there was nothing real about crypto" is plainly false. The desire to have currencies that can't easily be controlled, manipulated, or implicitly taxed (seigniorage, inflation) by gove... (read more)

More specifically, the issue with crypto is that the benefits are much less than promised, and there's a whole lot of bullshit claims on crypto like it being secure or not manipulatable. On one example of why cryptocurrencies fail as an a currency, one of it's problems is that it's fixed supply and no central entity means the value of that currency swings wildly, which is a dealbreaker for any currency. Note, this is just one of the many, fractal problems here with crypto. Crypto isn't all fraud. There's reality, but it's built out of unsound foundations and trying to sell a fake castle to others.

Thank you for writing these! They've been practically my only source of "news" for most of the time you've been writing them, and before that I mostly just ignored "news" entirely because I found it too toxic and it was too difficult+distasteful to attempt to decode it into something useful. COVID the disease hasn't directly had a huge effect on my life, and COVID the social phenomenon has been on a significant decline for some time now, but your writing about it (and the inclusion of especially notable non-COVID topics) have easily kept me interested enou... (read more)

2Adam Zerner3mo
I disagree with this part. It might be somewhat valuable, but I think Zvi's talents would be significantly better applied elsewhere.

I found it to be a pretty obvious reference to the title. SPAM is a meatcube. A meatcube is something that has been processed into uniformity. Any detectable character it had, whether faults, individuality, or flashes of brilliance, has been ground, blended, and seasoned away.

I don't know how far a model trained explicitly on only terminal output could go, but it makes sense that it might be a lot farther than a model trained on all the text on the internet (some small fraction of which happens to be terminal output). Although I also would have thought GPT's architecture, with a fixed context window and a fixed number of layers and tokenization that isn't at all optimized for the task, would pay large efficiency penalties at terminal emulation and would be far less impressive at it than it is at other tasks.

Assuming it does work, could we get a self-operating terminal by training another GPT to roleplay the entering commands part? Probably. I'm not sure we should though...

Sure, I understood that's what was being claimed. Roleplaying a Linux VM without error seemed extremely demanding relative to other things I knew LLMs could do, such that it was hard for me not to question whether the whole thing was just made up.

Thanks! This is much more what I expected. Things that look generally like outputs that commands might produce, and with some mind-blowing correct outputs (e.g. the effect of tr on the source code) but also some wrong outputs (e.g. the section after echo A >a; echo X >b; echo T >c; echo H >d; the output being consistent between cat a a c b d d and cat a a c b d d | sort (but inconsistent with the "actual contents" of the files) is especially the kind of error I'd expect an LLM to make).

Done! Thanks for updating me toward this. :P

Got it. This post also doesn't appear to actually be part of that sequence though? I would have noticed if it was and looked at the sequence page.

EDIT: Oh, I guess it's not your sequence.

EDIT2: If you just included "Alignment Stream of Thought" as part of the link text in your intro where you do already link to the sequence, that would work.

Yeah, I thought of holding off actually creating a sequence until I had two posts like this. This updates me toward creating one now being beneficial, so I'm going to do that.


What do you mean by this acronym?  I'm not aware of its being in use on LW, you don't define it, and to me it very definitely (capitalization and all) means Armin van Buuren's weekly radio show A State of Trance.

Alignment Stream of Thought []. Sorry, should've made that clearer - I couldn't think of a natural place to define it.

Counterpoint #2a: A misaligned AGI whose capabilities are high enough to use our safety plans against us will succeed with an equal probability (e.g., close to 100%), if necessary by accessing these plans whether or not they were posted to the Internet.

If only relative frequency of genes matters, then the overall size of the gene pool doesn't matter. If the overall size of the gene pool doesn't matter, then it doesn't matter if that size is zero. If the size of the gene pool is zero, then whatever was included in that gene pool is extinct.

Yes, it's true people make all kinds of incorrect inferences because they think genes that increase the size of the gene pool will be selected for or those that decrease it will be selected against. But it's still also true that a gene that reduces the size of the po... (read more)

True, but it's very nearly entirely the process that only cares about relative frequencies that constructs complex mechanisms such as brains.

I mean, just lag, yes, but there's also plain old incorrect readings. But yes, it would be cool to have a system that incorporated glucagon. Though, diabetics' body still produce glucagon AFAIK, so it'd really be better to just have something that senses glucose and releases insulin the same way a working pancreas would.

Context: I am a type 1 diabetic. I have a CGM, but for various reasons use multiple daily injections rather than an insulin pump; however, I'm familiar with how insulin pumps work.

A major problem with a closed-loop CGM-pump system is data quality from the CGM. My CGM (Dexcom G6) has ~15 minutes of lag (because it reads interstitial fluid, not blood). This is the first generation of Dexcom that doesn't require calibrations from fingersticks, but I've occasionally had CGM readings that felt way off and needed to calibrate anyway. Accuracy and noisiness v... (read more)

It's been years since I've talked to anyone working on this technology, but IIRC one of the issues was that in principle you could prevent the lag from leading to bad data that kills you if the pump could also provide glucagon, but there was no way to make glucagon shelf-stable enough to have in a pump. Apparently that changed as of 2019/2020, or is in the process of changing, so maybe someone will make a pump with both.

We'll build the most powerful AI we think we can control. Nothing prevents us from ever getting that wrong. If building one car with brakes that don't work made everyone in the world die in a traffic accident, everyone in the world would be dead.

So how did we get from narrow AI to super powerful AI? Foom? But we can build narrow AIs that don't foom, because we have. We should be able to build narrow AIs that don't foom by not including anything that would allow them to recursively self improve [*]. EY's answer to the question "why isn't narrow AI safe" wasn't "narrow AI will foom", it was "we won't be motivated to keep AI's narrow". [*] not that we could tell them how to self-improve, since we don't really understand it ourselves.

There's also the problem of an AGI consistently exhibiting aligned behavior due to low risk tolerance, until it stops doing that (for all sorts of unanticipated reasons).

This is especially compounded by the current paradigm of brute forcing randomly generated-neural networks, since the resulting systems are fundamentally unpredictable and unexplainable.

How much did that setup cost? I'm curious about similar use cases.

2Radford Neal1y
I bought my system in February 2021 for $3400 Canadian dollars (plus tax).  It had the 12-core Threadripper Pro 3945WX (the low-end option), and 32 GBytes of ECC RAM (two DIMMs), plus a NVIDIA P620 GPU (which I replaced with other GPUs), and a 1TB HDD.  I added six more DIMMs (bought second-hand on ebay, for about $100 per DIMM, be careful to get the right kind!) to get 128 GBytes in eight channels, as well as additional SSDs and HDDs.  The prices of everything may be different now.  An A4000 GPU can now be obtained for about $1400 Canadian dollars, but mine were more expensive when I bought them before the crypto crash.  An A4500 GPU has better cooling (and is a bit more powerful), but takes two slots and costs more.

The best way to actually schedule or predict a project is to break it down into as many small component tasks as possible, identify dependencies between those tasks, and produce most likely, optimistic, and pessimistic estimates for each task, and then run a simulation for chain of dependencies to see what the expected project completion looks like. Use a Gantt chart. This is a boring answer because it's the "learn project management" answer, and people will hate on it because gesture vaguely to all of the projects that overrun their schedule. There are m

... (read more)

In other words, asking people for a best guess or an optimistic prediction results in a biased prediction that is almost always earlier than a real delivery date. On the other hand, while the pessimistic question is not more accurate (it has the same absolute error margins), it is unbiased. The reality is that the study says that people asked for a pessimistic question were equally likely to over-estimate their deadline as they were to under-estimate it. If you don't think a question that gives you a distribution centered on the right answer is useful, I'

... (read more)

I have a sense that this is a disagreement about how to decide what words "really" mean, and I have a sense that I disagree with you about how to do that.

I had already (weeks ago) approvingly cited and requested for my wife and my best friend to read that particular post, which I think puts it at 99.5th percentile or higher of LW posts in terms of my wanting its message to be understood and taken to heart, so I think I disagree with this comment about as strongly as is possib... (read more)

It didn't work for the students in the study in the OP. That's literally why the OP mentioned it!

It depends on what you mean by "didn't work". The study described is published in a paper only 16 pages long. We can just read it:,_1994.pdf [,_1994.pdf] First, consider the question of, "are these predictions totally useless?" This is an important question because I stand by my claim that the answer of "never" is actually totally useless due to how trivial it is. Yep. Matches my experience. We know that only 11% of students met their optimistic targets, and only 30% of students met their "best guess" targets. What about the pessimistic target? It turns out, 50% of the students did finish by that target. That's not just a quirk, because it's actually related to the distribution itself. In other words, asking people for a best guess or an optimistic prediction results in a biased prediction that is almost always earlier than a real delivery date. On the other hand, while the pessimistic question is not more accurate (it has the same absolute error margins), it is unbiased. The reality is that the study says that people asked for a pessimistic question were equally likely to over-estimate their deadline as they were to under-estimate it. If you don't think a question that gives you a distribution centered on the right answer is useful, I'm not sure what to tell you. The paper actually did a number of experiments. That was just the first. In the third experiment, the study tried to understand what people are thinking about when estimating. This seems relevant considering that the idea of premortems or "worst case" questioning is to elicit impediments, and the project managers / engineering leads doing that questioning are intending to hear about impediments and will continue their questioning until they've been satisfied that the group is actually discussing that.  In the fourt

You're right - "you failed, what happened" does create a mental frame that "what could go wrong" does not. I don't think "how long could it take if everything goes as poorly as possible" creates any more useful of a frame than "you failed, what happened". But it does, formally, request a number. I don't think that number, itself, is good for anything. I'm not even convinced asking for that number is very effective for eliciting the "you failed, what happened" mindset. I definitely don't think it's more effective for that than just asking directly "you failed, what happened".

Given the context, I imagine what they were doing is making up a number that was bigger than another number they'd just made up. Humans are cognitive misers. A student would correctly guess that it doesn't really matter if they get this question right and not try very hard. That's actually what I would do in a context where it was clear that a numeric answer was required, I was expected to spend little time answering, and I was motivated not to leave that particular question blank.

My answer of "never" also took little thought (for me). I thought a bit ... (read more)

Yes, given that question, IMO they should have answered "never". 55.5 days isn't the true answer, because in reality everything didn't go as poorly as possible. You're right, it's a bad question that a brick wall would do a better job of answering correctly than a human who's trying to be helpful.

The answer to your question is useful, but not because of the number. "What could go wrong to make this take longer than expected?" would elicit the same useful information without spuriously forcing a meaningless number to be produced.

5[DEACTIVATED] Duncan Sabien1y
I have a sense that this is a disagreement about how to decide what words "really" mean, and I have a sense that I disagree with you about how to do that. [] It is false that that question would elicit the same useful information. Quoting from something I previously wrote elsewhere:

"Assume it gets done but the process is super shitty. How long will it take?"

I would, in fact, consider that interpretation to be unambiguously "not understanding what they were being asked" , given the question in the post. Not understanding what is being asked is something that happens a fair bit.

I'll give you that if they had asked "assuming it gets done but everything goes as poorly as possible, how long did it take?", it takes a bit of a strange mind to look for some weird scenario that strings things along for years, or centuries, or eons. But "... (read more)

If we look at the student answers, they were off by ~7 days, or about a 14% error from the actual completion time. The only way I can interpret your post is that you're suggesting all of these students should have answered "never". How far off is "never" from the true answer of 55.5 days? It's about infinitely far off. It is an infinitely wrong answer. Even if a project ran 1000% over every worst-case pessimistic schedule, any finite prediction was still infinitely closer than "never". That's because "infinitely long" is a trivial answer for any task that isn't literally impossible.[1] It provides 0 information and takes 0 computational effort. It might as well be the answer from a non-entity, like asking a brick wall how long the thesis could take to complete. Question: How long can it take to do X? Brick wall: Forever. Just go do not-X instead. It is much more difficult to give an answer for how long a task can take assuming it gets done while anticipating and predicting failure modes that would cause the schedule to explode, and that same answer is actually useful since you can now take preemptive actions to avoid those failure modes -- which is the whole point of estimating and scheduling as a logical exercise.  The actual conversation that happens during planning is A: "What's the worst case for this task?" B: "6 months." A: "Why?" B: "We don't have enough supplies to get past 3 trial runs, so if any one of them is a failure, the lead time on new materials with our current vendor is 5 months." A: "Can we source a new vendor?"  B: "No, but... <some other idea>" 1. ^ In cases when something is literally impossible, instead of saying "infinitely long", or "never", it's more useful to say "that task is not possible" and then explain why. Communication isn't about finding the "haha, gotcha" answer to a question when asked.

I would, in fact, consider that interpretation to be unambiguously "not understanding what they were being asked" , given the question in the post.

Two points:

  1. "Not understanding what they were being asked" isn't an explanation. This is a common (really, close-to-universal) gap in people's attempts to understand one another. If these students didn't "understand" (noting the ambiguity about what exactly that means), what were they doing instead? "Being stupid"? "Not thinking"? "Falling prey to biases"? None of this tells you what they were doing.
  2. I think what
... (read more)

in a classic experiment, 37 psychology students were asked to estimate how long it would take them to finish their senior theses “if everything went as poorly as it possibly could,” and they still underestimated the time it would take, as a group (the average prediction was 48.6 days, and the average actual completion time was 55.5 days).

That's nuts. Does anyone really think that "if everything went as poorly as it possibly could" that the thesis would ever get done at all? It's so bizarre it makes me question whether the students actually understood what they were being asked.

IME the sense that this is nuts seems to be a quirk of STEM thinking. In practice, most non-rationalists seem to interpret "How long will this take if everything goes as poorly as possible?" as something like "Assume it gets done but the process is super shitty. How long will it take?"

It's a quirk of rationalist culture (and a few others — I've seen this from physicists too) to take the words literally and propose that "infinitely long" is a plausible answer, and be baffled as to how anyone could think otherwise.

Many smart political science and English majors don't seem to go down that line of reasoning, for instance.

Thermodynamic? Thermodynamics seems to be about using a small number of summary statistics (temperature, pressure, density, etc.) because the microstructure of the system isn't necessary to compute what will happen at the macro level.

"Building an actual aligned AI, of course, would be a pivotal act." What would an aligned AI do that would prevent anybody from ever building an unaligned AI?

My guess is that it would implement universal surveillance and intervene, when necessary, to directly stop people from doing just that. Sorry, I should've been clearer that I was talking about an aligned superintelligent AI. Since unaligned AI killing everyone seems pretty obviously extremely bad according to the vast majority of humans' preferences, preventing that would be a very high priority for any sufficiently powerful aligned AI.

Thanks, that really clarifies things. Frankly I’m not on board with any plan to “save the world” that calls for developing AGI in order to implement universal surveillance or otherwise take over the world. Global totalitarianism dictated by a small group of all-powerful individuals is just so terrible in expectation that I’d want to take my chances on other paths to AI safety. I’m surprised that these kinds of pivotal acts are not more openly debated as a source of s-risk and x-risk. Publish your plans, open yourselves to critique, and perhaps you’ll revise your goals. If not, you’ll still be in a position to follow your original plan. Better yet, you might convince the eventual decision makers of it.

I can't see how "publishing papers with alignment techniques" or "encouraging safe development with industry groups and policy standards" could be pivotal acts. To prevent anyone from building unaligned AI, building an unaligned AI in your garage needs to be prevented. That requires preventing people who don't read the alignment papers or policy standards and aren't members of the industry groups from building unaligned AI.

That, in turn, appears to me to require at least one of 1) limiting access to computation resources from your garage, 2) limiting kno... (read more)

"Building an actual aligned AI, of course, would be a pivotal act." What would an aligned AI do that would prevent anybody from ever building an unaligned AI? I mostly agree with what you wrote. Preventing all unaligned AIs forever seems very difficult and cannot be guaranteed by soft influence and governance methods. These would only achieve a lower degree of reliability, perhaps constraining governments and corporations via access to compute and critical algorithms but remaining susceptible to bad actors who find loopholes in the system. I guess what I'm poking at is, does everyone here believe that the only way to prevent AI catastrophe is through power-grab pivotal acts that are way outside the Overton Window, e.g. burning all GPUs? 

I’m going to predict a somewhat faster rise this week because I doubt the Midwest drop will get sustained.

The deaths number going up this much shows that my prediction the previous week was indeed far too high, despite this coming in substantially higher than my median guess, confirming that last week was a cross between slower real growth than expected and the Easter holiday. This week had a huge jump in the South region.

These pieces of text don't seem to line up with their associated charts and graphs? I notice I am confused.

Wow, the air conditioner systematically sucking the cold air it's generated back into the intake sort of seems like another problem with this design. (Possibly the same problem in another guise, thermodynamically, but in any case, different in terms of actual produced experience.)

I am also not quite clear why north Korea destroying the world would be so much worse than deepmind doing it.

I think the argument about this part would be that Deepmind is much more likely (which is not to say "likely" on an absolute scale) to at least take alignment seriously enough to build (or even use!) interpretability tools and maybe revise their plans if the tools show the AGI plotting to kill everyone. So by the time Deepmind is actually deploying an AGI (even including accidental "deployments" due to foom during testing), it's less likely to b... (read more)

2Donald Hobson1y
Imagine you are in charge of choosing how fast deep mind develops tech. Go too fast and you have a smaller chance of alignment. Go too slow and north Korea may beat you.  There isn't much reason to go significantly faster than north Korea in this scenario.  If you can go a bit faster and still make something probably aligned, do that. In a worse situation, taking your time and hoping for a drone strike on north Korea is probably the best bet.  Coordinating on a fuzzy boundary no one can define or measure is really hard. If coordination happens, it will be to avoid something simple, like any project using more than X compute. I don't think Conceptually close to AGI = Profitable. There is simple dumb money making code. And there is code that contains all the ideas for AGI, but is missing one tiny piece, and so is useless.

I didn't look at the tags before reading. I did notice it was fiction pretty quickly but "is this dath ilan" was still a live question for me until the reveal. (Though Eliezer might want to continue writing some non-dath ilan fiction occasionally, if he wants that to continue to be a likely thought process.)

I think I have that intuition because the great majority of seatbelt unbucklings in my experience happen while traveling at a speed of zero (because they're in cars, not planes). The sentence has no cues to indicate the unusual context of being in a plane (and in fact, figuring that out is the point of the example). So my mental process reading that sentence is "that's obviously false" -> "hmm, wonder if I'm missing something" -> "oh, maybe in a plane?" and the first step there seems a lot more reliable (in other reasoners as well, not just me) than the second or third.

The "inference" "We can also infer that she is traveling at a high speed because she is unbuckling her seatbelt." is also nonsensical. People don't typically unbuckle their seatbelts when traveling at high speed. (Albeit, this does maybe happen to be true for airplane travel because one isn't allowed to unbuckle one's seatbelt while traveling at low speed, i.e. during taxi, takeoff and landing; but that's enough of a non-central case that it needs to be called out for the reasoning not to sound absurd.)

Why is it a non-central example when this is, in fact, about commercial airplane travel where you will be moving fastest at cruising altitude and that is when you're allowed to unbuckle and move about the cabin?

I'm pretty confused that this is as necessary as it is, particularly with writing that involves a lot of math and math notation. I don't understand how people get the insight and motivation necessary to write that kind of thing without explaining what the point of it is or giving examples of how to apply it as part of their expositions.

Your (johnswentworth's) posts don't seem to suffer nearly as much from this, at least from a quick skim of the ones you said on another thread you'd like distilled, but e.g. Infra-Bayesianism seems maybe important (it's abo... (read more)

I'm not saying there is a clearly better structure available for this purpose - I think the weirdness comes from the fact that it's so unclear who should go in the box normally reserved for "Shareholders" or "Voters."

Isn't the obvious answer for nonprofits "Donors"? (Yes, it's not immediately obvious how amount or recency of donations should translate into power to ultimately direct the organization, but this at least tells you who should go in the box.)

That's not to say that this is how nonprofits are run (it seems not to be) but that it would be the obvious translation of the ways that for-profit companies and governments are run.

Creating totally artificial rewards ("if I put away the groceries, I can have a piece of candy") kind of straddles the boundary between "Rewards" and "Make the task less unpleasant", IMO. I've occasionally managed to do N things I didn't want to do by allowing myself to watch N episodes of a show I'm binging, but only if I do a thing between episodes. Clearly this is a second-best solution, but it does seem to stop the "what happens next? Keep poking him until he watches the next episode so we can find out" demon from being quite so disruptive.

One aspect of habits you didn't mention is automaticity. I automatically brush my teeth as part of taking a shower. Often I notice myself doing this, but even if not it's still part of the "taking a shower" mental program.

I agree that this isn't reliable and sometimes one has to muscle through to retain a habit, but I think the balance between automaticity and aversion to losing the habit varies a lot from person to person, and it would be worth having a better understanding of what enhances or inhibits automaticity.

1Ricardo Santos1y
exactly, sometimes it is easier to swallow multiple habits in "package deals" that is making habits to automatically happen in conjunction or immediately one after the other, once you have one already in place 

If only Fermat had had wider margins...

I wish you (or someone) would make a little book of this.

I can certainly see why one might do that – if you allow requests then refusing a request is going to be quite socially awkward at be evest.

Can't tell whether you're being sarcastic, but I actually think this is true. I feel like there might have been a time when these things weren't weaponised and even if it were okay to ask, no one would do it unless they were immunocompromised or something. But these days, as an employer who wants employees spending as little time and attention on masking (or lack thereof) as possible, the only maybe-stable equilib... (read more)

I felt dumb recently when I noticed that the determinant is sort of "the absolute value for matrices", considering that it's literally written using the same signs as the absolute value. Although I guess the determinant of the representation of a complex number as a matrix is , not . The "signed volume" idea seems related to this, insofar as multiplying a complex number by another will stretch / smush by (in addition to rotating it).

Thanks. Yeah, I knew there was some qualifier missing that would make it true, I just couldn't intuit exactly what it was.

Edited to add: Actually I would say that the determinant distributes through multiplication. Commutativity: . Distributivity: . Neither is a perfect analog, because the determinant is a unary operation, but distributivity at least captures that there are two operations involved. But unlike my other comment, this one doesn't actually impair comprehension, as there's not really a different thing you could be... (read more)

Now, we see a connection with the sign of a permutation: it's the only nontrivial way we know (and in fact it's the only way to do it at all!) to assign a scalar value to a permutation, which in this special case we know the determinant must do.

Huh? Off the top of my head, here's another way to assign a scalar value to a permutation: multiply together the lengths of all the cycles it contains. (No idea whether this is useful for anything. Taking the least common multiple of the lengths of all the cycles tells you the order of the permutation, i.e. how many times you have to apply it before you get the identity, though.)

2Ege Erdil1y
The assignment has to commute with multiplication, and your proposed assignment would not. Just consider, say, (12)(23)=(123). I've edited the post to make this clearer, thanks for the comment.

I have a little experience with this. I'm a type 1 diabetic and insulin needs to be kept refrigerated or it denatures and doesn't work.

  • I've just gotten a bunch of EcoFlow products for this. They work well together, but are maybe not the cheapest. Many are out of stock at the moment, anyway.

  • My 400W portable solar panel appears to actually generate around 140W in bright sunlight when pointed in roughly the right direction. This is a substantial amount for powering phones and probably even laptops! Not so much for refrigerators, air conditioners, e

... (read more)

See also Evolution of Modularity. Using your quantilizer to choose training examples from which to learn appears to be a very simple, natural way of accomplishing modularly varying reward functions. (I left a comment there too.)

Aren't "modularly varying reward functions" exactly what D𝜋's Self-Organised Neural Networks accomplish? Each example in the training data is a module of the reward function. By only learning on the training examples that are currently hardest for the network, we make those examples easier and thus implicitly swap them out of the "examples currently hardest for the network" set.

Load More