why it is so good in general (GPT-4)
What are the examples indicating it's at the level of performance at complex tasks you would expect from GPT-4? Especially performance which is clearly attributable to improvements that we expect to be made in GPT-4? I looked through a bunch of screenshots but haven't seen any so far.
Can confirm I consistently had non-deterministic temp-0 completions on older davinci models accessed through the API last year.
Bloomberg reported on plans to invest $10B today
Have you seen this implemented in any blogging platform other people can use? I'd love to see this feature implemented in some Obsidian publishing solution like quartz, but for now they mostly don't care about access management.
Wow, Zvi example is basically what I've been doing recently with hyperbolic discounting too after I've spent a fair amount of time thinking about Joe Carlsmith—Can you control the past. It seems to work. "It gives me a lot of the kind of evidence about my future behavior that I like" is now the dominant reason behind certain decisions.
How much time do you expect the form, the coding test, and the interview to take for an applicant?
This idea tries to discover translations between the representations of two neural networks, but without necessarily discovering a translation into our representations.
I think this has been under investigation for a few years in the context of model fusion in federated learning, model stitching, and translation between latent representations in general.
Relative representations enable zero-shot latent space communication - an analytical approach to matching representations (though this is a new work, it may be not that good, I haven't checked)
Git Re-Basin: Merging Models modulo Permutation Symmetries - recent model stitching work with some nice results
Latent Translation: Crossing Modalities by Bridging Generative Models - some random application of unsupervised translation to translation between autoencoder latent codes (probably not the most representative example)
It doesn't seem like "shorter timelines" in the safest quadrant has much to do with their current strategy, as they have a gpt-4 paper section on how they postponed the release to reduce acceleration.