Quantcast
Channel: Non_Interactive – Software & ML
Browsing latest articles
Browse All 13 View Live

Techniques for debugging neural networks

In my last post, I briefly discussed the infuriating fact that a neural network, even when deeply flawed, will often “work” in the sense that it’ll do above-random at classification or a generative...

View Article



On the efficiency of human intelligence

A pet peeve of mine that often shows up in ML discourse is the claim that humans are much more data efficient at learning than the models we are currently training. The argument typically goes like...

View Article

ICML 2023

I’ve met quite a few amazing people through this blog, most of which I’ve only had the chance to trade e-mails with. I’m attending ICML next week and would love to grab a coffee or beer with any of...

View Article

Image may be NSFW.
Clik here to view.

DALL-E 3

We released DALL-E 3 this week. It has been a labor of love for Aditya, Gabe and myself for a little over a year. It really is an impressive machine we have built. It continues to surprise me every...

View Article

The State of ML in 2023

I’ve been trying to figure out how to best write this article for most of the last year. Today, I’ve decided to just write down something, rather than continue trying to wordsmith exactly what I mean....

View Article


Image may be NSFW.
Clik here to view.

Is the Reversal Curse a generalization problem?

In my last post, I made a claim that the recently discovered reversal curse is not something that worries me. In fact, when I originally learned of it, I can’t say I was very surprised. In this post,...

View Article

Compute Multipliers

I’ve listened to a couple of interviews with Dario Amodei, CEO of Anthropic, this year. In both of them, he dropped the term “compute multiplier” a few times. This concept is exceptionally important...

View Article

go/rulesofthumb

Google has a neat internal website called “Rules of Thumb”, which compares the marginal cost of computational resources to the unit of a “SWE”. “SWE” refers to “Software Engineer” – which itself is...

View Article


Image may be NSFW.
Clik here to view.

Learned Structures

From 2019-2021, I was fascinated with neural network architectures. I think a lot of researchers in the field were at the time. The transformer paper had been out for a little while and it was...

View Article


Research Code

At my job, I’m currently in a cycle that is involving working with software engineers quite a bit. One thing that has happened a number of times is that a software engineer will bring up “research...

View Article
Browsing latest articles
Browse All 13 View Live




Latest Images