Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Hmmm. A few points.

First, I've never heard of program synthesis, and it seems like an interesting topic. Could you point me to some resources so I can learn more about it?

Second. I take issue with this statement:

> "trying to generate new code by modelling old code has the obvious limitation that no genuinely new code can be generated"

I disagree. We've seen GAN's generate genuinely new artwork, we've seen music synthesizers do the same. We've also seen GPT-3 and other generative text engines create genuinely interesting and innovative content. AI dungeon comes to mind. Sure, it's in one way or another based on it's training data. But that's what humans do too.

Our level of abstraction is just higher, and we're able to generate more "distinct" music/code/songs based on our own training data. But that may not hold in the long term, and it also doesn't mean that current AI models can do nothing but regurgitate. They can generate new, genuinely interesting content and connections.



> First, I've never heard of program synthesis, and it seems like an interesting topic. Could you point me to some resources so I can learn more about it?

I'll leave it to the GP to give a lit review, but will say that CS has (always had) a hype-and-bullshit problem, and that knowing your history is good way to stay sober in this field.

> We've also seen GPT-3 and other generative text engines create genuinely interesting and innovative content.

Making things humans find entertaining is easy. Markov chains could generate genuinely interesting and innovative poetry in the 90s. There's a Reply All episode on generating captions for memes or something where the two hosts gawk in amazement at what's basically early 90s tech.

> They can generate new, genuinely interesting content and connections.

Have you ever dropped acid? You can make all sorts of fascinating content and connections while hallucinating. Seriously -- much more than when you're sober. Probably shouldn't push to prod while stoned, though.

Art and rhetoric are easy because there's really no such thing as "Wrong". That's why we've been able to "fake creative intelligence" since the 80s or 90s.

Almost all software that people are paid to write today is either 1) incredibly complicated and domain-specific (think scientists/mathematicians/R&D engineers), or else 2) interfaces with the real world via either physical machines or, more commonly, integration into some sort of social processes.

For Type 1, "automatic programming" has a loooong way to go, but you could imagine it working out eventually. Previous iterations on "automatic programming" have had huge impacts. E.g., back in the day, the FORTRAN compiler was called an "automatic programmer". Really.

For Type 2, well, wake me up there's a massive transformer-based model that only generates text like "A tech will be at your house in 3 hours" when a tech is actually on the way. And that's a pretty darn trivial example.

There is a third type of software: boilerplate and repetitive crap. Frankly, for that third type, I think the wix.com model will always beat out the CoPilot model. And it's frankly the very low end of the market anyways.


interesting topic. Could you point me to some resources so I can learn more about it?

The following is a comprehensive introduction and survey of recent work in the field:

https://www.microsoft.com/en-us/research/wp-content/uploads/...

Regarding novelty, neural networks are excellent at interpolation but awful at extrapolation. What I mean by interploation is that the models learned by a neural network represent a dense region of cartesian space circumscribed by the neural net's training instances. The trained model can recognise and generate instances included in this dense region, including instances not included in its training set, by interpolation, but it is almost completely incapable of representing instances outside this dense region by extrapolation [1].

So basically what I'm saying above is that deep learning language model code generators can't extrapolate to programs that are not similar to the programs they've been trained on. Hence, no real novelty. But you can go a long way without such novelty hence my expectation that e.g. Copilot can be a nice boilerplate generator.

______________

[1] My standard reference for this is the following article by François Chollet, maintainer of Keras:

https://blog.keras.io/the-limitations-of-deep-learning.html

See also the second part of the series of articles where possible ways are proposed to address this limitation:

https://blog.keras.io/the-future-of-deep-learning.html

The latter is relevant to the conversation because the author basically proposes augmanting current deep learning approaches with a form of program synthesis.


> First, I've never heard of program synthesis, and it seems like an interesting topic. Could you point me to some resources so I can learn more about it?

Look up the work of Cordell Green. He invented the idea and built a company around it.

https://www.kestrel.edu/people/green/




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: