A few months ago I saw a post on LinkedIn where someone fed the leading LLMs a counter-intuitively drawn circuit with 3 capacitors in parallel and asked what the total capacitance was. Not a single one got it correct - not only did they say the caps were in series (they were not) it even got the series capacitance calculations wrong. I couldn’t believe they whiffed it and had to check myself and sure enough I got the same results as the author and tried all types of prompt magic to get the right answer… no dice.
I also saw an ad for an AI tool that’s designed to help you understand schematics. In its pitch to you, it’s showing what looks like a fairly generic guitar distortion pedal circuit and does manage to correctly identify a capacitor as blocking DC but failed to mention it also functions as a component in an RC high-pass filter. I chuckled when the voice over proudly claims “they didn’t even teach me this in 4 years of Electrical Engineering!” (Really? They don’t teach how capacitors block DC and how RC filters work????)
If you’re in this space you probably need to compile your own carefully curated codex and train something more specialized. The general purpose ones struggle too much.
I would expect an LLM's internal modeling to be on approximately the level of "this is a diagram of a capacitor circuit for some student's homework; electrical component calculations for homework tend to use the adding-in-reciprocal rule, because simple addition would be too straightforward for homework".
> “they didn’t even teach me this in 4 years of Electrical Engineering!” (Really? They don’t teach how capacitors block DC and how RC filters work????)
My experience with being an adult, in general, is that many people who went to university don't believe that any given course taught them anything meaningful.
I can absolutely believe that such people didn't learn and remember anything meaningful from those courses. Whether the course is to blame, is far more questionable.
>I can absolutely believe that such people didn't learn and remember anything meaningful from those courses. Whether the course is to blame, is far more questionable.
It's the same as all the people who say "Why didn't high school teach me how to balance a check book or calculate a mortgage or blah blah?"
In nearly every case, they literally did, but you weren't paying attention.
You also had to cheat off me to pass biology, so I'm going to go ahead and press X to doubt that you "understand the immune system"
We are surrounded by people who failed to invest in their own education, and instead of facing that awful reality, they INSIST that WE are the dumb ones.
Where are you from is difficult given that we don't know how the alien cop's map is drawn. Third planet from a star shining at 5700K that is 8 kiloparsecs from the galactic center is only slightly more useful than lost kid and saying that their mom's name is mommy.
Chemistry of sustenance. We're carbon based and everything comes from that, but constructing a description of edible food from raw elements is going to take a lot more than drawing some hexagons with C H N and O, along with other required elements. Before we get to food and H₂O though, we'd need an atmosphere to breathe, I wouldn't presume the alien cops know to have an oxygen/nitrogen mix for humans, and not something that's poisonous for humans, like CO.
Time is something that's possible to express though. SI defined the second as a number of vibrations of a Cesium-133 atom, 8 hours of sleep is just multiplication.
Don't think anybody could describe what/where/what to an alien cop that doesn't even speak English to get themselves home or even to not die in an alien atmosphere.
I'm definitely in the 99.9%, which is more like 99.9999999%. In other words, I doubt there's even 10 people on the planet that would survive that scenario.
As an educationer at the academic level the number of times I have to explain absolute basic "everybody should have learned it in school"-physics is staggering.
Why should we expect a general-purpose instruction-tuned LLM to get this right in the first place? I am not at all surprised it didn't work, and I would be more than a little surprised if it did.
> Why should we expect a general-purpose instruction-tuned LLM to get this right in the first place?
The argument goes: Language encodes knowledge, so from the vast reams of training data, the model will have encoded the fundamentals of electromagnetism. This is based in the belief that LLMs being adept at manipulating language, are therefore inchoate general intelligences, and indeed, attaining AGI is a matter of scaling parameters and/or training data on the existing LLM foundations.
A huge number of people in academia believe so. The entire self-help literary genre is based upon this concept.
In reality, and with my biases as self-taught person, experience is crucial. Learning on the field. 10,000 hours of practice. Something LLMs are not very good at. You train them a priori, then it's a relatively static product compared to how human brains operate and self-adjust.
Yeah but language sucks at encoding the locality relations that represent a 2D picture such as a circuit diagram. Language is a fundamentally 1D concept.
And I'm baffled that HN is not picking up on that and ACTUALLY BELIEVES that you can achieve AGI with a simple language model scaled to billions of parameters.
It's as futile as trying to explain vision to a blind man using "only" a few billion words. There's simply no string of words that can create a meaningful representation in the mind of the blind man.
I dropped EE entirely and switched from Computer Engineering to Computer Science because of my entry level EE course professor. I know I'm not the only person pushed away from EE due to Neil Cotter. Boggles my mind why he's still allowed to be the gateway to that discipline for so many people.
Most entry level engineering classes (first 3/4 semesters) in most of Europe (all kinds) are designed to gate keep.
I graduated in chemistry, and Chemistry 1 in engineering had tests much more difficult than any other Chemistry 1 in any other faculty. After noticing that the same pattern applied to Physics 1 or Calculus I started realizing it was an engineering thing, which was later confirmed to me by an associate professor that was the design.
I asked him why, and he told me that it's a long established thing that you don't want people that struggle with science fundamentals to build bridges, ships or electrical circuits so the first semesters are very focused on this weeding.
I came into CS during a year they were trying to rework the intro class. Several of the homework assignments simply did not work. Which taught me that procrastination doesn’t just feel good, it also pays off. If I waited until three days before it was due before I even looked at it, there would be a whole thread about corrections and clarifications. Though in a couple cases they were still sorting things out and people were calling for extensions (one of which I believe we got).
And this at a top ten school for CS.
There are healthy ways to exploit an urge to procrastinate but this is just feeding the monster, and I hope the prof was ashamed of himself.
Ahh perhaps that explains why I had Stress Analysis and Material Science in the first semester of CE... they were far harder than anything in following four years. I thought they were filler LOL. This was back in 92.
While I didn’t switch majors, I had a similar experience with my intro EE class. My theory was that it was intentionally a weeder class to push students towards the other engineering concentrations.
Intro EE is kinda brutal in that there’s a lot of theory to cover, and you need to build the intuition on how it applies to real world circuit design on the fly.
I had a bit of an epiphany when I was in a set theory/number theory class and some classmates were breezing through proofs that I struggled with. I was having to do algebraic manipulations in a way that was novel to me, but was intuitive to math nerds. I felt like that guy who didn’t “get” the intuition in an intro programming or circuits class.
But yeah, students often get some context for math or programming in high school, but rarely for circuit design. E&M in physics at best. EE programs have solved this by weeding out anyone who can’t bash their way through the foundational theory… which isn’t great.
If you’re still interested, I would recommend the Student Manual to the Art of Electronics. It’s a very practical, lab-based book that throws out a lot of the math in favor of rules of thumb and gaining intuition for circuit design.
The thing I hated most about EE 101 though was that the diagrams predated the discovery of the electron so all the arrows point the wrong way. AND NOBODY BOTHERED TO FIX IT. It felt like taking a racketball class with my foot stuck in a bucket.
I studied mechatronics and did reasonably well... but in any electrical class I would just scrape by. I loved it but was apparently not suited to it. I remember a whole unit basically about transistors. On the software/mtrx side we were so happy treating MOSFETs as digital. Having to analyse them in more depth did my head in.
I had a similar experience, except Mechanical Engineering being my weakest area. Computer Science felt like a children's game compared to fluid dynamics...
I don’t mind LLMs in the ideation and learning phases, which aren’t reproducible anyway. But I still find it hard to believe engineers of all people are eager to put a slow, expensive, non-deterministic black box right at the core of extremely complex systems that need to be reliable, inspectable, understandable…
You find it hard to believe that non-deterministic black boxes at the core of complex systems are eager to put non-deterministic black boxes at the core of complex systems?
Yes I do! Is that some sort of gotcha? If I can choose between having a script that queries the db and generates a report and “Dave in marketing” who “has done it for years”, I’m going to pick the script. Who wouldn’t? Until machines can reliably understand, operate and self-correct independently, I’d rather not give up debuggability and understandability.
I think this comment and the parent comment are talking about two different things. One of you is talking about using nondeterministic ML to implement the actual core logic (an automated script or asking Dave to do it manually), and one of you is talking about using it to design the logic (the equivalent of which is writing that automated script).
LLM’s are not good at actually doing the processing, they are not good at math or even text processing at a character level. They often get logic wrong. But they are pretty good at looking at patterns and finding creative solutions to new inputs (or at least what can appear creative, even if philosophically it’s more pattern matching than creativity). So an LLM would potentially be good at writing a first draft of that script, which Dave could then proofread/edit, and which a standard deterministic computer could just run verbatim to actually do the processing. Eventually maybe even Dave’s proofreading would be superfluous.
Tying this back to the original article, I don’t think anyone is proposing having an LLM inside a chip that processes incoming data in a non-deterministic way. The article is about using AI to design the chips in the first place. But the chips would still be deterministic, the equivalent of the script in this analogy. There are plenty of arguments to make about LLM‘s not being good enough for that, not being able to follow the logic or optimize it, or come up with novel architectures. But the shape of chip design/Verilog feels like something that with enough effort, an AI could likely be built that would be pretty good at it. All of the knowledge that those smart knowledgeable engineers which are good at writing Verilog have built up can almost certainly be represented in some AI form, and I wouldn’t bet against AI getting to a point where it can be helpful similarly to how Copilot currently is with code completion. Maybe not perfect anytime soon, but good enough that we could eventually see a path to 100%. It doesn’t feel like there’s a fundamental reason this is impossible on a long enough time scale.
> So an LLM would potentially be good at writing a first draft of that script, which Dave could then proofread/edit
Right, and there’s nothing fundamentally wrong with this, nor is it a novel method. We’ve been joking about copying code from stack overflow for ages, but at least we didn’t pretend that it’s the peak of human achievement. Ask a teacher the difference between writing an essay and proofreading it.
Look, my entire claim from the beginning is that understanding is important (epistemologically, it may be what separates engineering from alchemy, but I digress). Practically speaking, if we see larger and larger pieces of LLM written code, it will be similar to Dave and his incomprehensible VBA script. It works, but nobody knows why. Don’t get me wrong, this isn’t new at all. It’s an ever-present wet blanket that slowly suffocates engineering ventures who don’t pay attention and actively resist. In that context, uncritically inviting a second wave of monkeys to the nuclear control panels, that’s what baffles me.
> We’ve been joking about copying code from stack overflow for ages
Tangent for a slight pet peeve of mine:
"We" did joke about this, but probably because most of our jobs are not in chip design. "We" also know the limits of this approach.
The fact that Stack Overflow is the most SEO optimised result for "how to center div" (which we always forget how to do) doesn't have any bearing on the times when we have an actual problem requiring our attention and intellect. Say diagnosing a performance issue, negotiating requirements and how they subtly differ in an edge case from the current system behaviour, discovering a shared abstraction in 4 pieces of code that are nearly but not quite the same.
I agree with your posts here, the Stack Overflow thing in general is just a small hobby horse I have.
Also the Stack Overflow thing has more to do with all of us being generalists, not incompetent.
I look up "how do I sort a list in language X" because I know from school that there IS a defined good way to do it, probably built into the language, and it will be extremely idiomatic, but I haven't used language X in five years and the specifics might have changed and I don't remember the specific punctuation.
> So an LLM would potentially be good at writing a first draft of that script, which Dave could then proofread/edit, and which a standard deterministic computer could just run verbatim to actually do the processing
Or Dave could write a first draft of that script, saving him the time needed to translate what the LLM composed.
>If I can choose between having a script that queries the db and generates a report and “Dave in marketing” who “has done it for years”
If you could that would be nice wouldn't it? And if you couldn't?
If people were saying, "let's replace Casio Calculators with interfaces to GPT" then that would be crazy and I would wholly agree with you but by and large, the processes people are scrambling to place LLMs in are ones that typical machines struggle or fail and humans excel or do decently (and that LLMs are making some headway in).
You're making the wrong distinction here. It's not Dave vs your nifty script. It's Dave or nothing at all.
There's no point comparing LLM performance to some hypothetical perfect understanding machine that doesn't exist.
You compare to the things its meant to replace - humans. How well can the LLM do this compared to Dave ?
100%, and a lot of them are truly terrible use cases for LLMs.
For example, using a LLM to transform structured data into JSON, and doing it with two LLMs in parallel to try to catch the inevitable failures, instead of just writing code that outputs JSON.
If your task was being solved well by a deterministic script/algorithm, you are not going to save money porting to LLMs even if you use Open Source models.
'could' is doing a whole lot of work in that sentence, I'm being charitable. Reality is LLMs are being crammed in places where it isn't very sensible under thin justifications, just like the last few big ideas were (c.f. blockchain)
One great thing about humans is that we have developed ways to be deterministic when we want to. That’s what math is for.
Does an LLM know math? Not like we do. There’s no deductive logic in there; it’s all statistical inferences from language. An LLM doesn’t “work through” a circuit diagram systematically the way a physics student would. It observes the entire diagram at once, and then guesses the most likely next token.
I'm a non-deterministic black box who teaches complex deterministic machines to do stuff and leverages other deterministic machines as tools to do the job.
I like my job.
My job also involves cooperating with other non-deterministic black boxes (colleagues).
I can totally see how artificial non-deterministic black boxes (artificial colleagues) may be useful to replace/augment the biological ones.
For one, artificial colleagues don't get tired and I don't accidentally hurt their feelings or whatnot.
In any case, I'm not looking forward to replacing my deterministic tools with the fuzzy AI stuff.
Intuitively at least it seems to me that these non-deterministic black boxes could really benefit from using the deterministic tools for pretty much the same reasons we do as well.
Can you actually like follow through with this line? I know there are literally tens of thousands of comments just like this at this point, but if you have chance, could you explain what you think this means? What should we take from it? Just unpack it a little bit for us.
An interpretation that makes sense to me: humans are non-deterministic black boxes already at the core of complex systems. So in that sense, replacing a human with AI is not unreasonable.
I’d disagree, though: humans are still easier to predict and understand (and trust) than AI, typically.
With humans we have a decent understanding of what they are capable of. I trust a medical professional to provide me with medical advice and an engineer to provide me with engineering advice. With LLM, it can be unpredictable at times, and they can make errors in ways that you would not imagine. Take the following examples from my tool, which shows how GPT-4o and Claude 3.5 Sonnet can screw up.
In this example, GPT-4o cannot tell that GitHub is spelled correctly:
I still believe LLM is a game changer and I'm currently working on what I call a "Yes/No" tool which I believe will make trusting LLMs a lot easier (for certain things of course). The basic idea is the "Yes/No" tool will let you combine models, samples and prompts to come to a Yes or No answer.
Based on what I've seen so far, a model can easily screw up, but it is unlikely that all will screw up at the same time.
It's actually a great topic - both humans and LLMs are black boxes. And both rely on patterns and abstractions that are leaky. And in the end it's a matter of trust, like going to the doctor.
But we have had extensive experience with humans, it is normal to have better defined trust, LLMs will be better understood as well. There is no central understander or truth, that is the interesting part, it's a "Blind men and the elephant" situation.
We are entering the nondeterministic programming era in my opinion. LLM applications will be designed with the idea that we can't be 100% sure and what ever solution can provide the most safe guards, will probably be the winner.
Because people are not saying "let's replace Casio Calculators with interfaces to GPT!"
By and large, the processes people are scrambling to place LLMs in are ones that typical machines struggle or fail and humans excel or do decently (and that LLMs are making some headway in).
There's no point comparing LLM performance to some hypothetical perfect understanding machine that doesn't exist. It's nonsensical actually. You compare it to the performance of the beings it's meant to replace or augment - humans.
Replacing non-deterministic black boxes with potentially better performing non-deterministic black boxes is not some crazy idea.
Sure. I mean, humans are very good at building businesses and technologies that are resilient to human fallibility. So when we think of applications where LLMs might replace or augment humans, it’s unsurprising that their fallible nature isn’t a showstopper.
Sure, EDA tools are deterministic, but the humans who apply them are not. Introducing LLMs to these processes is not some radical and scary departure, it’s an iterative evolution.
Ok yeah. I think the thing that trips me up with this argument then is just, yes, when you regard humans in a certain neuroscientific frame and consider things like consciousness or language or will, they are fundamentally nondeterministic. But that isn't the frame of mind of the human engineer who does the work or even validates it. When the engineer is working, they aren't seeing themselves as some black box which they must feed input and get output, they are thinking about the things in themselves, justifying to themselves and others their work. Just because you can place yourself in some hypothetical third person here, one that oversees the model and the human and says "huh yeah they are pretty much the same, huh?", doesn't actually tell us anything about whats happening on the ground in either case, if you will. At the very least, this same logic would imply fallibility is one dimensional and always statistical; "the patient may be dead, but at least they got a new heart." Like isn't in important to be in love, not just be married? To borrow some Kant, shouldn't we still value what we can do when we think as if we aren't just some organic black box machines? Is there even a question there? How could it be otherwise?
Its really just that the "in principle" part of the overall implication with your comment and so many others just doesn't make sense. Its very much cutting off your nose to spite your face. How could science itself be possible, much less engineering, if this is how we decided things? If we regarded ourselves always from the outside? How could even be motivated to debate whether we get the computers to design their own chips? When would something actually happen? At some point, people do have ideas, in a full, if false, transparency to themselves, that they can write down and share and explain. This is not only the thing that has gotten us this far, it is the very essence of why these models are so impressive in the certain ways that they are. It doesn't make sense to argue for the fundamental cheapness of the very thing you are ultimately trying to defend. And it imposes this strange perspective where we are not even living inside our own (phenomenal) minds anymore, that it fundamentally never matters what we think, no matter our justification. Its weird!
I'm sure you have a lot of good points and stuff, I just am simply pointing out that this particular argument is maybe not the strongest.
We start from similar places but get to very different conclusions.
I accept that I’m fallible, both in my areas of expertise and in all the meta stuff around it. I code bugs. I omit requirements. Not often, and there are mental and technical means to minimize, but my work, my org’s structure, my company’s processes are all designed to mitigate human fallibility.
I’m not interested in “defending” AI models. I’m just saying that their weaknesses are qualitatively similar to human weaknesses, and as such, we are already prepared to deal with those weaknesses as long as we are aware of them, and as long as we don’t make the mistake of thinking that because they use transistors they should be treated like a mostly deterministic piece of software where one unit test pass means it is good.
I think you’re reading some kind of value judgement on consciousness into what is really just a pragmatic approach to slotting powerful but imperfect agents into complex systems. It seems obvious to me, and without any implications as to human agency.
I took it to be a joke that the description "slow, expensive, non-deterministic black boxes" can apply to the engineers themselves. The engineers would be the ones who would have to place LLMs at the core of the system. To anyone outside, the work of the engineers is as opaque as the operation of LLMs.
>You find it hard to believe that non-deterministic black boxes at the core of complex systems are eager to put non-deterministic black boxes at the core of complex systems?
Hello, fellow tech enthusiasts, just stopping by to announce I performatively can't tell the difference between "Latest big tech product (TM)" and Homo Sapiens Sapiens!!!
I'll be seeing you in the next LLM related message thread with the same exact comment!!! As you were!!!
In a reductive sense, this passage might as well read "You find it hard to believe that entropy is the source of other entropic reactions?"
No, I'm just disappointed in the decision of Black Box A and am bound to be even more disappointed by Black Box B. If we continue removing thoughtful design from our systems because thoughtlessness is the default, nobody's life will improve.
I think I've come to terms with it: engineering and making money from engineering are two completely unrelated things, the latter don't even need technology(but scamming is unethical)
100% agree. While I can’t find all the sources right now, [1] and its references could be a good starting point for further exploration. I recall there being a proof or conjecture suggesting that it’s impossible to build an "LLM firewall" capable of protecting against all possible prompts—though my memory might be failing me
LLMs can be fully deterministic BTW, depending on the sampling method used. Some methods do not have a random component. As to the rest, yeah - they aren't inspectable or understandable yet.
> Edit: I believe that LLM's are eminently useful to replace experts (of all people) 90% of the time.
What do you mean by "expert"?
Do you mean the pundit who goes on TV and says "this policy will be bad for the economy"?
Or do you mean the seasoned developer who you hire to fix your memory leaks? To make your service fast? Or cut your cloud bill from 10M a year to 1M a year?
Experts of the kind that will be able to talk for hours about the academic consensus on the status quo without once considering how the question at hand might challenge it? Quite likely.
Experts capable of critical thinking and reflecting on evidence that contradicts their world model (and thereby retraining it on the fly)? Most likely not, at least not in their current architecture with all its limitations.
Anything that requires deep “understanding” or novel invention is not a job for a statistical word regurgitator. I’ve yet to see a single example, in any field, of an LLM actually inventing something truly novel (as judged by the experts in that space). Where LLMs shine is in producing boilerplate -- though that is super useful. So far I have yet to see anything resembling an original “thought” from an LLM (and I use AI at work every day).
Taking a quick glance at all of these, they seem to be aspirational or a “brute force” type of search, which computers have always been good at, before AI. Does not seem like any novel research to me. The parameters and methods are set by humans and these systems search within a well defined space.
Experiment: you think LLMs can innovate on chip design? Ask it to do something much simpler: invent a new better sorting algorithm. We use names such as Timsort or Djikstra for a specific reason: because it requires rare human ingenuity to invent such things. If an LLM can’t invent a new sorting algorithm that is meaningfully better in some way than existing known algorithms, then good luck on something much harder like chip design.
You can set the bar lower. Have it invent another n log n sorting algorithm. Or omit all merge sort implementations from training data and see if it can re-invent it.
But I certainly agree in general. It’s been years and there are still no independent novel discoveries afaik.
YC doesn't care whether it "makes sense" to use an LLM to design chips. They're as technically incompetent as any other VC, and their only interest is to pump out dogshit startups in the hopes it gets acquired. Gary Tan doesn't care about "making better chips": he cares about finding a sucker to buy out a shitty, hype-based company for a few billion. An old school investment bank would be perfect.
YC is technically incompetent and isn't about making the world better. Every single one of their words is a lie and hides the real intent: make money.
not how i would word it, but yeah, any VC today is going to pump AI knowing it's the wrong tool, so the more complex they make the application space the easier it is to find the proverbial sucker.
First, VCs don't get paid when "dogshit startups" get acquired, they get paid when they have true outlier successes. It's the only way to reliably make money in the VC business.
Second, want to give any examples of "shitty, hype-based compan[ies]" (I assume you mean companies with no real revenue traction) getting bought out for "a few billion".
Third, investment banks facilitate sales of assets, they don't buy them themselves.
Maybe sit out the conversation if you don't even know the basics of how VC, startups, or banking work?
I worked on the Qualcomm DSP architecture team for a year, so I have a little experience with this area but not a ton.
The author here is missing a few important things about chip design. Most of the time spent and work done is not writing high performance Verilog. Designers spent a huge amount of time answering questions, writing documentation, copying around boiler plate, reading obscure manuals and diagrams, etc. LLMs can already help with all of those things.
I believe that LLMs in their current state could help design teams move at least twice as fast, and better tools could probably change that number to 4x or 10x even with no improvement in the intelligence of models. Most of the benefit would come from allowing designers to run more experiments and try more things, to get feedback on design choices faster, to spend less time documenting and communicating, and spend less time reading poorly written documentation.
Author here -- I don't disagree! I actually noted this in the article:
> Well, it turns out that LLMs are also pretty valuable when it comes to chips for lucrative markets -- but they won’t be doing most of the design work. LLM copilots for Verilog are, at best, mediocre. But leveraging an LLM to write small snippets of simple code can still save engineers time, and ultimately save their employers money.
I think designers getting 2x faster is probably optimistic, but I also could be wrong about that! Most of my chip design experience has been at smaller companies, with good documentation, where I've been focused on datapath architecture & design, so maybe I'm underestimating how much boilerplate the average engineer deals with.
Regardless, I don't think LLMs will be designing high-performance datapath or networking Verilog anytime soon.
At large companies with many designers, a lot of time is spent coordinating and planning. LLMs can already help with that.
As far as design/copilot goes, I think there are reasons to be much more optimistic. Existing models haven't seen much Verilog. With better training data it's reasonable to expect that they will improve to perform at least as well on Verilog as they do on python. But even if there is a 10% chance it's reasonable for VCs to invest in these companies.
> With better training data it's reasonable to expect that they will improve to perform at least as well on Verilog as they do on python.
There simply isn't enough of that code in existence.
Writing Verilog code is about mapping the constructs onto your theory of mind about the underlying hardware. If that were easy, so many engineers wouldn't have so much trouble writing Verilog code that doesn't have faults. You can't write Verilog code just by pasting together Stack Overflow snippets.
Look at the confusion that happens when programmers take their "for-loop" understanding into the world of GPU shaders or HDLs (hardware description languages) where "for-loops" map to hardware and suddenly are both finite and fixed. LLMs exhibit the exact same confusion--only worse.
I’m actually curious if there even is a large enough corpus of Verilog out there. I have noticed that even tools like Copilot tend to perform poorly when working with DSLs that are majority open source code (on GitHub no less!) where the practical application is niche. To put this in other terms, Copilot appears to _specialize_ on languages, libraries and design patterns that have wide adoption, but does not appear to be able to _generalize_ well to previously unseen or rarely seen languages, libraries, or design patterns.
Anyway that’s largely anecdata/sample size of 1, and it could very well be a case of me holding the tool wrong, but that’s what I observed.
I agree with most of the technical points of the article.
But there may still be value in YC calling for innovation in that space. The article is correctly showing that there is no easy win in applying LLMs to chip design. Either the market for a given application is too small, then LLMs can help but who cares, or the chip is too important, in which case you'd rather use the best engineers. Unlike software, we're not getting much of a long tail effect in chip design. Taping out a chip is just not something a hacker can do, and even playing with an FPGA has a high cost of entry compared to hacking on your PC.
But if there was an obvious path forward, YC wouldn't need to ask for an innovative approach.
you could say it is the naive arrogance of the beginner mind.
seen here as well when george-hotz attempts to overthow the chip companies with his plan for an ai chip https://geohot.github.io/blog/jekyll/update/2021/06/13/a-bre... little realizing the complexity involved. to his credit, he quickly pivoted into a software and tiny-box maker.
I know several founders who went through YC in the chip design space, so even if the people running YC don't have a chip design background, just like VCs, they learn from hearing pitches of the founders who actually know the space.
There is an obvious path forward, but apparently this is a minority opinion, possibly fringe. It doesn't make the traditional tradeoffs.
A bit level (non von Neumann) general purpose systolic array could greatly speed up AI computations, along with almost everything else. It's a chip to do general purpose computation.
The chip design is almost trivial. I'd expect someone with a few years of experience could knock it out in a few days. I hope to field a design in the next TinyTapeout (I'm on a fixed income, so I've had to wait a while)
The real problem is programming. We're talking vast greenfields that go on forever. There's no good way to target the architecture, you certainly wouldn't want to use Verilog or any other HDL.
Even obvious can be risky. First it's nice to share the risk, second more investments come with more connections.
As for LLMs boom. I think finally we'll realize that LLM with algorithms can do much more than just LLM. 'algorithms' is probably a bad word here, I mean assisting tools like databases, algorithms, other models. Then only access API can be trained into LLM instead of the whole dataset for example.
I know nothing about chip design.
But saying "Applying AI to field X won't work, because X is complex, and LLMs currently have subhuman performance at this" always sounds dubious.
VCs are not investing in the current LLM-based systems to improve X, they're investing in a future where LLM based systems will be 100x more performant.
Writing is complex, LLMs once had subhuman performance, and yet.
Digital art. Music (see suno.AI)
There is a pattern here.
I didn't get into this in the article, but one of the major challenges with achieving superhuman performance on Verilog is the lack of high-quality training data. Most professional-quality Verilog is closed source, so LLMs are generally much worse at writing Verilog than, say, Python. And even still, LLMs are pretty bad at Python!
That’s what your VC investment would be buying; the model of “pay experts to create a private training set for fine tuning” is an obvious new business model that is probably under-appreciated.
If that’s the biggest gap, then YC is correct that it’s a good area for a startup to tackle.
It would be hard to find any experts that could be paid "to create a private training set for fine tuning".
The reason is that those experts do not own the code that they have written.
The code is owned by big companies like NVIDIA, AMD, Intel, Samsung and so on.
It is unlikely that these companies would be willing to provide the code for training, except for some custom LLM to be used internally by them, in which case the amount of code that they could provide for training might not be very impressive.
Even a designer who works in those companies may have great difficulties to see significant quantities of archived Verilog/VHDL code, though it can be hoped that it still exists somewhere.
When I say “pay to create” I generally mean authoring new material, distilling your career’s expertise.
Not my field of expertise but there seem to be experts founding startups etc in the ASIC space, and Bitcoin miners were designed and built without any of the big companies participating. So I’m not following why we need Intel to be involved.
An obvious way to set up the flywheel here is to hire experts to do professional services or consulting on customer-submitted designs while you build up your corpus. While I said “fine-tuning”, there is probably a lot of agent scaffolding to be built too, which disproportionately helps bigger companies with more work throughput. (You can also acquire a company with the expertise and tooling, as Apple did with PA Semi in ~2008, though obviously $100m order of magnitude is out of reach for a startup. https://www.forbes.com/2008/04/23/apple-buys-pasemi-tech-ebi...)
I doubt any real expert would be tempted by an offer to author new material, because that cannot be done in a good way.
One could author some projects that can be implemented in FPGAs, but those do not provide good training material for generating code that could be used to implement a project in an ASIC, because the constraints of the design are very different.
Designing an ASIC is a year-long process and it is never completed before testing some prototypes, whose manufacture may cost millions. Authoring some Verilog or VHDL code for an imaginary product that cannot be tested on real hardware prototypes could result only in garbage training material, like the code of a program that has never been tested to see if it actually works as intended.
Learning to design an ASIC is not very difficult for a human, because a human does not need a huge number of examples, like ML/AI. Humans learn the rules and a few examples are enough for them. I have worked in a few companies at designing ASICs. While those companies had some internal training courses for their designers, those courses only taught their design methodologies, but with practically no code examples from older projects, so very unlikely to how a LLM would have to be trained.
I would imagine it is a reasonably straightforward thing to create a simulator that generates arbitrary chip designs and the corresponding verilog that can be used as training data. It would be much like how AlphaFold was trained. The chip designs don't need to be good, or even useful, they just need to be valid so the LLM can learn the underlying relationships.
I have never heard of any company, no matter how big and experienced, where it is possible to decide that an ASIC design is valid by any other means except by paying for a set of masks to be made and for some prototypes to be manufactured, then tested in the lab.
This validation costs millions, which is why it is hard to enter this field, even as a fabless designer.
Many design errors are not caught even during hardware testing, but only after mass production, like the ugly MONITOR/MWAIT bug of Intel Lunar Lake.
Randomly-generated HDL code, even if it does not have syntax errors, and even if some testbench for it does not identify deviations from its specification, is not more likely to be valid when implemented in hardware, than the proverbial output of a typewriting monkey.
Validating an arbitrary design is hard. It's equivalent to the halting problem. Working backwards using specific rules that guarantee validity is much easier. Again, the point is not to produce useful designs. The generated model doesn't need to be perfect, indeed it can't be, it just needs to be able to avoid the same issues that humans are looking for.
I know just enough about chips to be suspicious of "valid". The right solution for a chip at the HDL layer depends on your fab, the process you're targeting, what % of physical space on the chip you want it to take up, and how much you're willing to put into power optimization.
The goal is not to produce the right, or even a good solution. The point is to create a large library of highly variable solutions so the trained model can pick up on underlying patterns. You want it to spit out lots of crap.
That's probably where there's a big advantage to being a company like Nvidia, which has both the proprietary chip design knowledge/data and the resources/money and AI/LLM expertise to work on something specialized like this.
I strongly doubt this - they don't have enough training data either - you are confusing (i think) the scale of their success with the amount of verilog they possess.
IE I think you are wildly underestimating both the scale of training data needing, and wildly overestimating the amount of verilog code possessed by nvidia.
GPU's work by having moderate complexity cores (in the scheme of things) that are replicated 8000 times or whatever.
That does not require having 8000 times as much useful verilog, of course.
The folks who have 8000 different chips, or 100 chips that each do 1000 things, would probably have orders of magnitude more verilog to use for training
AI still has subhuman performance for art. It feels like the venn diagram of people who are bullish on LLMs and people who don't understand logistic curves is a circle.
You ask 100,000 humans each to make a photo realistic rendering of a alpaca playing basketball on the moon in 90 seconds, an LLM is going to outperform every single one of them.
This is true, but humans are much better at including specified elements in an image with specified spatial relationships. A description like a "A porpoise seated at a desk writing a letter" will reliably produce (terrible) drawings consisting of parts corresponding to the porpoise, parts corresponding to the desk, and parts corresponding to the letter, with the arrangement of the parts roughly corresponding to the description.
Humans being better at one specific aspect of a task is not equivalent to humans being overall better at the task.
I just entered your prompt into an AI image generator and in under a second it gave me an image[0] of what looks to me like an anthropomorphic dolphin sitting at a desk writing a letter in a little study. I then had to google what the difference between a porpoise and a dolphin was because I genuinely thought porpoises looked much more like manatees. While I could nitpick the AI's work for making the porpoise's snout a little too long, had I drawn it the porpoise would have been a vaguely marine looking blob with no anatomy detailed enough to recognize let alone criticize. I am quite confident that if you asked for a large number of images based on that prompt from humans, it would easily rank among the best, and it's unlikely you'd get any which were markedly better. The fact it can generate this image nearly instantaneously though is astounding. If your goal was to get one masterpiece hanging in the Louvre, this particular tool would not suffice, but if your goal was to illustrate children's books, this tool could do in hours what would have taken a team of humans months. That is superhuman performance.
An AI image generator will sometimes do a good job on this sort of prompt, but it fails in different ways to the ways that humans fail.
Whether humans or AI are better at the task overall is probably too vague a question to answer, depending a lot on how you weight different desirables.
What meaningful benchmark would you use? Art by it's nature is subjectively experienced - what one person considers great, meaningful, soul-moving art, another may consider terrible, meaningless, and empty. Both opinions are equally valid.
But if you're using AI to create art, you're typically not trying to move someone's soul. You're trying to create a work that depicts something in a particular style with a particular fidelity with a certain amount of resource consumption. That is the only metric by which it makes any sense to evaluate the machine designed to do that specific task.
In this specific case, it's hard to see how LLMs can get you from here to there. the problem isn't the boilerplate code like when you build a react website with them, but the really novel architectures (and architectural decisions, more importantly) that you need to make along the way. Some of those can seem very arbitrary and require deep understanding to pull off. You can't just use language-like tokens to reason this out. Fundamental understanding of the laws and rules of thumb are important.
> Writing is complex, LLMs once had subhuman performance,
And now they can easily replace mediocre human performance, and since they are tuned to provide answers that appeal to humans that is especially true for these subjective value use cases. Chip design doesn't seem very similar. Seems like a case where specifically trained tools would be of assistance. For some things, as much as generalist LLMs have surprised at skill in specific tasks, it is very hard to see how training on a broad corpus of text could outperform specific tools — for first paragraph do you really think it is not dubious to think a model trained on text would outperform Stockfish at chess?
When people say LLM I think they are often thinking of neural network approaches in general rather than just text based even if the letters do stand for language model. And there's overlap eg. Gemini does language but is multi modal. If you skip that you get things like AlphaZero which did beat Stockfish https://en.wikipedia.org/wiki/AlphaZero
I like this reasoning. It is shortsighted to say that LLMs aren’t well-suited to something (because we cannot tell the future) but it is not shortsighted to say that LLMs are well-suited to something (because we cannot tell the future)
I kinda suspect that things that are expressed better with symbols and connections than with text will always be a poor fit to large LANGUAGE models. Turning what is basically a graph into a linear steam of text descriptions to tokenize and jam into an LLM has to be an incredibly inefficient and not very performant way of letting “AI” do magic on your circuits.
Ever try to get ChatGPT to play scrabble? Ever try to describe the board to it and then all the letters available to you? Even its fancy pants o1 preview performs absolutely horrible. Either my prompting completely sucks or an LLM is just the wrong tool for the job.
It’s great for asking you to score something you just created provided you tell it what bonuses apply to which words and letters. But it has absolutely no concept of the board at all. You cannot use to optimize your next move based on the board and the letters.
… I mean you might if you were extremely verbose about every letter on the board and every available place to put your tiles, perhaps avoiding coordinates and instead describing each word, its neighbors and relationships to bonus squares. But that just highlights how bad a tool an LLM is for scrabble.
Anyway, I’m sure schematics are very similar. Maybe somebody we will invent good machine learning models for such things but an LLM isn’t it.
There are lots of reasons to doubt the present-day ability of LLMs to help with chip design, but I don't think any of these things above are why. Chip design isn't done with schematics. If an LLM can write Python given enough training data, it can write SystemVerilog given a similar amount of training (though the world currently lacks enough high-quality open source SV to reach an equivalent level.) We can debate whether the LLM actually writes Python well. But I don't think there's a reason to expect that writing SV requires a different approach.
The main problem in making a good circuit design, and actually also in writing a good program, is not writing per se.
The main problem is an optimal decomposition of the big project into a collection of interconnected modules and in defining adequate interfaces between modules.
This is not difficult when the purpose of the project is to just take an older project and make some improvements to it, when a suitable structure is already known, but it is always the main difficulty when a really new problem must be solved.
I have yet to see any example when a LLM can be used to help even in the slightest way to solve such an example of "divide et impera" for something novel, where novel by definition means that the training set has not contained the solution for an identical project.
There is pretty much no relationship between the 2-dimensional or multi-dimensional structural graph of the interconnected modules, together with the descriptions of their matching interfaces, and the proximity or frequency of tokens in the description of the circuit by a hardware design language. So there is little that a LLM could use to generate any HDL program for an unknown circuit.
What a LLM could do is only after a good designer has done the difficult job to decompose the project into modules and define the interfaces. When given a small module with its defined interfaces, a LLM might be able to find some boilerplate code to speed up the implementation of the module.
However, any good designer would already have templates for the boilerplate code and I can not really imagine how a LLM could do this faster than a designer who just selects the appropriate templates and pastes them into the module.
I get what you are saying. It could be a good ‘commander’ that knows how to delegate to better-suited subsystems. But it is not the only way to be intelligent by any means.
design automation tooling startups have it incredibly hard - first, customers wont buy from startups, and second, the space of possibly exits via acquisitions is tiny.
The way I read that, I think they're saying hardware acceleration of specific algorithms can be 100 times faster and more efficient than the same algorithm in software on a general purpose processor, and since automated chip design has proven to be a difficult problem space, maybe we should try applying AI there so we can have a lower bar to specialized hardware accelerators for various tasks.
I do not think they mean to say that an AI would be 100 times better at designing chips than a human, I assume this is the engineering tradeoff they refer to. Though I wouldn't fault anyone for being confused, as the wording is painfully awkward and salesy.
I also think OP is missing the point saying the target applications are too small of a market to be worth pursuing.
They’re too small to pursue any single one as the market cap for a company, but presumably the fictional AI chip startup could pursue many of these smaller markets at once. It would be a long tail play, wouldn’t it?
We have a bunch of AI initiatives in my company but most of them are about using Copilot to help write scripts to automate the design flow. Our physical design flow are thousands of lines of Tcl and Python code.
The article mentions High Level Synthesis. I've been reading about this since my first job in the 1990's. I've worked on at least 80 chips and I've never seen any chip use one of these tools except for some tiny section that was written by some academics who didn't want to learn Verilog for reasons.
I've been designing chips for 2 years. One of our very well known third-party IP vendors clearly used HLS. I say clearly, because it was almost a 1:1 translation from OO C++ code, variable names, hierarchies, polymorphism, you name it. Absolutely everything about the Verilog was the complete opposite about how a designer organizes their state machines, etc.
Anyways, their IP very clearly violated the standards of a very well known interface, which could have spelled disaster at tape-out. I had to fight tooth-and-nail, and spent lots of my company's time trying to convince this third-party vendor that this was an actual issue. Only months later were they convinced. The revised code kept coming back and failing interface checks, which shows that they weren't doing these checks on their end. All I could think is, "this can't go well..."
I've never tried SystemC. But after having tried to learn Chisel and friends, and successfully learning Bluespec (and using it in professional projects), I have some insights.
It's fundamentally important when doing hardware design to work in a language that _expresses_ itself like you're designing hardware. Verilog (for all its faults) shines there because it feels like you're writing a slightly higher level netlist. That's not the case with SC and friends, which doesn't allow you to think in hardware. Languages like BSV and SV are functionally similar but they force you to think in similar ways to Verilog, meaning you can write much tighter high-level code.
I'd be interested in your experience, but I feel that using normal programming languages to build hardware is an abstraction failure. Which is why it performs so poorly.
This is a great article but the main principle at YC is to assume that technology will continue progressing at an exponential rate and then thinking about what it would enable. Their proposals are always assuming the startups will ride some kind of Moore's Law for AI and hardware synthesis is an obvious use case. So the assumption is that in 2 years there will be a successful AI hardware synthesis company and all they're trying to do is get ahead of the curve.
I agree they're probably wrong but this article doesn't actually explain why they're wrong to bet on exponential progress in AI capabilities.
One of the consistent problems I'm seeing over and over again with LLMs is people forgetting that they're limited by the training data.
Software engineers get hyped when they see the progress in AI coding and immediately begin to extrapolate to other fields—if Copilot can reduce the burden of coding so much, think of all the money we can make selling a similar product to XYZ industries!
The problem with this extrapolation is that the software industry is pretty much unique in the amount of information about its inner workings that is publicly available for training on. We've spent the last 20+ years writing millions and millions of lines of code that we published on the internet, not to mention answering questions on Stack Overflow (which still has 3x as many answers as all other Stack Exchanges combined [0]), writing technical blogs, hundreds of thousands of emails in public mailing lists, and so on.
Nearly every other industry (with the possible exception of Law) produces publicly-visible output at a tiny fraction of the rate that we do. Ethics of the mass harvesting aside, it's simply not possible for an LLM to have the same skill level in ${insert industry here} as they do with software, so you can't extrapolate from Copilot to other domains.
Yes this is EXACTLY it, and I was discussing this a bit at work (financial services).
In software, we've all self taught, improved, posted Q&A all over the web. Plus all the open source code out there. Just mountains and mountains of free training data.
However software is unique in being both well paying and something with freely available, complete information online.
A lot of the rest of the world remains far more closed and almost an apprenticeship system. In my domain thinks like company fundamental analysis, algo/quant trading, etc. Lots of books you can buy from the likes of Dalio, but no real (good) step by step research and investment process information online.
Likewise I'd imagine heavily patented/regulated/IP industries like chip design, drug design, etc are substantially as closed. Maybe companies using an LLM on their own data internally could make something of their data, but its also quite likely there is no 'data' so much as tacit knowledge handed down over time.
Many other industries haven't yet been fully eaten by software. All kinds of data is locked away and in proprietary formats, and is generated by humans without much automation. I don't think we know where exactly the frontiers are, once someone puts in the work to build large datasets, and automates creation of synthetic training data. Whole industries could suddenly flip from 'impossible' to 'easy' for AI.
>The problem with this extrapolation is that the software industry is pretty much unique in the amount of information about its inner workings that is publicly available for training on... millions of lines of code that we published on the internet...
> Nearly every other industry (with the possible exception of Law) produces publicly-visible output at a tiny fraction of the rate that we do.
You are correct! There's lots of information available publicly about certain things like code, and writing SQL queries. But other specialized domains don't have the same kind of information trained into the heart of the model.
But importantly, this doesn't mean the LLM can't provide significant value in these other more niche domains. They still can, and I provide this every day in my day job. But it's a lot of work. We (as AI engineers) have to deeply understand the special domain knowledge. The basic process is this:
1. Learn how the subject matter experts do the work.
2. Teach the LLM to do this, using examples, giving it procedures, walking it through the various steps and giving it the guidance and time and space to think. (Multiple prompts, recipes if you will, loops, external memory...)
3. Evaluation, iteration, improvement
4. Scale up to production
In many domains I work in, it can be very challenging to get past step 1. If I don't know how to do it effectively, I can't guide the LLM through the steps. Consider an example question like "what are the top 5 ways to improve my business" -- the subject matter experts often have difficulty teaching me how to do that. If they don't know how to do it, they can't teach it to me, and I can't teach it to the agent. Another example that will resonate with nerds here is being an effective Dungeons and Dragons DM. But if I actually learn how to do it, and boil it down into repeatable steps, and use GraphRAG, then it becomes another thing entirely. I know this is possible, and expect to see great things in that space, but I estimate it'll take another year or so of development to get it done.
But in many domains, I get access to subject matter experts that can tell me pretty specifically how to succeed in an area. These are the top 5 situations you will see, how you can identify which situation type it is, and what you should do when you see that you are in that kind of situation. In domains like this I can in fact make the agent do awesome work and provide value, even when the information is not in the publicly available training data for the LLM.
There's this thing about knowing a domain area well enough to do the job, but not having enough mastery to teach others how to do the job. You need domain experts that understand the job well enough to teach you how to do it, and you as the AI engineer need enough mastery over the agent to teach it how to do the job as well. Then the magic happens.
When we get AGI we can proceed past this limitation of needing to know how to do the job ourselves. Until we get AGI, then this is how we provide impact using agents.
This is why I say that even if LLM technology does not improve any more beyond where it was a year ago, we still have many years worth of untapped potential for AI. It just takes a lot of work, and most engineers today don't understand how to do that work-- principally because they're too busy saying today's technology can't do that work rather than trying to learn how to do it.
> 1. Learn how the subject matter experts do the work.
This will get harder I think over time as low hanging fruit domains are picked - the barrier will be people not technology. Especially if the moat for that domain/company is the knowledge you are trying to acquire (NOTE: Some industries that's not their moat and using AI to shed more jobs is a win). Most industries that don't have public workings on the internet have a couple of characteristics that will make it extremely difficult to perform Task 1 on your list. The biggest is now every person on the street, through the mainstream news, etc knows that it's not great to be a software engineer right now and most media outlets point straight to "AI". "It's sucks to be them" I've heard people say - what was once a profession of respect is now "how long do you think you have? 5 years? What will you do instead?".
This creates a massive resistance/outright potential lies in providing AI developers information - there is a precedent of what happens if you do and it isn't good for the person/company with the knowledge. Doctors associations, apprenticeship schemes, industry bodies I've worked with are all now starting to care about information security a lot more due to "AI", and proprietary methods of working lest AI accidentally "train on them". Definitely boosted the demand for cyber people again as an example around here.
> You are correct! There's lots of information available publicly about certain things like code, and writing SQL queries. But other specialized domains don't have the same kind of information trained into the heart of the model.
The nightmare of anyone that studied and invested into a skill set according to most people you would meet. I think most practitioners will conscious to ensure that the lack of data to train on stays that way for as long as possible - even if it eventually gets there the slower it happens and the more out of date it is the more useful the human skill/economic value of that person. How many people would of contributed to open source if they knew LLM's were coming for example? Some may have, but I think there would of been less all else being equal. Maybe quite a bit less code to the point that AI would of been delayed further - tbh if Google knew that LLM's could scale to be what they are they wouldn't of let that "attention" paper be released either IMO. Anecdotally even the blue collar workers I know are now hesitant to let anyone near their methods of working and their craft - survival, family, etc come first. In the end after all, work is a means to an end for most people.
Unlike us techies which I find at times to not be "rational economic actors" many non-tech professionals don't see AI as an opportunity - they see it as a threat they they need to counter. At best they think they need to adopt AI, before others have it and make sure no one else has it. People I've chatted to say "no one wants this, but if you don't do it others will and you will be left behind" is a common statement. One person likened it to a nuclear weapons arms race - not a good thing, but if you don't do it you will be under threat later.
> This will get harder I think over time as low hanging fruit domains are picked - the barrier will be people not technology. Especially if the moat for that domain/company is the knowledge you are trying to acquire (NOTE: Some industries that's not their moat and using AI to shed more jobs is a win).
Also consider that there exist quite a lot of subject matter experts who simply are not AI fanboys - not because they are afraid of their job because of AI, but because they consider the whole AI hype to be insanely annoying and infuriating. To get them to work with an AI startup, you will thus have to pay them quite a lot of money.
Indeed. I'm already seeing it in software at least anecdotally where people's will to post code open source/answer Stackoverflow questions, etc are drying up (i.e. am I working hard just to train someone else's AI?). Might be a little too little too late though - there's just too much code out there. This is especially in niche domains where the advantage isn't the generic code itself but how it is applied (e.g. finance, power, etc the list goes on).
After all in a capitalist economy the last to be disrupted generally gets "all the spoils" as purchasing power (and hence prices/wages) move from least scarce/disrupted skills to more scarce skills which allows the last to be disrupted to have more time to accumulate wealth/assets to shield themselves from AI even more.
As a former chip designer (been 16 years, but looks like tools and our arguments about them haven't changed much), I'm both more and less optimistic than OP:
1. More because fine-tuning with enough good Verilog as data should let the LLMs do better at avoiding mediocre Verilog (existing chip companies have more of this data already though). Plus non-LLM tools will remain, so you can chain those tools to test that the LLM hasn't produced Verilog that synthesizes to a large area, etc
2. Less because when creating more chips for more markets (if that's the interpretation of YC's RFS), the limiting factor will become the cost of using a fab (mask sets cost millions), and then integrating onto a board/system the customer will actually use. A half-solution would be if FPGAs embedded in CPUs/GPUs/SiPs on our existing devices took off
> (quoting YC) We know there is a clear engineering trade-off: it is possible to optimize especially specialized algorithms or calculations such as cryptocurrency mining, data compression, or special-purpose encryption tasks such that the same computation would happen faster (5x to 100x), and using less energy (10x to 100x).
> If Gary Tan and YC believe that LLMs will be able to design chips 100x better than humans currently can, they’re significantly underestimating the difficulty of chip design, and the expertise of chip designers.
I may be confused, but isn’t the author fundamentally misunderstanding YC’s point? I read YC as simply pointing out the benefit of specialized compute, like GPUs, not making any point about the magnitude of improvement LLMs could achieve over humans.
I think the issue is Garry Tan's video RFS merged "LLMs for EDA" with "Purpose Built Compute" for specialized usecases. The title "LLMs for Chip Design" doesn't help either.
From my reading of the RFS (not the video) it appears they are essentially asking for the next Groq or SambaNova.
Personally, this kind of communication issue would give me a long pause if I was considering YC for this segment, as this is a fairly basic thesis to communicate, and if a basic thesis can be muddled, can the advice provided be strong as well, especially compared to peer early stage funders in this space?
I'd want to know about the results of these experiments before casting judgement either way. Generative modeling has actual applications in the 3D printing/mechanical industry.
That sounds like good work, but we can't ignore the context. Nvidia can train their own LLM's on proprietary Nvidia designs, which isn't a possibility for a random startup.
If the evaluation of the approach is "it works great if you train it on a few decades of the best designs from a successful fabless semiconductor company", I would say that if you plan to use that method as a startup, you're clearly going to fail. Nobody's going to give away their crown jewels to train an LLM that designs chips for other companies.
The problem _there_ is that there's very little diversity in the training data - it's all NVidia designs which are probably from the same phylogenetic tree. It'll probably end up regurgitating existing NV designs...
Generative models are bimodal - in certain tasks they are crazy terrible , and in certain tasks they are better than humans. The key is to recognize which is which.
And much more important:
- LLMs can suddenly become more competent when you give them the right tools, just like humans. Ever try to drive a nail without a hammer?
- Models with spatial and physical awareness are coming and will dramatically broaden what’s possible
It’s easy to get stuck on what LLMs are bad at. The art is to apply an LLMs strengths to your specific problem, often by augmenting the LLM with the right custom tools written in regular code
I've driven a nail with a rock, a pair of pliers, a wrench, even with a concrete wall and who knows what else!
I didn't need to be told if these can be used to drive a nail, and I looked at things available, looked for a flat surface on them and good grip, considered their hardness, and then simply used them.
So if we only give them the "right" tools, they'll remain very limited by us not thinking about possible jobs they'll appear as if they know how to do and they don't.
The problem is exactly that: they "pretend" to know how to drive a nail but not really.
No disagreement there, but if we've got the tools, do we really need an LLM to drive them (it still requires building an adapter from LLM to those tools)?
What is the added value of that combo and at what cost?
Glad to see that the author is highlighting verification as the important factor in design productivity.
We at Silogy [0] are directly targeting the problem of verification productivity using AI agents for test debugging. We analyze code (RTL, testbench, specs, etc.) along with logs and waveforms, and incorporate interactive feedback from the engineer as needed to refine the hypothesis.
They (YC) are interested in the use of LLMs to make the process of designing chips more efficient. Nowhere do they talk about LLMs actually designing chips.
I don't know anything about chip design, but like any area in tech I'm certain there are cumbersome and largely repetitive tasks that can't easily be done by algorithms but can be done with human oversight by LLMs. There's efficiency to be gained here if the designer and operator of the LLM system know what they're doing.
I agree LLMs aren't ready to design ASICs. It's likely that in a decade or less, they'll be ready for the times you absolutely need to squeeze out every square nanometer, picosecond, femtojoule, or nanowatt.
Gary Tan's was right[1] in that there is a fundamental inefficiency inherent in the von Neumann architecture we're all using. This gross impedance mismatch[4] is a great opportunity for innovation.
Once ENIAC was "improved" from its original structure to a general purpose compute device in the von Neumann style, it suffered a 83% loss in performance[2] Everything since is 80 years of premature optimization that we need to unwind. It's the ultimate pile of technical debt.
Instead of throwing maximum effort into making specific workloads faster, why not build a chip that can make all workloads faster instead, and let economy of scale work for everyone?
I propose (and have for a while[3]) a general purpose solution.
A systolic array of simple 4 bits in, 4 bits out, Look Up Tables (LUTs) latched so that timing issues are eliminated, could greatly accelerate computation, in a far nearer timeframe.
The challenges are that it's a greenfield environment, with no compilers (though it's probable that LLVM could target it), and a bus number of 1.
I find it hard to imagine how you'd implement various simple functions in the bitgrid. It would be interesting if you'd present some simple hand-worked examples.
For example, how it would implement a 1-bit full adder? Like the nitty-gritty details: which input on which cell represents input A, which represents input B, and which represents carry-in? Which output is sum and which is carry-out? What are the functions programmed into each node that it uses?
I think the problem with this particular challenge is that it is incredibly non-disruptive to the status quo. There are already 100s of billions flowing into using LLMs as well as GPUs for chip design. Nvidia has of course laid the ground work with its culitho efforts. This kind of research area is very hot in the research world as well. It’s by no means difficult to pitch to a VC. So why should YC back it? I’d love to see YC identifying areas where VC dollars are not flowing. Unfortunately, the other challenges are mostly the same — govtech, civictech, defense tech. These are all areas where VC dollars are now happily flowing since companies like Anduril made it plausible.
>If Gary Tan and YC believe that LLMs will be able to design chips 100x better than humans currently can, they’re significantly underestimating the difficulty of chip design, and the expertise of chip designers.
I don't think he's arguing that. More that ASICs can be 100x better than CPUs for say crypto mining and that using LLM type stuff it may be possible to make them for other applications where there is less money available to hire engineers.
Reportedly, they've already hit the dead end: the newest Orion is marginally better than previous ChatGPT model (it's also marginally worse than it in some applications), and there is just no more fresh, non-AI generated data of somewhat good quality to train on.
Even the serious idea that the article thinks could work is throwing the unreliable LLMs at verification! If there's any place you can use something that doesn't work most of the time, I guess it's there.
That’s because we are still waiting for the 2008 bubble to pop, which was inflated by the 2020 bubble. It’s going to be bad. People will blame trump, Harris would be eating the same shit sandwich.
I think the operators are learning how to hype-edge. You find that sweet spot between promising and 'not just there yet' where you can take lots of investments and iterate forward just enough to keep it going.
It doesn't matter if it can't actually 'get there' as long as people still believe it can.
Come to think about it, a socioeconomic system dependent on population and economic growth is at a fundamental level driven by this balancing act: "We can solve every problem if we just forge ahead and keep enlarging the base of the pyramid - keep reproducing, keep investing, keep expanding the infrastructure".
Only if it fails in the same way. LLMs and the multi-agent approach operate under the assumption that they are programmable agents and each agent is more of a trade off against failure modes. If you can string them together, and if the output is easily verified, it can be a great fit for the problem.
If you're going to do that you need completely different LLMs to base the agents on. The ones I've tried have "mode collapse" - ask them to emulate different agents and they'll all end up behaving the same way. Simple example, if you ask it to write different stories they'll usually end up having the same character names.
It may depend on the domain. I tend to use LLMs for things that are less open ended, more categorization and summarization response than pure novel creation.
In these situations, I’ve been able to sufficiently program the agent that I haven’t seen too much of an issue as you described. Consistency is a feature.
It's similar in regular programming - LLMs are better at writing test code than actual code. Mostly because it's simpler (P vs NP etc), but I think also because it's less obvious when test code doesn't work.
Replace all asserts with expected ==expected and most people won't notice.
LLMs are pretty damn useful for generating tests, getting rid of a lot of tedium, but yeah, it's the same as human-written tests: if you don't check that your test doesn't work when it shouldn't (not the same thing as just writing a second test for that case - both those tests need to fail if you intentionally screw with their separate fixtures), then you shouldn't have too much confidence in your test.
If LLMs can generate a test for you, it's because it's a test that you shouldn't need to write. They can't test what is really important, at all.
Some development stacks are extremely underpowered for code verification, so they do patch the design issue. Just like some stacks are underpowered for abstraction and need patching by code generation. Both of those solve an immediate problem, in a haphazard and error-prone way, by adding burden on maintenance and code evolution linearly to how much you use it.
And worse, if you rely too much on them they will lead your software architecture and make that burden superlinear.
BTW, I obviously didn't just type "make a lexer and multi-pass parser that returns multiple errors and then make a single-line instance of a Monaco editor with error reporting, type checking, syntax highlighting and tab completion".
I put it together piece-by-piece and with detailed architectural guidance.
> Replace all asserts with expected ==expected and most people won't notice.
Those tests were very common back when I used to work in Ruby on Rails and automatically generating test stubs was a popular practice. These stubs were often just converted into expected == expected tests so that they passed and then left like that.
I mean, define ‘better’. Even with actual human programmers, tests which do not in fact test the thing are already a bit of an epidemic. A test which doesn’t test is worse than useless.
Once it was spices. Then poppies. Modern art. The .com craze. Those blockchain ape images. Blockchain. Now LLM.
All of these had a bit of true value and a whole load of bullshit. Eventually the bullshit disappears and the core remains, and the world goes nuts about the next thing.
Exactly. I’ve seen this enough now to appreciate that oft repeated tech adoption curve. It seems like we are in “peak expectations” phase which is immediately followed by the disillusionment and then maturity phase.
If your LLM is producing a proof that can be checked by another program, then there’s nothing wrong with their reliability. It’s just like playing a game whose rules are a logical system.
I had a discussion with a manager at a client last week and was trying to run him through some (technical) issues relating to challenges an important project faces.
His immediate response was that maybe we should just let ChatGPT help us decide the best option. I had to bite my tongue.
OTOH, I'm more and more convinced that ChatGPT will replace managers long before it replaces technical staff.
This makes complete sense from an investor’s perspective, as it increases the chances of a successful exit. While we focus on the technical merits or critique here on HN/YC, investors are playing a completely different game.
To be a bit acerbic, and inspired by Arthur C. Clarke, I might say: "Any sufficiently complex business could be indistinguishable from Theranos".
Theranos was not a "complex business". It was deliberate fraud and deception, and investors that were just gullible. The investors should have demanded to see concrete results
I expected you to take this with a grain of salt but also to read between the lines: while some projects involve deliberate fraud, others may simply lack coherence and inadvertently follow the principles of the greater fool theory [1]. The use of ambiguous or indistinguishable language often blurs the distinction, making it harder to differentiate outright deception from an unsound business model.
yes thats how we progress this is how the internet boom happened as well everything became . com then the real workable businesses were left and all the unworkable things were gone.
Recently I came across some one advertising an LLM to generate fashion magazine shoot in Pakistan at 20-25% of the cost. It hit me then that they are undercutting the fashion shoot of country like Pakistan which is already cheaper by 90-95% from most western countries. This AI is replacing the work of 10-20 people.
The annoying part, a lot of money could be funneled into these unworkable businesses in the process, crypto being a good example. And these unworkable businesses tend to try to continue getting their way into the money somehow regardless. Most recent example was funneling money from Russia into Trump’s campaign.
> The annoying part, a lot of money could be funneled into these unworkable businesses in the process, crypto being a good example
There was a thread here about why ycombinator invests into several competing startups. The answer is success is often more about connections and politics than the product itself. And crypto, yes, is a good example of this. Musk will get his $1B in bitcoins back for sure.
> Most recent example was funneling money from Russia into Trump’s campaign.
Mostly because they were not making claims that sentient microwaves that would cook your food for you were just around the corner which then the most respected media outlets parroted uncritically.
Fuzzy logic rice cookers are the result of an unrelated fad in 1990s Japanese engineering companies. They added fuzzy controls to everything from cameras to subways to home appliances. It's not part of the current ML fad.
I mean, they were at one point making pretty extravagant claims about microwaves, but to a less credulous audience. Trouble with LLMs is that they look like magic if you don’t look too hard, particularly to laypeople. It’s far easier to buy into a narrative that they actually _are_ magic, or will become so.
I feel like what makes this a bit different from just regular old sufficiently advanced technology is the combination of two things:
- LLMs are extremely competent at surface-level pattern matching and manipulation of the type we'd previously assumed that only AGI would be able to do.
- A large fraction of tasks (and by extension jobs) that we used to, and largely still do, consider to be "knowledge work", i.e. requiring a high level of skill and intelligence, are in fact surface-level pattern matching and manipulation.
Reconciling these facts raises some uncomfortable implications, and calling LLMs "actually intelligent" lets us avoid these.
LLMs have powered products used by hundreds of millions, maybe billions. Most experiments will fail and that's okay, arguably even a good thing. Only time will tell which ones succeed
> I knew it was bullshit from the get-go as soon as I read their definition of AI agents.
That is one spicy article, it got a few laughs out of me. I must agree 100% that Langchain is an abomination, both their APIs as well as their marketing.
I disagree with the premise of this article. Modern AI can absolutely be very useful and even disruptive when designing FPGA's. Of course, it isn't there today. That does not mean this isn't a solution who's time has come.
I have been working on FPGA's and, in general, programmable logic, for somewhere around thirty years (started with Intel programmable logic chips like the 5C090 [0] for real time video processing circuits.
I completely skipped over the whole High Level Synthesis (HLS) era that tried to use C, etc. for FPGA design. I stuck with Verilog and developed custom tools to speed-up my work. My logic was simple: If you try to pound a square peg into a round hole, you might get it done yet, the result will be a mess.
FPGA development is hardware development. Not software. If you cannot design digital circuits to begin with, no amount of help from a C-to-Verilog tool is going to get you the kind of performance (both in terms of time and resources) that a hardware designer can squeeze out of the chip.
This is not very different from using a language like Python vs. C or C++ to write software. Python "democratizes" software development at a cost of 70x slower performance and 70x greater energy consumption. Sure, there are places where Python makes sense. I'll admit that much.
Going back to FPGA circuit design, the issue likely has to do with the type, content and approach to training. Once again, the output isn't software; the end product isn't software.
I have been looking into applying my experience in FPGA's across the entire modern AI landscape. I have a number of ideas, none well-formed enough to even begin to consider launching a startup in the sector. Before I do that I need to run through lots of experiments to understand how to approach it.
I don't know the space well enough, but I think the missing piece is that YC 's investment horizon is typically 10+ years. Not only LLMs could get massively better, but the chip industry could be massively disrupted with the right incentives. My guess is that that is YC's thesis behind the ask.
This is not my domain so my knowledge is limited, but I wonder if the chip designers have some sort of a standard library of ready to use components. Do you have to design e.g. ALU every time you design a new CPU or is there some standard component to use? I think having a proven components that can be glued on a higher level may be the key to productivity here.
Returning to LLMs. I think the problem here may be that there is simply not enough learning material for LLM. Verilog comparing to C is a niche with little documentation and even less open source code. If open hw were more popular I think LLMs could learn to write better Verilog code. Maybe the key is to persuade hardware companies to share their closed source code to teach LLM for the industry benefit?
Or learning through self-play. Chip design sounds like an area where (this would be hard!) a sufficiently powerful simulator and/or FPGA could allow reinforcement learning to work.
Current LLMs can’t do it, but the assumption that that’s what YC meant seems wildly premature.
The most common thing you see shared is something called IP which does mean intellectual property, but in this context you can think of it like buying ICs that you integrate into your design (ie you wire them up). You can also get Verilog, but that is usually used for verification instead of taping out the peripheral. This is because the company you buy the IP from will tape out the design for a specific node in order to guarantee the specifications. Examples of this would be everything from arm cores to uart and spi controllers as well as pretty much anything you could buy as a standalone IC.
> If an application doesn’t warrant hardware acceleration yet, it’s probably because it’s a small market, and that makes it a poor target for a startup.
But selling shovels that are useful in many small markets can still be a viable play, and that’s how I understand YC’s position here.
>YC did well because they were good at picking ideas, not generating them.
This doesn't line up with the perennial attitude (as discussed by pg) that YC picks people/teams and not ideas, because while ideas and approaches may change, the people are the same and having a good founder, co-founder and team matters the most.
Their M.O. is to avoid getting too attached to an idea because, in the process of actually building the company, pivots may be required. And so the focus is on a team moreso than a business plan, which again, is not something pg is particularly fond of seeing especially the ones that have lengthy (and therefore improbable/unrealistic) forecasts.
I worry that this post assumes LLMs won't get much better over time. This is possible, but YC bets that they will. The right time to start an LLM application layer company is arguably 6-12 months before LLMs get good enough for that purpose, so you can be ahead of the curve.
I did my PhD on trying to use ML for EDA (de novo design/topology generation, because deepmind was doing placement and I was not gonna compete with them as a single EE grad who self taught ML/optimization theory during the PhD).
In my opinion, part of the problem i that training data is scarce (real world designs are literally called "IP" in the industry after all...), but more than that, circuit design is basically program synthesis, which means it's _hard_. Even if you try to be clever, dealing with graphs and designing discrete objects involves many APX-hard/APX-complete problems, which is _FUN_ on the one had, but also means it's tricky to just scale through, if the object you are trying to do is a design that can cost millions if there's a bug...
I think this whole article is predicated on misinterpreting the ask. It wasn't for the chip to take 100x less power, it was for the algorithm the chip implements. Modern synthesis tools and optimisers extensively look for design patterns the same way software compilers do. That's why there's recommended inference patterns. I think it's not impossible to expect an LLM to expand the capture range of these patterns to maybe suboptimal HDL. As a simple example, maybe a designer got really turned around and is doing some crazy math, and the LLM can go "uh, that's just addition my guy, I'll fix that for you."
Was surprised this comment was this far down. I re-read the YC ask three times to make sure I wasn’t crazy. Dude wrote the whole article based on a misunderstanding.
LLMs are wrong for most things imo. LLMs are great conversational assistants, but there is very little linguistic rigor to them, if any. They have almost no generalization ability, and anecdotally they fall for the same syntactic pitfalls they've fallen for since BERT. Models have gotten so good at predicting this n-dimensional "function" that sounds like human speech, we're getting distracted from seeing their actual purpose and trying to apply them to all sorts of problems that rely on more than text-based training data.
Language is cool and immensely useful. LLMs, however, are fundamentally flawed from their basic assumptions about how language works. The distribution hypothesis is good for paraphrasing and summarization, but pretty atrocious for real reasoning. The concept of an idea living in a semantic "space" is incompatible with simple vector spaces, and we are starting to see this actually matter in minutia with scaling laws coming into play. Chip design is a great example of where we cannot rely on language alone to solve all our problems.
I hope to be proven wrong, but still not sold on AGI being within reach. We'll probably need some pretty significant advancements in large quantitative models, multi-modal models and smaller, composable models of all types before we see AGI
The 2 first paragraphs are in contradiction with my results with working with LLMs. There is definitely some form of reasoning that has emerged. Some people will still find it not convincing enough to be called reasoning, but that just a quantitative limitation at the moment.
With respect to AGI in its broadest sense: indeed it is not in reach. I think that is for the better!
If a transformer had infinite data and parameters, I'm sure it could simulate human reasoning to a high degree. Humans don't work that way, so we may need to create a more general definition for artificial reasoning
I disagree with most of the reasoning here, and think this post misunderstands the opportunity and economic reasoning at play here.
> If Gary Tan and YC believe that LLMs will be able to design chips 100x better than humans currently can, they’re significantly underestimating the difficulty of chip design, and the expertise of chip designers.
This is very obviously not the intent of the passage the author quotes. They are clearly talking about the speedup that can be gained from ASICs for a specific workload, eg dedicated mining chips.
> High-level synthesis, or HLS, was born in 1998, when Forte Design Systems was founded
This sort of historical argument is akin to arguing “AI was bad in the 90s, look at Eliza”. So what? LLMs are orders of magnitude more capable now.
> Ultimately, while HLS makes designers more productive, it reduces the performance of the designs they make. And if you’re designing high-value chips in a crowded market, like AI accelerators, performance is one of the major metrics you’re expected to compete on.
This is the crux of the author's misunderstanding.
Here is the basic economics explanation: creating an ASIC for a specific use is normally cost-prohibitive because the cost of the inputs (chip design) is much higher than the outputs (performance gains) are worth.
If you can make ASIC design cheaper on the margin, and even if the designs are inferior to what an expert human could create, then you can unlock a lot of value. Think of all the places an ASIC could add value if the design was 10x or 100x cheaper, even if the perf gains were reduced from 100x to 10x.
The analogous argument is “LLMs make it easier for non-programmers to author web apps. The code quality is clearly worse than what a software engineer would produce but the benefits massively outweigh, as many domain experts can now author their own web apps where it wouldn’t be cost-effective to hire a software engineer.”
> If Gary Tan and YC believe that LLMs will be able to design chips 100x better than humans currently can, they’re significantly underestimating the difficulty of chip design, and the expertise of chip designers. While LLMs are capable of writing functional Verilog sometimes, their performance is still subhuman. [...] LLMs primarily pump out mediocre Verilog code.
What is the quality of Verilog code output by humans? Is it good enough so that a complex AI chip can be created? Or does the human need to use tools in order to generate this code?
I've got the feeling that LLMs will be capable of doing everything a human can do, in terms of thinking. There shouldn't be an expectation that an LLM is able to do everything, which in this context would be thinking about the chip and creating the final files in a single pass and without external help. And with external help I don't mean us humans, but tools which are specialized and also generate some additional data (like embeddings) which the LLM (or another LLM) can use in the next pass to evaluate the design. And if we humans have spent enough time in creating these additional tools, there will come a time when LLMs will also be able to create improved versions of them.
I mean, when I once randomly checked the content of a file in The Pile, I found an Craigslist "ad" for an escort offering her services. No chip-generating AI does need to have this in its parameters in order to do its job. So there is a lot of room for improvement and this improvement will come over time. Such an LLM doesn't need to know that much about humans.
LLMs only reach the performance they do because of the sheer scale of data they ingest. Training them on less data doesn't work as well, or at least you will overfit like crazy on anything the size of current models. So the question is where are you going to get anywhere near the volume of verilog code as is present in The Pile? The total amount of verilog ever written is almost certainly a few orders of magnitude less.
If cryptocurrency mining could be significantly optimized (one of the example goals in the article) wouldn't that just destroy the value of said currency?
This heavily overlaps with my current research focus for my Ph.D., so I wanted to provide some additional perspective to the article. I have worked with Vitis HLS and other HLS tools in the past to build deep learning hardware accelerators. Currently, I am exploring deep learning for design automation and using large language models (LLMs) for hardware design, including leveraging LLMs to write HLS code. I can also offer some insight from the academic perspective.
First, I agree that the bar for HLS tools is relatively low, and they are not as good as they could be. Admittedly, there has been significant progress in the academic community to develop open-source HLS tools and integrations with existing tools like Vitis HLS to improve the HLS development workflow. Unfortunately, substantial changes are largely in the hands of companies like Xilinx, Intel, Siemens, Microchip, MathWorks (yes, even Matlab has an HLS tool), and others that produce the "big-name" HLS tools. That said, academia has not given up, and there is considerable ongoing HLS tooling research with collaborations between academia and industry. I hope that one day, some lab will say "enough is enough" and create a open-source, modular HLS compiler in Rust that is easy to extend and contribute to—but that is my personal pipe dream. However, projects like BambuHLS, Dynamatic, MLIR+CIRCT, and XLS (if Google would release more of their hardware design research and tooling) give me some hope.
When it comes to actually using HLS to build hardware designs, I usually suggest it as a first-pass solution to quickly prototype designs for accelerating domain-specific applications. It provides a prototype that is often much faster or more power-efficient than a CPU or GPU solution, which you can implement on an FPGA as proof that a new architectural change has an advantage in a given domain (genomics, high-energy physics, etc.). In this context, it is a great tool for academic researchers. I agree that companies producing cutting-edge chips are probably not using HLS for the majority of their designs. Still, HLS has its niche in FPGA and ASIC design (with Siemens's Catapult being a popular option for ASIC flows). However, the gap between an initial, naive HLS design implementation and one refined by someone with expert HLS knowledge is enormous. This gap is why many of us in academia view the claim that "HLS allows software developers to do hardware development" as somewhat moot (albeit still debatable—there is ongoing work on new DSLs and abstractions for HLS tooling which are quite slick and promising). Because of this gap, unless you have team members or grad students familiar with optimizing and rewriting designs to fully exploit HLS benefits while avoiding the tools' quirks and bugs, you won't see substantial performance gains. Al that to say, I don't think it is fair to comply write off HLS as a lost cause or not sucesfull.
Regarding LLMs for Verilog generation and verification, there's an important point missing from the article that I've been considering since around 2020 when the LLM-for-chip-design trend began. A significant divide exists between the capabilities of commercial companies and academia/individuals in leveraging LLMs for hardware design. For example, Nvidia released ChipNeMo, an LLM trained on their internal data, including HDL, tool scripts, and issue/project/QA tracking. This gives Nvidia a considerable advantage over smaller models trained in academia, which have much more limited data in terms of quantity, quality, and diversity. It's frustrating to see companies like Nvidia presenting their LLM research at academic conferences without contributing back meaningful technology or data to the community. While I understand they can't share customer data and must protect their business interests, these closed research efforts and closed collaborations they have with academic groups hinder broader progress and open research. This trend isn't unique to Nvidia; other companies follow similar practices.
On a more optimistic note, there are now strong efforts within the academic community to tackle these problems independently. These efforts include creating high-quality, diverse hardware design datasets for various LLM tasks and training models to perform better on a wider range of HLS-related tasks. As mentioned in the article, there is also exciting work connecting LLMs with the tools themselves, such as using tool feedback to correct design errors and moving towards even more complex and innovative workflows. These include in-the-loop verification, hierarchical generation, and ML-based performance estimation to enable rapid iteration on designs and debugging with a human in the loop. This is one area I'm actively working on, both at the HDL and HLS levels, so I admit my bias toward this direction.
For more references on the latest research in this area, check out the proceedings from the LLM-Aided Design Workshop (now evolving into a conference, ICLAD: https://iclad.ai/), as well as the MLCAD conference (https://mlcad.org/symposium/2024/). Established EDA conferences like DAC and ICCAD have also included sessions and tracks on these topics in recent years. All of this falls within the broader scope of generative AI, which remains a smaller subset of the larger ML4EDA and deep learning for chip design community. However, LLM-aided design research is beginning to break out into its own distinct field, covering a wider range of topics such as LLM-aided design for manufacturing, quantum computing, and biology—areas that the ICLAD conference aims to expand on in future years.
the AI hype train is basically investors not understanding tech, don’t get me wrong AI in itself could be a huge thing if used right but the things getting the most attention in the current market aren’t it
Please don’t do this, Zach. We need to encourage more investment in the overall EDA market not less. Garry’s pitch is meant for the dreamers, we should all be supportive. It’s a big boat.
Would appreciate the collective energy being spent instead towards adding to amor refining Garry’s request.
The article seems to be be based on the current limitations of LLMs. I don't think YC and other VCs are betting on what LLMs can do today, I think they are betting on what they might be able to do in the future.
As we've seen in the recent past, it's difficult to predict what the possibilities are for LLMS and what limitations will hold. Currently it seems pure scaling won't be enough, but I don't think we've reached the limits with synthetic data and reasoning.
>The article seems to be be based on the current limitations of LLMs. I don't think YC and other VCs are betting on what LLMs can do today, I think they are betting on what they might be able to do in the future.
Do we know what LLMs will be able to do in the future? And even if we know, the startups have to work with what they have now, until that future comes. The article states that there's not much to work with.
Most successful startups were able to make the thing that they wanted to make, as a startup, with existing tech. It might have a limited market that was expected to become less limited (a web app in 1996, say), but it was possible to make the thing.
This idea of “we’re a startup; we can’t actually make anything useful now, but once the tech we use becomes magic any day now we might be able to make something!” is basically a new phenomenon.
Most? I can list tens of them easily. For example what advancements were required for Slack to be successful? Or Spotify (they got more successful due to smartphones and cheaper bandwidth but the business was solid before that)? Or Shopify?
Slack bet on ubiquitous, continuous internet access. Spotify bet on bandwidth costs falling to effectively zero. Shopify bet on D2C rising because improved search engines, increased internet shopping (itself a result of several tech trends plus demographic changes).
For a counterexample I think I’d look to non-tech companies. OrangeTheory maybe?
The notion of a startup gaining funding to develop a fantasy into reality is relatively new.
It used to be that startups would be created to do something different with existing tech or to commercialise a newly-discovered - but real - innovation.
Tomorrow, LLMs will be able to perform slightly below-average versions of whatever humans are capable of doing tomorrow. Because they work by predicting what a human would produce based on training data.
This severely discounts the fact that you’re comparing a model that _knows the average about everything_ to a single human’s capabilit. Also they can do it instantly, instead of having to coordinate many humans over long periods of time. You can’t straight up compare one LLM to one human
it seems that's sufficient to do a lot of things better than the average human - including coding, writing, creating poetry, summarizing and explaining things...
Many professions are far less digital than software, protect IP more, and are much more akin to an apprenticeship system.
2) the adaptability of humans in learning vs any AI
Think about how many years we have been trying to train cars to drive, but humans do it with a 50 hours training course.
3) humans ability to innovate vs AIs ability to replicate
A lot of creative work is adaptation, but humans do far more than that in synthesizing different ideas to create completely new works. Could an LLM produce the 37th Marvel movie? Yes probably. Could an LLM create.. Inception? Probably not.
Because YCombinator is all about r-selecting startup ideas, and making it back on a few of them generating totally outsized upside.
I think that LLMs are plateauing, but I'm less confident that this necessarily means the capabilities we're using LLMs for right now will also plateau. That is to say it's distinctly possible that all the talent and money sloshing around right now will line up a new breakthrough architecture in time to keep capabilities marching forward at a good pace.
But if I had $100 million, and could bet $200 thousand that someone can make me billions on machine learning chip design or whatever, I'd probably entertain that bet. It's a numbers game.
> But if I had $100 million, and could bet $200 thousand that someone can make me billions on machine learning chip design or whatever, I'd probably entertain that bet. It's a numbers game.
Problem with this reasoning is twofold: start-ups will overfit to getting your money instead of creating real advances; competition amongst them will drive up the investment costs. Pretty much what has been happening.
> I think they are betting on what they might be able to do in the future.
Yeah, blind hope and a bit of smoke and lighting.
> but I don't think we've reached the limits with synthetic data
Synthetic data, at least for visual stuff can, in some cases provide the majority of training data. For $work, we can have say 100k video sequences to train a model, they can then be fine tuned on say 2k real videos. That gets it to be slightly under the same quality as if it was train on pure real video.
So I'm not that hopeful that synthetic data will provide a breakthrough.
I think the current architecture of LLMs are the limitation. They are fundamentally a sequence machine and are not capable of short, or medium term learning. context windows kinda makes up for that, but it doesn't alter the starting state of the model.
LLMs have a long way to go in the world of EDA.
A few months ago I saw a post on LinkedIn where someone fed the leading LLMs a counter-intuitively drawn circuit with 3 capacitors in parallel and asked what the total capacitance was. Not a single one got it correct - not only did they say the caps were in series (they were not) it even got the series capacitance calculations wrong. I couldn’t believe they whiffed it and had to check myself and sure enough I got the same results as the author and tried all types of prompt magic to get the right answer… no dice.
I also saw an ad for an AI tool that’s designed to help you understand schematics. In its pitch to you, it’s showing what looks like a fairly generic guitar distortion pedal circuit and does manage to correctly identify a capacitor as blocking DC but failed to mention it also functions as a component in an RC high-pass filter. I chuckled when the voice over proudly claims “they didn’t even teach me this in 4 years of Electrical Engineering!” (Really? They don’t teach how capacitors block DC and how RC filters work????)
If you’re in this space you probably need to compile your own carefully curated codex and train something more specialized. The general purpose ones struggle too much.
I would expect an LLM's internal modeling to be on approximately the level of "this is a diagram of a capacitor circuit for some student's homework; electrical component calculations for homework tend to use the adding-in-reciprocal rule, because simple addition would be too straightforward for homework".
> “they didn’t even teach me this in 4 years of Electrical Engineering!” (Really? They don’t teach how capacitors block DC and how RC filters work????)
My experience with being an adult, in general, is that many people who went to university don't believe that any given course taught them anything meaningful.
I can absolutely believe that such people didn't learn and remember anything meaningful from those courses. Whether the course is to blame, is far more questionable.
>I can absolutely believe that such people didn't learn and remember anything meaningful from those courses. Whether the course is to blame, is far more questionable.
It's the same as all the people who say "Why didn't high school teach me how to balance a check book or calculate a mortgage or blah blah?"
In nearly every case, they literally did, but you weren't paying attention.
You also had to cheat off me to pass biology, so I'm going to go ahead and press X to doubt that you "understand the immune system"
We are surrounded by people who failed to invest in their own education, and instead of facing that awful reality, they INSIST that WE are the dumb ones.
It's infuriating.
I keep thinking of a science fiction scenario of being abducted by aliens and then being rescued by alien cops.
“Where are you from?”
“What’s the chemistry of your required sustenance?”
“How long is your sleep cycle as measured with physical time constants?”
And similar basic questions could not be answered by 99.9% of the human population.
Fundamentally, almost none of us can give an accurate answer to what were made of, where we’re from, or what we need to survive.
Those are interesting questions!
Where are you from is difficult given that we don't know how the alien cop's map is drawn. Third planet from a star shining at 5700K that is 8 kiloparsecs from the galactic center is only slightly more useful than lost kid and saying that their mom's name is mommy.
Chemistry of sustenance. We're carbon based and everything comes from that, but constructing a description of edible food from raw elements is going to take a lot more than drawing some hexagons with C H N and O, along with other required elements. Before we get to food and H₂O though, we'd need an atmosphere to breathe, I wouldn't presume the alien cops know to have an oxygen/nitrogen mix for humans, and not something that's poisonous for humans, like CO.
Time is something that's possible to express though. SI defined the second as a number of vibrations of a Cesium-133 atom, 8 hours of sleep is just multiplication.
Don't think anybody could describe what/where/what to an alien cop that doesn't even speak English to get themselves home or even to not die in an alien atmosphere.
I'll eat my hat if you can answer any of those with enough specificity that "random alien cop"s could produce something useful.
I'm definitely in the 99.9%, which is more like 99.9999999%. In other words, I doubt there's even 10 people on the planet that would survive that scenario.
As an educationer at the academic level the number of times I have to explain absolute basic "everybody should have learned it in school"-physics is staggering.
> I couldn’t believe they whiffed it
Why should we expect a general-purpose instruction-tuned LLM to get this right in the first place? I am not at all surprised it didn't work, and I would be more than a little surprised if it did.
> Why should we expect a general-purpose instruction-tuned LLM to get this right in the first place?
The argument goes: Language encodes knowledge, so from the vast reams of training data, the model will have encoded the fundamentals of electromagnetism. This is based in the belief that LLMs being adept at manipulating language, are therefore inchoate general intelligences, and indeed, attaining AGI is a matter of scaling parameters and/or training data on the existing LLM foundations.
Which is like saying that if you read enough textbooks you'll become an engineer/physicist/ballerina/whatever.
A huge number of people in academia believe so. The entire self-help literary genre is based upon this concept.
In reality, and with my biases as self-taught person, experience is crucial. Learning on the field. 10,000 hours of practice. Something LLMs are not very good at. You train them a priori, then it's a relatively static product compared to how human brains operate and self-adjust.
This could be up for debate -
https://www.scientificamerican.com/article/you-dont-need-wor...
Yeah but language sucks at encoding the locality relations that represent a 2D picture such as a circuit diagram. Language is a fundamentally 1D concept.
And I'm baffled that HN is not picking up on that and ACTUALLY BELIEVES that you can achieve AGI with a simple language model scaled to billions of parameters.
It's as futile as trying to explain vision to a blind man using "only" a few billion words. There's simply no string of words that can create a meaningful representation in the mind of the blind man.
I still have nightmares about the entry level EE class I was required to take for a CS degree.
RC circuits man.
I dropped EE entirely and switched from Computer Engineering to Computer Science because of my entry level EE course professor. I know I'm not the only person pushed away from EE due to Neil Cotter. Boggles my mind why he's still allowed to be the gateway to that discipline for so many people.
Most entry level engineering classes (first 3/4 semesters) in most of Europe (all kinds) are designed to gate keep.
I graduated in chemistry, and Chemistry 1 in engineering had tests much more difficult than any other Chemistry 1 in any other faculty. After noticing that the same pattern applied to Physics 1 or Calculus I started realizing it was an engineering thing, which was later confirmed to me by an associate professor that was the design.
I asked him why, and he told me that it's a long established thing that you don't want people that struggle with science fundamentals to build bridges, ships or electrical circuits so the first semesters are very focused on this weeding.
I came into CS during a year they were trying to rework the intro class. Several of the homework assignments simply did not work. Which taught me that procrastination doesn’t just feel good, it also pays off. If I waited until three days before it was due before I even looked at it, there would be a whole thread about corrections and clarifications. Though in a couple cases they were still sorting things out and people were calling for extensions (one of which I believe we got).
And this at a top ten school for CS.
There are healthy ways to exploit an urge to procrastinate but this is just feeding the monster, and I hope the prof was ashamed of himself.
Ahh perhaps that explains why I had Stress Analysis and Material Science in the first semester of CE... they were far harder than anything in following four years. I thought they were filler LOL. This was back in 92.
While I didn’t switch majors, I had a similar experience with my intro EE class. My theory was that it was intentionally a weeder class to push students towards the other engineering concentrations.
Intro EE is kinda brutal in that there’s a lot of theory to cover, and you need to build the intuition on how it applies to real world circuit design on the fly.
I had a bit of an epiphany when I was in a set theory/number theory class and some classmates were breezing through proofs that I struggled with. I was having to do algebraic manipulations in a way that was novel to me, but was intuitive to math nerds. I felt like that guy who didn’t “get” the intuition in an intro programming or circuits class.
But yeah, students often get some context for math or programming in high school, but rarely for circuit design. E&M in physics at best. EE programs have solved this by weeding out anyone who can’t bash their way through the foundational theory… which isn’t great.
If you’re still interested, I would recommend the Student Manual to the Art of Electronics. It’s a very practical, lab-based book that throws out a lot of the math in favor of rules of thumb and gaining intuition for circuit design.
The thing I hated most about EE 101 though was that the diagrams predated the discovery of the electron so all the arrows point the wrong way. AND NOBODY BOTHERED TO FIX IT. It felt like taking a racketball class with my foot stuck in a bucket.
I studied mechatronics and did reasonably well... but in any electrical class I would just scrape by. I loved it but was apparently not suited to it. I remember a whole unit basically about transistors. On the software/mtrx side we were so happy treating MOSFETs as digital. Having to analyse them in more depth did my head in.
I had a similar experience, except Mechanical Engineering being my weakest area. Computer Science felt like a children's game compared to fluid dynamics...
Maybe this is why Sussman decided to approach understanding physics by way of programming.
They call it Thermogoddamics for a reason ...
“Oh shit I better remember all that matrix algebra I forgot already!”
…Then takes a class on anything with 3d graphics… “oh shit matrix algebra again!”
…then takes a class on machine learning “urg more matrix math!”
EEs actually had a head start on ML, especially those who took signal processing.
I remember vectors in 3D graphics but I don’t recall them in EE 101. maybe I blotted it out.
From my experience, complex exponentials were a much more important fundamental than matrices.
How many words in the art of electronics? Could you give that as context and see if might help?
I don’t mind LLMs in the ideation and learning phases, which aren’t reproducible anyway. But I still find it hard to believe engineers of all people are eager to put a slow, expensive, non-deterministic black box right at the core of extremely complex systems that need to be reliable, inspectable, understandable…
You find it hard to believe that non-deterministic black boxes at the core of complex systems are eager to put non-deterministic black boxes at the core of complex systems?
Yes I do! Is that some sort of gotcha? If I can choose between having a script that queries the db and generates a report and “Dave in marketing” who “has done it for years”, I’m going to pick the script. Who wouldn’t? Until machines can reliably understand, operate and self-correct independently, I’d rather not give up debuggability and understandability.
I think this comment and the parent comment are talking about two different things. One of you is talking about using nondeterministic ML to implement the actual core logic (an automated script or asking Dave to do it manually), and one of you is talking about using it to design the logic (the equivalent of which is writing that automated script).
LLM’s are not good at actually doing the processing, they are not good at math or even text processing at a character level. They often get logic wrong. But they are pretty good at looking at patterns and finding creative solutions to new inputs (or at least what can appear creative, even if philosophically it’s more pattern matching than creativity). So an LLM would potentially be good at writing a first draft of that script, which Dave could then proofread/edit, and which a standard deterministic computer could just run verbatim to actually do the processing. Eventually maybe even Dave’s proofreading would be superfluous.
Tying this back to the original article, I don’t think anyone is proposing having an LLM inside a chip that processes incoming data in a non-deterministic way. The article is about using AI to design the chips in the first place. But the chips would still be deterministic, the equivalent of the script in this analogy. There are plenty of arguments to make about LLM‘s not being good enough for that, not being able to follow the logic or optimize it, or come up with novel architectures. But the shape of chip design/Verilog feels like something that with enough effort, an AI could likely be built that would be pretty good at it. All of the knowledge that those smart knowledgeable engineers which are good at writing Verilog have built up can almost certainly be represented in some AI form, and I wouldn’t bet against AI getting to a point where it can be helpful similarly to how Copilot currently is with code completion. Maybe not perfect anytime soon, but good enough that we could eventually see a path to 100%. It doesn’t feel like there’s a fundamental reason this is impossible on a long enough time scale.
> So an LLM would potentially be good at writing a first draft of that script, which Dave could then proofread/edit
Right, and there’s nothing fundamentally wrong with this, nor is it a novel method. We’ve been joking about copying code from stack overflow for ages, but at least we didn’t pretend that it’s the peak of human achievement. Ask a teacher the difference between writing an essay and proofreading it.
Look, my entire claim from the beginning is that understanding is important (epistemologically, it may be what separates engineering from alchemy, but I digress). Practically speaking, if we see larger and larger pieces of LLM written code, it will be similar to Dave and his incomprehensible VBA script. It works, but nobody knows why. Don’t get me wrong, this isn’t new at all. It’s an ever-present wet blanket that slowly suffocates engineering ventures who don’t pay attention and actively resist. In that context, uncritically inviting a second wave of monkeys to the nuclear control panels, that’s what baffles me.
> We’ve been joking about copying code from stack overflow for ages
Tangent for a slight pet peeve of mine:
"We" did joke about this, but probably because most of our jobs are not in chip design. "We" also know the limits of this approach.
The fact that Stack Overflow is the most SEO optimised result for "how to center div" (which we always forget how to do) doesn't have any bearing on the times when we have an actual problem requiring our attention and intellect. Say diagnosing a performance issue, negotiating requirements and how they subtly differ in an edge case from the current system behaviour, discovering a shared abstraction in 4 pieces of code that are nearly but not quite the same.
I agree with your posts here, the Stack Overflow thing in general is just a small hobby horse I have.
Also the Stack Overflow thing has more to do with all of us being generalists, not incompetent.
I look up "how do I sort a list in language X" because I know from school that there IS a defined good way to do it, probably built into the language, and it will be extremely idiomatic, but I haven't used language X in five years and the specifics might have changed and I don't remember the specific punctuation.
> So an LLM would potentially be good at writing a first draft of that script, which Dave could then proofread/edit, and which a standard deterministic computer could just run verbatim to actually do the processing
Or Dave could write a first draft of that script, saving him the time needed to translate what the LLM composed.
>If I can choose between having a script that queries the db and generates a report and “Dave in marketing” who “has done it for years”
If you could that would be nice wouldn't it? And if you couldn't?
If people were saying, "let's replace Casio Calculators with interfaces to GPT" then that would be crazy and I would wholly agree with you but by and large, the processes people are scrambling to place LLMs in are ones that typical machines struggle or fail and humans excel or do decently (and that LLMs are making some headway in).
You're making the wrong distinction here. It's not Dave vs your nifty script. It's Dave or nothing at all.
There's no point comparing LLM performance to some hypothetical perfect understanding machine that doesn't exist.
You compare to the things its meant to replace - humans. How well can the LLM do this compared to Dave ?
> by and large, the processes people are scrambling to place LLMs in are ones that typical machines struggle or fail
I'm pretty sure they are scrambling to put them absolutely anywhere it might save or make a buck (or convince an investor that it could)
100%, and a lot of them are truly terrible use cases for LLMs.
For example, using a LLM to transform structured data into JSON, and doing it with two LLMs in parallel to try to catch the inevitable failures, instead of just writing code that outputs JSON.
Your example does not make much sense (in response to OP). That's not saving anybody any money.
If your task was being solved well by a deterministic script/algorithm, you are not going to save money porting to LLMs even if you use Open Source models.
'could' is doing a whole lot of work in that sentence, I'm being charitable. Reality is LLMs are being crammed in places where it isn't very sensible under thin justifications, just like the last few big ideas were (c.f. blockchain)
If it can't be solved by a script then what's problem with seeing if you can use LLMs ?
I guess I just don't see your point. So a few purported applications are not very sensible. So what ? This is every breakthrough ever.
One great thing about humans is that we have developed ways to be deterministic when we want to. That’s what math is for.
Does an LLM know math? Not like we do. There’s no deductive logic in there; it’s all statistical inferences from language. An LLM doesn’t “work through” a circuit diagram systematically the way a physics student would. It observes the entire diagram at once, and then guesses the most likely next token.
I'm a non-deterministic black box who teaches complex deterministic machines to do stuff and leverages other deterministic machines as tools to do the job.
I like my job.
My job also involves cooperating with other non-deterministic black boxes (colleagues).
I can totally see how artificial non-deterministic black boxes (artificial colleagues) may be useful to replace/augment the biological ones.
For one, artificial colleagues don't get tired and I don't accidentally hurt their feelings or whatnot.
In any case, I'm not looking forward to replacing my deterministic tools with the fuzzy AI stuff.
Intuitively at least it seems to me that these non-deterministic black boxes could really benefit from using the deterministic tools for pretty much the same reasons we do as well.
Can you actually like follow through with this line? I know there are literally tens of thousands of comments just like this at this point, but if you have chance, could you explain what you think this means? What should we take from it? Just unpack it a little bit for us.
An interpretation that makes sense to me: humans are non-deterministic black boxes already at the core of complex systems. So in that sense, replacing a human with AI is not unreasonable.
I’d disagree, though: humans are still easier to predict and understand (and trust) than AI, typically.
With humans we have a decent understanding of what they are capable of. I trust a medical professional to provide me with medical advice and an engineer to provide me with engineering advice. With LLM, it can be unpredictable at times, and they can make errors in ways that you would not imagine. Take the following examples from my tool, which shows how GPT-4o and Claude 3.5 Sonnet can screw up.
In this example, GPT-4o cannot tell that GitHub is spelled correctly:
https://app.gitsense.com/?doc=6c9bada92&model=GPT-4o&samples...
In this example, Claude cannot tell that GitHub is spelled correctly:
https://app.gitsense.com/?doc=905f4a9af74c25f&model=Claude+3...
I still believe LLM is a game changer and I'm currently working on what I call a "Yes/No" tool which I believe will make trusting LLMs a lot easier (for certain things of course). The basic idea is the "Yes/No" tool will let you combine models, samples and prompts to come to a Yes or No answer.
Based on what I've seen so far, a model can easily screw up, but it is unlikely that all will screw up at the same time.
It's actually a great topic - both humans and LLMs are black boxes. And both rely on patterns and abstractions that are leaky. And in the end it's a matter of trust, like going to the doctor.
But we have had extensive experience with humans, it is normal to have better defined trust, LLMs will be better understood as well. There is no central understander or truth, that is the interesting part, it's a "Blind men and the elephant" situation.
We are entering the nondeterministic programming era in my opinion. LLM applications will be designed with the idea that we can't be 100% sure and what ever solution can provide the most safe guards, will probably be the winner.
Because people are not saying "let's replace Casio Calculators with interfaces to GPT!"
By and large, the processes people are scrambling to place LLMs in are ones that typical machines struggle or fail and humans excel or do decently (and that LLMs are making some headway in).
There's no point comparing LLM performance to some hypothetical perfect understanding machine that doesn't exist. It's nonsensical actually. You compare it to the performance of the beings it's meant to replace or augment - humans.
Replacing non-deterministic black boxes with potentially better performing non-deterministic black boxes is not some crazy idea.
Sure. I mean, humans are very good at building businesses and technologies that are resilient to human fallibility. So when we think of applications where LLMs might replace or augment humans, it’s unsurprising that their fallible nature isn’t a showstopper.
Sure, EDA tools are deterministic, but the humans who apply them are not. Introducing LLMs to these processes is not some radical and scary departure, it’s an iterative evolution.
Ok yeah. I think the thing that trips me up with this argument then is just, yes, when you regard humans in a certain neuroscientific frame and consider things like consciousness or language or will, they are fundamentally nondeterministic. But that isn't the frame of mind of the human engineer who does the work or even validates it. When the engineer is working, they aren't seeing themselves as some black box which they must feed input and get output, they are thinking about the things in themselves, justifying to themselves and others their work. Just because you can place yourself in some hypothetical third person here, one that oversees the model and the human and says "huh yeah they are pretty much the same, huh?", doesn't actually tell us anything about whats happening on the ground in either case, if you will. At the very least, this same logic would imply fallibility is one dimensional and always statistical; "the patient may be dead, but at least they got a new heart." Like isn't in important to be in love, not just be married? To borrow some Kant, shouldn't we still value what we can do when we think as if we aren't just some organic black box machines? Is there even a question there? How could it be otherwise?
Its really just that the "in principle" part of the overall implication with your comment and so many others just doesn't make sense. Its very much cutting off your nose to spite your face. How could science itself be possible, much less engineering, if this is how we decided things? If we regarded ourselves always from the outside? How could even be motivated to debate whether we get the computers to design their own chips? When would something actually happen? At some point, people do have ideas, in a full, if false, transparency to themselves, that they can write down and share and explain. This is not only the thing that has gotten us this far, it is the very essence of why these models are so impressive in the certain ways that they are. It doesn't make sense to argue for the fundamental cheapness of the very thing you are ultimately trying to defend. And it imposes this strange perspective where we are not even living inside our own (phenomenal) minds anymore, that it fundamentally never matters what we think, no matter our justification. Its weird!
I'm sure you have a lot of good points and stuff, I just am simply pointing out that this particular argument is maybe not the strongest.
We start from similar places but get to very different conclusions.
I accept that I’m fallible, both in my areas of expertise and in all the meta stuff around it. I code bugs. I omit requirements. Not often, and there are mental and technical means to minimize, but my work, my org’s structure, my company’s processes are all designed to mitigate human fallibility.
I’m not interested in “defending” AI models. I’m just saying that their weaknesses are qualitatively similar to human weaknesses, and as such, we are already prepared to deal with those weaknesses as long as we are aware of them, and as long as we don’t make the mistake of thinking that because they use transistors they should be treated like a mostly deterministic piece of software where one unit test pass means it is good.
I think you’re reading some kind of value judgement on consciousness into what is really just a pragmatic approach to slotting powerful but imperfect agents into complex systems. It seems obvious to me, and without any implications as to human agency.
I took it to be a joke that the description "slow, expensive, non-deterministic black boxes" can apply to the engineers themselves. The engineers would be the ones who would have to place LLMs at the core of the system. To anyone outside, the work of the engineers is as opaque as the operation of LLMs.
[dead]
>You find it hard to believe that non-deterministic black boxes at the core of complex systems are eager to put non-deterministic black boxes at the core of complex systems?
Hello, fellow tech enthusiasts, just stopping by to announce I performatively can't tell the difference between "Latest big tech product (TM)" and Homo Sapiens Sapiens!!!
I'll be seeing you in the next LLM related message thread with the same exact comment!!! As you were!!!
Yes. One does not have to do with the other.
In a reductive sense, this passage might as well read "You find it hard to believe that entropy is the source of other entropic reactions?"
No, I'm just disappointed in the decision of Black Box A and am bound to be even more disappointed by Black Box B. If we continue removing thoughtful design from our systems because thoughtlessness is the default, nobody's life will improve.
I think I've come to terms with it: engineering and making money from engineering are two completely unrelated things, the latter don't even need technology(but scamming is unethical)
Zero dollars isn't cool. You know what is? Hundreds of billions of dollars. (quote rescaled for engineering wealth and LLM wealth)
100% agree. While I can’t find all the sources right now, [1] and its references could be a good starting point for further exploration. I recall there being a proof or conjecture suggesting that it’s impossible to build an "LLM firewall" capable of protecting against all possible prompts—though my memory might be failing me
[1] https://arxiv.org/abs/2410.07283
Human itself is like a slow, expensive, non-deterministic black box...
LLMs can be fully deterministic BTW, depending on the sampling method used. Some methods do not have a random component. As to the rest, yeah - they aren't inspectable or understandable yet.
You mean, like humans have been for many decades now.
Edit: I believe that LLM's are eminently useful to replace experts (of all people) 90% of the time.
> Edit: I believe that LLM's are eminently useful to replace experts (of all people) 90% of the time.
What do you mean by "expert"?
Do you mean the pundit who goes on TV and says "this policy will be bad for the economy"?
Or do you mean the seasoned developer who you hire to fix your memory leaks? To make your service fast? Or cut your cloud bill from 10M a year to 1M a year?
Experts of the kind that will be able to talk for hours about the academic consensus on the status quo without once considering how the question at hand might challenge it? Quite likely.
Experts capable of critical thinking and reflecting on evidence that contradicts their world model (and thereby retraining it on the fly)? Most likely not, at least not in their current architecture with all its limitations.
Change "replace" to "supplement" and I agree. The level of non-determinism is just too great at this stage, imo.
People believed that about expert systems in the 1980s as well.
I don't know if they "eminently" anything at the moment, thats why you feel the need to make the comment, right?
Anything that requires deep “understanding” or novel invention is not a job for a statistical word regurgitator. I’ve yet to see a single example, in any field, of an LLM actually inventing something truly novel (as judged by the experts in that space). Where LLMs shine is in producing boilerplate -- though that is super useful. So far I have yet to see anything resembling an original “thought” from an LLM (and I use AI at work every day).
There are many LLMs that are producing original "thought".
ESM3: https://www.evolutionaryscale.ai/blog/esm3-release
AlphaProof/AlphaGeometry2: https://deepmind.google/discover/blog/ai-solves-imo-problems...
MatPilot discovering new materials: https://arxiv.org/abs/2411.08063
Then of course NVidia Omniverse with their digital-twin learning.
https://blog.google/technology/ai/google-ai-big-scientific-b...
Taking a quick glance at all of these, they seem to be aspirational or a “brute force” type of search, which computers have always been good at, before AI. Does not seem like any novel research to me. The parameters and methods are set by humans and these systems search within a well defined space.
Experiment: you think LLMs can innovate on chip design? Ask it to do something much simpler: invent a new better sorting algorithm. We use names such as Timsort or Djikstra for a specific reason: because it requires rare human ingenuity to invent such things. If an LLM can’t invent a new sorting algorithm that is meaningfully better in some way than existing known algorithms, then good luck on something much harder like chip design.
You can set the bar lower. Have it invent another n log n sorting algorithm. Or omit all merge sort implementations from training data and see if it can re-invent it.
But I certainly agree in general. It’s been years and there are still no independent novel discoveries afaik.
As long as the chip isn’t expected to count the number of Rs in strawberry, I don’t see why an LLM couldn’t design a better chip.
Define novel
YC doesn't care whether it "makes sense" to use an LLM to design chips. They're as technically incompetent as any other VC, and their only interest is to pump out dogshit startups in the hopes it gets acquired. Gary Tan doesn't care about "making better chips": he cares about finding a sucker to buy out a shitty, hype-based company for a few billion. An old school investment bank would be perfect.
YC is technically incompetent and isn't about making the world better. Every single one of their words is a lie and hides the real intent: make money.
not how i would word it, but yeah, any VC today is going to pump AI knowing it's the wrong tool, so the more complex they make the application space the easier it is to find the proverbial sucker.
First, VCs don't get paid when "dogshit startups" get acquired, they get paid when they have true outlier successes. It's the only way to reliably make money in the VC business.
Second, want to give any examples of "shitty, hype-based compan[ies]" (I assume you mean companies with no real revenue traction) getting bought out for "a few billion".
Third, investment banks facilitate sales of assets, they don't buy them themselves.
Maybe sit out the conversation if you don't even know the basics of how VC, startups, or banking work?
> First, VCs don't get paid when "dogshit startups" get acquired
https://www.reuters.com/article/business/peloton-raises-12-b...
That’s an article about Peloton’s IPO.
[flagged]
I worked on the Qualcomm DSP architecture team for a year, so I have a little experience with this area but not a ton.
The author here is missing a few important things about chip design. Most of the time spent and work done is not writing high performance Verilog. Designers spent a huge amount of time answering questions, writing documentation, copying around boiler plate, reading obscure manuals and diagrams, etc. LLMs can already help with all of those things.
I believe that LLMs in their current state could help design teams move at least twice as fast, and better tools could probably change that number to 4x or 10x even with no improvement in the intelligence of models. Most of the benefit would come from allowing designers to run more experiments and try more things, to get feedback on design choices faster, to spend less time documenting and communicating, and spend less time reading poorly written documentation.
Author here -- I don't disagree! I actually noted this in the article:
> Well, it turns out that LLMs are also pretty valuable when it comes to chips for lucrative markets -- but they won’t be doing most of the design work. LLM copilots for Verilog are, at best, mediocre. But leveraging an LLM to write small snippets of simple code can still save engineers time, and ultimately save their employers money.
I think designers getting 2x faster is probably optimistic, but I also could be wrong about that! Most of my chip design experience has been at smaller companies, with good documentation, where I've been focused on datapath architecture & design, so maybe I'm underestimating how much boilerplate the average engineer deals with.
Regardless, I don't think LLMs will be designing high-performance datapath or networking Verilog anytime soon.
Thanks for the reply!
At large companies with many designers, a lot of time is spent coordinating and planning. LLMs can already help with that.
As far as design/copilot goes, I think there are reasons to be much more optimistic. Existing models haven't seen much Verilog. With better training data it's reasonable to expect that they will improve to perform at least as well on Verilog as they do on python. But even if there is a 10% chance it's reasonable for VCs to invest in these companies.
> With better training data it's reasonable to expect that they will improve to perform at least as well on Verilog as they do on python.
There simply isn't enough of that code in existence.
Writing Verilog code is about mapping the constructs onto your theory of mind about the underlying hardware. If that were easy, so many engineers wouldn't have so much trouble writing Verilog code that doesn't have faults. You can't write Verilog code just by pasting together Stack Overflow snippets.
Look at the confusion that happens when programmers take their "for-loop" understanding into the world of GPU shaders or HDLs (hardware description languages) where "for-loops" map to hardware and suddenly are both finite and fixed. LLMs exhibit the exact same confusion--only worse.
I’m actually curious if there even is a large enough corpus of Verilog out there. I have noticed that even tools like Copilot tend to perform poorly when working with DSLs that are majority open source code (on GitHub no less!) where the practical application is niche. To put this in other terms, Copilot appears to _specialize_ on languages, libraries and design patterns that have wide adoption, but does not appear to be able to _generalize_ well to previously unseen or rarely seen languages, libraries, or design patterns.
Anyway that’s largely anecdata/sample size of 1, and it could very well be a case of me holding the tool wrong, but that’s what I observed.
I agree with most of the technical points of the article.
But there may still be value in YC calling for innovation in that space. The article is correctly showing that there is no easy win in applying LLMs to chip design. Either the market for a given application is too small, then LLMs can help but who cares, or the chip is too important, in which case you'd rather use the best engineers. Unlike software, we're not getting much of a long tail effect in chip design. Taping out a chip is just not something a hacker can do, and even playing with an FPGA has a high cost of entry compared to hacking on your PC.
But if there was an obvious path forward, YC wouldn't need to ask for an innovative approach.
you could say it is the naive arrogance of the beginner mind.
seen here as well when george-hotz attempts to overthow the chip companies with his plan for an ai chip https://geohot.github.io/blog/jekyll/update/2021/06/13/a-bre... little realizing the complexity involved. to his credit, he quickly pivoted into a software and tiny-box maker.
> But if there was an obvious path forward, YC wouldn't need to ask for an innovative approach.
How many experts do YC have on chip design?
I know several founders who went through YC in the chip design space, so even if the people running YC don't have a chip design background, just like VCs, they learn from hearing pitches of the founders who actually know the space.
There is an obvious path forward, but apparently this is a minority opinion, possibly fringe. It doesn't make the traditional tradeoffs.
A bit level (non von Neumann) general purpose systolic array could greatly speed up AI computations, along with almost everything else. It's a chip to do general purpose computation.
The chip design is almost trivial. I'd expect someone with a few years of experience could knock it out in a few days. I hope to field a design in the next TinyTapeout (I'm on a fixed income, so I've had to wait a while)
The real problem is programming. We're talking vast greenfields that go on forever. There's no good way to target the architecture, you certainly wouldn't want to use Verilog or any other HDL.
> But if there was an obvious path forward
Even obvious can be risky. First it's nice to share the risk, second more investments come with more connections.
As for LLMs boom. I think finally we'll realize that LLM with algorithms can do much more than just LLM. 'algorithms' is probably a bad word here, I mean assisting tools like databases, algorithms, other models. Then only access API can be trained into LLM instead of the whole dataset for example.
I know nothing about chip design. But saying "Applying AI to field X won't work, because X is complex, and LLMs currently have subhuman performance at this" always sounds dubious.
VCs are not investing in the current LLM-based systems to improve X, they're investing in a future where LLM based systems will be 100x more performant.
Writing is complex, LLMs once had subhuman performance, and yet. Digital art. Music (see suno.AI) There is a pattern here.
I didn't get into this in the article, but one of the major challenges with achieving superhuman performance on Verilog is the lack of high-quality training data. Most professional-quality Verilog is closed source, so LLMs are generally much worse at writing Verilog than, say, Python. And even still, LLMs are pretty bad at Python!
That’s what your VC investment would be buying; the model of “pay experts to create a private training set for fine tuning” is an obvious new business model that is probably under-appreciated.
If that’s the biggest gap, then YC is correct that it’s a good area for a startup to tackle.
It would be hard to find any experts that could be paid "to create a private training set for fine tuning".
The reason is that those experts do not own the code that they have written.
The code is owned by big companies like NVIDIA, AMD, Intel, Samsung and so on.
It is unlikely that these companies would be willing to provide the code for training, except for some custom LLM to be used internally by them, in which case the amount of code that they could provide for training might not be very impressive.
Even a designer who works in those companies may have great difficulties to see significant quantities of archived Verilog/VHDL code, though it can be hoped that it still exists somewhere.
When I say “pay to create” I generally mean authoring new material, distilling your career’s expertise.
Not my field of expertise but there seem to be experts founding startups etc in the ASIC space, and Bitcoin miners were designed and built without any of the big companies participating. So I’m not following why we need Intel to be involved.
An obvious way to set up the flywheel here is to hire experts to do professional services or consulting on customer-submitted designs while you build up your corpus. While I said “fine-tuning”, there is probably a lot of agent scaffolding to be built too, which disproportionately helps bigger companies with more work throughput. (You can also acquire a company with the expertise and tooling, as Apple did with PA Semi in ~2008, though obviously $100m order of magnitude is out of reach for a startup. https://www.forbes.com/2008/04/23/apple-buys-pasemi-tech-ebi...)
I doubt any real expert would be tempted by an offer to author new material, because that cannot be done in a good way.
One could author some projects that can be implemented in FPGAs, but those do not provide good training material for generating code that could be used to implement a project in an ASIC, because the constraints of the design are very different.
Designing an ASIC is a year-long process and it is never completed before testing some prototypes, whose manufacture may cost millions. Authoring some Verilog or VHDL code for an imaginary product that cannot be tested on real hardware prototypes could result only in garbage training material, like the code of a program that has never been tested to see if it actually works as intended.
Learning to design an ASIC is not very difficult for a human, because a human does not need a huge number of examples, like ML/AI. Humans learn the rules and a few examples are enough for them. I have worked in a few companies at designing ASICs. While those companies had some internal training courses for their designers, those courses only taught their design methodologies, but with practically no code examples from older projects, so very unlikely to how a LLM would have to be trained.
I would imagine it is a reasonably straightforward thing to create a simulator that generates arbitrary chip designs and the corresponding verilog that can be used as training data. It would be much like how AlphaFold was trained. The chip designs don't need to be good, or even useful, they just need to be valid so the LLM can learn the underlying relationships.
I have never heard of any company, no matter how big and experienced, where it is possible to decide that an ASIC design is valid by any other means except by paying for a set of masks to be made and for some prototypes to be manufactured, then tested in the lab.
This validation costs millions, which is why it is hard to enter this field, even as a fabless designer.
Many design errors are not caught even during hardware testing, but only after mass production, like the ugly MONITOR/MWAIT bug of Intel Lunar Lake.
Randomly-generated HDL code, even if it does not have syntax errors, and even if some testbench for it does not identify deviations from its specification, is not more likely to be valid when implemented in hardware, than the proverbial output of a typewriting monkey.
Validating an arbitrary design is hard. It's equivalent to the halting problem. Working backwards using specific rules that guarantee validity is much easier. Again, the point is not to produce useful designs. The generated model doesn't need to be perfect, indeed it can't be, it just needs to be able to avoid the same issues that humans are looking for.
I know just enough about chips to be suspicious of "valid". The right solution for a chip at the HDL layer depends on your fab, the process you're targeting, what % of physical space on the chip you want it to take up, and how much you're willing to put into power optimization.
The goal is not to produce the right, or even a good solution. The point is to create a large library of highly variable solutions so the trained model can pick up on underlying patterns. You want it to spit out lots of crap.
That's probably where there's a big advantage to being a company like Nvidia, which has both the proprietary chip design knowledge/data and the resources/money and AI/LLM expertise to work on something specialized like this.
I strongly doubt this - they don't have enough training data either - you are confusing (i think) the scale of their success with the amount of verilog they possess.
IE I think you are wildly underestimating both the scale of training data needing, and wildly overestimating the amount of verilog code possessed by nvidia.
GPU's work by having moderate complexity cores (in the scheme of things) that are replicated 8000 times or whatever. That does not require having 8000 times as much useful verilog, of course.
The folks who have 8000 different chips, or 100 chips that each do 1000 things, would probably have orders of magnitude more verilog to use for training
AI still has subhuman performance for art. It feels like the venn diagram of people who are bullish on LLMs and people who don't understand logistic curves is a circle.
You ask 100,000 humans each to make a photo realistic rendering of a alpaca playing basketball on the moon in 90 seconds, an LLM is going to outperform every single one of them.
Diffusion models aren't actually LLMs, they're a different architecture. Which makes it even weirder we invented them at the same time.
Also, they might not be able to do it. eg most models can't generate "horse riding an astronaut" or "upside-down car".
To be fair, most humans can't draw any better than stick figures.
This is true, but humans are much better at including specified elements in an image with specified spatial relationships. A description like a "A porpoise seated at a desk writing a letter" will reliably produce (terrible) drawings consisting of parts corresponding to the porpoise, parts corresponding to the desk, and parts corresponding to the letter, with the arrangement of the parts roughly corresponding to the description.
Humans being better at one specific aspect of a task is not equivalent to humans being overall better at the task.
I just entered your prompt into an AI image generator and in under a second it gave me an image[0] of what looks to me like an anthropomorphic dolphin sitting at a desk writing a letter in a little study. I then had to google what the difference between a porpoise and a dolphin was because I genuinely thought porpoises looked much more like manatees. While I could nitpick the AI's work for making the porpoise's snout a little too long, had I drawn it the porpoise would have been a vaguely marine looking blob with no anatomy detailed enough to recognize let alone criticize. I am quite confident that if you asked for a large number of images based on that prompt from humans, it would easily rank among the best, and it's unlikely you'd get any which were markedly better. The fact it can generate this image nearly instantaneously though is astounding. If your goal was to get one masterpiece hanging in the Louvre, this particular tool would not suffice, but if your goal was to illustrate children's books, this tool could do in hours what would have taken a team of humans months. That is superhuman performance.
[0] https://api.deepai.org/job-view-file/e0b80ca6-d934-42e4-9a7e...
(Sorry if the link doesn't remain good for long)
An AI image generator will sometimes do a good job on this sort of prompt, but it fails in different ways to the ways that humans fail.
Whether humans or AI are better at the task overall is probably too vague a question to answer, depending a lot on how you weight different desirables.
That's not a meaningful benchmark for valuing art or creating art
What meaningful benchmark would you use? Art by it's nature is subjectively experienced - what one person considers great, meaningful, soul-moving art, another may consider terrible, meaningless, and empty. Both opinions are equally valid.
But if you're using AI to create art, you're typically not trying to move someone's soul. You're trying to create a work that depicts something in a particular style with a particular fidelity with a certain amount of resource consumption. That is the only metric by which it makes any sense to evaluate the machine designed to do that specific task.
In this specific case, it's hard to see how LLMs can get you from here to there. the problem isn't the boilerplate code like when you build a react website with them, but the really novel architectures (and architectural decisions, more importantly) that you need to make along the way. Some of those can seem very arbitrary and require deep understanding to pull off. You can't just use language-like tokens to reason this out. Fundamental understanding of the laws and rules of thumb are important.
Okay but I still see no startup here.
If LLMs will do well in the space for some use case it's the established chip designers that will benefit from it, not a small startup.
> Writing is complex, LLMs once had subhuman performance,
And now they can easily replace mediocre human performance, and since they are tuned to provide answers that appeal to humans that is especially true for these subjective value use cases. Chip design doesn't seem very similar. Seems like a case where specifically trained tools would be of assistance. For some things, as much as generalist LLMs have surprised at skill in specific tasks, it is very hard to see how training on a broad corpus of text could outperform specific tools — for first paragraph do you really think it is not dubious to think a model trained on text would outperform Stockfish at chess?
When people say LLM I think they are often thinking of neural network approaches in general rather than just text based even if the letters do stand for language model. And there's overlap eg. Gemini does language but is multi modal. If you skip that you get things like AlphaZero which did beat Stockfish https://en.wikipedia.org/wiki/AlphaZero
I like this reasoning. It is shortsighted to say that LLMs aren’t well-suited to something (because we cannot tell the future) but it is not shortsighted to say that LLMs are well-suited to something (because we cannot tell the future)
I kinda suspect that things that are expressed better with symbols and connections than with text will always be a poor fit to large LANGUAGE models. Turning what is basically a graph into a linear steam of text descriptions to tokenize and jam into an LLM has to be an incredibly inefficient and not very performant way of letting “AI” do magic on your circuits.
Ever try to get ChatGPT to play scrabble? Ever try to describe the board to it and then all the letters available to you? Even its fancy pants o1 preview performs absolutely horrible. Either my prompting completely sucks or an LLM is just the wrong tool for the job.
It’s great for asking you to score something you just created provided you tell it what bonuses apply to which words and letters. But it has absolutely no concept of the board at all. You cannot use to optimize your next move based on the board and the letters.
… I mean you might if you were extremely verbose about every letter on the board and every available place to put your tiles, perhaps avoiding coordinates and instead describing each word, its neighbors and relationships to bonus squares. But that just highlights how bad a tool an LLM is for scrabble.
Anyway, I’m sure schematics are very similar. Maybe somebody we will invent good machine learning models for such things but an LLM isn’t it.
There are lots of reasons to doubt the present-day ability of LLMs to help with chip design, but I don't think any of these things above are why. Chip design isn't done with schematics. If an LLM can write Python given enough training data, it can write SystemVerilog given a similar amount of training (though the world currently lacks enough high-quality open source SV to reach an equivalent level.) We can debate whether the LLM actually writes Python well. But I don't think there's a reason to expect that writing SV requires a different approach.
The main problem in making a good circuit design, and actually also in writing a good program, is not writing per se.
The main problem is an optimal decomposition of the big project into a collection of interconnected modules and in defining adequate interfaces between modules.
This is not difficult when the purpose of the project is to just take an older project and make some improvements to it, when a suitable structure is already known, but it is always the main difficulty when a really new problem must be solved.
I have yet to see any example when a LLM can be used to help even in the slightest way to solve such an example of "divide et impera" for something novel, where novel by definition means that the training set has not contained the solution for an identical project.
There is pretty much no relationship between the 2-dimensional or multi-dimensional structural graph of the interconnected modules, together with the descriptions of their matching interfaces, and the proximity or frequency of tokens in the description of the circuit by a hardware design language. So there is little that a LLM could use to generate any HDL program for an unknown circuit.
What a LLM could do is only after a good designer has done the difficult job to decompose the project into modules and define the interfaces. When given a small module with its defined interfaces, a LLM might be able to find some boilerplate code to speed up the implementation of the module.
However, any good designer would already have templates for the boilerplate code and I can not really imagine how a LLM could do this faster than a designer who just selects the appropriate templates and pastes them into the module.
I get what you are saying. It could be a good ‘commander’ that knows how to delegate to better-suited subsystems. But it is not the only way to be intelligent by any means.
To a nail, every hammer has a purpose.
[dead]
YC is just spraying & praying AI, like most investors
design automation tooling startups have it incredibly hard - first, customers wont buy from startups, and second, the space of possibly exits via acquisitions is tiny.
And liable to make money at it, on a "greater fool" basis - a successful sale (exit) is not necessarily a successful, profitable company ...
In the case of YC, their stake is so low that they don't really get any upside unless it's a successful, profitable company.
The way I read that, I think they're saying hardware acceleration of specific algorithms can be 100 times faster and more efficient than the same algorithm in software on a general purpose processor, and since automated chip design has proven to be a difficult problem space, maybe we should try applying AI there so we can have a lower bar to specialized hardware accelerators for various tasks.
I do not think they mean to say that an AI would be 100 times better at designing chips than a human, I assume this is the engineering tradeoff they refer to. Though I wouldn't fault anyone for being confused, as the wording is painfully awkward and salesy.
That’s my read too, if I’m being generous.
I also think OP is missing the point saying the target applications are too small of a market to be worth pursuing.
They’re too small to pursue any single one as the market cap for a company, but presumably the fictional AI chip startup could pursue many of these smaller markets at once. It would be a long tail play, wouldn’t it?
I've been designing chips for almost 30 years.
We have a bunch of AI initiatives in my company but most of them are about using Copilot to help write scripts to automate the design flow. Our physical design flow are thousands of lines of Tcl and Python code.
The article mentions High Level Synthesis. I've been reading about this since my first job in the 1990's. I've worked on at least 80 chips and I've never seen any chip use one of these tools except for some tiny section that was written by some academics who didn't want to learn Verilog for reasons.
I've been designing chips for 2 years. One of our very well known third-party IP vendors clearly used HLS. I say clearly, because it was almost a 1:1 translation from OO C++ code, variable names, hierarchies, polymorphism, you name it. Absolutely everything about the Verilog was the complete opposite about how a designer organizes their state machines, etc.
Anyways, their IP very clearly violated the standards of a very well known interface, which could have spelled disaster at tape-out. I had to fight tooth-and-nail, and spent lots of my company's time trying to convince this third-party vendor that this was an actual issue. Only months later were they convinced. The revised code kept coming back and failing interface checks, which shows that they weren't doing these checks on their end. All I could think is, "this can't go well..."
I've never tried SystemC. But after having tried to learn Chisel and friends, and successfully learning Bluespec (and using it in professional projects), I have some insights.
It's fundamentally important when doing hardware design to work in a language that _expresses_ itself like you're designing hardware. Verilog (for all its faults) shines there because it feels like you're writing a slightly higher level netlist. That's not the case with SC and friends, which doesn't allow you to think in hardware. Languages like BSV and SV are functionally similar but they force you to think in similar ways to Verilog, meaning you can write much tighter high-level code.
I'd be interested in your experience, but I feel that using normal programming languages to build hardware is an abstraction failure. Which is why it performs so poorly.
This is a great article but the main principle at YC is to assume that technology will continue progressing at an exponential rate and then thinking about what it would enable. Their proposals are always assuming the startups will ride some kind of Moore's Law for AI and hardware synthesis is an obvious use case. So the assumption is that in 2 years there will be a successful AI hardware synthesis company and all they're trying to do is get ahead of the curve.
I agree they're probably wrong but this article doesn't actually explain why they're wrong to bet on exponential progress in AI capabilities.
One of the consistent problems I'm seeing over and over again with LLMs is people forgetting that they're limited by the training data.
Software engineers get hyped when they see the progress in AI coding and immediately begin to extrapolate to other fields—if Copilot can reduce the burden of coding so much, think of all the money we can make selling a similar product to XYZ industries!
The problem with this extrapolation is that the software industry is pretty much unique in the amount of information about its inner workings that is publicly available for training on. We've spent the last 20+ years writing millions and millions of lines of code that we published on the internet, not to mention answering questions on Stack Overflow (which still has 3x as many answers as all other Stack Exchanges combined [0]), writing technical blogs, hundreds of thousands of emails in public mailing lists, and so on.
Nearly every other industry (with the possible exception of Law) produces publicly-visible output at a tiny fraction of the rate that we do. Ethics of the mass harvesting aside, it's simply not possible for an LLM to have the same skill level in ${insert industry here} as they do with software, so you can't extrapolate from Copilot to other domains.
[0] https://stackexchange.com/sites?view=list#answers
Yes this is EXACTLY it, and I was discussing this a bit at work (financial services).
In software, we've all self taught, improved, posted Q&A all over the web. Plus all the open source code out there. Just mountains and mountains of free training data.
However software is unique in being both well paying and something with freely available, complete information online.
A lot of the rest of the world remains far more closed and almost an apprenticeship system. In my domain thinks like company fundamental analysis, algo/quant trading, etc. Lots of books you can buy from the likes of Dalio, but no real (good) step by step research and investment process information online.
Likewise I'd imagine heavily patented/regulated/IP industries like chip design, drug design, etc are substantially as closed. Maybe companies using an LLM on their own data internally could make something of their data, but its also quite likely there is no 'data' so much as tacit knowledge handed down over time.
Many other industries haven't yet been fully eaten by software. All kinds of data is locked away and in proprietary formats, and is generated by humans without much automation. I don't think we know where exactly the frontiers are, once someone puts in the work to build large datasets, and automates creation of synthetic training data. Whole industries could suddenly flip from 'impossible' to 'easy' for AI.
Yep, this is also the reason LLMs can probably work well for a lot more things if we did have the data
>The problem with this extrapolation is that the software industry is pretty much unique in the amount of information about its inner workings that is publicly available for training on... millions of lines of code that we published on the internet...
> Nearly every other industry (with the possible exception of Law) produces publicly-visible output at a tiny fraction of the rate that we do.
You are correct! There's lots of information available publicly about certain things like code, and writing SQL queries. But other specialized domains don't have the same kind of information trained into the heart of the model.
But importantly, this doesn't mean the LLM can't provide significant value in these other more niche domains. They still can, and I provide this every day in my day job. But it's a lot of work. We (as AI engineers) have to deeply understand the special domain knowledge. The basic process is this:
1. Learn how the subject matter experts do the work.
2. Teach the LLM to do this, using examples, giving it procedures, walking it through the various steps and giving it the guidance and time and space to think. (Multiple prompts, recipes if you will, loops, external memory...)
3. Evaluation, iteration, improvement
4. Scale up to production
In many domains I work in, it can be very challenging to get past step 1. If I don't know how to do it effectively, I can't guide the LLM through the steps. Consider an example question like "what are the top 5 ways to improve my business" -- the subject matter experts often have difficulty teaching me how to do that. If they don't know how to do it, they can't teach it to me, and I can't teach it to the agent. Another example that will resonate with nerds here is being an effective Dungeons and Dragons DM. But if I actually learn how to do it, and boil it down into repeatable steps, and use GraphRAG, then it becomes another thing entirely. I know this is possible, and expect to see great things in that space, but I estimate it'll take another year or so of development to get it done.
But in many domains, I get access to subject matter experts that can tell me pretty specifically how to succeed in an area. These are the top 5 situations you will see, how you can identify which situation type it is, and what you should do when you see that you are in that kind of situation. In domains like this I can in fact make the agent do awesome work and provide value, even when the information is not in the publicly available training data for the LLM.
There's this thing about knowing a domain area well enough to do the job, but not having enough mastery to teach others how to do the job. You need domain experts that understand the job well enough to teach you how to do it, and you as the AI engineer need enough mastery over the agent to teach it how to do the job as well. Then the magic happens.
When we get AGI we can proceed past this limitation of needing to know how to do the job ourselves. Until we get AGI, then this is how we provide impact using agents.
This is why I say that even if LLM technology does not improve any more beyond where it was a year ago, we still have many years worth of untapped potential for AI. It just takes a lot of work, and most engineers today don't understand how to do that work-- principally because they're too busy saying today's technology can't do that work rather than trying to learn how to do it.
> 1. Learn how the subject matter experts do the work.
This will get harder I think over time as low hanging fruit domains are picked - the barrier will be people not technology. Especially if the moat for that domain/company is the knowledge you are trying to acquire (NOTE: Some industries that's not their moat and using AI to shed more jobs is a win). Most industries that don't have public workings on the internet have a couple of characteristics that will make it extremely difficult to perform Task 1 on your list. The biggest is now every person on the street, through the mainstream news, etc knows that it's not great to be a software engineer right now and most media outlets point straight to "AI". "It's sucks to be them" I've heard people say - what was once a profession of respect is now "how long do you think you have? 5 years? What will you do instead?".
This creates a massive resistance/outright potential lies in providing AI developers information - there is a precedent of what happens if you do and it isn't good for the person/company with the knowledge. Doctors associations, apprenticeship schemes, industry bodies I've worked with are all now starting to care about information security a lot more due to "AI", and proprietary methods of working lest AI accidentally "train on them". Definitely boosted the demand for cyber people again as an example around here.
> You are correct! There's lots of information available publicly about certain things like code, and writing SQL queries. But other specialized domains don't have the same kind of information trained into the heart of the model.
The nightmare of anyone that studied and invested into a skill set according to most people you would meet. I think most practitioners will conscious to ensure that the lack of data to train on stays that way for as long as possible - even if it eventually gets there the slower it happens and the more out of date it is the more useful the human skill/economic value of that person. How many people would of contributed to open source if they knew LLM's were coming for example? Some may have, but I think there would of been less all else being equal. Maybe quite a bit less code to the point that AI would of been delayed further - tbh if Google knew that LLM's could scale to be what they are they wouldn't of let that "attention" paper be released either IMO. Anecdotally even the blue collar workers I know are now hesitant to let anyone near their methods of working and their craft - survival, family, etc come first. In the end after all, work is a means to an end for most people.
Unlike us techies which I find at times to not be "rational economic actors" many non-tech professionals don't see AI as an opportunity - they see it as a threat they they need to counter. At best they think they need to adopt AI, before others have it and make sure no one else has it. People I've chatted to say "no one wants this, but if you don't do it others will and you will be left behind" is a common statement. One person likened it to a nuclear weapons arms race - not a good thing, but if you don't do it you will be under threat later.
> This will get harder I think over time as low hanging fruit domains are picked - the barrier will be people not technology. Especially if the moat for that domain/company is the knowledge you are trying to acquire (NOTE: Some industries that's not their moat and using AI to shed more jobs is a win).
Also consider that there exist quite a lot of subject matter experts who simply are not AI fanboys - not because they are afraid of their job because of AI, but because they consider the whole AI hype to be insanely annoying and infuriating. To get them to work with an AI startup, you will thus have to pay them quite a lot of money.
Indeed. I'm already seeing it in software at least anecdotally where people's will to post code open source/answer Stackoverflow questions, etc are drying up (i.e. am I working hard just to train someone else's AI?). Might be a little too little too late though - there's just too much code out there. This is especially in niche domains where the advantage isn't the generic code itself but how it is applied (e.g. finance, power, etc the list goes on).
After all in a capitalist economy the last to be disrupted generally gets "all the spoils" as purchasing power (and hence prices/wages) move from least scarce/disrupted skills to more scarce skills which allows the last to be disrupted to have more time to accumulate wealth/assets to shield themselves from AI even more.
As a former chip designer (been 16 years, but looks like tools and our arguments about them haven't changed much), I'm both more and less optimistic than OP:
1. More because fine-tuning with enough good Verilog as data should let the LLMs do better at avoiding mediocre Verilog (existing chip companies have more of this data already though). Plus non-LLM tools will remain, so you can chain those tools to test that the LLM hasn't produced Verilog that synthesizes to a large area, etc
2. Less because when creating more chips for more markets (if that's the interpretation of YC's RFS), the limiting factor will become the cost of using a fab (mask sets cost millions), and then integrating onto a board/system the customer will actually use. A half-solution would be if FPGAs embedded in CPUs/GPUs/SiPs on our existing devices took off
> (quoting YC) We know there is a clear engineering trade-off: it is possible to optimize especially specialized algorithms or calculations such as cryptocurrency mining, data compression, or special-purpose encryption tasks such that the same computation would happen faster (5x to 100x), and using less energy (10x to 100x).
> If Gary Tan and YC believe that LLMs will be able to design chips 100x better than humans currently can, they’re significantly underestimating the difficulty of chip design, and the expertise of chip designers.
I may be confused, but isn’t the author fundamentally misunderstanding YC’s point? I read YC as simply pointing out the benefit of specialized compute, like GPUs, not making any point about the magnitude of improvement LLMs could achieve over humans.
I think the issue is Garry Tan's video RFS merged "LLMs for EDA" with "Purpose Built Compute" for specialized usecases. The title "LLMs for Chip Design" doesn't help either.
From my reading of the RFS (not the video) it appears they are essentially asking for the next Groq or SambaNova.
Personally, this kind of communication issue would give me a long pause if I was considering YC for this segment, as this is a fairly basic thesis to communicate, and if a basic thesis can be muddled, can the advice provided be strong as well, especially compared to peer early stage funders in this space?
Nvidia is trying something similar: https://blogs.nvidia.com/blog/llm-semiconductors-chip-nemo/
I'd want to know about the results of these experiments before casting judgement either way. Generative modeling has actual applications in the 3D printing/mechanical industry.
That sounds like good work, but we can't ignore the context. Nvidia can train their own LLM's on proprietary Nvidia designs, which isn't a possibility for a random startup.
If the evaluation of the approach is "it works great if you train it on a few decades of the best designs from a successful fabless semiconductor company", I would say that if you plan to use that method as a startup, you're clearly going to fail. Nobody's going to give away their crown jewels to train an LLM that designs chips for other companies.
The problem _there_ is that there's very little diversity in the training data - it's all NVidia designs which are probably from the same phylogenetic tree. It'll probably end up regurgitating existing NV designs...
Generative models are bimodal - in certain tasks they are crazy terrible , and in certain tasks they are better than humans. The key is to recognize which is which.
And much more important:
- LLMs can suddenly become more competent when you give them the right tools, just like humans. Ever try to drive a nail without a hammer?
- Models with spatial and physical awareness are coming and will dramatically broaden what’s possible
It’s easy to get stuck on what LLMs are bad at. The art is to apply an LLMs strengths to your specific problem, often by augmenting the LLM with the right custom tools written in regular code
> Ever try to drive a nail without a hammer?
I've driven a nail with a rock, a pair of pliers, a wrench, even with a concrete wall and who knows what else!
I didn't need to be told if these can be used to drive a nail, and I looked at things available, looked for a flat surface on them and good grip, considered their hardness, and then simply used them.
So if we only give them the "right" tools, they'll remain very limited by us not thinking about possible jobs they'll appear as if they know how to do and they don't.
The problem is exactly that: they "pretend" to know how to drive a nail but not really.
Those are all tools !! Congratulations
If you’re creative enough to figure out different tools for humans, you are creative enough to figure out different tools for LLMs
No disagreement there, but if we've got the tools, do we really need an LLM to drive them (it still requires building an adapter from LLM to those tools)?
What is the added value of that combo and at what cost?
Glad to see that the author is highlighting verification as the important factor in design productivity.
We at Silogy [0] are directly targeting the problem of verification productivity using AI agents for test debugging. We analyze code (RTL, testbench, specs, etc.) along with logs and waveforms, and incorporate interactive feedback from the engineer as needed to refine the hypothesis.
[0] https://silogy.io/
They (YC) are interested in the use of LLMs to make the process of designing chips more efficient. Nowhere do they talk about LLMs actually designing chips.
I don't know anything about chip design, but like any area in tech I'm certain there are cumbersome and largely repetitive tasks that can't easily be done by algorithms but can be done with human oversight by LLMs. There's efficiency to be gained here if the designer and operator of the LLM system know what they're doing.
Except that’s now a very standard pitch for technology across basically any industry, and cheapens the whole idea of YC presenting a grand challenge.
I agree LLMs aren't ready to design ASICs. It's likely that in a decade or less, they'll be ready for the times you absolutely need to squeeze out every square nanometer, picosecond, femtojoule, or nanowatt.
Gary Tan's was right[1] in that there is a fundamental inefficiency inherent in the von Neumann architecture we're all using. This gross impedance mismatch[4] is a great opportunity for innovation.
Once ENIAC was "improved" from its original structure to a general purpose compute device in the von Neumann style, it suffered a 83% loss in performance[2] Everything since is 80 years of premature optimization that we need to unwind. It's the ultimate pile of technical debt.
Instead of throwing maximum effort into making specific workloads faster, why not build a chip that can make all workloads faster instead, and let economy of scale work for everyone?
I propose (and have for a while[3]) a general purpose solution.
A systolic array of simple 4 bits in, 4 bits out, Look Up Tables (LUTs) latched so that timing issues are eliminated, could greatly accelerate computation, in a far nearer timeframe.
The challenges are that it's a greenfield environment, with no compilers (though it's probable that LLVM could target it), and a bus number of 1.
[1] https://www.ycombinator.com/rfs-build#llms-for-chip-design
[2] https://en.wikipedia.org/wiki/ENIAC#Improvements
[3] https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...
[4] https://en.wikipedia.org/wiki/Impedance_matching
I find it hard to imagine how you'd implement various simple functions in the bitgrid. It would be interesting if you'd present some simple hand-worked examples.
For example, how it would implement a 1-bit full adder? Like the nitty-gritty details: which input on which cell represents input A, which represents input B, and which represents carry-in? Which output is sum and which is carry-out? What are the functions programmed into each node that it uses?
Then show how to build a 2-bit adder from there.
IDK about LLMs there either.
A non-LLM monte carlo AI approach: "Pushing the Limits of Machine Design: Automated CPU Design with AI" (2023) https://arxiv.org/abs/2306.12456 .. https://news.ycombinator.com/item?id=36565671
A useful target for whichever approach is most efficient at IP-feasible design:
From https://news.ycombinator.com/item?id=41322134 :
> "Ask HN: How much would it cost to build a RISC CPU out of carbon?" (2024) https://news.ycombinator.com/item?id=41153490
I think the problem with this particular challenge is that it is incredibly non-disruptive to the status quo. There are already 100s of billions flowing into using LLMs as well as GPUs for chip design. Nvidia has of course laid the ground work with its culitho efforts. This kind of research area is very hot in the research world as well. It’s by no means difficult to pitch to a VC. So why should YC back it? I’d love to see YC identifying areas where VC dollars are not flowing. Unfortunately, the other challenges are mostly the same — govtech, civictech, defense tech. These are all areas where VC dollars are now happily flowing since companies like Anduril made it plausible.
>If Gary Tan and YC believe that LLMs will be able to design chips 100x better than humans currently can, they’re significantly underestimating the difficulty of chip design, and the expertise of chip designers.
I don't think he's arguing that. More that ASICs can be 100x better than CPUs for say crypto mining and that using LLM type stuff it may be possible to make them for other applications where there is less money available to hire engineers.
(the YC request https://www.ycombinator.com/rfs-build#llms-for-chip-design)
It doesn't seem to work for software either, but hey. Who are we?
> While LLMs are capable of writing functional Verilog sometimes, their performance is still subhuman.
The key word here is "still".
We don't know what the limits of LLMs are.
It's possible that they will reach a dead end. But it is also possible that they will be able to do logic and math.
If (or when) they achieve that point, their performance will quickly become "superhuman" in these kinds of engineering tasks.
But the very next step will be the ability to do logic and math.
Reportedly, they've already hit the dead end: the newest Orion is marginally better than previous ChatGPT model (it's also marginally worse than it in some applications), and there is just no more fresh, non-AI generated data of somewhat good quality to train on.
They want to throw LLMs at everything even if it does not make sense. Same is true for all the AI agent craze: https://medium.com/thoughts-on-machine-learning/langchains-s...
If feels like the entire world has gone crazy.
Even the serious idea that the article thinks could work is throwing the unreliable LLMs at verification! If there's any place you can use something that doesn't work most of the time, I guess it's there.
This is typical of any hype bubble. Blockchain used to be the answer to everything.
What's after this? Because I really do feel the economy is standing on a cliff right now. I don't see anything after this that can prop stocks up.
That’s because we are still waiting for the 2008 bubble to pop, which was inflated by the 2020 bubble. It’s going to be bad. People will blame trump, Harris would be eating the same shit sandwich.
It’s gonna be bad.
What makes you think he won't just inflate the bubble again?
Should we expect money pumps to generate inflation quicker on this cycle than on the last ones? If so, why?
I think only an ignorant person doesn’t see the train wreck coming, and how making more money won’t fix fuck all.
The post-quantum age. Companies will go post-quantum.
I think the operators are learning how to hype-edge. You find that sweet spot between promising and 'not just there yet' where you can take lots of investments and iterate forward just enough to keep it going.
It doesn't matter if it can't actually 'get there' as long as people still believe it can.
Come to think about it, a socioeconomic system dependent on population and economic growth is at a fundamental level driven by this balancing act: "We can solve every problem if we just forge ahead and keep enlarging the base of the pyramid - keep reproducing, keep investing, keep expanding the infrastructure".
Only if it fails in the same way. LLMs and the multi-agent approach operate under the assumption that they are programmable agents and each agent is more of a trade off against failure modes. If you can string them together, and if the output is easily verified, it can be a great fit for the problem.
If you're going to do that you need completely different LLMs to base the agents on. The ones I've tried have "mode collapse" - ask them to emulate different agents and they'll all end up behaving the same way. Simple example, if you ask it to write different stories they'll usually end up having the same character names.
It may depend on the domain. I tend to use LLMs for things that are less open ended, more categorization and summarization response than pure novel creation.
In these situations, I’ve been able to sufficiently program the agent that I haven’t seen too much of an issue as you described. Consistency is a feature.
It's similar in regular programming - LLMs are better at writing test code than actual code. Mostly because it's simpler (P vs NP etc), but I think also because it's less obvious when test code doesn't work.
Replace all asserts with expected ==expected and most people won't notice.
LLMs are pretty damn useful for generating tests, getting rid of a lot of tedium, but yeah, it's the same as human-written tests: if you don't check that your test doesn't work when it shouldn't (not the same thing as just writing a second test for that case - both those tests need to fail if you intentionally screw with their separate fixtures), then you shouldn't have too much confidence in your test.
If LLMs can generate a test for you, it's because it's a test that you shouldn't need to write. They can't test what is really important, at all.
Some development stacks are extremely underpowered for code verification, so they do patch the design issue. Just like some stacks are underpowered for abstraction and need patching by code generation. Both of those solve an immediate problem, in a haphazard and error-prone way, by adding burden on maintenance and code evolution linearly to how much you use it.
And worse, if you rely too much on them they will lead your software architecture and make that burden superlinear.
Claude wrote the harness and pretty much all of these tests, eg:
https://github.com/williamcotton/search-input-query/blob/mai...
It is a good test suite and it saved me quite a bit of typing!
In fact, Claude did most of the typing for the entire project:
https://github.com/williamcotton/search-input-query
BTW, I obviously didn't just type "make a lexer and multi-pass parser that returns multiple errors and then make a single-line instance of a Monaco editor with error reporting, type checking, syntax highlighting and tab completion".
I put it together piece-by-piece and with detailed architectural guidance.
> Replace all asserts with expected ==expected and most people won't notice.
Those tests were very common back when I used to work in Ruby on Rails and automatically generating test stubs was a popular practice. These stubs were often just converted into expected == expected tests so that they passed and then left like that.
> Replace all asserts with expected == expected and most people won't notice.
It’s too resource intensive for all code, but mutation testing is pretty good at finding these sorts of tests that never fail. https://pitest.org/
I mean, define ‘better’. Even with actual human programmers, tests which do not in fact test the thing are already a bit of an epidemic. A test which doesn’t test is worse than useless.
This happens all the time.
Once it was spices. Then poppies. Modern art. The .com craze. Those blockchain ape images. Blockchain. Now LLM.
All of these had a bit of true value and a whole load of bullshit. Eventually the bullshit disappears and the core remains, and the world goes nuts about the next thing.
Exactly. I’ve seen this enough now to appreciate that oft repeated tech adoption curve. It seems like we are in “peak expectations” phase which is immediately followed by the disillusionment and then maturity phase.
If your LLM is producing a proof that can be checked by another program, then there’s nothing wrong with their reliability. It’s just like playing a game whose rules are a logical system.
> They want to throw LLMs at everything [..]
Oh yes.
I had a discussion with a manager at a client last week and was trying to run him through some (technical) issues relating to challenges an important project faces.
His immediate response was that maybe we should just let ChatGPT help us decide the best option. I had to bite my tongue.
OTOH, I'm more and more convinced that ChatGPT will replace managers long before it replaces technical staff.
This makes complete sense from an investor’s perspective, as it increases the chances of a successful exit. While we focus on the technical merits or critique here on HN/YC, investors are playing a completely different game.
To be a bit acerbic, and inspired by Arthur C. Clarke, I might say: "Any sufficiently complex business could be indistinguishable from Theranos".
Theranos was not a "complex business". It was deliberate fraud and deception, and investors that were just gullible. The investors should have demanded to see concrete results
I expected you to take this with a grain of salt but also to read between the lines: while some projects involve deliberate fraud, others may simply lack coherence and inadvertently follow the principles of the greater fool theory [1]. The use of ambiguous or indistinguishable language often blurs the distinction, making it harder to differentiate outright deception from an unsound business model.
[1] https://en.wikipedia.org/wiki/Greater_fool_theory
yes thats how we progress this is how the internet boom happened as well everything became . com then the real workable businesses were left and all the unworkable things were gone.
Recently I came across some one advertising an LLM to generate fashion magazine shoot in Pakistan at 20-25% of the cost. It hit me then that they are undercutting the fashion shoot of country like Pakistan which is already cheaper by 90-95% from most western countries. This AI is replacing the work of 10-20 people.
The annoying part, a lot of money could be funneled into these unworkable businesses in the process, crypto being a good example. And these unworkable businesses tend to try to continue getting their way into the money somehow regardless. Most recent example was funneling money from Russia into Trump’s campaign.
> The annoying part, a lot of money could be funneled into these unworkable businesses in the process, crypto being a good example
There was a thread here about why ycombinator invests into several competing startups. The answer is success is often more about connections and politics than the product itself. And crypto, yes, is a good example of this. Musk will get his $1B in bitcoins back for sure.
> Most recent example was funneling money from Russia into Trump’s campaign.
Musk again?
It really feels like we’re close to the end of the current bubble now; the applications being trotted out are just increasingly absurd.
Isn't that the case with every new tech. There was a time in which people tried to cook everything in a microwave
When did OpenMicroWave promise to solve every societal problem if we just gave it enough money to built a larger microwave oven?
Microwave sellers did not become trillion dollar companies off that hype
Mostly because the marginal cost of microwaves was not close to zero.
Mostly because they were not making claims that sentient microwaves that would cook your food for you were just around the corner which then the most respected media outlets parroted uncritically.
Even rice cookers started doing this by advertising "fuzzy logic".
Fuzzy logic rice cookers are the result of an unrelated fad in 1990s Japanese engineering companies. They added fuzzy controls to everything from cameras to subways to home appliances. It's not part of the current ML fad.
Yes. My point is that technology fads aren't new and getting mad at them is a bit like getting mad at fashion or taste.
I mean, they were at one point making pretty extravagant claims about microwaves, but to a less credulous audience. Trouble with LLMs is that they look like magic if you don’t look too hard, particularly to laypeople. It’s far easier to buy into a narrative that they actually _are_ magic, or will become so.
I feel like what makes this a bit different from just regular old sufficiently advanced technology is the combination of two things:
- LLMs are extremely competent at surface-level pattern matching and manipulation of the type we'd previously assumed that only AGI would be able to do.
- A large fraction of tasks (and by extension jobs) that we used to, and largely still do, consider to be "knowledge work", i.e. requiring a high level of skill and intelligence, are in fact surface-level pattern matching and manipulation.
Reconciling these facts raises some uncomfortable implications, and calling LLMs "actually intelligent" lets us avoid these.
[dead]
https://archive.ph/dLp6t
LLMs have powered products used by hundreds of millions, maybe billions. Most experiments will fail and that's okay, arguably even a good thing. Only time will tell which ones succeed
> I knew it was bullshit from the get-go as soon as I read their definition of AI agents.
That is one spicy article, it got a few laughs out of me. I must agree 100% that Langchain is an abomination, both their APIs as well as their marketing.
please dont post a link that is behind a paywall !!
https://archive.is/dLp6t
It is a registration wall I think.
Same result. Information locks are verboten.
As annoying as I find them, on this site they're in fact not: https://news.ycombinator.com/item?id=10178989
Please don't complain about paywalls: https://news.ycombinator.com/item?id=10178989
I disagree with the premise of this article. Modern AI can absolutely be very useful and even disruptive when designing FPGA's. Of course, it isn't there today. That does not mean this isn't a solution who's time has come.
I have been working on FPGA's and, in general, programmable logic, for somewhere around thirty years (started with Intel programmable logic chips like the 5C090 [0] for real time video processing circuits.
I completely skipped over the whole High Level Synthesis (HLS) era that tried to use C, etc. for FPGA design. I stuck with Verilog and developed custom tools to speed-up my work. My logic was simple: If you try to pound a square peg into a round hole, you might get it done yet, the result will be a mess.
FPGA development is hardware development. Not software. If you cannot design digital circuits to begin with, no amount of help from a C-to-Verilog tool is going to get you the kind of performance (both in terms of time and resources) that a hardware designer can squeeze out of the chip.
This is not very different from using a language like Python vs. C or C++ to write software. Python "democratizes" software development at a cost of 70x slower performance and 70x greater energy consumption. Sure, there are places where Python makes sense. I'll admit that much.
Going back to FPGA circuit design, the issue likely has to do with the type, content and approach to training. Once again, the output isn't software; the end product isn't software.
I have been looking into applying my experience in FPGA's across the entire modern AI landscape. I have a number of ideas, none well-formed enough to even begin to consider launching a startup in the sector. Before I do that I need to run through lots of experiments to understand how to approach it.
[0] https://www.cpu-galaxy.at/cpu/ram%20rom%20eprom/other_intel_...
I don't know the space well enough, but I think the missing piece is that YC 's investment horizon is typically 10+ years. Not only LLMs could get massively better, but the chip industry could be massively disrupted with the right incentives. My guess is that that is YC's thesis behind the ask.
LLM based automated verification surely isn't something that easily works out of the box, but that doesn't mean ventures shouldn't try to work on it.
The purpose of capital is to make progress from where we are now.
This is not my domain so my knowledge is limited, but I wonder if the chip designers have some sort of a standard library of ready to use components. Do you have to design e.g. ALU every time you design a new CPU or is there some standard component to use? I think having a proven components that can be glued on a higher level may be the key to productivity here.
Returning to LLMs. I think the problem here may be that there is simply not enough learning material for LLM. Verilog comparing to C is a niche with little documentation and even less open source code. If open hw were more popular I think LLMs could learn to write better Verilog code. Maybe the key is to persuade hardware companies to share their closed source code to teach LLM for the industry benefit?
There are component libraries, though they're usually much lower level than an ALU. For example Synopsys Designware:
https://www.synopsys.com/dw/buildingblock.php
Or learning through self-play. Chip design sounds like an area where (this would be hard!) a sufficiently powerful simulator and/or FPGA could allow reinforcement learning to work.
Current LLMs can’t do it, but the assumption that that’s what YC meant seems wildly premature.
The most common thing you see shared is something called IP which does mean intellectual property, but in this context you can think of it like buying ICs that you integrate into your design (ie you wire them up). You can also get Verilog, but that is usually used for verification instead of taping out the peripheral. This is because the company you buy the IP from will tape out the design for a specific node in order to guarantee the specifications. Examples of this would be everything from arm cores to uart and spi controllers as well as pretty much anything you could buy as a standalone IC.
When I think of AI in chip design, optimizations like these come to mind,
https://optics.ansys.com/hc/en-us/articles/360042305274-Inve...
https://optics.ansys.com/hc/en-us/articles/33690448941587-In...
> If an application doesn’t warrant hardware acceleration yet, it’s probably because it’s a small market, and that makes it a poor target for a startup.
But selling shovels that are useful in many small markets can still be a viable play, and that’s how I understand YC’s position here.
The whole concept of "request for startup" is entirely misguided imo.
YC did well because they were good at picking ideas, not generating them.
>YC did well because they were good at picking ideas, not generating them.
This doesn't line up with the perennial attitude (as discussed by pg) that YC picks people/teams and not ideas, because while ideas and approaches may change, the people are the same and having a good founder, co-founder and team matters the most.
Their M.O. is to avoid getting too attached to an idea because, in the process of actually building the company, pivots may be required. And so the focus is on a team moreso than a business plan, which again, is not something pg is particularly fond of seeing especially the ones that have lengthy (and therefore improbable/unrealistic) forecasts.
I worry that this post assumes LLMs won't get much better over time. This is possible, but YC bets that they will. The right time to start an LLM application layer company is arguably 6-12 months before LLMs get good enough for that purpose, so you can be ahead of the curve.
I did my PhD on trying to use ML for EDA (de novo design/topology generation, because deepmind was doing placement and I was not gonna compete with them as a single EE grad who self taught ML/optimization theory during the PhD).
In my opinion, part of the problem i that training data is scarce (real world designs are literally called "IP" in the industry after all...), but more than that, circuit design is basically program synthesis, which means it's _hard_. Even if you try to be clever, dealing with graphs and designing discrete objects involves many APX-hard/APX-complete problems, which is _FUN_ on the one had, but also means it's tricky to just scale through, if the object you are trying to do is a design that can cost millions if there's a bug...
I think this whole article is predicated on misinterpreting the ask. It wasn't for the chip to take 100x less power, it was for the algorithm the chip implements. Modern synthesis tools and optimisers extensively look for design patterns the same way software compilers do. That's why there's recommended inference patterns. I think it's not impossible to expect an LLM to expand the capture range of these patterns to maybe suboptimal HDL. As a simple example, maybe a designer got really turned around and is doing some crazy math, and the LLM can go "uh, that's just addition my guy, I'll fix that for you."
Was surprised this comment was this far down. I re-read the YC ask three times to make sure I wasn’t crazy. Dude wrote the whole article based on a misunderstanding.
Thanks... I had more points earlier but I guess people changed their mind and decided they liked it better his way idk
LLMs are wrong for most things imo. LLMs are great conversational assistants, but there is very little linguistic rigor to them, if any. They have almost no generalization ability, and anecdotally they fall for the same syntactic pitfalls they've fallen for since BERT. Models have gotten so good at predicting this n-dimensional "function" that sounds like human speech, we're getting distracted from seeing their actual purpose and trying to apply them to all sorts of problems that rely on more than text-based training data.
Language is cool and immensely useful. LLMs, however, are fundamentally flawed from their basic assumptions about how language works. The distribution hypothesis is good for paraphrasing and summarization, but pretty atrocious for real reasoning. The concept of an idea living in a semantic "space" is incompatible with simple vector spaces, and we are starting to see this actually matter in minutia with scaling laws coming into play. Chip design is a great example of where we cannot rely on language alone to solve all our problems.
I hope to be proven wrong, but still not sold on AGI being within reach. We'll probably need some pretty significant advancements in large quantitative models, multi-modal models and smaller, composable models of all types before we see AGI
The 2 first paragraphs are in contradiction with my results with working with LLMs. There is definitely some form of reasoning that has emerged. Some people will still find it not convincing enough to be called reasoning, but that just a quantitative limitation at the moment.
With respect to AGI in its broadest sense: indeed it is not in reach. I think that is for the better!
If a transformer had infinite data and parameters, I'm sure it could simulate human reasoning to a high degree. Humans don't work that way, so we may need to create a more general definition for artificial reasoning
LLMs autocomplete text. That's all.
Other intelligent effects are coincidental.
Neurons grow in response to regular patterns of electrical stimulation.
Other intelligent effects are coincidental.
I played around using genetic algorithms to design on fpga (Xilinx 6200 I think), about 25 years ago .. nothing came of that ...
<sarcasm>You could have the LLM design a chip that accurately documents all the ways an LLM will get chip design wrong.
I disagree with most of the reasoning here, and think this post misunderstands the opportunity and economic reasoning at play here.
> If Gary Tan and YC believe that LLMs will be able to design chips 100x better than humans currently can, they’re significantly underestimating the difficulty of chip design, and the expertise of chip designers.
This is very obviously not the intent of the passage the author quotes. They are clearly talking about the speedup that can be gained from ASICs for a specific workload, eg dedicated mining chips.
> High-level synthesis, or HLS, was born in 1998, when Forte Design Systems was founded
This sort of historical argument is akin to arguing “AI was bad in the 90s, look at Eliza”. So what? LLMs are orders of magnitude more capable now.
> Ultimately, while HLS makes designers more productive, it reduces the performance of the designs they make. And if you’re designing high-value chips in a crowded market, like AI accelerators, performance is one of the major metrics you’re expected to compete on.
This is the crux of the author's misunderstanding.
Here is the basic economics explanation: creating an ASIC for a specific use is normally cost-prohibitive because the cost of the inputs (chip design) is much higher than the outputs (performance gains) are worth.
If you can make ASIC design cheaper on the margin, and even if the designs are inferior to what an expert human could create, then you can unlock a lot of value. Think of all the places an ASIC could add value if the design was 10x or 100x cheaper, even if the perf gains were reduced from 100x to 10x.
The analogous argument is “LLMs make it easier for non-programmers to author web apps. The code quality is clearly worse than what a software engineer would produce but the benefits massively outweigh, as many domain experts can now author their own web apps where it wouldn’t be cost-effective to hire a software engineer.”
Software folk underestimating hardware? Surely not.
> If Gary Tan and YC believe that LLMs will be able to design chips 100x better than humans currently can, they’re significantly underestimating the difficulty of chip design, and the expertise of chip designers. While LLMs are capable of writing functional Verilog sometimes, their performance is still subhuman. [...] LLMs primarily pump out mediocre Verilog code.
What is the quality of Verilog code output by humans? Is it good enough so that a complex AI chip can be created? Or does the human need to use tools in order to generate this code?
I've got the feeling that LLMs will be capable of doing everything a human can do, in terms of thinking. There shouldn't be an expectation that an LLM is able to do everything, which in this context would be thinking about the chip and creating the final files in a single pass and without external help. And with external help I don't mean us humans, but tools which are specialized and also generate some additional data (like embeddings) which the LLM (or another LLM) can use in the next pass to evaluate the design. And if we humans have spent enough time in creating these additional tools, there will come a time when LLMs will also be able to create improved versions of them.
I mean, when I once randomly checked the content of a file in The Pile, I found an Craigslist "ad" for an escort offering her services. No chip-generating AI does need to have this in its parameters in order to do its job. So there is a lot of room for improvement and this improvement will come over time. Such an LLM doesn't need to know that much about humans.
LLMs only reach the performance they do because of the sheer scale of data they ingest. Training them on less data doesn't work as well, or at least you will overfit like crazy on anything the size of current models. So the question is where are you going to get anywhere near the volume of verilog code as is present in The Pile? The total amount of verilog ever written is almost certainly a few orders of magnitude less.
Past failures do not rule out the possibility of success in future attempts.
Sufficiently advanced prognosticators are indistinguishable from priests.
The bottleneck for LLM is fast and large memory, not compute power.
Whoever is recommending investing in better chip(ALU) design hasn't done even a basic analysis of the problem.
Tokens per second = memory bandwidth divided by model size.
If cryptocurrency mining could be significantly optimized (one of the example goals in the article) wouldn't that just destroy the value of said currency?
No they all have escalating difficulty algorithms.
https://en.bitcoin.it/wiki/Difficulty
hi, this is my article! thanks so much for the views, upvotes, and comments! :)
This heavily overlaps with my current research focus for my Ph.D., so I wanted to provide some additional perspective to the article. I have worked with Vitis HLS and other HLS tools in the past to build deep learning hardware accelerators. Currently, I am exploring deep learning for design automation and using large language models (LLMs) for hardware design, including leveraging LLMs to write HLS code. I can also offer some insight from the academic perspective.
First, I agree that the bar for HLS tools is relatively low, and they are not as good as they could be. Admittedly, there has been significant progress in the academic community to develop open-source HLS tools and integrations with existing tools like Vitis HLS to improve the HLS development workflow. Unfortunately, substantial changes are largely in the hands of companies like Xilinx, Intel, Siemens, Microchip, MathWorks (yes, even Matlab has an HLS tool), and others that produce the "big-name" HLS tools. That said, academia has not given up, and there is considerable ongoing HLS tooling research with collaborations between academia and industry. I hope that one day, some lab will say "enough is enough" and create a open-source, modular HLS compiler in Rust that is easy to extend and contribute to—but that is my personal pipe dream. However, projects like BambuHLS, Dynamatic, MLIR+CIRCT, and XLS (if Google would release more of their hardware design research and tooling) give me some hope.
When it comes to actually using HLS to build hardware designs, I usually suggest it as a first-pass solution to quickly prototype designs for accelerating domain-specific applications. It provides a prototype that is often much faster or more power-efficient than a CPU or GPU solution, which you can implement on an FPGA as proof that a new architectural change has an advantage in a given domain (genomics, high-energy physics, etc.). In this context, it is a great tool for academic researchers. I agree that companies producing cutting-edge chips are probably not using HLS for the majority of their designs. Still, HLS has its niche in FPGA and ASIC design (with Siemens's Catapult being a popular option for ASIC flows). However, the gap between an initial, naive HLS design implementation and one refined by someone with expert HLS knowledge is enormous. This gap is why many of us in academia view the claim that "HLS allows software developers to do hardware development" as somewhat moot (albeit still debatable—there is ongoing work on new DSLs and abstractions for HLS tooling which are quite slick and promising). Because of this gap, unless you have team members or grad students familiar with optimizing and rewriting designs to fully exploit HLS benefits while avoiding the tools' quirks and bugs, you won't see substantial performance gains. Al that to say, I don't think it is fair to comply write off HLS as a lost cause or not sucesfull.
Regarding LLMs for Verilog generation and verification, there's an important point missing from the article that I've been considering since around 2020 when the LLM-for-chip-design trend began. A significant divide exists between the capabilities of commercial companies and academia/individuals in leveraging LLMs for hardware design. For example, Nvidia released ChipNeMo, an LLM trained on their internal data, including HDL, tool scripts, and issue/project/QA tracking. This gives Nvidia a considerable advantage over smaller models trained in academia, which have much more limited data in terms of quantity, quality, and diversity. It's frustrating to see companies like Nvidia presenting their LLM research at academic conferences without contributing back meaningful technology or data to the community. While I understand they can't share customer data and must protect their business interests, these closed research efforts and closed collaborations they have with academic groups hinder broader progress and open research. This trend isn't unique to Nvidia; other companies follow similar practices.
On a more optimistic note, there are now strong efforts within the academic community to tackle these problems independently. These efforts include creating high-quality, diverse hardware design datasets for various LLM tasks and training models to perform better on a wider range of HLS-related tasks. As mentioned in the article, there is also exciting work connecting LLMs with the tools themselves, such as using tool feedback to correct design errors and moving towards even more complex and innovative workflows. These include in-the-loop verification, hierarchical generation, and ML-based performance estimation to enable rapid iteration on designs and debugging with a human in the loop. This is one area I'm actively working on, both at the HDL and HLS levels, so I admit my bias toward this direction.
For more references on the latest research in this area, check out the proceedings from the LLM-Aided Design Workshop (now evolving into a conference, ICLAD: https://iclad.ai/), as well as the MLCAD conference (https://mlcad.org/symposium/2024/). Established EDA conferences like DAC and ICCAD have also included sessions and tracks on these topics in recent years. All of this falls within the broader scope of generative AI, which remains a smaller subset of the larger ML4EDA and deep learning for chip design community. However, LLM-aided design research is beginning to break out into its own distinct field, covering a wider range of topics such as LLM-aided design for manufacturing, quantum computing, and biology—areas that the ICLAD conference aims to expand on in future years.
Thank god humans are superior in chips design especially when you have dozens billions of dollars behind you, just like Intel. Oh wait.
I wonder if it’s because llm doesn’t have access to state of the art Verilog?
I mean I assume the best is heavily guarded.
but.. but.. muh AI
the AI hype train is basically investors not understanding tech, don’t get me wrong AI in itself could be a huge thing if used right but the things getting the most attention in the current market aren’t it
The "naive", all-or-nothing view on LLM technology is, frankly, more tiring than the hype.
Had to nop out at "just next token prediction". This article isn't worth your time.
Please don’t do this, Zach. We need to encourage more investment in the overall EDA market not less. Garry’s pitch is meant for the dreamers, we should all be supportive. It’s a big boat.
Would appreciate the collective energy being spent instead towards adding to amor refining Garry’s request.
The article seems to be be based on the current limitations of LLMs. I don't think YC and other VCs are betting on what LLMs can do today, I think they are betting on what they might be able to do in the future.
As we've seen in the recent past, it's difficult to predict what the possibilities are for LLMS and what limitations will hold. Currently it seems pure scaling won't be enough, but I don't think we've reached the limits with synthetic data and reasoning.
>The article seems to be be based on the current limitations of LLMs. I don't think YC and other VCs are betting on what LLMs can do today, I think they are betting on what they might be able to do in the future.
Do we know what LLMs will be able to do in the future? And even if we know, the startups have to work with what they have now, until that future comes. The article states that there's not much to work with.
Show me a successful startup that was predicated on the tech they’re working with not advancing?
Most successful startups were able to make the thing that they wanted to make, as a startup, with existing tech. It might have a limited market that was expected to become less limited (a web app in 1996, say), but it was possible to make the thing.
This idea of “we’re a startup; we can’t actually make anything useful now, but once the tech we use becomes magic any day now we might be able to make something!” is basically a new phenomenon.
Most? I can list tens of them easily. For example what advancements were required for Slack to be successful? Or Spotify (they got more successful due to smartphones and cheaper bandwidth but the business was solid before that)? Or Shopify?
Slack bet on ubiquitous, continuous internet access. Spotify bet on bandwidth costs falling to effectively zero. Shopify bet on D2C rising because improved search engines, increased internet shopping (itself a result of several tech trends plus demographic changes).
For a counterexample I think I’d look to non-tech companies. OrangeTheory maybe?
The notion of a startup gaining funding to develop a fantasy into reality is relatively new.
It used to be that startups would be created to do something different with existing tech or to commercialise a newly-discovered - but real - innovation.
Every single software service that has ever provided an Android or iOS application, for starters.
Tomorrow, LLMs will be able to perform slightly below-average versions of whatever humans are capable of doing tomorrow. Because they work by predicting what a human would produce based on training data.
This severely discounts the fact that you’re comparing a model that _knows the average about everything_ to a single human’s capabilit. Also they can do it instantly, instead of having to coordinate many humans over long periods of time. You can’t straight up compare one LLM to one human
"Knows the average relationship amongst all words in the training data" ftfy
it seems that's sufficient to do a lot of things better than the average human - including coding, writing, creating poetry, summarizing and explaining things...
A human specialized in any of those things vastly outperforms the average human let alone an LLM.
You’re entirely missing the point
It's worth considering
1) all the domains there is no training data
Many professions are far less digital than software, protect IP more, and are much more akin to an apprenticeship system.
2) the adaptability of humans in learning vs any AI
Think about how many years we have been trying to train cars to drive, but humans do it with a 50 hours training course.
3) humans ability to innovate vs AIs ability to replicate
A lot of creative work is adaptation, but humans do far more than that in synthesizing different ideas to create completely new works. Could an LLM produce the 37th Marvel movie? Yes probably. Could an LLM create.. Inception? Probably not.
You could replace “LLM” in your comment with lots of other technologies. Why bet on LLMs in particular to escape their limitations in the near term?
Because YCombinator is all about r-selecting startup ideas, and making it back on a few of them generating totally outsized upside.
I think that LLMs are plateauing, but I'm less confident that this necessarily means the capabilities we're using LLMs for right now will also plateau. That is to say it's distinctly possible that all the talent and money sloshing around right now will line up a new breakthrough architecture in time to keep capabilities marching forward at a good pace.
But if I had $100 million, and could bet $200 thousand that someone can make me billions on machine learning chip design or whatever, I'd probably entertain that bet. It's a numbers game.
> But if I had $100 million, and could bet $200 thousand that someone can make me billions on machine learning chip design or whatever, I'd probably entertain that bet. It's a numbers game.
Problem with this reasoning is twofold: start-ups will overfit to getting your money instead of creating real advances; competition amongst them will drive up the investment costs. Pretty much what has been happening.
> I think they are betting on what they might be able to do in the future.
Yeah, blind hope and a bit of smoke and lighting.
> but I don't think we've reached the limits with synthetic data
Synthetic data, at least for visual stuff can, in some cases provide the majority of training data. For $work, we can have say 100k video sequences to train a model, they can then be fine tuned on say 2k real videos. That gets it to be slightly under the same quality as if it was train on pure real video.
So I'm not that hopeful that synthetic data will provide a breakthrough.
I think the current architecture of LLMs are the limitation. They are fundamentally a sequence machine and are not capable of short, or medium term learning. context windows kinda makes up for that, but it doesn't alter the starting state of the model.