4 Comments
User's avatar
議事之峰 The Gentleman Tech Bro's avatar

I have seen this description of LLMs a couple months ago, and I really like it: "leaky abstraction". And in my experience, the less intelligent the model, the more leaky it is.

楓葉國小K's avatar

I feel the same way. Models from a few years ago often behaved in ways that confused users. Things have improved dramatically, and the newer models perform much better now. However, this introduces a new challenge: how can we ensure system consistency when switching to a newer model to take advantage of its improved performance?

議事之峰 The Gentleman Tech Bro's avatar

Interesting question that you brought up. Before you mentioned it, I thought things really would become easier: you really can tell the model what output you need, and it will consistently produce the appropriate output. Tool use and agentic workflows have made that even easier.

(Agents can cross check each other, but admittedly OK, additional complexity means additional risks.)

In fact, the engineering challenge I can think of is to rip out the manual, heuristic safeguards that were necessary for dumber models before.

Am I missing something?

楓葉國小K's avatar

You're probably right. I'm still fairly new to this field and exploring how Gen AI works in practice. I recently tried building a ReAct agent and noticed that, even with the same prompt, different models can behave quite differently e.g. it produces a correct output but calls a tool multiple times unnecessarily, which shouldn't happen.

My next step is to experiment with multiple agents, as you mentioned, and have them cross-check each other. That seems like a more promising direction.