Part 4 : Generative AI/ChatGPT Models — State of the Union
Quick Background ( … for those who came in late)
This is Part 4 of “A ChatGPT guide for Techno-Executives”.
This is last of the 4 week blogs — a set of (~5/6/7 soon) Byte-sized, Targeted, Nimble, Visual (with appropriate pictures) and Snackable (each blog, a few short pages, goes well with a few shots of a single malt, preferably triple distilled) aiming to be “usefully wrong” — like our protagonist the ChatGPT !
- “Part 1 : Are we Doomed” [Here] dealt with how to think about ChatGPT / GenerativeAI / “Increasingly Multi- or General-Purpose AI”
- “Part 2 : ChatGPT Threat Vectors & Guardrails for LLMOps” [Here] introduced concepts on how to think about threats and mitigations. It also looked at the canonical use cases as well as the team, talent & organization to support ChatGPT projects. It is a little different than traditional projects !
- “Part 3 : ChatGPT — A Few Good Numbers & Current Wisdom (of the crowds!)” [Here] has a few numbers viz. estimates of the market size, the unicorns, model training resources et al.
- Here we will explore the LLM eco system — very short
- For the next few weeks, I have some ideas on Transformers that might be a good “Part 5 : An Ode to a Transformer” and may be a “Part 6 : Explainability of the Unexplainable”.
- In between I wrote an interlude blog “Part 5a. ChatGPT- The Smooth Talking Stochastic Parrot” [Here]
Back to the main feature …
The LLM Models
The Transformer based LLM models have evolved — they are more capable and they are also much larger. As we saw earlier, they cost billions of dollars to train !
I thought raising a kid was expensive, raising an AI is a lot more expensive !
The major players are, of course, Google, OpenAI and startups like Anthropic and others. Microsoft is a major investor in OpenAI and is adding the Generative AI capabilities into Windows 11 as well as Azure Cloud.
Is there a winner ? Do Defensible Moats Exist ?
An interesting internal paper from Google was leaked, basically saying there are no long term differentiation and that open source models are catching up.
Prof.Domingos had good observations.
Generative AI will soon become status quo, essential component of an organizations tech stack.
But the barriers to create a LLM from scratch is extremely high and probably defensible moats do exist in the Generative AI space.
Let us look at Google. Just after OpenAI releases ChatGPT, the chatter was that Google is lost, it’s search empire is at siege and that it has lost it’s edge. Remember, Google started the Transformer architecture and had the edge till then. In May, Google released Bard and a host of announcements.
Then they started calling ChatGPT the floppy disk of AI !
Eventually search, as we know of it, might go away, Amazon and Google might become obsolete and Generative AI might change the ad eco system — but for the short term calamity averted !
In the end, Google did (somewhat) level the competition. And, of course, OpenAI added more features (plugin et al) and opened connectivity to other apps.
Microsoft
Soon after, at their Build conference, Microsoft had a host of very advanced announcements, integrating Generative AI to a host of domains — from windows to developer platforms and the cloud.
Anthropic
One space that needs a lot of advancement is in the area of ethics, toxicity et al. Anthropic has this Constitutional AI — to embed ethics in the LLM itself — very reminiscent of Issac Asimov’s 3 Laws of robotics ! Look promising. They need to embed more laws, as depicted in the diagram below.
What about OpenSource ?
The current consensus is that there will be open source LLMs that are equally good. I like the opinions at the unusual.vc (Here) I liked the picture of the robot as well !!
Final word on Generative AI motes
Andreessen/Horowitz has a good blog on this topic
So far, no strong durable motes, maybe competitive advantage at certain layers of the stack. Makes sense, as the domain is very new and (as we discussed earlier) there are lots of difficult threat vectors.
There won’t be one model to rule them all !
An insightful blog sums it up — there will be a long tail of general use cases (supported by commodity LLMs) and then there will be high-value, specialized systems (may be because of the domain, may be due to security requirements) that need deep technical stacks including guardrails, very large models, curated data and so forth.
In a span of a few minutes we went from AI to floppies to dancing bears to motes ! I hope this short blog gave you a broad view of the LLM eco system !
In between on to my interlude blog “Part 5a. ChatGPT- The Smooth Talking Stochastic Parrot” [Here]
Onward to the next blog “Part 5 : An Ode to a Transformer !”
Actually Part 5 is the blog I wanted to write — gives me a chance to dive deep into transformers.
I had conducted a Transformers class at the Nvidia GTC 2020 (Here); it is going to a few more weeks to add latest developments and summarize.
During the memorial weekend I had teed up the 4 blogs and now plan to do some fine tuning — which I would have done, by the time y’all see this blog !