I was in a discussion yesterday about introducing young people (17-18) to generative text models. I noted that
Thoughts from James who recently held a Gen AI literacy workshop for older teenagers.
On risks:
One idea I had was to ask a generative model a question and fact check points in front of students, allowing them to see fact checking as part of the process. Upfront, it must be clear that while AI-generated text may be convincing, it may not be accurate.
On usage:
Generative text should not be positioned as, or used as, a tool to entirely replace tasks; that could disempower. Rather, it should be taught to be used as a creativity aid. Such a class should involve an exercise of making something.
What we (people in general, that use the internet, regardless of government/country) need, in large part, is literacy. Not "gen AI literacy" or "media literacy", but simply "literacy".
I'm saying this because of a lot of the output of those text generators: says a lot without conveying much, it connects completely unrelated concepts because they happen to use similar words, it makes self-contradictory claims, things like this. And often its statements are completely unrelated to the context at hand. People with good literacy detect those things right off the bat, but people who struggle with basic reading comprehension don't.
The thing that strikes me about LLMs is that they have been created to chat. To converse. They’re partly influenced by Turing tests where the objective is to convince someone you’re human by keeping up a conversation. They weren’t designed to create meaningful content or factual content.
People still seem to want to use chat GPT to create something, and fix the accuracy as a second step. I say go back to the drawing board and create a tool that analyses statements and tries to create information based on trusted linked open data sources.
LLMs are not created to chat, they're literally what the name says - language models. They are very complex statistical models of the joint causal probability of all possible words given the previous words in the context window. There's a common misconception that they're "made for chat" by the wider public because ChatGPT was the first "killer application", but they are much more general than that. What's so profound about LLMs to AI and NLP engineers is that they're general purpose. That is, given the right framework they can be used to complete any task expressible in natural language. It's hard to convey to people just how powerful that is, and I haven't seen software engineers really figure this out yet either. As an example I keep going back to, I made a library to create "semantic functions" in Python which look like this:
@semantic
def list_people(text) -> list[str]:
'''List the people mentioned in the given text.'''
That is the entire function, expressed in the docstring. 10 months ago, this would’ve been literally impossible. I could approximate it with thousands of lines of code using SpaCy and other NLP libraries to do NER, maybe a dictionary of known names with fuzzy matching, some heuristics to rule out city names or more advanced sentence structure parsing for false positives, but the result would be guaranteed to be worse for significantly more effort. Here, I just tell the AI to do it and it… does. Just like that. But you can’t hype up an algorithm that does boring stuff like NLP, so people focus on the danger of AI (which is real, but laymen and news focus on the wrong things), how it’s going to take everyone’s jobs (it will, but that’s a problem with our system which equates having a job to being allowed to live), how it’s super-intelligent, etc. It’s all the business logic and doing things that are hard to program but easy to describe that will really show off its power.
I also think that they should go back to the drawing board, to add another abstraction layer: conceptualisation.
LLMs simply split words into tokens (similar-ish to morphemes) and, based on the tokens found in the input and preceding answer tokens, they throw a die to pick the next token.
This sort of "automatic morpheme chaining" does happen in human Language¹, but it's fairly minor. More than that: we associate individual and sets of morphemes with abstract concepts². Then we handle those concepts in contrast with our world knowledge³, give them some truth value, moral assessment etc., and then we recode them back into words. LLMs do not do anything remotely similar.
Let me give you an example. Consider the following sentence:
The king of Italy is completely bald because his hair is currently naturally green.
A human being can easily see a thousand issues with this sentence. But more importantly, we do it based on the following:
world knowledge: Italy is a republic, thus it has no king.
world knowledge: humans usually don't have naturally green hair.
logic applied to the concepts: complete baldness implies absence of hair. Currently naturally green hair implies presence of hair. One cannot have absence and presence of hair at the same time.
world knowledge and logic: to the best that we know, the colour of someone's hair has zero to do with baldness.
In all those cases we need to refer to the concepts behind the words, not just the words.
I do believe that a good text generator could model some conceptualisation. And even world knowledge. If such a generator was created, it would easily surpass LLMs even with considerably lower linguistic input.
Notes:
By "Language" with capital L, I mean the human faculty, not stuff like Mandarin or English or Spanish etc.
Structuralism would call those concepts "signified", and the morphemes conveying it "signifier", if you want to look for further info. Saussure should be rather useful for that.
"World knowledge" refers to the set of concepts that we have internalised, that refer to how we believe that the world works.
I am a software engineer, I have literally forked tensorflow and modified the executor, and I have created neural networks for predicting aquaculture KPIs that have been deployed with great success.
I stopped looking for a year, and now I feel AI illiterate. (insert "too afraid to ask" meme)
My experience suggests it's too early to start teaching people. Let the technology do its loop and settle down.
I think the people who really need a crash course in AI literacy are the people my age and older.
I've already heard at least one horror story about someone's boss trying to include a ChatGPT-sourced, error riddled submission paper in a sensitive bid.
Ha ha sorry I see now my comment is meaningless without that. I'm just at the very end of Gen X, so called Xennial/Settlers of Catan generation.
So basically I'm saying some of the Gen X people and Boomers in senior roles now seem not to understand the limitations of LLMs and are trying to incorporate them anyway.
Not at all, from my experience they teach you how to use the websites and programs needed to complete assignments and nothing more, same went for teachers and faculty who would have no idea how to do things like change inputs on displays, turn on projectors, tell the difference between online and local versions of software, etc