From Boolean Search to LLMs

May 26, 2025

The world of AI prides itself on a constant stream of new jargon that updates weekly. In conversations where words like "agentic AI," "embedding," and "parameters" got tossed around like popcorn, it was easy to feel lost. I needed a shortcut to know just enough so I could do my job effectively.

I thought that the shortcut was my husband, a seasoned AI researcher who co-authored the "word2vec" paper, until I asked him to explain "parameter". His response?

“It is the number of weights and biases contained in the model's tensors.”

So I went to ChatGPT and Gemini and asked: "what do parameters actually look like?" I picked an answer I could visualize in my head and asked my husband to validate. Of course, the answer was grossly simplified, however, having a grossly simplified understanding is way better than no understanding at all!

So, in this and the next blogs, I am going to grossly simplify the most fundamental AI concepts by turning them into examples and images. Let's start with what is not AI.

What Is NOT AI?

If you went to law school 20+ years ago, you might remember searching for case laws on the 4th amendment to the U.S. Constitution using Boolean search strings like this:

((("search" OR "seizure") NEAR/20 "probable cause"))

The LexisNexis and WestLaw at the time were rule-based systems using rules written by human in the form of IF‑THEN statements:

IF a documents has the word search OR seizure appears WITHIN 20 words of the phrase probable cause,
THEN return that document.

This rule‑based approach is not AI because there is no learning involved.

What Does AI Look Like?

When we talk about AI, we typically refer to AI models: which consist of millions or billions of numbers connected in a complex matrix. Unlike rule-based systems, AI models are built with machine-learning techniques. This means that the AI learns by "reading" a very large number of examples and discovering patterns on its own. Using the 4th Amendment example above, an AI model would be able find the sentence below if you search for "4th amendment probable cause":

"the affiant had reasonable grounds at the time of his affidavit . . . for the belief that the law was being violated on the premises to be searched"

In order to achieve this, the model would have to "read" thousands (or more) of sentences that are both on point for the 4th Amendment and not on-point. Once it read enough, it would be able to do the "fuzzy match" between "4th amendment" and "reasonable grounds” and “search".

AI is everywhere in our lives. Every time we use our credit cards, the banks are using AI to detect fraudulent transactions. Every time we watch a movie on Netflix, AI learns our preferences and suggests what we might want to watch next.

Before 2022, AI models were mostly used by people with computer or data science degrees. The arrival of ChatGPT in November 2022 changed that. The AI models behind ChatGPT are called Large Language Models or LLMs.

What Do LLMs Look Like?

Large language models are supersized AI models. When researchers increased the size of an AI model from a few million parameters to 175+ billion, it could be trained by using the entire internet and more. At that scale, the model could write poems, draft memos, offer travel ideas or even generate audio and video using knowledge from its training. Its ability to do so is called "generative AI" because the model generates new content.

When asked to write a memo about the 4th Amendment using the sentence above, an LLM would continue writing something like this, eventually filling up several pages:

"...and if the apparent facts set out in the affidavit are such that a reasonably discreet and prudent man would be led to believe that there was a commission of the offense charged, there is probable cause justifying the issuance of a warrant."

What Do AI Systems With Rule-Based Components Look Like?

Many AI systems using LLMs still include rule-based components. Using the example above, an AI system might add the following IF-THEN rules before sending the search query to an LLM:

"Block any search from an IP with more than five failed attempts."
"Do not return any case that was decided before 1972."

Rule-based systems are efficient: they cost less money to build and run, and they can be implemented quickly without a lot of data. Setting up a rule-based system as the first line of defense can reduce costs and improve performance before handing off more complex tasks to AI models.

Conclusion

AI jargon can seem intimidating, however, by visualizing complex technical concepts into grossly simplified examples and images, we can learn just enough to do our jobs effectively.

What is not AI: If-Then rule-based systems using techniques such as Boolean search.
- e.g. find a 4th amendment case using ((("search" OR "seizure") NEAR/20 "probable cause"))
What is AI: systems that can learn from a large amount of data.
- e.g., fuzzy match between “4th Amendment” and “probably cause and search”
What is LLM: supersized AI that can generate text and other content based on its learning.
- e.g., ChatGPT that can write a memo on 4th Amendment

In my next blog, I will grossly simplify token, vector, embedding and parameters. Don’t miss out!

Fairly AI

Discussion about this post

Ready for more?