I think the issue is that there are statements here that assert and generalize something but use/cite using a specialization or a special instance to prove the generalization.
For example, NNs are not search engines but are more than that because;
Proof 1:
I have a simple counterpoint - RETRO is pretty much an explicit “search engine” as you call it, powered by LLMs. Yet it has none of the meta-learning, few-shot capabilities of other models. This idea that LLMs are just search engines is quite a laughable one - its like saying the brain is just a bag of electricity.
Proof 2: Cite paper 2
Proof 3: Cite paper X
It should be qualitatively proved first before showing evidence. A qualitative proof if it exists suffices the generalization but the presentation of evidence doesn’t help with generalization because the world is stochastic, even science prefers falsificatoin.
Firstly we should define what a search engine is and make sure to generalize it as possible especially if the tone of the assertions is claiming to generalize (e.g. X is a search engine, Y is not a search engine).
The definition you gave of a search engine is not generalized because it mentions an implementation detail - Key. Let me show you why:
Fact: The definition above of a search engine accepts a web search engine such as Google because it asks for a search key (e.g. words) and returns results.
Fact: Google uses AI and ML algorithms to search for content from the internet.
Let’s say Google(Key) = Content is a function that takes a key and returns the content searched from the internet. If we expand this function it will likely be;
Google(Key) = SomeFunctionConvolution(ClassicSearchFunctions, AISearchFunctions, MLSearchFunctions, SomeSearchFunctionX, SomeSearchFunctionY)
If one gets the intuition here, one would probably see that such search engine definition abovementioned is extremely restrictive and it should only be used for specialization purposes - bad to compare with NNs, HTMs, or Internet Search Engines. I think it is safe to use it specifically for relational database search engines. The Google search engine is more than a classical search engine.
Hence, it’s more meaningful to use the term search engine as a system that searches for an item using a specific kind of computation process - a more generalized one. This computation is an algorithm that is designed to search or present an emergent searching behavior. The former is obvious for NNs, scientists/engineers invented Gradient Descent a mathematical algorithm to search for local minima. The latter is not quite obvious because their core intents can be objectively viewed as not for searching alone but for something else such as the survival of some species.
IOW I don’t think it is productive to use such S(Key) = Item definition of a search engine because it can leave the inherent computational details in it.
So when @david.pfx said (below), current DL is really inherently a search engine. Why, because it is a system that searches for optimal parameters that can fit/model some dataset. However, it is engineered to predict or classify (what we see right away). If we restrict the search engine definition with S(Key) = Item, then of course DL is not a member of it, most importantly even internet search engines of today are not members as well. Only a relational database search engine can be accepted perhaps in this definition. My best guess is that we are not talking about this naive type of search engine. Therefore it is more productive to use a generalize search engine definition.
FWIW my working hypothesis is:
- Current ML is not AI, it’s mostly just a fancy search engine. It searches a multidimensional space for matches on text/images. It takes a lot of parameters to define interesting regions in that space. Practical ‘AI’ adds an output layer of engineered code to do something useful with the search results.