Mind blown!

Bitking · March 30, 2018, 1:29am

Zipfs law.
It’s magic!

I have no Idea why discourse wants to start this in the middle. You should watch the whole thing.

abshej · March 30, 2018, 2:26am

I blame grammar rules.

Paul_Lamb · March 30, 2018, 2:39am

… he said quizzaciously

Bitking · March 31, 2018, 4:36pm

Still - I have to ask - for these high repetition words what part of speech do they serve?
How much of human language is devoted to what part(s) of speech?
Is it the same part of speech in other languages?

Ed_Pell · March 31, 2018, 5:48pm

I you they it    pronoun a place holder, points to a prior 

the a            articles, we know, one of us knows

and but:         conjunctions, set clusivity  

be is was:       verb of existence 

of to in for with on:    prepositions, forms relationship between words   
     
have:            verb of contingent existence       

as:              conjunction, conditional or comparison 

that:            relative pronoun, demonstrative pronoun, indicating a                 relative clause or it is a pointing word 

They are not nouns or verbs. They help us to externalize i-language.

Bitking · March 31, 2018, 5:52pm

Superb, exactly the level needed to do the rest of this.

Do we have the same word list for other languages AND do we have the same level of word description in those languages?

Lists? Check!
https://en.wiktionary.org/wiki/Wiktionary:Frequency_lists
Example:
https://en.wiktionary.org/wiki/Wiktionary:Frequency_lists/Spanish1000

Now on to some sort of automatic dictionary?
I feed google translate these words:

que
de
no
a
la
el
es
y
en
lo
un
por
qué
me
una

and I get back this:

that
of
No
a
the
the
is
and
in
what
a
by
what
me
one
Repeat as needed for your favorite language.

Bitking · March 31, 2018, 6:13pm

Aribic - maybe harder to match up.
No., Lemma, Frequency, Part-of-speech

مِن‎ (min), 3226, preposition
ٱللَّه‎ (l-lah), 2699, proper noun
فِي‎ (fī), 1701, preposition
إِنّ‎ (ʾinn), 1682, accusative particle
عَلَىٰ‎ (ʿalā), 1445, preposition
ٱلَّذِي‎ (llaḏī), 1442, relative pronoun
لَا‎ (lā), 1364, negative particle
مَا‎ (mā), 1266, relative pronoun
رَبّ‎ (rabb), 975, noun
إِلَىٰ‎ (ʾilā), 742, preposition
مَا‎ (mā), 704, negative particle
مَن‎ (man), 606, relative pronoun
إِن‎ (ʾin), 578, conditional particle
أَن‎ (ʾan), 578, subordinating conjunction
إِلَّا‎ (ʾillā), 558, restriction particle
Feed google translate this:
مِن‎
ٱللَّه‎
فِي‎
إِنّ‎
عَلَىٰ‎
ٱلَّذِي‎
لَا‎
مَا‎
رَبّ‎
إِلَىٰ‎
مَا‎
مَن‎
إِن‎
أَن‎
إِلَّا‎
Get this back:
From
God
In
The
On
Who?
No
What
Lord
To
What
From
The
That
Only

Keep in mind that this is drawn from the text of the Quran so there is likely to be a god bias in the list.

Bitking · March 31, 2018, 6:26pm

Of course - like everything else in neuroscience: When you look you find that someone has already been doing this. Ziphs law gets a tiny foot-note here.

Nice paper on the topic:

Ed_Pell · March 31, 2018, 6:55pm

Keep in mind Russian has no word "the".
There is a lot of conservation across languages. 

|и|         and|
|в (во)|    (+pr) in; (+a) into, to|
|не|        not|
|на|        (+pr) on, at; (+a) onto, to|
|я|         I|
|он|        he|
|что|       what, that|
|с (со)|    (+inst) with; (+g) from, off|
|это|       this, that, it|
|быть (i) (pres: есть)|    to be; there is, there are|
|а|         and, but (slight contrast)|
|весь (f вся, n всё, pl все)|    all|
|они|       they|
|она|       she|
|как|       how, as, like|
|мы|        we|
|к (ко) (+d)|    towards, to|
|у (+g)|    by; at (used in ‘have’ construction)|

Ed_Pell · March 31, 2018, 7:22pm

Arabic http://www.qamus.org/wordlist.htm

from
in
on
about
that
which
to/towards
which
with
this

This is different than English.

Feyn.io · March 31, 2018, 7:29pm

More mind blowing is it can apply to almost everything. Like my cat, every day or probably throughout his life time, 20% awake, 80% sleeping.

jake · March 31, 2018, 8:33pm

I would think 80 20 or 20 80 would be a good ratio of signal to noise filter as well…the perfect joke is probably 80 percent predictable with a 20 percent twist.

dotsteve · April 1, 2018, 10:18am

Thanks Mark. My head will be spinning for days.

fine2100 · April 1, 2018, 12:38pm

I have read this discussion with interest, especially the interest in “words” and word frequencies in different languages. Since this discussion is fundamentally about the relation between the structure of verbal behavior and its relations to the brains systems and processing of and creation of information from and about the internal and external environments of the brain, I would like to point out some fundamentals not touched in the debate, that nevertheless are fundamentals and important for the questions raised and the answers provided. First and all the brain asks questions and answers questions and create behaviors that show what its questions and answers were/are…humans dont drink water unless the brain has asked “Am I thirsty” or “should I drink now to survive in the desert” or questions of that sort. Proof of this: Drinking water is not a random behavior depending on a randomized decision process…only very few behaviors are the cause of randomized decisions. Instead a solid sequence of questions and answers lay behind most behaviors.
So “doubt” (= entropy) is fundamental, and is most probably hardwired into the structure of the brain (in the neocortex) and the subcortex is the machine delivering the answers…
Now we arrive at human languages…it is interesting that all human languages (at least the about 50% most important languages I have researched) all use the same six questions words to articulate doubt: where, what, which, why, when, how…and in indoeuropean languages and Japanese these six questions words are also the same, though these two language lines separated more than 8000 years ago…so these six question words are all we need to raised qualified doubt…then basically we use nouns, verbs and adjectives to produce one dimensional strings about multidimensional questions and answers and behaviors…the interesting thing is that the question words are only six across all languages…they are enough to qualify doubt in action (=the verbal behavior)…
Now we can then ask why it is possible at all to translate one language into another language? This is of course because any language is based on asking and answering these six questions and transform the answers into action…this is the only fact that can account for the fact that any language can be translated meaningfully to another language…this is because that language is modelling the same physical realities restricting action all around the world for all peoples. When we see linguistic novelties in the form of new words, it is because we have discovered new distinctions in nature that we can name…
So my advice to you brainscientists that wants to use texts as data (to discover something about the text, the writer, the brain???)…you should start understanding how one could automate and analyse how many questions and how many answers a given text poses…and how many commands for action…because this is always how texts reflects what goes on in the brain…unfortunately most texts are incomplete and base themselves on unwritten assumptions and expectations…like this text: I am thirsty, go get me the usual…!!! First sentences “I am thirsty” is an answers…then “go get me the usual”…is difficult…is it a question or a command for action based on an answer to a question…as it is written it is a command…and the question was “what will get me the usual”…
It will be analysis of this kind that will make the correct splits of texts and sentences, especially if analysis can end up asking the relevant questions that will make the machine able to construct a text without any hidden assumptions and expectations…this will be the first step for AI-language machines…find and ask the questions…then receive the answers…and then figure out the actions…all this will be based on a logical theory of language of which I have shown is about q/a and solving doubt…and then on empirical data from language users…and first then empirical data from decision makers and actors about how to act…
Regards
Finn Gilling

jake · April 1, 2018, 3:35pm

I bet this is because these words are often stored as predictive strings…differently in different languages. Just a thought …could we do a functional MRI looking directly at language areas of the cortex of a bilingual or multilingual subject. I expect that as language is learned these predictive strings form in similar ways across languages. This might explain phonics…sounds and letter combinations form into repeated words and words into repeated strings (mannerisms of speach) and so on…as I think I mentioned earlier predicting a baby’s first word could be done by observing the most common word and object exposures as well as interactions with these items and projecting onto an sdr. I think conventions in writing are probably just the brain’s way of doing data compression on strings of commonly used words and these develop into mannerisms of speech.
I am thinking particularly about how my father constructs speech in German in very similar ways to that of his English speech patterns.

Bitking · April 1, 2018, 4:59pm

As far as fMRIs to match up word to cortex location, this paper and its references should be a good start if you are interested in continuing this line:
http://www.cell.com/trends/cognitive-sciences/pdf/S1364-6613(13)00122-8.pdf

jake · April 2, 2018, 4:07am

Has anyone ever tried to rank behaviours of animals in the same way as words…I bet it follows the same curve because it’s a language.

Emotions are the language before there are words and behaviour is the externalized manifestation of this language. The neocortex has to take this emotional input because it’s baked in. That i think makes sense of a lot of human nature.
Almost like having to be backwardly compliant with DOS…ha ha…little computer joke…

Topic		Replies	Views
"How Your Brain Organizes Information" video General Neuroscience	90	2487	May 1, 2023
The Magical Number Seven, Plus or Minus Two Science short-term-memory	28	1051	March 23, 2022
Attention Approximates Sparse Distributed Memory Lounge	1	525	October 22, 2021
Kudos Marcus! NuPIC	3	668	August 1, 2016
A Huge Thanks Lounge	7	491	March 31, 2021

Mind blown!

Related topics