Blog posts: more colourless green ideas and sad news

Here are just two blog posts I want to give more visibility to (which makes no sense because I have less readers than both blogs negatively combined, but anyway):

One is a blog post on History and Philosophy of the Language Sciences, about predecessors to the famous sentence “Colourless green ideas sleep furiously”, showing that authors such as Russell, Carnap and Tesnière made similar points before Chomsky. Go over there to read about gems such as “Quadruplicity drinks procrastination.”

The other blog post is a sad one, Omer Preminger posting about leaving academia. The reason is unfortunately not very uncommon: the inability to live together with your partner if one or both of you are in academia.

I wish Omer all the best for the next step. (Inappropriately and egoistically, I of course hope that this doesn’t necessarily mean that all of his blogging stops for good, but that’s beside the point.) The decision can’t have been an easy one, so I hope everything works out well for him in the future.

Post on abstract concepts and weak/strong generative capacity

There is a guest post over at NYU morphlab on Phases and Phrases: Some thoughts on Weak Equivalence, Strong Equivalence, and Empirical Coverage by Hagen Blix and Adina Williams that I highly recommend everyone to read.

They start out with a discussion of theories of the movement of celestial bodies (they comment: “an arena where nobody never quarreled ever, so safe travels!”), pointing out that at least for some time, both geo- and heliocentric approaches could explain the same data, so in a sense they were “weakly equivalent” (ignoring the phases of Venus for now).

They then turn to linguistics (and for some reason avoid any pun centered around linguistic invisible ‘movement’) to explore the analogy between the weak equivalence of physics theories and what that concept usually means in linguistics.

To sum up, before we finally return to linguistics: Both models made assumptions about the way bodies move in the heavens (unobservable at the time). Both models could derive the movements of bodies in the sky (observable) as a projection from one onto the other. Both models also made predictions about the phases of celestial bodies, again based on the same assumptions about orbits. But only for one model were these predictions actually in line with the newly observable data, the phases of Venus. This was the new kind of data that made the heliocentric models win out […]

Now, we may say that two models for the movement of celestial bodies are weakly equivalent if they generate the same movement of bright spots across the sky (2D). But only if they were to also assign them the same movement in the heavens (3D) would we call them strongly equivalent. Astronomers, it turns out, did not care whether geocentric models are weakly equivalent to heliocentric ones. They cared about which one covered the larger range of relevant phenomena. 

This means that equivalency (weak or strong) can only be understood as relative to a particular collection of phenomena (i.e., relative to the data that the theory is here to explain). In short, it is scientifically ridiculous to focus on the fact that geocentric theories are weakly equivalent to heliocentric ones relative to predicting the paths of bright spots across the sky. Clearly, that’s not the only desideratum, and it is rather blinkered to get stuck worrying about only one particular type of data. The astronomers of the day knew this, even though nobody could directly observe any movement in the heavens – at that point a purely theoretical, abstract postulate.

They then drive home the point that in theoretical linguistics, too, one has to postulate “unobservables”, and see how far they can take you

Syntacticians are, unfortunately, in a position reminiscent of the one that 16th century astronomers found themselves in (possibly worse, but probably with less persecution): Just as astronomers postulated unobservable movement in the heavens to explain observable movement in the sky, syntacticians postulate unobservable phrasal nodes (say, a verb phrase in English) to account for the things we directly observe: Strings such as “Jo kicked the prof” or “Mary had them eat a cake” and their associated meanings. To a syntactician, strings are a little like the bright spots moving across the sky: They are the most immediately observable phenomenon we have. We certainly want to explain them. Like the astronomers, though, we too keep uncovering new phenomena, and we definitely care about whether our previous unobservables (our abstract postulates, our structure of phrasal nodes) can account for all of them.

They visualize with some toy grammars that some grammars are better than others, by capturing more regularities with less abstract concepts, making the correct predictions for which syntactic constructions can refer to which phrasal nodes.

While I agree with 99% of what they say, here are some minor quibbles:

While they admirably make the point for abstract theoretical concepts, the post comes off as a tiny bit too dismissive of computational complexity results in linguistics for my taste. I do share the sentiment that if your local (hopefully straw person) computational linguist were to say something like “I don’t care what the correct analysis of passive in natural language is, they are all weakly equivalent anyway” or “Two grammatical formalisms are weakly equivalent so why bother showing which one is a more accurate account of language”: this doesn’t cut it.

However, weak generative capacity is more than a concept that we can safely ignore because we have our much stronger razor by Occam. And ignoring it like this makes the theoretical linguist as dismissive as the computational linguist above.

What weak generative capacity does is to provide interesting lower and upper bounds on our grammatical theories. It is absolutely true as said above that this doesn’t help us for our analysis of passive. But I find it an interesting result in its own right that early transformational grammar is too powerful a formalism and could generate anything, and it’s good that we have more clearly defined theories now.

It is also interesting that not all grammar formalisms are equally powerful, e.g. that TAG and CCG are less powerful than Minimalist Grammars (see e.g. Stabler 2009 “Computational models of language universals” for a short overview). If we were to find a construction that unambiguously crosses that boundary, I would view that as a very interesting find.

Again, wgc only provides upper and lower bounds, with saying hardly anything about what happens within those bounds – but that makes the edge cases all the more interesting especially since these bounds are independent from the grammatical framework we use to describe them.

It’s also quite intriguing that for phonology where much more complexity subclasses of the regular languages are known, the frequency of phenomena is inversely correlated with how high they are in the hierarchy (see e.g. this post over at outdex). It would be cool if we had the same for syntax, i.e. a more finegrained distinction of context-free/sensitive classes of phenomena in which we can arrange them, a sort of periodic table for syntactic phenomena, independent of the formalism that is used (however, in my amateur understanding, one of the difficulties in finding these subclasses is precisely the fact that in syntax phenomena can be much less disentangled from their analyses).

To conclude, apart from these minor issues it’s really an entertaining and illustrative blog post. Go over there and read it!

New (to me) linguistics blogs

Freelance reconstruction is a blog by J. Pystynen about historical linguistics, engaging occasionally with generative grammar, see e.g. the blog posts Analogy Is Not Phonology or “All swans are underlyingly white”.

I don’t know how the blog Wellformedness slipped past my attention. It’s a great blog by computational linguist Kyle Gorman, with posts mostly about phonology (e.g. conspiracies), but also about NLP and the social aspects of linguistics. This is why I continue crying out for linguistic blogs finally adopting blog rolls dammit (shoutout to Omer again for adding a blogroll!).

I’ll add both to my overview page of linguistics blogs.

Congrats JWST!

Today the James Webb Space Telescope has successfully launched and is on its way to its final destination, the Langrange point L2. It will take a month until it gets there and after that it’s still 5 months until everything is set up and its observations can begin.

You can read up on what JWST can do if everything works at Quanta magazine (as usual!) by the amazing Natalie Wolchover.

As far as I understand, this is how things look: the earliest light after the Big Bang is from the time where the universe became transparent for the first time, at around 379000 years after the Big Bang. We have observed this light, called the Cosmic Microwave Background. After that, there is no additional source of light (except the 21cm spin line described here), and the universe is just filled with a gas of hydrogen and helium atoms and the leftover radiation from the Big Bang – up until the first stars formed from clouds of gas being gravitationally attracted, at around 150 million years after the Big Bang.

JWST’s successor, the Hubble telescope, could see up until 500 million years after the Big Bang. JWST, however, could peer into the universe’s past up until 50 million years after the Big Bang, i.e. earlier than cosmologists except the first stars to have formed. This means the JWST (absent observations that send theorists completely back to the drawing board, which would be even more amazing) could glimpse the birth of the first structures in the universe!

Among many other amazing things it is also sensitive enough to potentially detect the chemical composition of the atmospheres of exoplanets which might give us hints to potential life.

Let’s hope that everything continues to go as smooth as this morning and a French person will continue to say ‘nominale’ at every step of the journey (the most reassuring word I’ve heard today!) so that we learn more amazing things about our universe! 🙂

Posts and articles

Just to not let this blog die completely, here are some worthwhile blog posts and articles about linguistics I’ve come across recently:

Scientific American has an article about research on bilingualism where they interview Sarah Frances Phillips, a grad student under Liina Pylkkänen.

Bilingualism is for some reason a topic in linguistic research that has made it into the public eye (i.e. worth an article in SciAm that is not about Piraha)!

Language Log has an interview with Cory Stade where they talk about “cognitive fossils”, or rather the question how strongly correlated the tools humans used are with their cognitive evolution.

Omer has a blog post about modularity in minimalism.

If nothing else, the post should remind you that if you say about whatever you research “It’s interfaces!” then you should at least specify how exactly those interfaces work and look if the information or operations this interface has access to don’t look suspicious.

SM under scrutiny & Résonaances

It turns out that at the LHCb, the hints of violation of lepton flavour universality just got stronger. For years now, measurements of the decays of a certain family of composite particles have shown slight deviations from the Standard Model. What is interesting is that all these deviations point into the same direction. None of these single measurements cross the magic 5sigma line but there is some debate whether one shouldn’t take the combined results into account which would be very close to the discovery level.

In any case, it is one of those measurements where the LHCb achieved more precision with more data, and the deviation remains. As with all potential cracks in the SM, a definitive discovery would both be groundbreaking because we would finally see errors in the most precise theory in history, and awkward because we wouldn’t know what is responsible for the error. Since this is high-precision physics that measures whether there are deviations in the percentages of predicted decays there is no detection of the actual forces or particles that are responsible for the deviations. The energies required are not accessible at least for a couple of decades.

Update: Résonaances has a blog post on this measurement, significantly increasing his posting speed! As huge fan of his blog, this is amazing news 😉

The upcoming muon g-2 results

The Muon g-2 experiment is one of the last places where there is a reasonable hope of finding discrepancies with the Standard Model of Particle Physics. This Science Magazine article describes what this is all about and why it is relevant. Jester of Resonaances tweets that at this year’s Rencontres there is a talk “Most recent results from g-2 experiment”. This might just be an update, not necessarily the announcement of a discovery but who knows.

As far as I understand, even in the exciting case that we finally find something that the SM does not predict, this will not really change the current state of particle physics. Since this is a high-precision experiment, we will only know that something is wrong with the SM but not what or even at what energy scale. But even so, it would be the first crack in the SM and therefore a thing to celebrate.

Blog post series on language complexity

David Tanzer on John Baez’ blog Azimuth advertised for a new blog The Signal Beat where you can find a series on language complexity.
The posts start by viewing a language as a set of all its sentences and then guide you through ideas such as how to decide whether a sentence belongs to that language and how complex that decision procedure is, ending on the P vs NP conjecture.

David Adger in Nautilus continued

There’s an interview of David Adger in the current issue of Nautilus. It contains everything that the laws of physics dictate a linguistics interview should be about: Universal Grammar, merge, Arrival, Piraha, something something language is such a central part of human life/society blabla.

One (part of an) interview question that I celebrated was the following: Your ideas evoke probably the only controversy in the linguistics world that has spilled over to popular culture—the debate over “universal grammar.”

In general, David Adger is as informative and gentle as ever, using non-inflammatory rhetoric on the topic of UG (which is as always very welcome).

So, cool thing to have actual linguists in science magazines although the day is still to come where actual in-depth questions about theoretical linguistics are asked.

New measurement of Universe’s expansion

The Atacama Cosmology Telescope just released its measurements of the cosmic microwave background. The inferred expansion rate of the universe agrees with the ones of previous Planck missions.

This is interesting insofar as the question of the rate of the expansion is one of cosmology’s biggest mysteries right now. Measurements from the early universe and measurements of later times derived from i.a. supernova data do not agree. So this new measurement gives additional credibility to the old Planck data, making experimental error increasingly unlikely.

The Hubble tension as its called gets tenser and tenser.

