Computational Methods in Chemistry
Chemist Mark Tuckerman on the laws of motion, observables, and the Monte-Carlo approach
I think some of the favorite studies that I’ve done and the studies that I’ve been involved with, but also similar ones done by other groups that I haven’t, and these are studies that I mentioned before looking at an inflection and morphology – these morphemes that go on the ends of word to change things. One particularly well-studied debate within this is the English past tense debate. I guess a lot of people watching this video maybe won’t have English as a first language, and they will have been taught in school that a regular past tense in English you simply take the verb and put –ed on the verb, so you know “walk-walked” or “talked-talked”. This sounds very simple.
The earlier theories of how English native speaking children learn the past tense were essentially just that they learn a rule that you can put an -ed on the verb, and that’s how it works. What’s interesting is when you look at it in a very fine-grained way that rule seems to be an oversimplification. English children don’t always put the -ed on the verb, sometimes they miss it off, or sometimes they put it on verbs where it shouldn’t be there. What’s really interesting is if you look at it in a fine-grained way, the likelihood of supplying the past tense marker or not depends on the particular verb in a way that has to do with the sound of the verb.
It all is very simple when it’s written down (we write –ed), but actually how that’s pronounced – and remember children learn this a long time before they learn to read or write, and the fact that it’s all written -ed is no use to them. A lot of verbs in English, the regular past tense form actually ends in a [t] sound. If you think of “missed”, “wished” – as far as a child is concerned, it’s not –ed, it’s [t]. What the data suggests is that they’re making a phonological generalization not based on –ed, but based on the sound of a word. They’re learning a verb that ends in a fricative like “miss”, “wish”, “dish” or whatever it might be, that these take [t] in the past tense form.
What’s interesting is if you do studies where children are given novel verbs like “to frisk” or “to gliss” or something like that. So you say: “You know he’s a man who likes to gliss, he does this every day. Yesterday he – the child will say – “glisst”. So, the likelihood of children doing that depends on how phonologically similar the novel word is to the words that children have already got stored in memory. These results suggest it doesn’t look like they’ve got a regular rule that they can just apply to any verb, it sounds like they’re producing and past tense forms by simply storing all the forms that they know. When I want to produce a new form, they’re just making a phonological analogy, a sound based analogy, across the forms that they’ve got already stored.
There’s no debate that children have a tendency to go beyond what they have heard and to say new things. Past tense is a great example of that. My own daughter is right in the middle of this overgeneralization period at the moment. Pretty much every time she says “we sitted down”, “we goed there”, “we eated that”. She hasn’t really learned any irregular forms yet. So, children definitely have a tendency to go beyond what they’ve heard. The question is how do we best explain that.
There’s kind of three different approaches actually I’d say. The first is the generativist style approach, where there is the formal rule, you take the verb, you write an -ed on. The second is the kind of mainstream constructivist approach, where you store all the different forms, and then you have started to cross them to make some kind of generalization, but there’s still some kind of abstraction. Then you have – what I’m actually arguing for and my current writing is – a more radical constructivist approach, which is called an exemplar approach, where you don’t actually store any generalizations – all you store is the individual forms. So when you generate these novel forms, you’re generating them on the fly, as you go. You don’t have any stored rules, or generalizations, or anything. All you have is the stored forms, and they are generalizations, yes, but they happen on the fly, as you’re producing language, as you’re retrieving these forms rather than they’re being stored generalizations.
Until recently it was thought that there was a critical period for language acquisition, which finished around maybe five or six, that if you’re not exposed to a language before that period, then you’ll never be able to learn it to a native-like level. These were generally done on very small-scale studies, like 50 or 100 people. What’s really interesting is a guy called Josh Hartshorne, who’s based at Boston College in the US. He’s recently run a huge crowdsource study across the Internet, and I think this is something we’re going to see much more and more obviously not only in child language acquisition, but in science in general, or not just science even but in everything. We are seeing the rise of big data, of web-generated datasets.
So, what Josh did was very clever. He got people to do an online quiz. It was a neural network, and it would take people’s answers in English grammar quiz, and it would try and predict what was their native language, and how long they’ve been learning English, and so on. But this was just the hook to get people in, of course. What it was really doing was asking them what was their native language, how long had they been learning English, and he was filling all of this into a big statistical model, which was testing these this critical period idea not with a hundred people or a thousand people, but with, I think, several million people.
He managed to complete this study. What he actually found was very interesting. He found that there does seem to be a critical period, but it doesn’t finish at five or six, it seems to finish around, I think, 16 to 18, or even later. Anyway, it was in the early adulthood certainly, rather than childhoods. So, it obviously makes a difference in how you learn. You’re not going to become a native speaker just by sitting in a classroom for an hour a week and doing a bit of homework. You need to have something like the experience the children get. But it’s very encouraging for everyone who wants to learn English or another second language around the world to know that these critical periods do seem to extend much later than we had previously thought.
In terms of the relationship between language and other areas of child development, what’s difficult is that the generativist versus constructivist debate that I mentioned in child language doesn’t really have parallels in other areas. I mean obviously I’m a completely outside of this field, but say maths, for example. Everyone pretty much agrees that kids just have to learn maths from being taught maths and kind of what they can figure out. There isn’t really the idea that children are born with a kind of universal innate knowledge of mathematics that they have to map on to what they know. It’s difficult to draw parallels between language developments and other areas. They certainly overlap a lot with things like concept development and so on.
The example I mentioned earlier about whether we store for language the individual exemplars or whether we form obstructions – that’s a debate that’s already happened in cognitive psychology, not in a linguistic sense. But if we just set aside the words, think about a cat. Set aside the word “cat”, but just think of a concept “cat”. There are three different ways of talking about this. The old view was that we had rules – it must have four legs, it must have a tail and so on. But then this doesn’t work, because you could have a plastic model cat or a picture of a cat, to which none of these things apply.
Then the next idea was that we had a kind of prototype, idealized cat. We store all our cats in our head and make an analogy across them, and have some kind of perfect idealized cat, and we judge other cats with reference to this prototype. But then the third idea is this example our idea that actually all we do is store every encounter we have with a cat or a picture of a cat. We just store all of those in our head – there isn’t any separate abstraction process – and we just retrieve the necessary examples of a cat for whatever our task is at that purpose. No debate in science has ever settled conclusively, especially not in psychology, where we can’t look in people’s heads, but I would say the exemplar view of category formation is probably the dominant one now within category formation. I think languages needs to catch up with the rest of cognitive psychology here.
Computational models of child language acquisition are very important in our field. In the old days, you had your theory, and it was just like people talk about 1970s box and arrows theories. So, you know you have a box, and then the word comes in and then three hours go out, you know, long-term memory, short-term memory, the grammar, and no one knows what this really means. But if you have a computational model then you’re forced to make your assumptions absolutely explicit.
Some work that I’ve been involved with – I mean I haven’t done the computer side, but the guy who is called Felix Angleman, he works with us at Manchester and Liverpool universities – it was this I mentioned before about which mark goes on the end of a verb like “played-plays” in English and so on. But it’s actually much more complicated in that in most languages, certainly in Russian, for example. So, we did some studies of Polish and also Finnish, and Polish is quite similar to Russian in a lot of these respects. So, the problem that kids have here is much more complicated than English-speaking kids, but we still think they solve it in the same way, by storing all the examples and making a phonological analogy across them to generate these new forms.
What Felix did was built a computational model that does something like this. He’s given the stem of the word in Polish or Finnish and it has to produce the fully inflected form of the verb. It does it basically by doing what we’re saying here – by analogizing across the stored forms that it has. A computational model is a hugely valuable here, because if I just say we do it by analogizing across the stored forms, then you know people’s natural reaction is: “What the hell does that mean?” Whereas if we say: “Okay, it works something like this computational model, which stores the forms and can use them to produce the output form”, then what we’re saying at least hopefully starts to look vaguely plausible.
A question that comes up a lot in particularly for second language learning is whether some languages are easier or more difficult than others. I guess there are two ways to answer it. The first is to say it depends where you’re starting from. If your first language is English, then learning Hebrew is going to be horrendously difficult. But if your first language is Arabic, then learning Hebrew is going to be easier than learning English. Learning a language from within your own language family or one that’s historically related to your own language is obviously going to be a lot easier.
I guess, what people really want to know when they ask this question is, if we could somehow have an alien, would it be simpler for them to learn one language than another. Well again, I guess, it depends on where an alien was coming from. What happens is language tends to sacrifice complexity in kind of one part of the grammar and bring it in somewhere else. I think it’s probably easier to get to a basic level of English than it is in many other languages, just because, as I mentioned when comparing English to Polish or Russian, we don’t have all these different forms of a word. We have “plays-played-playing” and that’s it. We don’t have “играю-играешь- играете” and all these different forms that we have in other languages.
I think, to get to a level where you can just say the words that you want to say, where you can just retrieve the vocabulary and spit them out in some way that will make some kind of sense, probably English is a little bit easier than what we call highly inflected languages. But certainly, to convince someone that you were a native speaker, I wouldn’t want to say that any languages are easier or more difficult than others. Because the nuances the English can’t do with morphology he has to do with other ways like the vocab or the more rigid word order, and so on. These things tend to balance themselves out.
Chemist Mark Tuckerman on the laws of motion, observables, and the Monte-Carlo approach
Neuroscientist John Krystal on flashbacks, relative merits of different treatments, and brain tissue collectio...
Psychiatrist Guy Goodwin on the monoamine theory of depression, cognitive behaviour therapy and why some peopl...