1 00:00:00,000 --> 00:00:16,070 *rc3 hacc preroll music* 2 00:00:16,070 --> 00:00:20,080 Herald: All right, we are back again from the HACC or about:future studios on the 3 00:00:20,080 --> 00:00:30,156 rc3 remote Congress 3 this year sadly remote so we'll do as we can. We have a 4 00:00:30,156 --> 00:00:37,366 really nice talk coming up here about type theory in linguistics and about meaning in 5 00:00:37,366 --> 00:00:44,226 linguistics. From daherb. He will be talking about the uses of type theory in 6 00:00:44,226 --> 00:00:52,175 both programing languages and natural languages and maybe draw, like, 7 00:00:52,175 --> 00:01:01,963 similarities between them. He has a PhD in computer science, was in Sweden for 6 years 8 00:01:01,963 --> 00:01:07,234 at a university and is now back in Germany. There is sadly no translation for 9 00:01:07,234 --> 00:01:11,967 this talk because there aren't enough angels to do all the translation work. 10 00:01:11,967 --> 00:01:17,240 There is a pixelflut behind me. It's my own implementation. So if you want to spam 11 00:01:17,240 --> 00:01:23,144 me, you can. I'm not sure if it will work, but apart from that, I think that's enough 12 00:01:23,144 --> 00:01:26,894 of introduction. Let's go to daherb. 13 00:01:26,894 --> 00:01:33,280 daherb: Thanks, and a little bit about me: I would call myself a computational 14 00:01:33,280 --> 00:01:41,200 linguist, even though my Ph.D. officially just says computer science. And I see 15 00:01:41,200 --> 00:01:46,880 myself in the tradition of this slightly outdated xkcd where people try to use 16 00:01:46,880 --> 00:01:56,080 computers to test linguistic theories. In addition to that, I spent my time in 17 00:01:56,080 --> 00:02:02,160 Sweden in a very well known functional programming group and basically next to a 18 00:02:02,160 --> 00:02:08,160 research group dedicated to type theory. So my interests are both into functional 19 00:02:08,160 --> 00:02:14,320 programming, and I got a little bit spoiled also by type theory. And that inspired me 20 00:02:14,320 --> 00:02:22,080 to give this talk. And to start: The question is what is meaning in natural 21 00:02:22,080 --> 00:02:29,360 language? So I came up with a few examples, example sentences. And the 22 00:02:29,360 --> 00:02:38,550 question is: what is the meaning of these sentences? But before we dive into that, a 23 00:02:38,550 --> 00:02:44,479 little bit about linguistics. So there are several subdisciplines in linguistics: one 24 00:02:44,479 --> 00:02:51,670 is syntax, which is mostly people drawing trees. And people are really good at 25 00:02:51,670 --> 00:02:59,549 drawing trees these days. On the slide, you can see various formalisms, how people 26 00:02:59,549 --> 00:03:05,846 draw trees, and that's pretty well established these days. But that's 27 00:03:05,846 --> 00:03:14,275 basically: what are valid sentences in a language? The next question is: what is 28 00:03:14,275 --> 00:03:22,964 the meaning of these sentences? And within linguistics, for a long time, semantics 29 00:03:22,964 --> 00:03:27,645 was something you would not really try to approach because it's really, really 30 00:03:27,645 --> 00:03:37,760 difficult. But actually, semantics is the literal meaning, and in addition to 31 00:03:37,760 --> 00:03:44,480 semantics, we might have the meaning in context. So this is a very great example 32 00:03:44,480 --> 00:03:51,680 of so-called pragmatics. Question: your greatest weakness? Answer of the person: 33 00:03:51,680 --> 00:03:56,000 interpreting semantics of a question, but ignoring the pragmatics. Could you give an 34 00:03:56,000 --> 00:04:04,960 example? Yes. So in this example, the person only looked at the literal meaning. 35 00:04:04,960 --> 00:04:12,960 Can you give an example? Yes, I can give an example. Not interpreted as the request 36 00:04:12,960 --> 00:04:19,280 to actually give an example. So we have the literal meaning and decide that we 37 00:04:19,280 --> 00:04:28,643 even have an additional meaning in context. So if we look at statements in 38 00:04:28,643 --> 00:04:34,043 natural language, we might have challenges in interpreting: what do they mean? 39 00:04:34,043 --> 00:04:37,889 All men are mortal. Every man is mortal. 40 00:04:37,889 --> 00:04:43,154 That we can interpret pretty literal. Yesterday it was raining. 41 00:04:43,154 --> 00:04:50,709 Tomorrow it will be raining. The first one could be a literal fact. The one about the 42 00:04:50,709 --> 00:04:57,761 future. Who knows? Then there are statements about things that could 43 00:04:57,761 --> 00:05:03,732 potentially happen. It could be raining. Then there are statements which in our 44 00:05:03,732 --> 00:05:09,780 world could never actually happen, like John is searching for a unicorn. It's 45 00:05:09,780 --> 00:05:16,020 pretty much a matter of fact that there are no real life unicorns. So if John is 46 00:05:16,020 --> 00:05:20,566 searching for a unicorn, we don't really claim that there might be unicorns in our 47 00:05:20,566 --> 00:05:30,249 world. And then we have questions like, could you pass me the salt where we don't 48 00:05:30,249 --> 00:05:34,773 really have the literal meaning or we don't care about the literal meaning, we 49 00:05:34,773 --> 00:05:41,703 don't care if the answer is yes or no. What we actually – what the person 50 00:05:41,703 --> 00:05:47,154 making the statement actually means is, please hand me the salt. Please pass me 51 00:05:47,154 --> 00:05:54,193 the salt. And in addition, there is kind of an assumption that there is actually 52 00:05:54,193 --> 00:05:59,020 salt on the table. But we don't really assume that if John is searching for a 53 00:05:59,020 --> 00:06:03,101 unicorn, that there is a unicorn, if someone asks the question, can you pass me 54 00:06:03,101 --> 00:06:13,000 the salt? there is actually salt on the table. So meaning is a very philosophical 55 00:06:13,000 --> 00:06:24,328 problem. And why do people care about it? There are relevant questions, for example, 56 00:06:24,328 --> 00:06:29,674 when are two statements equivalent, when do they mean the same? "All men are 57 00:06:29,674 --> 00:06:35,964 mortal" Does it mean the same as every man is mortal? Or the more challenging "I saw 58 00:06:35,964 --> 00:06:42,630 the morning star", "I saw the evening star" "I saw Venus". As a matter of fact, 59 00:06:42,630 --> 00:06:48,553 these three are meaning exactly the same, but if someone says, I saw the morning 60 00:06:48,553 --> 00:06:54,710 star, do we know that they actually also know that they saw Venus, that the morning 61 00:06:54,710 --> 00:07:03,173 star is the same as Venus? And another relevant question is, when does one 62 00:07:03,173 --> 00:07:09,206 statement follow from other statements? And especially in computational 63 00:07:09,206 --> 00:07:19,865 linguistics, we want to find automatic ways to decide these questions. There are 64 00:07:19,865 --> 00:07:28,350 a few more examples. So there is a very well-known example: all men are mortal, 65 00:07:28,350 --> 00:07:33,520 Socrates is a man. So we should come to the conclusion that Socrates is also 66 00:07:33,520 --> 00:07:39,600 mortal. Or more challenging: every European, has the right to live in Europe. 67 00:07:39,600 --> 00:07:43,360 Every European is a person. Every person who has the right to live in Europe can 68 00:07:43,360 --> 00:07:48,240 travel freely in Europe. Can we answer the question, can every European travel 69 00:07:48,240 --> 00:07:56,640 freely within Europe? and hopefully find some way to answer this question with yes. 70 00:07:56,640 --> 00:08:03,040 Humans are – People are usually pretty good at deciding that with a little bit of 71 00:08:03,040 --> 00:08:11,520 logic and a lot of intuition. But computers are really bad about intuition. 72 00:08:11,520 --> 00:08:22,400 So we are interested in models or ways that actually also work for computers. And 73 00:08:22,400 --> 00:08:29,520 these days there are basically two big branches: the one the formal approaches. 74 00:08:29,520 --> 00:08:33,360 And the other are the statistical approaches. The formal approaches use 75 00:08:33,360 --> 00:08:39,920 logic to try to answer the questions, while the statistical approaches, just 76 00:08:39,920 --> 00:08:49,200 look at plenty of data, I shouldn't say "just", they use a lot of data to answer 77 00:08:49,200 --> 00:08:57,440 the questions we are interested in. In this talk, I will only focus on the 78 00:08:57,440 --> 00:09:04,720 formal approaches because I like them more. But both approaches, both kinds of 79 00:09:04,720 --> 00:09:19,124 approaches have the right to exist and can be also very helpful in their own way. So, 80 00:09:19,124 --> 00:09:25,698 back to the kind of the title of the talk now slightly paraphrased: type theory and 81 00:09:25,698 --> 00:09:36,437 semantics. And there are basically two big approaches to type theory. OK, what is 82 00:09:36,437 --> 00:09:45,720 type theory? In computer science type theory is the study of what systems of 83 00:09:45,720 --> 00:09:53,257 types exist and how, what properties do they have? We will shortly see what 84 00:09:53,257 --> 00:10:05,520 that could mean. So that … If we go back in history, then there are simply typed 85 00:10:05,520 --> 00:10:13,760 languages. They go back to Alonzo Church. I think in 1940 he described the 86 00:10:13,760 --> 00:10:20,800 simply typed lambda calculus. And then a few decades later, a guy called Richard 87 00:10:20,800 --> 00:10:30,160 Montague used this approach or adopted this approach, which was purely from logic 88 00:10:30,160 --> 00:10:44,400 to natural languages. And Richard Montague was a very particular person. He was … he 89 00:10:44,400 --> 00:10:53,760 had a great influence, but also he was not an easy person, he openly attacked 90 00:10:53,760 --> 00:11:01,920 colleagues and … but his influence on linguistics started in the 70s. There are 91 00:11:01,920 --> 00:11:15,819 still people trying to work out details about his theory. Now we come to the 92 00:11:15,819 --> 00:11:24,390 one answer: what is meaning in linguistics? And that's so-called truth 93 00:11:24,390 --> 00:11:34,011 conditional or model theoretic semantics. So, what is the meaning of the sentences 94 00:11:34,011 --> 00:11:45,184 like "all men are mortal". And the meaning of this sentence is: can we come up with a 95 00:11:45,184 --> 00:11:52,837 model which is kind of a picture of a world that makes this sentence true? And 96 00:11:52,837 --> 00:12:01,942 here in the picture, we have an example of that: we have the set of all mortal 97 00:12:01,942 --> 00:12:09,961 things. And we have the set of all men in our world, and they are a subset, meaning 98 00:12:09,961 --> 00:12:16,689 everything that's a man is also a mortal thing. And then we have the second 99 00:12:16,689 --> 00:12:23,797 sentence, which says "Socrates is a man", which means we know that Socrates is one 100 00:12:23,797 --> 00:12:31,767 of these objects in the set of all men. And then is the question "is Socrates 101 00:12:31,767 --> 00:12:39,907 mortal?", and because we know that all men are in the set of mortal things, we can 102 00:12:39,907 --> 00:12:47,465 answer this question quite easily with yes. So one approach to meaning of natural 103 00:12:47,465 --> 00:12:55,675 language is: can we find a world in which all the properties are in a way that all 104 00:12:55,675 --> 00:13:01,343 the sentences we are given are true? And because we look at models, it's called 105 00:13:01,343 --> 00:13:18,000 model theoretic semantics. And the big influence of Montague was … As I mentioned 106 00:13:18,000 --> 00:13:22,240 before, there was already the theory by Alonzo Church of the simply typed lambda 107 00:13:22,240 --> 00:13:29,040 calculus, which was an approach in logic, and then there came around Montague and 108 00:13:29,040 --> 00:13:33,120 made the statement, this very bold statement: "There is, in my opinion, no 109 00:13:33,120 --> 00:13:37,120 important theoretical difference between natural languages and the artificial 110 00:13:37,120 --> 00:13:42,480 languages of logicians. Indeed, I consider it possible to comprehend the syntax and 111 00:13:42,480 --> 00:13:48,160 semantics of both kinds of languages that think a single natural and mathematically 112 00:13:48,160 --> 00:13:57,200 precise theory". And that's a really bold statement, because we know that usually if 113 00:13:57,200 --> 00:14:03,760 we, for example, look at programming languages, they are very small and very 114 00:14:03,760 --> 00:14:09,391 well-defined compared to what we can express in natural languages. But yeah, he 115 00:14:09,391 --> 00:14:18,645 made the statement and then he started to demonstrate that this is actually 116 00:14:18,645 --> 00:14:32,560 possible. Then, let's look a bit at times, and I called this a simply typed language, 117 00:14:34,320 --> 00:14:45,680 and we can give … We have infinitely many types, and every object in our 118 00:14:45,680 --> 00:14:51,920 logic language has a type. And the type is either e for an 119 00:14:51,920 --> 00:15:01,040 entity, t for truth value, or if we have some type alpha, and some other type beta, 120 00:15:01,760 --> 00:15:09,760 then we have this pair , which is a type. And then logicians like to be 121 00:15:09,760 --> 00:15:16,165 very precise and they say "nothing else is the type in this language". And. That's, 122 00:15:16,165 --> 00:15:30,560 again, rather abstract. Let's look at some examples. We have certain classes of words 123 00:15:30,560 --> 00:15:39,120 in our language. For example, we have names. And they … our examples are, for 124 00:15:39,120 --> 00:15:47,120 example, John and Mary, and we have the semantic type e, which is a basic type, 125 00:15:49,040 --> 00:15:55,120 and then we have intransitive verbs, which means verbs that don't take a direct 126 00:15:55,120 --> 00:16:02,720 object like "sleeps" or transitive groups that have a direct object like "love". And 127 00:16:02,720 --> 00:16:10,000 then we have sentences and they have the semantic type t, which is as a computer 128 00:16:10,000 --> 00:16:14,400 scientist, I would call it boolean. It's a truth value. It's either true or false. 129 00:16:16,960 --> 00:16:24,110 And intransitive verbs have this type of a pair ("e to t"), which, for example, 130 00:16:24,110 --> 00:16:34,925 could interpret as a function from entities to truth values. And then I have a 131 00:16:34,925 --> 00:16:45,229 few examples down here. So what's in this double square brackets? That's our natural 132 00:16:45,229 --> 00:16:52,665 language. And what's not in italics, not in these double square brackets, that's our 133 00:16:52,665 --> 00:17:03,200 logic language. So for simple things like the name John, we just claim that there's 134 00:17:03,200 --> 00:17:11,680 some object representing John, we call it just "j". The same for Mary and also "sleeps" 135 00:17:11,680 --> 00:17:19,981 and "love", we translate to just some object in our logic language. But what are 136 00:17:19,981 --> 00:17:33,360 these exactly? There are two ways to look at them: sleep' ("sleep prime"), for example, 137 00:17:33,360 --> 00:17:41,920 is either the set of all objects that are sleeping in our world. For example, the 138 00:17:41,920 --> 00:17:49,040 set only consisting of only having the element John or j representing John. So 139 00:17:49,040 --> 00:17:55,120 John is the only sleeping entity in our world, or we can represent it as a 140 00:17:55,120 --> 00:18:04,240 function from entities to truth values and the function returns true, if x is a 141 00:18:04,240 --> 00:18:10,880 sleeping element in our world and false otherwise, and if we define it the 142 00:18:10,880 --> 00:18:16,880 following way, that j is mapped to true and m is mapped to false, then we get 143 00:18:16,880 --> 00:18:25,360 exactly the same representation as with the set before. And that theory, that's 144 00:18:25,360 --> 00:18:30,960 usually called a characteristic function. So either we can treat it as a set or as a 145 00:18:30,960 --> 00:18:38,640 function. In this case, it doesn't really matter, but the next question is: "what is 146 00:18:38,640 --> 00:18:51,840 love?" Or love', to be precise. And while "sleep" is a property of an entity – 147 00:18:51,840 --> 00:18:57,520 for each entity, we can decide if it's sleeping or not. Love is not a property of 148 00:18:57,520 --> 00:19:02,096 one entity, but it's the relation between two entities. So it's either a set of 149 00:19:02,096 --> 00:19:17,120 pairs. Or it is a function which takes two parameters, and as soon as we apply to one 150 00:19:17,120 --> 00:19:22,320 of the parameters, we end up with the second function. So the function with two 151 00:19:22,320 --> 00:19:30,400 parameters represent the relation between two entities or two individuals. But if we 152 00:19:30,400 --> 00:19:37,997 apply to one of the parameters, we get a function, which is the property, for 153 00:19:37,997 --> 00:19:59,040 example, to be loved by someone. Just to keep in mind, for simple predicates, sets 154 00:19:59,040 --> 00:20:07,280 and functions are equivalent, and for these relations, it's either pairs or it's 155 00:20:07,280 --> 00:20:12,240 a function. And if we apply it to one of the arguments, we get another function. 156 00:20:16,080 --> 00:20:23,440 One of the driving forces behind the theory of Montague is compositionality. 157 00:20:24,960 --> 00:20:31,600 And that's usually attributed to Frege, even though he never really expressed it 158 00:20:31,600 --> 00:20:38,240 that way. But one of the most well known interpretations is "the meaning of a 159 00:20:38,240 --> 00:20:41,840 compound expression is a function of the meaning of its parts and the way they are 160 00:20:41,840 --> 00:20:49,360 syntactically combined". So if we go back here, we kind of have our simple and 161 00:20:49,360 --> 00:20:55,520 simplest parts. We have our names and we have our verbs, and we have rather simple 162 00:20:57,040 --> 00:21:03,280 interpretations what they mean. And now the question is how can we combine them to 163 00:21:03,280 --> 00:21:10,400 form … for example, the meaning of more complex sentences, or in general of 164 00:21:10,400 --> 00:21:18,271 sentences. And for that, we need a little bit of syntax. Syntax tells us how we can 165 00:21:18,271 --> 00:21:26,961 combine these elements to form sentences. So the rule syntax 1, for example, says if 166 00:21:26,961 --> 00:21:39,173 we have a name and an intransitive verb, we can combine them to a sentence. And the 167 00:21:39,173 --> 00:21:46,333 second rule, syntax 2, says if you have a transitive verb and a name, we can combine 168 00:21:46,333 --> 00:21:54,720 it to an intransitive verb. Now we want to look at the meaning of sentences like 169 00:21:54,720 --> 00:22:09,200 "John sleeps" and the meaning of "John sleeps" is that if we interpret the 170 00:22:09,200 --> 00:22:15,440 meaning of "sleeps" and that's the predicate which can be applied to an 171 00:22:15,440 --> 00:22:22,880 entity or an individual to give a truth value. And John is of type e, so we can apply 172 00:22:22,880 --> 00:22:30,648 sleeps to John and we get, as a result, sleep' applied to John. Or the 173 00:22:30,648 --> 00:22:40,937 slightly more complex example of "John loves Mary". We can first apply the same 174 00:22:40,937 --> 00:22:53,480 semantic rule that we can apply the rest of the sentence to the subject of the 175 00:22:53,480 --> 00:23:06,080 sentence. And then we have to interpret "loves Mary", which is this function that 176 00:23:06,080 --> 00:23:13,680 takes two parameters and then is first applied to the object and then to the 177 00:23:13,680 --> 00:23:23,200 subject. So the meaning of "loves Mary" is kind of the predicate "to be in in a state 178 00:23:23,200 --> 00:23:30,240 of loving Mary". And then we apply to John and we get the meaning: Love, Mary, John. 179 00:23:34,800 --> 00:23:44,240 And now the question is: how do we actually come up with these semantic 180 00:23:44,240 --> 00:23:52,480 rules? So how do we know that we have to apply "loves Mary" to "John". And that's 181 00:23:52,480 --> 00:24:03,200 where the types give us guidance. So we know that "loves Mary" is of the Category 182 00:24:03,760 --> 00:24:11,840 IV, intransitive verb, which has the type , and we know that John is of the 183 00:24:11,840 --> 00:24:18,160 category "m", which is of semantic type e, and if we see as a function which 184 00:24:18,160 --> 00:24:25,360 takes an "e" as a parameter to produce a "t" as a result, then we know that we can 185 00:24:25,360 --> 00:24:39,760 apply "loves Mary" to John. Then I already before mentioned the word lambda calculus 186 00:24:39,760 --> 00:24:46,400 or lambda expression: Probably not everyone knows what a lambda 187 00:24:46,400 --> 00:24:55,040 expression is. It's kind of the solution to a very … almost nit picky problem. So 188 00:24:55,040 --> 00:25:00,400 if we have some formula, some expression like "x^2 + 2 x", what does it 189 00:25:00,400 --> 00:25:06,720 exactly mean? And then we could come up with just putting numbers instead of x. 190 00:25:09,040 --> 00:25:17,360 And if we put 1, we get the result 3 and if we put 2 the result is 8 and if we put 191 00:25:17,360 --> 00:25:23,840 x=3, we get 15. But what we usually mean is: it's some kind of function 192 00:25:23,840 --> 00:25:32,400 which takes a number and then maps this number to the number's square plus two x. 193 00:25:34,000 --> 00:25:38,080 That's what we mean. But it's not really written down precisely. And that's where 194 00:25:38,960 --> 00:25:48,400 logicians came up with a new operator called lambda. Which can introduce a 195 00:25:48,400 --> 00:25:54,800 variable and turn up expression, which is not completely clear, like the one we had 196 00:25:54,800 --> 00:26:02,960 before into a function. So if you write "lambda x, x^2 + 2 x", we know 197 00:26:02,960 --> 00:26:10,357 that it is a function which has a parameter which we call x, and we use that 198 00:26:10,357 --> 00:26:19,504 in our expression "x^2 + 2 x". And then we can apply this, for example, 199 00:26:19,504 --> 00:26:28,958 to 1, which means we have to replace all "x"s in our expression by 1, and then we 200 00:26:28,958 --> 00:26:37,440 get exactly the same results as we had in our informal expression. But now it's on a 201 00:26:37,440 --> 00:26:44,985 proper foundation. Why do we need that? Because we actually have, kind of higher 202 00:26:44,985 --> 00:27:02,280 order types, so we know that the E to T is kind of a function from individuals to 203 00:27:02,280 --> 00:27:10,750 truth values, that that's the same for common noun like man, bird and so on. But 204 00:27:10,750 --> 00:27:21,868 what is the type of what we call determiners or quantifiers: some, every, 205 00:27:21,868 --> 00:27:30,683 the. So the meaning of man is just man' or man'(x), the same 206 00:27:30,683 --> 00:27:41,227 as we have had with sleep before. And the meaning of "every" now we 207 00:27:41,227 --> 00:27:57,600 use this lambda. Because the meaning of "everything" is for all things "x" that are 208 00:27:57,600 --> 00:28:07,760 individuals, if they meet some predicate P, then they also meet some predicate Q. 209 00:28:12,080 --> 00:28:19,225 That's, again, very abstract, so let's have a look at an example. So the meaning 210 00:28:19,225 --> 00:28:31,040 of "every man" is: We have a new semantics rule meeting all the types, so the type of man 211 00:28:31,040 --> 00:28:41,947 is and the semantic type of "every" is this a slightly more complex type, but 212 00:28:41,947 --> 00:28:46,720 is the function from to something else? So we know we can apply that terminal 213 00:28:47,360 --> 00:29:01,120 to our common noun. Then we do our lamda magic and we end up with the term lamda Q 214 00:29:01,840 --> 00:29:12,720 for all x, man'(x) implies Q(x), meaning for all things: If they are men, then they do 215 00:29:12,720 --> 00:29:21,328 something or they have a certain property. And if we now look at "every man sleeps". 216 00:29:21,328 --> 00:29:29,925 We apply "every man" to "sleeps", we just had this term on the previous slide and the 217 00:29:29,925 --> 00:29:44,341 types for Q and for sleeps match, so we can just put that in there. And the 218 00:29:44,341 --> 00:29:53,296 meaning of "every man sleeps" is for all individuals: If they are men, then they 219 00:29:53,296 --> 00:30:07,600 sleep. And that way we get a theory how we can translate expressions of natural 220 00:30:07,600 --> 00:30:14,800 language into expressions, into logic expression, for which we can try to find 221 00:30:14,800 --> 00:30:22,080 models, so we have this truth conditional semantics, and it is compositional in the 222 00:30:22,080 --> 00:30:28,640 way that the meaning of an complex expression depends on its components. The 223 00:30:28,640 --> 00:30:32,800 meaning of "every man sleeps" depends on the meaning of "every man" and of "sleeps" 224 00:30:34,400 --> 00:30:42,800 and on ways how to combine them. And these ways how to combine them are determined by 225 00:30:42,800 --> 00:30:53,120 the types we use. So we have some very nice method of computing meaning in a 226 00:30:53,120 --> 00:30:58,800 compositional way into a logic, so that looks already pretty much like what 227 00:30:58,800 --> 00:31:11,440 Montague claims, but it has certain drawbacks. And for that reason, and also 228 00:31:11,440 --> 00:31:18,080 because in logic and computers science the research progressed, people came up with 229 00:31:18,080 --> 00:31:26,160 the so-called MTTs "Modern type theories" or the original is called "Martin-Löf type 230 00:31:26,160 --> 00:31:32,560 theory". So this nice bearded guy on the left is kind of the person who invented 231 00:31:32,560 --> 00:31:40,640 everything and the nice person on the right who likes to climb mountains, he 232 00:31:40,640 --> 00:31:47,280 used to be … who is a professor in the group where I was doing my Ph.D. used 233 00:31:47,280 --> 00:31:54,720 these type theories, these modern type theories to express the meaning of natural 234 00:31:54,720 --> 00:32:03,360 language in a slightly different way. And I'll give a quick glance at what changes 235 00:32:03,360 --> 00:32:11,520 they have. But first, one of the big advantages of this – or one of the biggest 236 00:32:11,520 --> 00:32:16,480 implications of these modern type theories is usually in computer science, in proof 237 00:32:16,480 --> 00:32:25,200 systems. So one of the most or very well- known proof assistant is the one with the 238 00:32:25,200 --> 00:32:32,960 logo on the right called Coq. Another dependently type programming language is 239 00:32:32,960 --> 00:32:39,440 the language Agda, which is … both of them are used quite extensively in formalizing, 240 00:32:41,200 --> 00:32:47,760 and verifying, for example, proofs in mathematics, and verifying and proving 241 00:32:47,760 --> 00:32:54,400 properties about programs. But they can also be used to reason about natural 242 00:32:54,400 --> 00:33:05,440 language. So the different approach to meaning in language is the so-called proof 243 00:33:05,440 --> 00:33:14,240 theoretic semantics. So before we looked at what models can be, how models can be 244 00:33:14,240 --> 00:33:21,520 used to express conditions under which things can be can be seen as true. And now 245 00:33:21,520 --> 00:33:28,960 we care more about proofs than about truth. And that goes back to logic and 246 00:33:28,960 --> 00:33:37,520 computer science again. Where it's usually referred to as the Curry-Howard 247 00:33:37,520 --> 00:33:44,800 isomorphism. It's that proofs in mathematics can be expressed as programs 248 00:33:44,800 --> 00:33:49,920 in a computational framework, and that allows us practical reasoning on 249 00:33:49,920 --> 00:33:57,040 computers. And informally, if you want to prove something, A, we can construct a 250 00:33:57,040 --> 00:34:03,920 term or write a program t of this type. And an example is at the top of the slide 251 00:34:06,720 --> 00:34:13,040 where we kind of want – first it's a function and it's the function from a 252 00:34:13,040 --> 00:34:18,825 semantic representation of the sentence "every man walks" to the sentence "John is 253 00:34:18,825 --> 00:34:30,820 a man", and finally the sentence "John walks". So we want to we have the two 254 00:34:30,820 --> 00:34:36,600 sentences "every man walks" and "John is a man". And we want to get to the conclusion 255 00:34:36,600 --> 00:34:43,277 – or to figure out – if it is a conclusion that "John walks". And this is a small 256 00:34:43,277 --> 00:34:50,134 proof in the proof assistant Coq there in a few lines of code. In a few lines of Coq 257 00:34:50,134 --> 00:34:55,483 code we can find the proof that actually from the sentence every man walks and John 258 00:34:55,483 --> 00:35:04,350 is a man. We can conclude that "John walks". The first big extension is: we 259 00:35:04,350 --> 00:35:14,480 have more expressive types. So before we had the type e for entities and the type t 260 00:35:14,480 --> 00:35:23,680 for truth values. So actually the type t doesn't change a lot. We now just called 261 00:35:23,680 --> 00:35:30,160 it "prop" the type of propositions. But for common nouns like man, which were 262 00:35:31,520 --> 00:35:36,960 properties properties of individuals we just introduced new types 263 00:35:36,960 --> 00:35:44,640 like "man", "book" and so on. And for verbs, for example, we define them as 264 00:35:44,640 --> 00:35:49,120 functions from one of these types to properties. So not just from any 265 00:35:49,120 --> 00:35:56,800 individual, but from meaningful domains. So, for example, the meaning of talk 266 00:35:56,800 --> 00:36:03,520 requires that whoever is talking is actually human, because except for a few 267 00:36:05,280 --> 00:36:12,080 situations where other things could talk, it's usually only a human who talks. And 268 00:36:12,080 --> 00:36:18,634 the meaning of mortal is that something has to first be alive to be mortal. That's 269 00:36:18,634 --> 00:36:32,480 already nice, so we can limit our domains of predicates to very meaningful things 270 00:36:32,480 --> 00:36:40,834 where people intuitively would say that makes sense, but then we end up in the 271 00:36:40,834 --> 00:36:48,472 problem. For example, if we have the meaning of "Socrates is of type man" and 272 00:36:48,472 --> 00:36:54,787 wave "mortal", but mortal is of this type: it takes a parameter that's animate and 273 00:36:54,787 --> 00:37:03,760 return a property. And now we want to interpret "Socrates is mortal" in pretty 274 00:37:03,760 --> 00:37:10,400 much the same way as we did it before, or in a similar way as we interpreted things 275 00:37:10,400 --> 00:37:16,800 before. So we want to apply this meaning of "mortal" to Socrates, but we have the 276 00:37:16,800 --> 00:37:23,920 problem, that man doesn't match the type animate. So how does that work? And the 277 00:37:23,920 --> 00:37:32,640 answer to that, to that is we can assume subtypes, or we can introduce subtypes. So 278 00:37:32,640 --> 00:37:41,520 if we assume that humans are animate and men are human. Then we can actually apply 279 00:37:42,640 --> 00:37:47,680 the function that takes an animate object as a parameter to something that's a man, 280 00:37:49,360 --> 00:37:56,560 and also we want to assume that modified objects like a heavy book should still be 281 00:37:56,560 --> 00:38:06,960 a subtype of book. And fortunately, we can do that in the theory. And the final thing 282 00:38:06,960 --> 00:38:19,760 is: some functions, so we can already rule out some some things, with the … here, if 283 00:38:19,760 --> 00:38:25,920 we say that "talk" only works for humans, we can already rule out some nonsensical 284 00:38:25,920 --> 00:38:34,560 things, but we can use a little bit of this proof theoretic a bit more. So if we 285 00:38:35,600 --> 00:38:42,720 have this. I hope it's not too distracting that it's a bit of a weird 286 00:38:42,720 --> 00:38:48,320 syntax. We define a new type which is consumable, and we define two objects that 287 00:38:48,320 --> 00:38:54,720 are consumable. It's bread and wine. And then we define a new type of action with 288 00:38:54,720 --> 00:39:02,880 the two actions. One is eat, one is drink. And then we define a new type performance, 289 00:39:04,160 --> 00:39:13,840 which depends on both a consumable object and an action. And then we can construct 290 00:39:14,400 --> 00:39:21,360 objects of this performance, by using this construct to perform. So the meaning of 291 00:39:21,360 --> 00:39:28,160 "drink wine", for example, could be: perform, wine, drink. But we can also in 292 00:39:28,160 --> 00:39:36,320 this setting define the meaning of "eat wine" as: perform, wine, eat. And that's 293 00:39:36,320 --> 00:39:42,560 usually something we want to rule out. We want to have only to be able to drink wine 294 00:39:42,560 --> 00:39:50,880 or to eat bread, but not to eat wine or drink bread. And that we can do by 295 00:39:52,080 --> 00:39:58,240 defining a new type of, for example, edible and drinkable objects, and we 296 00:39:58,240 --> 00:40:08,400 define eat bread as edible bread and drink wine as drinkable wine, and then we modify 297 00:40:08,400 --> 00:40:17,280 our action and performance, that the action of eating needs an object of this 298 00:40:17,280 --> 00:40:24,800 type edible for the thing we want to consume. And this object is kind of the 299 00:40:24,800 --> 00:40:33,360 proof that whatever consumable we want to eat is actually edible. And then we can 300 00:40:34,320 --> 00:40:42,720 still define drink wine. In a pretty similar way, but we cannot find any way of 301 00:40:43,680 --> 00:40:50,800 performing eat on wine because there is no way of constructing an edible object for 302 00:40:50,800 --> 00:41:04,320 wine. Which is a very powerful thing of modeling how our real world happens to 303 00:41:04,320 --> 00:41:14,616 work, and … that pretty much concludes my talk. Here are a few references. 304 00:41:14,616 --> 00:41:24,337 And now I'm open for questions. Herald: Right, so, so far we have one 305 00:41:24,337 --> 00:41:31,517 question I see. And the question is: so how far can we take this? Has anyone ever 306 00:41:31,517 --> 00:41:37,397 formulated a sufficiently detailed type system for English or at least some 307 00:41:37,397 --> 00:41:44,345 variant of English, which can be used to reason about human-written or even 308 00:41:44,345 --> 00:41:48,729 informal texts like algorithmic language processing? 309 00:41:48,729 --> 00:41:54,930 daherb: That's a very good question. And that's one of the weak spots of this, 310 00:41:54,930 --> 00:42:03,160 that it involves lots of work. And these days a lot of research and effort is more 311 00:42:03,160 --> 00:42:10,806 on statistical models where you hopefully, very hope, that the computer might learn 312 00:42:10,806 --> 00:42:15,991 these things. There are a few there, there's for example a dataset. So some 313 00:42:15,991 --> 00:42:22,960 logicians sat down at some point and made many sentences in this style, like "every 314 00:42:22,960 --> 00:42:33,934 European has the right to live in Europe" and so on. I think it's a few hundred 315 00:42:33,934 --> 00:42:42,664 sentences and that's kind of the benchmark where people try to test the systems . And 316 00:42:42,664 --> 00:42:50,206 that kind of works. I know there are systems that can cover the whole benchmark 317 00:42:50,206 --> 00:42:56,954 and give the right results, but there are other datasets that are not created by 318 00:42:56,954 --> 00:43:02,447 logicians, but by everyday people crowdsourced on the Internet, and there 319 00:43:02,447 --> 00:43:08,224 it's much more difficult to actually agree on what is the conclusion? 320 00:43:08,224 --> 00:43:15,200 Herald: Okay, so the main problem is that there aren't any curated big enough data 321 00:43:15,200 --> 00:43:18,904 sets to to check the algorithms against, or? 322 00:43:18,904 --> 00:43:25,920 daherb: That's part of the problem. But it's also, yeah, it's really difficult to 323 00:43:25,920 --> 00:43:32,720 make that work for the complete language, for really big parts. But if you have a 324 00:43:32,720 --> 00:43:37,120 small application, a small domain, then it's definitely a feasible thing to do. 325 00:43:38,160 --> 00:43:44,800 And it gives you also an explanation of why one sentence follows from another, 326 00:43:44,800 --> 00:43:51,360 either by giving you the model in which these sentences are true or in the modern 327 00:43:51,360 --> 00:43:56,880 type theory by giving you a proof. So there is lots of research in doing that 328 00:43:56,880 --> 00:44:03,440 without logic and just machine learning, and they seem to be pretty good. But then 329 00:44:04,640 --> 00:44:09,280 sometimes these systems just give you the wrong result and you have no idea why. And 330 00:44:10,320 --> 00:44:14,640 yeah, that's the balance. So you get the highly reliable results from this system, 331 00:44:15,200 --> 00:44:20,090 but you have to do a lot of work to get it working, or you use a machine learning 332 00:44:20,090 --> 00:44:23,782 system which works much broader, but if something goes wrong, it's much more 333 00:44:23,782 --> 00:44:26,222 difficult to figure out what is exactly going wrong. 334 00:44:26,222 --> 00:44:31,948 Herald: Okay, thank you very much. We have a question from Gordon in the IRC – 335 00:44:31,948 --> 00:44:39,371 live.hacc.media for anyone who isn't there yet. How you not managed to mention 336 00:44:39,371 --> 00:44:44,480 Lojban? I'm not really sure on how to pronounce that. 337 00:44:44,480 --> 00:44:52,880 daherb: So, um. I don't know too much about Lojban except that it's an 338 00:44:53,760 --> 00:44:59,840 artificial language that I think was designed to have a clear semantics, 339 00:44:59,840 --> 00:45:12,320 probably something like first order logic so that approach is different and, these 340 00:45:12,320 --> 00:45:18,720 people, at least, who work on this, which I presented at least could pretend that 341 00:45:18,720 --> 00:45:28,400 they work with proper English. Even though some people could also turn it around and 342 00:45:28,400 --> 00:45:33,920 say they defined a nice formal language, which just looks very much like English, 343 00:45:33,920 --> 00:45:41,680 but it's actually it's a formal language. Herald: Okay, and we have Bob asking 344 00:45:41,680 --> 00:45:46,240 whether there is any effort to create a combination of a statistical and logical 345 00:45:46,240 --> 00:45:50,560 approach, and if not, why hasn't it been done or what are the challenges? 346 00:45:50,560 --> 00:45:59,760 daherb: There are definitely approaches that try to do that. The question 347 00:45:59,760 --> 00:46:04,720 is, where do you put the machine learning and where do you put the logic and what 348 00:46:04,720 --> 00:46:10,440 parts you can replace by machine learning, for example? But there is definitely 349 00:46:12,269 --> 00:46:15,516 approaches. I think one is called CCG Lamda by people in Japan. 350 00:46:15,516 --> 00:46:24,960 So there are definitely these approaches. Herald: And we have Klondyke asking 351 00:46:24,960 --> 00:46:32,122 whether there is any work on trying to infer types and rules from the corpuses. 352 00:46:32,122 --> 00:46:37,395 daherb: I'm not really aware of anything like that. 353 00:46:37,395 --> 00:46:43,680 Herald: Did you try generating valid sentences by random. 354 00:46:46,800 --> 00:46:52,480 daherb: depends on what you define as valid sentences. One answer to this 355 00:46:52,480 --> 00:46:58,640 question would be I once wrote a Twitter bot which generates random sentences which 356 00:46:58,640 --> 00:47:06,720 are all total tautologies. So I used many of these techniques I presented to 357 00:47:06,720 --> 00:47:12,960 generate natural language sentences, which all have a certain logic structure. 358 00:47:16,640 --> 00:47:22,880 And if you have a system like that, generating is usually not a big issue, 359 00:47:22,880 --> 00:47:33,462 analysis is usually the bigger issue. Herald: Okay, we have one more question 360 00:47:33,462 --> 00:47:39,325 where I can get the context to about the xkcd cartoon – maybe go back to the slide, 361 00:47:39,325 --> 00:47:45,050 and they want to know whether you can say something about why there are many 362 00:47:45,050 --> 00:47:50,849 competing syntax trees for the same language or more like what the tree is 363 00:47:50,849 --> 00:47:57,836 exactly for, what are the differences daherb: In this, the main difference is 364 00:47:57,836 --> 00:48:05,303 how much effort you want to put into the analysis. So the syntax tree at 365 00:48:05,303 --> 00:48:14,160 the top left is basically textbook level. It gives a very simple analysis and 366 00:48:14,960 --> 00:48:18,400 from a linguistic point of view, people could argue what exactly is, for example, 367 00:48:18,400 --> 00:48:29,920 this category for copula. And the other two sentences are kind of – I don't want 368 00:48:29,920 --> 00:48:33,840 to really say competing syntactic theories, so there are lots of theories, 369 00:48:35,330 --> 00:48:40,150 where they try to explain how natural languages work. I said people are really good at 370 00:48:40,400 --> 00:48:47,840 drawing trees, but they still struggle to explain certain things. And over the last 371 00:48:49,440 --> 00:48:54,640 almost hundred years, people came up with lots and lots of theories, how syntax 372 00:48:54,640 --> 00:49:02,560 actually works. And these theories usually have nice advantages and on the other 373 00:49:02,560 --> 00:49:09,120 hand, drawbacks. So the syntax tree on the top right is called combinatory categorial 374 00:49:09,120 --> 00:49:18,960 grammars and it uses categories which are also kind of function types. So NP/N and 375 00:49:18,960 --> 00:49:28,800 can be seen as a function which takes a N as a parameter on the right and generates an 376 00:49:28,800 --> 00:49:35,360 NP. So the syntactic categories are kind of in parallel to the semantic categories. 377 00:49:36,960 --> 00:49:42,000 And that way it's easy to translate from the syntax tree into the semantic 378 00:49:42,000 --> 00:49:48,240 representation while for other syntax formalisms it might be more challenging to 379 00:49:48,240 --> 00:49:52,720 to do that. So it strongly depends what you want to do, how to express that 380 00:49:52,720 --> 00:50:03,280 concept. Yeah, it goes a bit in the direction of this xkcd. I can subscribe to 381 00:50:03,280 --> 00:50:06,560 any dozen of contradictory models and still be taken seriously. 382 00:50:06,560 --> 00:50:15,840 Herald: Okay, are there any issues on these trees with garden path sentences? 383 00:50:15,840 --> 00:50:22,000 daherb: I should know more about garden path entrances because it's one of 384 00:50:22,000 --> 00:50:30,160 the main examples where things go wrong. At the moment. I cannot really answer 385 00:50:30,160 --> 00:50:38,800 that. I try to give a simple high level overview of it. And there are many 386 00:50:38,800 --> 00:50:44,560 interesting problems: garden path sentences, donkey sentences, where syntax 387 00:50:44,560 --> 00:50:52,240 or semantics or both can go wrong. Herald: Okay, so far, I don't see any more 388 00:50:52,240 --> 00:51:00,400 questions, so I would like to know from you, how does one get into type 389 00:51:00,400 --> 00:51:07,840 theory or language analytics or semantics daherb: Hanging out with the wrong people 390 00:51:09,120 --> 00:51:17,440 or, yeah. So computational linguistics is a pretty established field these days, so 391 00:51:18,160 --> 00:51:26,320 it's what people call interdisciplinary, it's taking a lot of computer science and 392 00:51:26,320 --> 00:51:31,680 hopefully still a bit of linguistics, and then you try to tackle certain problems. 393 00:51:31,680 --> 00:51:38,880 And one of the big problems of natural language is what do they mean? And when 394 00:51:38,880 --> 00:51:45,520 you study that, for example, then you get some you learn something about it, and if 395 00:51:45,520 --> 00:51:50,560 you're curious about it, you can put research into it. And to be honest, my 396 00:51:50,560 --> 00:51:56,240 research was something completely different. But the topic of semantics was 397 00:51:56,240 --> 00:52:00,480 kind of one of the reasons why I wanted to do some more research and stay in 398 00:52:00,480 --> 00:52:08,320 academia. Herald: Okay, thank you. Back again to AI 399 00:52:09,120 --> 00:52:15,440 language processing. Do you think there is a kind of new, bigger field coming? Well, 400 00:52:15,440 --> 00:52:20,000 there was the last few years with natural language processing getting more and more 401 00:52:20,000 --> 00:52:26,080 accessible to the public and better. And do you think that there is still a chance 402 00:52:26,080 --> 00:52:32,240 for only analytical approaches, or do you think that AI will be in the end, more 403 00:52:32,240 --> 00:52:37,760 prevalent? daherb: So actually, if you look at the 404 00:52:37,760 --> 00:52:43,120 research output in recent years, then you only find a little bit about what I was 405 00:52:43,120 --> 00:52:52,385 talking about and you find a lot about … what I have listed here as statistical 406 00:52:52,385 --> 00:52:57,922 approaches, especially language models. I guess most people who haven't been away 407 00:52:57,922 --> 00:53:04,275 from the Internet in the last few months have at least heard a little bit about, 408 00:53:04,275 --> 00:53:11,237 for example, GPT-3, which is one of these fancy AI models which you can use for 409 00:53:11,237 --> 00:53:17,040 generating text based on some sentence to start the text. Or people even 410 00:53:17,040 --> 00:53:25,667 implemented a dungeon crawl text adventure using it. So that's the hot shit in in 411 00:53:25,667 --> 00:53:34,328 research, kind of. But these suffer from this problem that as soon as something 412 00:53:34,328 --> 00:53:40,211 goes wrong, it's really difficult to figure out what's going wrong. And you 413 00:53:40,211 --> 00:53:45,535 need shitloads of data. Herald: Okay, we have some more questions. 414 00:53:45,535 --> 00:53:51,013 Just came in. We have some book recommendations in the IRC if anyone is 415 00:53:51,013 --> 00:53:59,210 interested. And we have one person asking whether a function word like conjunctions 416 00:53:59,210 --> 00:54:04,569 or some grammar like that-clauses can be present in type systems. 417 00:54:04,569 --> 00:54:14,080 daherb: So I haven't done anything about conjunction and so on. In the simply 418 00:54:14,080 --> 00:54:22,880 typed language, they are just translated to their logic operators. So you only have 419 00:54:22,880 --> 00:54:28,400 usually conjunctions on the sentence level so you could make sentences. "John loves 420 00:54:28,400 --> 00:54:35,120 Mary" and "Mary sleeps". And because the type of sentence is t, which is truth 421 00:54:35,120 --> 00:54:39,600 value, which is kind of a boolean variable, you can use boolean operations 422 00:54:39,600 --> 00:54:44,880 on it. So you analyze the first sentence and then you analyze the second sentence 423 00:54:44,880 --> 00:54:50,720 and then you check if the logic conjunction holds between them. And if you 424 00:54:50,720 --> 00:54:57,200 translate that in into models, conjunction can also be translated into the set operation 425 00:54:57,200 --> 00:55:05,120 of intersection. So conjunctions and disjunctions like the 426 00:55:05,120 --> 00:55:10,400 function words "and" and "or" are pretty easy, or still rather easy to express. 427 00:55:12,560 --> 00:55:19,360 There is, …, still many, many challenges which into which I couldn't go in the 428 00:55:19,360 --> 00:55:24,480 time. Herald: Okay. I think that answers the 429 00:55:24,480 --> 00:55:30,000 question from before. And there is also a new one: This "for all" thing. I always 430 00:55:30,000 --> 00:55:34,400 thought that required dependent types. Did I miss something or is that actually 431 00:55:34,400 --> 00:55:40,400 wrong? daherb: If it is a simply typed 432 00:55:40,400 --> 00:55:48,560 language, then it doesn't require dependent types. There is is an equivalent 433 00:55:49,520 --> 00:55:57,360 to the "for all" in dependent types – in dependently typed languages. I think that 434 00:55:57,360 --> 00:56:02,857 should be the Pi types which are dependent function types, but in a simply 435 00:56:02,857 --> 00:56:07,872 typed language you just have the forall operator. 436 00:56:07,872 --> 00:56:19,280 Herald: Okay. Are there new questions? I think there are. Can you recommend an intro 437 00:56:19,280 --> 00:56:25,680 to linguistics or a book? I think we just had some, but maybe, you know, some off 438 00:56:25,680 --> 00:56:35,280 the top of your mind. daherb: So except for the ones I have in 439 00:56:35,280 --> 00:56:43,040 my literature list, there are a few books by a professor called Emily Bender, which 440 00:56:43,040 --> 00:56:48,560 she coauthored with other people. One is about syntax, one is about semantics. And 441 00:56:48,560 --> 00:56:52,480 I haven't read them myself, but I think they are pretty good from what I heard. 442 00:56:54,560 --> 00:56:59,120 Because they also bridge between the linguistic knowledge and the requirements 443 00:56:59,120 --> 00:57:01,760 you have these days with more computational. 444 00:57:01,760 --> 00:57:18,320 Herald: Okay. I think, I'm not sure whether the "no" is directed towards me or 445 00:57:18,320 --> 00:57:22,240 you, but apparently some of these were only programing language books, so 446 00:57:22,240 --> 00:57:26,640 probably the ones in the IRC. There is also a question relayed from Twitter 447 00:57:29,280 --> 00:57:32,960 whether semantic web still is a thing in your community. 448 00:57:35,030 --> 00:57:43,520 daherb: It's a community on its own. And I don't know too much about it. It it's 449 00:57:43,520 --> 00:57:48,320 definitely still a thing. There are still people working on it, but I cannot say too 450 00:57:48,320 --> 00:57:55,355 much about it. I think that was probably going in this direction, new attempts for 451 00:57:55,355 --> 00:58:05,348 multilingual Wikipedia, where you use semantic representation for the Wikipedia 452 00:58:05,348 --> 00:58:11,163 page and then use that to generate the articles in various languages. 453 00:58:11,163 --> 00:58:20,548 Herald: Okay, I think the end – daherb: And of course, 454 00:58:20,548 --> 00:58:24,160 Herald: The IRC, you're already satisfied – 455 00:58:24,160 --> 00:58:30,880 daherb: Maybe another answer to the semantics web and in addition, an answer 456 00:58:30,880 --> 00:58:36,240 to a question that was before about actually inferring these types somehow is 457 00:58:36,240 --> 00:58:40,800 so what you can do with these types, especially with expressive 458 00:58:40,800 --> 00:58:45,840 types, is you kind of create something that's called an ontology, which is in a way 459 00:58:45,840 --> 00:58:54,320 a representation of the world. So what I try to do here is I define what is edible, 460 00:58:54,320 --> 00:59:02,800 what is drinkable, and it's tedious to hard code all these things about the 461 00:59:02,800 --> 00:59:09,360 world. And a lot of this is represented in either datasets or, for example, in the 462 00:59:09,360 --> 00:59:14,800 semantic web. So people already put effort into encoding that somehow and then you 463 00:59:14,800 --> 00:59:23,360 just need to extract it and use it. Herald: Okay, we are already at the end 464 00:59:23,360 --> 00:59:28,160 of time, there is three minutes left. We have one more question. I think after that 465 00:59:28,160 --> 00:59:35,920 we'll close it: And are you aware of natural language, hand-wavy formal text 466 00:59:36,640 --> 00:59:39,040 like codes of law which have been type checked? 467 00:59:39,040 --> 00:59:47,840 daherb: I know that there are people who spend their whole Ph.D. on trying to 468 00:59:47,840 --> 00:59:56,901 formalize law texts. I think there was also 469 00:59:56,901 --> 01:00:02,148 some work on formalizing the GDPR, but I heard about a few of these things, but 470 01:00:02,148 --> 01:00:07,563 I'm not fully aware of what's available and where it is available. 471 01:00:07,563 --> 01:00:14,871 Herald: Okay, so if you don't have anything more to say, say so. And apart from that, 472 01:00:14,871 --> 01:00:19,685 I think we are done here, at least for my part. 473 01:00:19,685 --> 01:00:24,990 daherb: Yep, then thanks for listening! 474 01:00:24,990 --> 01:00:27,586 *postroll music* 475 01:00:28,019 --> 01:00:37,799 Subtitles created by c3subtitles.de in the year 2021. Join, and help us!