1 00:00:05,898 --> 00:00:08,180 Hello. Just to check. 2 00:00:08,180 --> 00:00:09,839 Can everyone hear me? 3 00:00:09,839 --> 00:00:11,591 Grand. I've never understood 4 00:00:11,591 --> 00:00:13,640 why that's such a phenomenon when people give talks 5 00:00:13,640 --> 00:00:16,600 because if you can't, what are you meant to say? 6 00:00:16,600 --> 00:00:17,860 (laughter) 7 00:00:18,452 --> 00:00:21,170 But yes, so as said, I'm Os. 8 00:00:21,170 --> 00:00:24,333 I'm a PhD student at the University of Washington, 9 00:00:24,333 --> 00:00:26,250 where, according to the slide, 10 00:00:26,250 --> 00:00:29,818 I study "Gender, Infrastructure and (Counter)Power." 11 00:00:30,130 --> 00:00:32,556 I'd ask you all to do me the indulgence of pretending that 12 00:00:32,556 --> 00:00:37,448 that's some very explicit, nuanced, thoughtful, academic description 13 00:00:37,448 --> 00:00:39,356 and not just what I write as a catch-all, 14 00:00:39,356 --> 00:00:42,385 because I kind of study a thousand different things 15 00:00:42,385 --> 00:00:45,916 and fitting them all into a few words is hard. 16 00:00:46,694 --> 00:00:48,290 But most of the things I study 17 00:00:48,290 --> 00:00:52,309 are around how systems of knowledge enforce particular ideas 18 00:00:52,309 --> 00:00:53,823 of how the world works, 19 00:00:53,823 --> 00:00:55,857 and particular relationships of power 20 00:00:55,857 --> 00:00:58,270 with a specific focus on gender. 21 00:00:58,922 --> 00:01:00,656 I'm also an ex-Wikipedian. 22 00:01:00,656 --> 00:01:02,657 I spent 15 years as an editor 23 00:01:02,657 --> 00:01:05,904 which is maybe where my interest in the nature of knowledge started, 24 00:01:06,754 --> 00:01:10,274 and I really can't express how happy I was to be invited 25 00:01:10,274 --> 00:01:13,392 and how glad I am to be here with all of you, 26 00:01:13,392 --> 00:01:14,896 but particularly James Forrester 27 00:01:14,896 --> 00:01:16,761 who is probably the only person qualified 28 00:01:16,761 --> 00:01:19,630 to countersign my passport renewal application, 29 00:01:20,609 --> 00:01:23,190 cause it's running out soon and I've been trying to work out... 30 00:01:23,190 --> 00:01:24,326 (laughter) 31 00:01:24,326 --> 00:01:26,205 You move to Seattle. Everything is great. 32 00:01:26,205 --> 00:01:27,771 Then you're like, "Oh, the UK government 33 00:01:27,771 --> 00:01:31,730 requires me to find an ex-priest, civil servant, or member of parliament, 34 00:01:31,730 --> 00:01:33,454 who's known me for at least 2 years 35 00:01:33,454 --> 00:01:35,005 and who I can ship paperwork to." 36 00:01:35,005 --> 00:01:36,569 That sounds plausible. 37 00:01:36,569 --> 00:01:38,440 (laughter) 38 00:01:38,440 --> 00:01:40,189 Anyway, but... 39 00:01:40,189 --> 00:01:43,874 So I'm here as someone who has spent a lot of time of... 40 00:01:45,100 --> 00:01:46,137 a number of years-- 41 00:01:46,137 --> 00:01:49,315 which I don't like to think about because it makes me feel incredibly old-- 42 00:01:49,315 --> 00:01:51,548 wrestling with the nature of knowledge 43 00:01:51,548 --> 00:01:53,456 and the idea of knowledge-- 44 00:01:53,456 --> 00:01:55,324 to talk to you about 45 00:01:55,324 --> 00:01:57,372 what Wikidata looks like 46 00:01:57,372 --> 00:02:01,245 to someone from my background and with my research interests. 47 00:02:02,433 --> 00:02:05,160 And I'm not going to spend much time on the story of Wikidata itself, 48 00:02:05,160 --> 00:02:08,431 because if you're here, having spent 24 hours 49 00:02:08,431 --> 00:02:10,230 having it brain dumped into you, 50 00:02:10,230 --> 00:02:11,720 you're familiar with it. 51 00:02:11,720 --> 00:02:13,217 It's a big semantic data store 52 00:02:13,217 --> 00:02:15,144 that aims to provide machine-readable knowledge 53 00:02:15,144 --> 00:02:16,939 in a centralized way. 54 00:02:16,939 --> 00:02:21,133 And what this looks like is a series of items 55 00:02:21,133 --> 00:02:23,558 with associated properties or statements. 56 00:02:23,558 --> 00:02:26,840 So the item for "apple" has the property "fruit." 57 00:02:26,840 --> 00:02:28,220 I mean, probably. 58 00:02:28,220 --> 00:02:31,454 It's a Wiki so there's probably a long-running edit war 59 00:02:31,454 --> 00:02:32,994 of whether an apple is a fruit, 60 00:02:32,994 --> 00:02:36,160 and there's 50 people running 300 accounts between them, 61 00:02:36,160 --> 00:02:37,670 and it's been going for years, 62 00:02:37,670 --> 00:02:41,010 and at this point, if you mention the word apple on Wikidata, 63 00:02:41,010 --> 00:02:44,645 you're preemptively banned as someone who, you know, 64 00:02:44,645 --> 00:02:46,104 is secretly a sock puppet 65 00:02:46,104 --> 00:02:48,753 and running an account on one or another side of this. 66 00:02:50,247 --> 00:02:51,941 So as a consequence, 67 00:02:51,941 --> 00:02:54,229 it's also a classification system, right? 68 00:02:54,229 --> 00:02:56,717 A way of sorting and organizing the world. 69 00:02:57,123 --> 00:03:00,070 So, objects or people or concepts 70 00:03:00,070 --> 00:03:03,665 are classified as worth having a Wikidata entry or not. 71 00:03:03,665 --> 00:03:05,186 A fruit or not. 72 00:03:05,580 --> 00:03:06,620 And in each case 73 00:03:06,620 --> 00:03:08,325 a series of criterion apply 74 00:03:08,325 --> 00:03:10,782 to determine the properties that an object should have, 75 00:03:10,782 --> 00:03:12,619 and the values of these properties 76 00:03:12,619 --> 00:03:15,225 and how the objects all relate to each other. 77 00:03:15,225 --> 00:03:17,776 So Wikidata is really an attempt to build 78 00:03:17,776 --> 00:03:20,378 a universal classification system. 79 00:03:21,912 --> 00:03:25,983 And classification systems have been studied pretty extensively. 80 00:03:25,983 --> 00:03:28,963 One prominent work which I'd really recommend people read 81 00:03:28,963 --> 00:03:32,382 if they're interested in this stuff is *Sorting Things Out,* 82 00:03:32,382 --> 00:03:35,513 which is book by Geoff Bowker and Susan Leigh Star. 83 00:03:36,332 --> 00:03:38,555 And they found that in an ideal universe, 84 00:03:38,555 --> 00:03:41,062 a classification system, 85 00:03:41,062 --> 00:03:44,300 be it universal or over a particular domain, 86 00:03:44,300 --> 00:03:46,099 has three attributes. 87 00:03:46,099 --> 00:03:49,987 The first is it operates on consistent and unique principles. 88 00:03:49,987 --> 00:03:51,652 So, there's a consistent pattern 89 00:03:51,652 --> 00:03:55,252 of what should be in each category and for what reasons. 90 00:03:55,557 --> 00:03:59,224 The second is all the categories are mutually exclusive. 91 00:03:59,224 --> 00:04:02,060 And the third is that the system is complete. 92 00:04:02,060 --> 00:04:04,566 It contains total coverage of what it describes. 93 00:04:04,566 --> 00:04:06,989 And this doesn't mean it has to have every single object 94 00:04:06,989 --> 00:04:08,975 that fits into the system. 95 00:04:08,975 --> 00:04:10,861 It just means that in the situation 96 00:04:10,861 --> 00:04:13,233 where it lacks an object 97 00:04:13,233 --> 00:04:14,827 and that object then shows up, 98 00:04:14,827 --> 00:04:16,433 there should be a consistent mechanism 99 00:04:16,433 --> 00:04:19,090 to work out whether it should be added or not, 100 00:04:19,090 --> 00:04:20,605 and how it should be described 101 00:04:20,605 --> 00:04:22,114 and so on, and so forth. 102 00:04:22,699 --> 00:04:25,835 There is one small problem with this which is that: 103 00:04:26,685 --> 00:04:29,315 "No real-world working classification system 104 00:04:29,315 --> 00:04:31,575 that we have looked at meets these simple requirements 105 00:04:31,575 --> 00:04:33,332 and we doubt that any ever could." 106 00:04:34,002 --> 00:04:35,274 Or to put it another way, 107 00:04:35,274 --> 00:04:37,321 all classification systems fail. 108 00:04:37,760 --> 00:04:41,208 All classification systems have gaps and exceptions. 109 00:04:42,068 --> 00:04:45,186 And obviously, the same is true for all systems, full stop. 110 00:04:45,186 --> 00:04:47,146 Anyone who has ever coded 111 00:04:47,146 --> 00:04:49,317 or simply worked in an environment, 112 00:04:49,317 --> 00:04:51,672 or studied in an environment, or lived in the world 113 00:04:51,672 --> 00:04:54,720 knows that we've yet to design a single thing 114 00:04:54,720 --> 00:04:57,669 that we've thought all the way through. 115 00:04:57,929 --> 00:05:00,365 The problem is that when we take a system, 116 00:05:00,365 --> 00:05:01,894 classification, or otherwise, 117 00:05:01,894 --> 00:05:03,199 and put it out into the world 118 00:05:03,199 --> 00:05:07,314 and give it power and authority, and integrate it into other systems, 119 00:05:07,314 --> 00:05:09,329 that already have power and authority, 120 00:05:09,329 --> 00:05:11,373 there are consequences for what happens 121 00:05:11,373 --> 00:05:13,769 when the system inevitably fails, 122 00:05:14,657 --> 00:05:20,189 for how it reinforces or undermines existing relationships of power, 123 00:05:20,189 --> 00:05:21,970 for how it hurts people. 124 00:05:22,370 --> 00:05:25,111 A universal classification system is, in another words, 125 00:05:25,111 --> 00:05:28,952 not merely doomed to failure, it's also doomed to hurt people. 126 00:05:30,083 --> 00:05:32,352 And the way that it is structured 127 00:05:32,352 --> 00:05:37,052 is ultimately a series of ethical and political choices as a result-- 128 00:05:37,585 --> 00:05:39,417 Who do you want to hurt? How much? 129 00:05:39,417 --> 00:05:41,839 What should be done when people are injured? 130 00:05:42,197 --> 00:05:45,166 And those choices have real consequences. 131 00:05:48,292 --> 00:05:51,602 And so making these choices often involves confronting the fact 132 00:05:51,602 --> 00:05:53,733 that there's very rarely a single 133 00:05:53,733 --> 00:05:55,551 simple machine-readable interpretation 134 00:05:55,551 --> 00:05:58,698 of something that's true for all people throughout all history. 135 00:05:58,698 --> 00:05:59,966 Anything in the universe 136 00:05:59,966 --> 00:06:03,135 has multiple meanings, and symbolisms, and nuances 137 00:06:03,135 --> 00:06:07,077 to different people in different contexts at different times. 138 00:06:07,270 --> 00:06:10,119 But designing a classification system and implementing it, 139 00:06:10,119 --> 00:06:11,952 designing a system that can make a claim 140 00:06:11,952 --> 00:06:14,557 to having consistent principles, 141 00:06:14,557 --> 00:06:17,358 and covering everything it discusses, 142 00:06:17,358 --> 00:06:21,148 inevitably involves cutting down on this complexity 143 00:06:21,148 --> 00:06:25,045 and making decisions about what "the" meaning of a thing is going to be, 144 00:06:25,045 --> 00:06:26,860 or what array of possible meaning 145 00:06:26,860 --> 00:06:29,659 should be presented and in what sequence. 146 00:06:30,770 --> 00:06:31,820 And as a result, 147 00:06:31,820 --> 00:06:35,690 it involves silencing voices or rendering voices louder. 148 00:06:35,984 --> 00:06:37,879 Again, this has consequences. 149 00:06:37,879 --> 00:06:40,263 And to see what I mean about this complexity 150 00:06:40,263 --> 00:06:43,855 and context, and reduction, and the consequences of it, 151 00:06:44,174 --> 00:06:47,370 I'd like to set through some examples from Wikidata itself. 152 00:06:47,640 --> 00:06:50,368 The ones I've chosen are all gender-related because again, 153 00:06:50,368 --> 00:06:54,305 gender is both professionally and personally sort of a key interest. 154 00:06:55,360 --> 00:06:58,837 So, the first that I'll start with is transexualism 155 00:06:58,837 --> 00:07:00,506 which is described as a "condition 156 00:07:00,506 --> 00:07:02,956 in which an individual identifies with a gender 157 00:07:02,956 --> 00:07:06,966 inconsistent or not culturally associated with their biological sex." 158 00:07:08,458 --> 00:07:10,136 Fairly unobjectionable and-- 159 00:07:10,136 --> 00:07:12,306 wait, no, it's classified as a disease, 160 00:07:12,306 --> 00:07:15,271 and a psychiatric disease at that. 161 00:07:15,833 --> 00:07:18,657 Now, I know what you're thinking, which is this is appalling 162 00:07:18,657 --> 00:07:20,462 but actually it's not as simple 163 00:07:20,462 --> 00:07:24,340 as either of these statements being true or false, right? 164 00:07:25,094 --> 00:07:27,753 They're in a category of sort of, "true, except." 165 00:07:28,539 --> 00:07:33,194 So, take transsexualism is an instance of disease, right? 166 00:07:33,374 --> 00:07:35,680 Technically, this is true, 167 00:07:35,680 --> 00:07:37,939 in so far as transsexualism 168 00:07:37,939 --> 00:07:39,228 is the name of an entry 169 00:07:39,228 --> 00:07:42,428 under the International Classification of Diseases, version 10. 170 00:07:43,070 --> 00:07:45,692 But we should add some complexity and nuance to that. 171 00:07:45,692 --> 00:07:51,160 So, the ICD is a classification of literally 172 00:07:51,160 --> 00:07:53,623 everything in the world that you could have 173 00:07:53,623 --> 00:07:57,929 that was in any way involved at all in someone's injury or death. 174 00:07:58,336 --> 00:08:02,647 It is in fact illegal to die of something that is not listed in the ICD. 175 00:08:02,647 --> 00:08:04,212 (laughter) 176 00:08:04,958 --> 00:08:07,125 So it contains kind of a lot of things, 177 00:08:07,125 --> 00:08:09,071 and transexualism is listed in it 178 00:08:09,071 --> 00:08:10,534 so we classify it as a disease 179 00:08:10,534 --> 00:08:13,674 because it's in a classification of diseases. 180 00:08:13,674 --> 00:08:16,864 So, here are some other things that the ICD also lists as diseases 181 00:08:16,864 --> 00:08:19,400 that it has specific entries for. 182 00:08:20,079 --> 00:08:22,995 PA80: Shot by accident. 183 00:08:24,166 --> 00:08:28,363 PA40.0: Fell off a boat, drowned. 184 00:08:28,363 --> 00:08:29,887 (laughter) 185 00:08:30,162 --> 00:08:35,106 PA41.1: Fell off a boat, damaged the boat, and drowned. 186 00:08:35,106 --> 00:08:36,798 (laughter) 187 00:08:37,210 --> 00:08:39,975 PA40.1: Fell off the boat, 188 00:08:39,975 --> 00:08:41,380 didn't damage the boat, 189 00:08:41,380 --> 00:08:42,598 didn't drown, 190 00:08:42,598 --> 00:08:44,050 still died of something. 191 00:08:44,050 --> 00:08:45,386 (laughter) 192 00:08:45,732 --> 00:08:48,793 And finally, QD50: Being poor. 193 00:08:48,793 --> 00:08:50,111 (laughter) 194 00:08:50,439 --> 00:08:53,668 So, if any of you have ever fallen off a boat, 195 00:08:53,668 --> 00:08:57,186 I'm very sorry but you have a disease 196 00:08:57,186 --> 00:08:59,265 which you should really talk to a doctor about. 197 00:08:59,265 --> 00:09:01,207 What class of doctor, I'm not sure. 198 00:09:01,207 --> 00:09:03,134 It might be a psychiatrist. 199 00:09:03,134 --> 00:09:04,510 Who knows? 200 00:09:05,713 --> 00:09:07,797 So you know that's disease, right? 201 00:09:07,797 --> 00:09:10,522 What about health specialty: psychiatry? 202 00:09:10,522 --> 00:09:13,703 Well, that's also true, sort of. 203 00:09:13,703 --> 00:09:15,595 So, psychiatrists are the people 204 00:09:15,595 --> 00:09:18,748 who diagnose the presence of gender dysphoria, 205 00:09:18,748 --> 00:09:21,074 a disconnect between one's sense of gender 206 00:09:21,074 --> 00:09:24,184 and one's sort of like, embodied or perceived gender. 207 00:09:24,530 --> 00:09:26,032 But again, context. 208 00:09:26,032 --> 00:09:29,048 For example, saying psychiatrists diagnose it 209 00:09:29,048 --> 00:09:30,970 ignores the fact that none of the treatments 210 00:09:30,970 --> 00:09:32,329 are psychiatric. 211 00:09:32,329 --> 00:09:34,052 You might as well list the specialties 212 00:09:34,052 --> 00:09:38,328 as specialization in hormones 213 00:09:38,328 --> 00:09:41,798 or plastic surgery, or being a personal shopper. 214 00:09:42,118 --> 00:09:45,664 All of these also have some role in people's life trajectories. 215 00:09:46,042 --> 00:09:47,874 They are not listed. 216 00:09:49,605 --> 00:09:51,704 One other useful potential factoid by the way, 217 00:09:51,704 --> 00:09:54,201 is that the ICD 10 is actually 218 00:09:54,201 --> 00:09:57,477 the old International Classification of Diseases, 219 00:09:57,477 --> 00:10:01,446 and the ICD 11 no longer lists transsexualism at all, 220 00:10:01,446 --> 00:10:03,512 much less as a disease. 221 00:10:04,222 --> 00:10:08,297 But my point here is not that Wikidata sometimes contains outdated information 222 00:10:08,297 --> 00:10:10,512 or sometimes contains false information, 223 00:10:10,512 --> 00:10:14,481 it's that the statements that are constructed from that information 224 00:10:14,481 --> 00:10:16,873 as a consequence of what they leave out 225 00:10:16,873 --> 00:10:18,213 and what the results are, 226 00:10:18,213 --> 00:10:20,524 drop things and add risk. 227 00:10:21,418 --> 00:10:23,132 So, one way of structuring 228 00:10:23,132 --> 00:10:25,478 the information that that entry contained is: 229 00:10:25,478 --> 00:10:29,545 "transsexualism is a psychiatric disease." 230 00:10:29,545 --> 00:10:31,511 And this leaves out a lot of complexity, 231 00:10:31,511 --> 00:10:33,543 some of which we've discussed. 232 00:10:33,543 --> 00:10:36,178 But the greater issue is how it interlocks 233 00:10:36,178 --> 00:10:39,751 and resonates with existing narratives, and existing information. 234 00:10:40,161 --> 00:10:43,340 For example, the idea of transsexualism is a disease. 235 00:10:43,340 --> 00:10:48,478 Does anyone know why the ICD stops listing it as a disease? 236 00:10:49,830 --> 00:10:51,343 Well, two reasons. 237 00:10:51,343 --> 00:10:55,779 First is because calling being trans a disease is not accurate. 238 00:10:55,779 --> 00:10:58,698 It does not meet the definition of being a disease. 239 00:10:59,932 --> 00:11:02,763 In fact, the only reason that anything to do with being trans 240 00:11:02,763 --> 00:11:07,761 is still in the ICD is not out of some objective 241 00:11:07,761 --> 00:11:11,685 like, you know, examination of biology or psychiatry 242 00:11:11,685 --> 00:11:13,928 but instead purely pragmatism. 243 00:11:14,126 --> 00:11:15,676 That if you stop listing it, 244 00:11:15,676 --> 00:11:18,453 then insurance companies in places like the U.S. 245 00:11:18,453 --> 00:11:20,969 would stop covering medical care 246 00:11:20,969 --> 00:11:23,787 that is associated with being trans. 247 00:11:24,257 --> 00:11:25,787 And the second is that 248 00:11:27,077 --> 00:11:30,335 the stigma associated with having something classified 249 00:11:30,335 --> 00:11:32,761 as a disease is substantive, 250 00:11:33,014 --> 00:11:35,514 and when you list transsexualism as a disease 251 00:11:35,514 --> 00:11:37,038 and a psychiatric one at that, 252 00:11:37,038 --> 00:11:39,373 you tap into really long-standing assumptions 253 00:11:39,373 --> 00:11:41,685 and false beliefs about trans people. 254 00:11:41,685 --> 00:11:43,865 Assumptions and beliefs that have a lot of power. 255 00:11:43,865 --> 00:11:46,753 Like, if it's a disease there must be something wrong 256 00:11:46,753 --> 00:11:49,601 with trans people, something that people should fix. 257 00:11:49,896 --> 00:11:51,487 And if it's a psychiatric condition 258 00:11:51,487 --> 00:11:54,740 then trans people should be therapized out of being trans. 259 00:11:54,740 --> 00:11:59,150 In other words, whatever the raw truth or falseness of the statement, 260 00:11:59,150 --> 00:12:01,930 stripping out its complexity and contextuality, 261 00:12:01,930 --> 00:12:05,276 lets people fit it into their own notions of what it means. 262 00:12:06,361 --> 00:12:07,374 And that doesn't end 263 00:12:07,374 --> 00:12:10,160 in a neutral objective classification system, 264 00:12:10,160 --> 00:12:13,109 it ends in things like conversion therapy, 265 00:12:13,109 --> 00:12:17,402 and it being legal to beat people to death for being trans 266 00:12:17,402 --> 00:12:19,990 when you find out that they're trans after you slept with them, 267 00:12:19,990 --> 00:12:22,061 because, you know, something's wrong with them. 268 00:12:22,061 --> 00:12:25,917 Like why would you be considered reasonable 269 00:12:25,917 --> 00:12:27,417 to have done this? 270 00:12:29,574 --> 00:12:33,375 So a more accurate framing of this might be this, 271 00:12:33,375 --> 00:12:36,859 which is hard to fit into Wikidata. 272 00:12:37,673 --> 00:12:40,573 And because we can't fit that into Wikidata, 273 00:12:40,573 --> 00:12:41,780 and we strip it down, 274 00:12:41,780 --> 00:12:43,077 and we lose all that complexity, 275 00:12:43,077 --> 00:12:47,943 we open up the possibility to, again, reinforce these really dangerous notions. 276 00:12:49,652 --> 00:12:52,082 So, let's look at another example, also from gender, 277 00:12:52,082 --> 00:12:54,441 and that is the entry for non-binary. 278 00:12:55,592 --> 00:12:58,360 So, as Wikidata informs us, 279 00:12:58,360 --> 00:13:00,505 non-binary is a range of genders 280 00:13:00,505 --> 00:13:03,420 that are neither exclusively man nor woman. 281 00:13:03,420 --> 00:13:07,062 And there are some critiques I have of the "also known as" section, 282 00:13:07,062 --> 00:13:08,616 but that's not the biggest issue here. 283 00:13:08,616 --> 00:13:10,068 No, the biggest issue here 284 00:13:10,068 --> 00:13:14,800 is that at no point does this entire page make any reference to trans people. 285 00:13:14,970 --> 00:13:18,972 So, if you go to the entry for transgender woman, 286 00:13:18,972 --> 00:13:22,064 it says, "opposite to transgender man." 287 00:13:22,243 --> 00:13:24,093 And if you go to the entry for transgender man 288 00:13:24,093 --> 00:13:26,542 it says, "opposite to transgender woman." 289 00:13:26,916 --> 00:13:28,640 If you go to this entry, 290 00:13:28,998 --> 00:13:32,982 it has absolutely no reference to trans people whatsoever. 291 00:13:32,982 --> 00:13:36,106 There is this complete disconnect and distinction 292 00:13:36,106 --> 00:13:39,331 between non-binary people and trans people. 293 00:13:40,327 --> 00:13:42,324 And this might be, seems to be, 294 00:13:42,324 --> 00:13:44,432 a pedantic thing to be concerned about 295 00:13:44,432 --> 00:13:47,627 but it's actually a really useful example for a couple of reasons. 296 00:13:48,285 --> 00:13:52,821 The first is that how non-binary people relates to being trans 297 00:13:52,821 --> 00:13:54,768 is really hotly debated. 298 00:13:56,248 --> 00:14:00,478 Individual non-binary people may or may not identify as trans. 299 00:14:02,170 --> 00:14:03,762 As a consequence, it's really difficult 300 00:14:03,762 --> 00:14:07,386 to make big categorical judgements about a class of people. 301 00:14:09,120 --> 00:14:13,207 Other people would say that non-binary people aren't trans, 302 00:14:13,207 --> 00:14:16,477 for whatever reason, or that non-binary people are trans. 303 00:14:17,577 --> 00:14:20,093 You know, you have to make a decision at some point. 304 00:14:20,093 --> 00:14:22,107 How are you going to categorize this entry? 305 00:14:22,107 --> 00:14:24,288 What attributes are you going to associate it with? 306 00:14:26,019 --> 00:14:28,047 But it's hard to do that in Wikidata 307 00:14:28,047 --> 00:14:30,841 when by necessity the structure of the platform 308 00:14:30,841 --> 00:14:33,070 is so categorical and so fixed, 309 00:14:33,070 --> 00:14:36,647 that you can't really say like, for some people these things are related 310 00:14:36,647 --> 00:14:39,592 and for others they aren't, and it's actually very politically charged 311 00:14:39,592 --> 00:14:41,333 but you should think about it. 312 00:14:42,473 --> 00:14:44,779 There's no objective fact to fall back on. 313 00:14:44,779 --> 00:14:48,050 It's very contextual and complex, and disputed. 314 00:14:50,193 --> 00:14:53,373 So, how do you fit this in? 315 00:14:53,637 --> 00:14:54,916 Anyone? 316 00:14:57,530 --> 00:15:00,154 But, this reductiveness isn't just a question of, 317 00:15:00,154 --> 00:15:02,370 "Oh well, we haven't fit all the information in 318 00:15:02,370 --> 00:15:04,300 so I guess it's not perfect." 319 00:15:04,300 --> 00:15:08,790 Again, it fits into preexisting discourses and the preexisting world, 320 00:15:08,790 --> 00:15:11,436 and has the potential to cause very real harms. 321 00:15:12,696 --> 00:15:14,180 There's this very long history 322 00:15:14,180 --> 00:15:17,290 of non-binary people not being considered trans, 323 00:15:18,178 --> 00:15:20,974 going back to, in fact, the foundational, 324 00:15:20,974 --> 00:15:24,870 sort of medical and academic, and authoritative works 325 00:15:24,870 --> 00:15:28,852 on what being trans is and how trans people should be treated. 326 00:15:29,633 --> 00:15:30,930 And what this has resulted in 327 00:15:30,930 --> 00:15:35,556 is non-binary people being cut out of access to resources-- 328 00:15:37,094 --> 00:15:41,167 medical care, community membership, any kind of support. 329 00:15:41,167 --> 00:15:43,861 In fact until 2013, 330 00:15:43,861 --> 00:15:47,083 being non-binary was not a thing you could possibly be 331 00:15:47,083 --> 00:15:51,230 while still getting access, to transition-related medical treatment. 332 00:15:51,230 --> 00:15:54,512 If you were, and you wanted access you would have to go to your doctor 333 00:15:54,512 --> 00:15:58,203 and consistently lie, and hopefully get away with it. 334 00:16:00,119 --> 00:16:03,324 So, if you want that diagnosis to happen 335 00:16:03,324 --> 00:16:05,644 so that your health insurance will cover things 336 00:16:05,644 --> 00:16:08,738 or that your national health service will cover things, 337 00:16:08,738 --> 00:16:10,840 you could either be a man or a woman, 338 00:16:10,840 --> 00:16:12,503 and nothing else. 339 00:16:13,986 --> 00:16:16,185 And right now there's a ton of backlash 340 00:16:16,185 --> 00:16:17,730 to non-binary existences 341 00:16:17,730 --> 00:16:20,459 from people who are thinking that we are a threat, 342 00:16:20,459 --> 00:16:23,092 or something new and novel 343 00:16:23,092 --> 00:16:26,758 when we've been around for just as long as any other kind of trans person 344 00:16:26,758 --> 00:16:29,629 and just not discussed. 345 00:16:30,623 --> 00:16:32,285 And again, the consequence of this 346 00:16:32,285 --> 00:16:36,946 is that this silence is reinforcing those preexisting ideas 347 00:16:36,946 --> 00:16:41,523 of being non-binary has nothing to do with being trans whatsoever, 348 00:16:41,523 --> 00:16:47,245 and it creates and reinforces discourses that cut people off from care, 349 00:16:47,245 --> 00:16:49,641 and cut people off from community. 350 00:16:51,273 --> 00:16:55,653 And finally, before I stop harping on things about gender quite so much, 351 00:16:56,352 --> 00:16:57,463 the hijra. 352 00:16:57,463 --> 00:16:59,266 So, according to Wikidata 353 00:16:59,266 --> 00:17:02,239 the hijra are the third gender of South Asian cultures 354 00:17:02,239 --> 00:17:04,869 and a sub class of non-binary. 355 00:17:05,526 --> 00:17:07,258 Now, here's the thing. 356 00:17:07,258 --> 00:17:11,256 Yes, hijra people fall outside a simple man-woman binary, 357 00:17:12,430 --> 00:17:14,280 but pretty much zero hijra people 358 00:17:14,280 --> 00:17:16,236 would ever define themselves as non-binary, 359 00:17:16,236 --> 00:17:18,790 because it just doesn't make any sense. 360 00:17:18,790 --> 00:17:22,705 In a western context, non-binary people are, by definition, 361 00:17:22,705 --> 00:17:24,380 not man or woman 362 00:17:24,380 --> 00:17:27,844 but as a consequence not trans man or trans woman. 363 00:17:28,706 --> 00:17:31,180 Hijra includes trans women, 364 00:17:31,180 --> 00:17:34,160 and also includes all intersex people, 365 00:17:34,160 --> 00:17:37,682 all sterile people, and a large number of gay people 366 00:17:37,682 --> 00:17:40,755 while not including trans men 367 00:17:40,755 --> 00:17:45,422 or people who are non-binary, and were assigned female at birth. 368 00:17:46,704 --> 00:17:48,042 All of this is really complex 369 00:17:48,042 --> 00:17:50,057 and there are literally books written 370 00:17:50,057 --> 00:17:54,578 on the framework of gender and how that fits into it. 371 00:17:54,578 --> 00:17:56,825 But the point is there's not a simple mapping 372 00:17:56,825 --> 00:17:58,803 of western gender notions 373 00:17:58,803 --> 00:18:01,173 to gender notions in the rest of the world. 374 00:18:02,121 --> 00:18:04,876 Categorizing hijra people 375 00:18:04,876 --> 00:18:09,791 as a subset of non-binary people 376 00:18:09,791 --> 00:18:14,026 ignores the fact that most hijra people do not see themselves that way, 377 00:18:14,026 --> 00:18:15,572 would not see themselves that way, 378 00:18:15,572 --> 00:18:18,891 and that the definitions of hijra and non-binary 379 00:18:18,891 --> 00:18:21,047 are completely incompatible. 380 00:18:22,577 --> 00:18:24,383 But again this has the potential 381 00:18:24,383 --> 00:18:26,711 to cause harm. 382 00:18:26,711 --> 00:18:28,231 Because the fact of the matter 383 00:18:28,231 --> 00:18:31,840 is that western notions of gender are pretty regularly 384 00:18:31,840 --> 00:18:35,116 and over a long period of time exported to the rest of the world 385 00:18:35,116 --> 00:18:36,638 often by violence. 386 00:18:37,261 --> 00:18:39,933 We have these information systems. 387 00:18:39,933 --> 00:18:42,634 We have classification systems. 388 00:18:42,634 --> 00:18:43,638 We have standards. 389 00:18:43,638 --> 00:18:46,648 We have, historically and currently, wars, 390 00:18:46,648 --> 00:18:49,030 all of which are orientated around this idea 391 00:18:49,030 --> 00:18:52,047 of the western way of doing things is the only good way 392 00:18:52,047 --> 00:18:54,306 or is the best way and the standard way, 393 00:18:54,306 --> 00:18:57,097 and everyone should conform. 394 00:18:57,097 --> 00:19:00,795 And so when we have these big projects which are trying to fit the world 395 00:19:00,795 --> 00:19:04,334 in to a very westernized idea of knowledge, because they have to, 396 00:19:04,334 --> 00:19:07,736 because that’s how classification systems do universally work-- 397 00:19:07,736 --> 00:19:10,533 everything has to fit into one consistent scheme. 398 00:19:11,396 --> 00:19:13,988 It is perpetuating that kind of violence. 399 00:19:17,173 --> 00:19:20,510 So, you could respond to my concerns and examples, 400 00:19:20,510 --> 00:19:22,535 and rambles with kind of a lot. 401 00:19:22,535 --> 00:19:25,480 One line to take would be, "Why does this matter?" 402 00:19:25,480 --> 00:19:28,475 Why does Wikidata participating and validating 403 00:19:28,475 --> 00:19:32,646 or invalidating particular discourses have an impact on the world? 404 00:19:33,495 --> 00:19:37,126 And the first answer is it actually doesn't matter if it matters. 405 00:19:37,126 --> 00:19:39,385 It matters that you acknowledge it, 406 00:19:39,385 --> 00:19:41,874 So, right now the default framing of Wikidata is 407 00:19:41,874 --> 00:19:45,172 we're just collecting all of the knowledge in a machine-readable form, 408 00:19:45,172 --> 00:19:46,398 but you're not. 409 00:19:46,398 --> 00:19:47,598 You're also making decisions 410 00:19:47,598 --> 00:19:50,124 about what should be included and what shouldn't, 411 00:19:50,124 --> 00:19:52,973 and how knowledge should be represented. 412 00:19:52,973 --> 00:19:55,897 What complexity is worth representing and what isn't. 413 00:19:56,667 --> 00:19:59,330 And those are ethical and political choices, 414 00:19:59,330 --> 00:20:01,790 and framing the project as simply the result 415 00:20:01,790 --> 00:20:04,741 of a million anonymous, and interchangeable monkeys 416 00:20:04,741 --> 00:20:06,748 with an equivalent number of typewriters 417 00:20:06,748 --> 00:20:09,382 makes it impossible for us to have conversations about it. 418 00:20:09,833 --> 00:20:13,450 Wikidata's organizers and users and funders must understand 419 00:20:13,450 --> 00:20:16,877 that they're fundamentally making charged decisions 420 00:20:16,877 --> 00:20:19,443 that are not neutral or objective at all, 421 00:20:20,114 --> 00:20:23,972 and that is not bad but dangerous. 422 00:20:25,991 --> 00:20:28,113 And so, okay, having accepted 423 00:20:28,113 --> 00:20:30,491 that these are ethical and political decisions, 424 00:20:30,491 --> 00:20:32,724 you could say, "Well, if people want their takes 425 00:20:32,724 --> 00:20:35,030 on things included, they should just contribute." 426 00:20:35,352 --> 00:20:38,900 And marginalized communities do contribute a lot, right? 427 00:20:38,900 --> 00:20:41,139 There's a long history of queer communities, 428 00:20:41,139 --> 00:20:44,358 particularly, being very early adopters of technology. 429 00:20:44,358 --> 00:20:47,848 And so people could just contribute to Wikidata. 430 00:20:47,848 --> 00:20:53,148 Like Hijra people could create accounts and start arguing 431 00:20:53,148 --> 00:20:56,260 that actually the entry shouldn't be a subset of non-binary 432 00:20:56,260 --> 00:20:58,124 and so, and so forth. 433 00:20:58,874 --> 00:21:01,848 The problem is that this is unlikely to help 434 00:21:01,848 --> 00:21:03,698 because they're the minority, 435 00:21:03,698 --> 00:21:05,879 because many of the voices and perspectives 436 00:21:05,879 --> 00:21:07,539 that are currently silenced, 437 00:21:07,539 --> 00:21:10,091 in the political and ethical decisions being made, 438 00:21:10,091 --> 00:21:11,852 are those of minorities. 439 00:21:11,852 --> 00:21:14,180 So, I did some number crunching on this. 440 00:21:14,180 --> 00:21:17,163 Wikidata has 20,000 active editors 441 00:21:17,483 --> 00:21:21,432 from a human population of seven billion give or take, 442 00:21:21,432 --> 00:21:23,919 unless you believe that maths is a lie 443 00:21:23,919 --> 00:21:28,436 and the world governments, controlled by lizards under the Arctic, 444 00:21:28,436 --> 00:21:31,295 is making everything up. 445 00:21:31,295 --> 00:21:33,159 And there are approximately... Um hmm? 446 00:21:33,159 --> 00:21:34,329 (person 1) You mean they're not? 447 00:21:34,329 --> 00:21:35,500 (laughter) 448 00:21:35,500 --> 00:21:37,413 Look, I'll be honest. 449 00:21:37,413 --> 00:21:39,177 If living in the U.S. for the last five years 450 00:21:39,177 --> 00:21:40,236 has taught me anything, 451 00:21:40,236 --> 00:21:44,622 it's that any government assemblage large enough to try and control 452 00:21:44,622 --> 00:21:46,414 a big chunk of the human population 453 00:21:46,414 --> 00:21:50,524 would in no way be consistently competent enough to actually cover it up. 454 00:21:50,524 --> 00:21:51,553 (laughter) 455 00:21:51,553 --> 00:21:53,276 Like we would have found out in three months-- 456 00:21:53,276 --> 00:21:54,522 and it wouldn't even have been 457 00:21:54,522 --> 00:21:57,025 because of some plucky investigative reporter-- 458 00:21:57,025 --> 00:21:58,861 it would have been because one of the lizards 459 00:21:58,861 --> 00:22:00,497 forgot to put on their human suit one day 460 00:22:00,497 --> 00:22:02,806 and accidentally went out to the shops for a pint of milk 461 00:22:02,806 --> 00:22:04,454 (laughter) 462 00:22:04,454 --> 00:22:07,723 and got caught in a TikTok video. 463 00:22:07,723 --> 00:22:09,856 (laughter) 464 00:22:10,880 --> 00:22:13,835 So Wikidata has 20,000 active editors-- 465 00:22:14,497 --> 00:22:16,514 of whom we will assume none are lizards 466 00:22:16,514 --> 00:22:18,419 in human suits or otherwise-- 467 00:22:18,624 --> 00:22:21,331 from a human population of seven billion, 468 00:22:21,770 --> 00:22:24,695 and there are approximately one million Hijra people in the world. 469 00:22:24,695 --> 00:22:27,494 So if we assume a rate of equal participation-- 470 00:22:27,494 --> 00:22:30,892 setting aside the extreme poverty a lot of Hijra people live in 471 00:22:30,892 --> 00:22:32,245 and the corresponding impact 472 00:22:32,245 --> 00:22:34,669 on access to things like reliable internet coverage-- 473 00:22:35,788 --> 00:22:40,545 then the combined efforts of 20,000 Wikidata editors 474 00:22:40,545 --> 00:22:44,393 would have to be overwhelmed by 2.85 people. 475 00:22:45,881 --> 00:22:48,545 That doesn't seem particularly plausible. 476 00:22:51,830 --> 00:22:53,425 Okay, so then you might say, 477 00:22:53,425 --> 00:22:57,109 "Well, what if we just have other Wikibase instances 478 00:22:57,109 --> 00:22:59,692 isn't that the whole thing we're building towards? 479 00:22:59,990 --> 00:23:03,234 You can set up your own Wikibase with your own perspectives 480 00:23:03,234 --> 00:23:05,940 and your own decisions about how to classify things, 481 00:23:05,940 --> 00:23:07,911 and what to prioritize, and what not to. 482 00:23:08,207 --> 00:23:11,126 Make your own site with your own standard for what constitutes knowledge 483 00:23:11,126 --> 00:23:13,352 and what information is important." 484 00:23:13,352 --> 00:23:15,920 And people could do precisely that. 485 00:23:15,920 --> 00:23:19,311 But the problem is that Wikidata has a lot of heft behind it 486 00:23:19,311 --> 00:23:23,173 which is why the decisions that Wikidata makes have so much import. 487 00:23:23,739 --> 00:23:26,058 There's the fact that it already exists. 488 00:23:26,058 --> 00:23:28,684 It has a first movers advantage. 489 00:23:29,358 --> 00:23:31,378 There's the Wikimedia brand. 490 00:23:31,378 --> 00:23:34,325 There's the funding from places like Google. 491 00:23:34,325 --> 00:23:36,897 There's the relationships with other institutions. 492 00:23:36,897 --> 00:23:39,018 When the strategic plan for Wikidata 493 00:23:39,018 --> 00:23:42,115 calls for engagement and integration with museums, 494 00:23:42,115 --> 00:23:43,176 that doesn't just result 495 00:23:43,176 --> 00:23:45,390 in getting more data for Wikidata. 496 00:23:45,390 --> 00:23:48,611 That also results in Wikidata 497 00:23:48,611 --> 00:23:52,256 and the decisions its users make permeating more of reality, 498 00:23:52,256 --> 00:23:57,607 becoming more of a standard of how data systems work, 499 00:23:58,120 --> 00:24:01,914 and more of a place that is drawn from to populate other spaces. 500 00:24:04,199 --> 00:24:07,111 So I keep using this line, "Not bad, but dangerous" 501 00:24:07,111 --> 00:24:10,238 to describe classification systems or to describe Wikidata, 502 00:24:10,959 --> 00:24:11,991 and I want to reinforce 503 00:24:11,991 --> 00:24:14,947 that I don't think that Wikidata is inherently bad. 504 00:24:16,171 --> 00:24:19,141 But I do think that its dangers are vast 505 00:24:19,141 --> 00:24:21,134 and are not being properly attended to. 506 00:24:21,134 --> 00:24:23,262 Just by looking at gender, 507 00:24:23,262 --> 00:24:27,498 we saw three examples, which I pulled very, very quickly, 508 00:24:27,498 --> 00:24:31,810 of situations where even setting aside 509 00:24:31,810 --> 00:24:34,950 the sort of objective "accuracy" 510 00:24:34,950 --> 00:24:38,830 of the information that a Wikidata entry might contain, 511 00:24:38,830 --> 00:24:43,645 the information it chooses to contain and chooses to prioritize perpetuates 512 00:24:43,645 --> 00:24:47,255 or silences particular discourses, and particular ideas 513 00:24:47,255 --> 00:24:51,673 that have weight in the rest of the world, that do harm in the rest of the world. 514 00:24:52,860 --> 00:24:54,230 And I picked those examples 515 00:24:54,230 --> 00:24:57,845 not because they're surprising in any way, 516 00:24:57,946 --> 00:25:00,020 or not because they're unique, 517 00:25:00,020 --> 00:25:04,258 but simply to point out that if I could find that many problems 518 00:25:04,258 --> 00:25:07,038 with resonances in wider violent systems 519 00:25:07,038 --> 00:25:08,987 in such a tiny sliver of content, 520 00:25:08,987 --> 00:25:11,644 imagine how many others are lurking out there. 521 00:25:13,750 --> 00:25:17,507 And the goal of Wikidata, 522 00:25:17,507 --> 00:25:19,385 the goal of universal classification 523 00:25:19,385 --> 00:25:21,577 if these dangers are not attended to 524 00:25:21,577 --> 00:25:24,480 could ultimately result, or will ultimately result, 525 00:25:24,480 --> 00:25:27,661 not in simple like neutral classification, 526 00:25:27,661 --> 00:25:29,134 but imposition. 527 00:25:29,134 --> 00:25:31,673 In saying this is the way the world works 528 00:25:31,673 --> 00:25:33,366 and if you don't like it 529 00:25:33,366 --> 00:25:37,178 then congrats, you should try and fit into it. 530 00:25:38,685 --> 00:25:41,856 And I really wish that I had a sort of simple answer for this. 531 00:25:42,471 --> 00:25:43,526 I don't. 532 00:25:43,526 --> 00:25:44,613 It's one of the advantages 533 00:25:44,613 --> 00:25:45,984 of switching to academia 534 00:25:45,984 --> 00:25:48,116 instead of working in an engineering department. 535 00:25:48,116 --> 00:25:49,378 You can just show up places 536 00:25:49,378 --> 00:25:52,311 and go, "Everything is really complicated." 537 00:25:52,311 --> 00:25:54,089 Someone should do something about that. 538 00:25:54,875 --> 00:25:56,720 Could I have a grant please? 539 00:25:56,863 --> 00:25:58,283 (laughter) 540 00:25:58,466 --> 00:25:59,604 But all I can really do 541 00:25:59,604 --> 00:26:02,876 is point you back to Bowker and Star's conclusion, 542 00:26:02,876 --> 00:26:06,604 which is that this isn't ultimately about Wikidata, 543 00:26:06,604 --> 00:26:08,255 this isn't a problem with Wikidata 544 00:26:08,255 --> 00:26:10,855 this is that the class of systems 545 00:26:10,855 --> 00:26:14,244 that Wikidata is a part of has never been done safely 546 00:26:14,244 --> 00:26:16,659 and there is no reason to think it could be. 547 00:26:17,703 --> 00:26:19,585 And so my call is ultimately 548 00:26:19,585 --> 00:26:22,139 not for a particular change, 549 00:26:22,139 --> 00:26:24,482 or for all of you to just go home and give up. 550 00:26:24,933 --> 00:26:27,180 It's for the project collectively 551 00:26:27,180 --> 00:26:29,208 and for you all individually 552 00:26:29,208 --> 00:26:32,435 to determine how comfortable you are 553 00:26:32,435 --> 00:26:35,729 with participating and building a system 554 00:26:35,729 --> 00:26:38,642 that makes a claim to universalism, 555 00:26:38,642 --> 00:26:42,396 that makes a claim to neutrality and truth in data, 556 00:26:43,821 --> 00:26:47,170 when we know that that's neither possible 557 00:26:47,170 --> 00:26:49,661 nor harmless when it fails. 558 00:26:49,661 --> 00:26:53,145 and if you are not comfortable with that, working to articulate 559 00:26:53,145 --> 00:26:55,459 what other ways of doing this there might be. 560 00:26:56,012 --> 00:26:58,789 And these could look like, for example, 561 00:27:00,271 --> 00:27:03,778 giving primacy to those local Wikibase installs. 562 00:27:03,778 --> 00:27:05,822 Saying that ultimately 563 00:27:05,822 --> 00:27:07,866 we need to give individual communities 564 00:27:07,866 --> 00:27:11,316 and individual contexts and spaces primacy 565 00:27:11,316 --> 00:27:13,133 in defining what matters to them, 566 00:27:13,133 --> 00:27:14,842 and how they wish to be defined. 567 00:27:14,842 --> 00:27:19,327 And the conversation about which perspective should be included 568 00:27:19,327 --> 00:27:22,235 in some central repository should wait 569 00:27:22,235 --> 00:27:25,111 until we have the full range of perspectives. 570 00:27:26,755 --> 00:27:28,877 So, that's everything from me. 571 00:27:28,877 --> 00:27:31,397 Thank you, everyone, for sitting through this. 572 00:27:31,713 --> 00:27:35,015 I think we have about 20 to 25 minutes-- 573 00:27:35,015 --> 00:27:38,973 (moderator) 25 minutes for questions, so, please, plentiful. 574 00:27:39,846 --> 00:27:41,280 Thank you very much. 575 00:27:41,893 --> 00:27:44,543 (applause) 576 00:27:46,806 --> 00:27:49,705 (person 2) Thank you so much for this wonderful presentation 577 00:27:49,705 --> 00:27:52,335 about the problems inherent in classification systems. 578 00:27:52,335 --> 00:27:54,772 One of the examples you had is really cool 579 00:27:54,772 --> 00:27:56,415 from a mathematical point of view, 580 00:27:56,415 --> 00:27:58,469 when you were showing that transgender male 581 00:27:58,469 --> 00:28:00,935 is the opposite of transgender female-- 582 00:28:00,935 --> 00:28:03,991 or transgender female is the opposite of transgender male 583 00:28:03,991 --> 00:28:07,386 and the opposite of cisgendered female. 584 00:28:07,386 --> 00:28:11,737 That makes cisgendered female be the same as transgender male, 585 00:28:11,737 --> 00:28:13,390 because opposite of is the same-- 586 00:28:13,390 --> 00:28:16,590 if A is opposite of B and C is the opposite of B, 587 00:28:16,590 --> 00:28:18,178 A and C are the same. 588 00:28:18,178 --> 00:28:21,034 So actually that's a place where it should be different from 589 00:28:21,034 --> 00:28:22,820 and not opposite of, 590 00:28:22,820 --> 00:28:26,191 and that involves a lot of mathematical issues 591 00:28:26,191 --> 00:28:28,957 when we go to actually ask queries of the database, 592 00:28:28,957 --> 00:28:31,708 so it's really important that you've pointed out things like that. 593 00:28:31,708 --> 00:28:34,093 Yeah, another example of that which I thought was fun 594 00:28:34,093 --> 00:28:38,648 was transsexualism was defined in part further down-- 595 00:28:38,648 --> 00:28:39,879 which I wanted to include, 596 00:28:39,879 --> 00:28:42,394 but couldn't find a way of fitting it into the flow-- 597 00:28:42,394 --> 00:28:45,704 as the same as sex-reassignment surgery. 598 00:28:46,579 --> 00:28:48,302 Which is unintentionally hilarious 599 00:28:48,302 --> 00:28:50,920 because a diagnosis of transsexualism 600 00:28:50,920 --> 00:28:54,951 was historically a prerequisite for sex-reassignment surgery. 601 00:28:55,272 --> 00:28:57,702 So it's not so much a chicken and an egg problem 602 00:28:57,702 --> 00:28:59,648 as the chicken is carrying the egg. 603 00:28:59,648 --> 00:29:01,068 (laughter) 604 00:29:02,159 --> 00:29:04,300 Yeah. So yeah, these-- 605 00:29:04,300 --> 00:29:07,896 When we look at Wikidata and how much it uses mathematical, 606 00:29:07,896 --> 00:29:11,511 or pseudo-mathematical language of, like, 607 00:29:11,511 --> 00:29:16,040 opposite of, distinct from, in the set of... 608 00:29:17,080 --> 00:29:18,890 Yeah, reality is more complex 609 00:29:18,890 --> 00:29:21,826 than the mathematics we have to represent it. 610 00:29:23,441 --> 00:29:25,323 I don't have a smart answer there except to say 611 00:29:25,323 --> 00:29:27,160 that I used to be a quantitative researcher 612 00:29:27,160 --> 00:29:30,417 and I left, and there is a reason for this. 613 00:29:33,086 --> 00:29:34,403 (moderator) Next question. 614 00:29:34,403 --> 00:29:35,735 Who raised hands? 615 00:29:35,735 --> 00:29:37,165 I see a hand over there? 616 00:29:45,902 --> 00:29:47,279 (person 3) Hello. 617 00:29:47,739 --> 00:29:49,636 First of all. Thank you for this presentation. 618 00:29:49,636 --> 00:29:51,452 It was very eye-opening. 619 00:29:53,417 --> 00:29:55,969 I want to tell you, but first of all-- 620 00:29:55,969 --> 00:29:58,523 there's a Wikimedia-- I don't know if you know 621 00:29:58,523 --> 00:30:00,826 about the community LGBT+ user group. 622 00:30:00,826 --> 00:30:02,006 So it's a user group, 623 00:30:02,006 --> 00:30:03,671 and they have this mailing list, 624 00:30:03,671 --> 00:30:07,589 and they discussing actually the issue of sex and gender in Wikidata, 625 00:30:07,589 --> 00:30:09,069 and there is some proposals made 626 00:30:09,069 --> 00:30:11,651 by LGBT+ people to improve it. 627 00:30:11,651 --> 00:30:14,900 So, but it's not fully done yet. 628 00:30:14,900 --> 00:30:18,431 So, there are some plans, people working on it. 629 00:30:18,431 --> 00:30:20,279 It would be great if you want to chime in there 630 00:30:20,279 --> 00:30:21,578 and give your opinion 631 00:30:21,578 --> 00:30:24,389 because I'm pretty sure you're more expert than most of us. 632 00:30:25,149 --> 00:30:28,001 But I want to give a critique of this thing that you said 633 00:30:28,001 --> 00:30:29,938 about hijra people that said 634 00:30:29,938 --> 00:30:34,277 out of 20,000 editors of Wikidata, 635 00:30:34,277 --> 00:30:36,594 assuming 2.8 of them will be hijra 636 00:30:36,594 --> 00:30:39,747 and they need to overcome all of these 20,000 people 637 00:30:39,747 --> 00:30:41,147 but this is not true. 638 00:30:41,147 --> 00:30:45,379 Lots of people, I say assume 20,000 people 639 00:30:45,379 --> 00:30:47,920 are just unaware of an issue. 640 00:30:47,920 --> 00:30:49,587 They are not bigots 641 00:30:49,587 --> 00:30:51,926 or they are not going to actively 642 00:30:51,926 --> 00:30:54,070 not let people do this. 643 00:30:54,070 --> 00:30:56,530 And lots of them would help if you tell them. 644 00:30:56,530 --> 00:30:59,558 Like, as you [inaudible] that edits Wikidata, 645 00:30:59,558 --> 00:31:01,785 I have no idea about this issue 646 00:31:01,785 --> 00:31:04,162 and if I knew it I would have fixed it. 647 00:31:04,660 --> 00:31:05,960 So, yeah. 648 00:31:05,960 --> 00:31:08,300 Yeah. I totally get what you mean. 649 00:31:08,878 --> 00:31:11,024 And I want to be clear that I'm not saying 650 00:31:11,334 --> 00:31:12,985 there are 20,000 people, 651 00:31:12,985 --> 00:31:14,278 many of whom are in this room, 652 00:31:14,278 --> 00:31:16,487 although only a tiny percentage 653 00:31:16,487 --> 00:31:19,672 who are vehement bigots and cultural imperialists. 654 00:31:20,341 --> 00:31:22,180 Instead what I'm getting at is the fact 655 00:31:22,180 --> 00:31:26,941 that the consensus model, and discussion-based model 656 00:31:26,941 --> 00:31:31,015 that the WikiProjects are based on 657 00:31:31,015 --> 00:31:33,085 has a couple of flaws, 658 00:31:33,085 --> 00:31:34,304 and one of the big flaws 659 00:31:34,304 --> 00:31:39,238 is that it assumes that all of the voices worth representing are there 660 00:31:39,238 --> 00:31:42,310 and are represented somewhat proportionately. 661 00:31:42,310 --> 00:31:46,192 Consensus started off as a model in Quaker communities 662 00:31:46,192 --> 00:31:49,100 where literally everyone impacted by a decision was in the room, 663 00:31:49,100 --> 00:31:52,690 because everyone impacted by a decision could fit in the room. 664 00:31:53,810 --> 00:31:58,700 And so my point with this 2.85 number is not to say 665 00:31:58,700 --> 00:32:01,421 you have to argue with the entire population of Wikidata 666 00:32:01,421 --> 00:32:03,497 every time you want to make any decision, 667 00:32:03,497 --> 00:32:07,518 but instead to say that the consensus model 668 00:32:07,518 --> 00:32:11,476 and the majoritarian model of what knowledge should be represented 669 00:32:11,476 --> 00:32:13,846 runs fundamentally into a problem 670 00:32:13,846 --> 00:32:20,693 when the people who are being underrepresented 671 00:32:20,693 --> 00:32:22,680 are underrepresented. 672 00:32:23,634 --> 00:32:26,330 For another example, and a real one, 673 00:32:27,514 --> 00:32:29,633 Myanmar as a country. 674 00:32:30,294 --> 00:32:34,524 The English Wikipedia claims that it was called Burma 675 00:32:34,524 --> 00:32:37,057 until a couple of years ago. 676 00:32:39,028 --> 00:32:41,258 And the reasoning for this was very simple. 677 00:32:42,557 --> 00:32:45,368 The BBC didn't like calling it Myanmar 678 00:32:45,368 --> 00:32:47,568 and a load of editors-- 679 00:32:47,568 --> 00:32:49,376 (person 4) [inaudible] completely wrong. 680 00:32:49,376 --> 00:32:50,485 Sorry. 681 00:32:50,485 --> 00:32:51,917 (laughter) 682 00:32:53,481 --> 00:32:56,057 You run into this issue of like... 683 00:32:56,057 --> 00:32:58,261 I know it's not the precise thing, but it's just... 684 00:32:58,261 --> 00:33:01,553 - (person 4) : [inaudible] it's actually-- - (moderator) I give you the mic, sir. 685 00:33:02,184 --> 00:33:03,290 - Yes? - (person 4) I'm sorry, 686 00:33:03,290 --> 00:33:05,453 that's just incredibly playing being ignorant and that... 687 00:33:05,453 --> 00:33:07,759 - Okay. Go for it. - (person 4) That's an absolute terrible, 688 00:33:07,759 --> 00:33:10,460 terrible mischaracterization of the political situation in Myanmar. 689 00:33:10,460 --> 00:33:11,482 Okay. Go for it. 690 00:33:11,857 --> 00:33:14,821 (person 4) Anyways, so basically what it is is that the country-- 691 00:33:15,733 --> 00:33:17,339 in the Burmese language 692 00:33:17,339 --> 00:33:20,111 the country can be referred to as *Myanma* or *Bama*. 693 00:33:20,111 --> 00:33:21,121 Yep. 694 00:33:21,121 --> 00:33:22,831 *Myanma* tends to be a more formal register 695 00:33:22,831 --> 00:33:25,074 and *Bama* tends to be a little bit more informal register 696 00:33:25,074 --> 00:33:28,045 but both are acceptable terms for the country. 697 00:33:30,561 --> 00:33:34,759 The term Burma came obviously from the term *Bama*, 698 00:33:36,205 --> 00:33:38,281 but what happened was 699 00:33:38,281 --> 00:33:40,392 there is no official... 700 00:33:41,922 --> 00:33:47,512 The country was officially referred to, in English, as Burma 701 00:33:47,512 --> 00:33:50,186 up until 1988-- 1989, excuse me, 702 00:33:50,977 --> 00:33:53,361 when the military government of the country 703 00:33:53,924 --> 00:33:56,419 basically decided, the military junta of the country decided 704 00:33:56,419 --> 00:33:59,199 that the country should be referred to as *Myanma*. 705 00:33:59,454 --> 00:34:04,526 Ostensibly, this was as an attempt to make the country name 706 00:34:04,526 --> 00:34:07,545 more acceptable to minorities within the country. 707 00:34:08,022 --> 00:34:10,160 However, this is a bit of historical revisionism 708 00:34:10,160 --> 00:34:12,864 because *Myanma* and *Bama* specifically refer 709 00:34:12,864 --> 00:34:15,384 to the majority ethnicity in the country. 710 00:34:15,384 --> 00:34:20,040 So, it was basically the government of Burma at the time-- 711 00:34:20,040 --> 00:34:22,524 trying to make the people equivalent to the country, 712 00:34:22,524 --> 00:34:23,833 therefore implicitly saying-- 713 00:34:23,833 --> 00:34:25,755 (person 4) Almost the opposite, 714 00:34:25,755 --> 00:34:27,494 but in a really weird way. 715 00:34:27,494 --> 00:34:30,500 They basically declared that *Bama* was in reference to the ethnicity 716 00:34:30,500 --> 00:34:32,625 and *Myanma* was in reference to the country, 717 00:34:32,625 --> 00:34:34,645 when historically they both represent ethnicity 718 00:34:34,645 --> 00:34:35,798 and the country. 719 00:34:35,798 --> 00:34:36,879 That makes sense. 720 00:34:37,060 --> 00:34:42,006 (person 4) But what happen was because Democrat advocates 721 00:34:42,006 --> 00:34:45,032 within the country believed that the military junta 722 00:34:45,032 --> 00:34:46,820 did not have the power 723 00:34:46,820 --> 00:34:48,585 to be able to change the name of the country 724 00:34:48,585 --> 00:34:49,763 in any language, 725 00:34:49,763 --> 00:34:52,183 because they were not empowered by the people of the country. 726 00:34:52,183 --> 00:34:57,655 and were explicitly a military junta that they... 727 00:34:57,655 --> 00:34:59,371 therefore the country should continue 728 00:34:59,371 --> 00:35:02,481 to be referred to Burma in English. 729 00:35:02,481 --> 00:35:05,570 Because of the fact that essentially to call it Myanmar is essentially to say 730 00:35:05,570 --> 00:35:09,692 the government of Burma and Myanmar at the time was legitimate. 731 00:35:10,451 --> 00:35:12,629 After the fall of the-- well not fall, 732 00:35:12,629 --> 00:35:16,776 but after like the semi return of civilian government in 2014, 733 00:35:18,231 --> 00:35:19,732 this question came up, 734 00:35:19,732 --> 00:35:22,262 "Okay, should we call this country Burma or Myanmar in English?" 735 00:35:22,262 --> 00:35:24,705 and essentially, the facto leader of the country, 736 00:35:24,705 --> 00:35:26,324 Aung San Suu Kyi, 737 00:35:26,324 --> 00:35:29,281 said that there's nothing in the Burmese constitution 738 00:35:29,281 --> 00:35:31,426 that says you know, what you should call it in English 739 00:35:31,426 --> 00:35:32,932 so call it whatever you want. 740 00:35:33,124 --> 00:35:34,493 I mean the name of the country 741 00:35:34,493 --> 00:35:38,557 is officially the Union of *Myanma* in Burmese, 742 00:35:38,557 --> 00:35:41,025 but as far as in English you can call it whatever you want. 743 00:35:41,025 --> 00:35:44,916 But generally before the return of the civilian government in Burma, 744 00:35:45,927 --> 00:35:47,807 to refer to it is as Myanmar was essentially 745 00:35:47,807 --> 00:35:52,190 to legitimize the military government. 746 00:35:52,466 --> 00:35:53,611 And so therefore, 747 00:35:53,611 --> 00:35:57,340 to call it Burma was generally considered to be a specific political act 748 00:35:57,340 --> 00:35:58,948 to not give that government legitimacy. 749 00:35:58,948 --> 00:36:02,758 Yeah. So, I'm not saying that that isn't a rationale for it. 750 00:36:02,758 --> 00:36:06,113 I'm saying that on the English Wikipedia specifically, 751 00:36:06,113 --> 00:36:10,521 the page went through seven requested move discussions 752 00:36:10,521 --> 00:36:14,660 over four years and a mediation cabal decision, 753 00:36:14,660 --> 00:36:17,092 and an attempted structured mediation, 754 00:36:17,092 --> 00:36:21,179 and a review of one the closures of the move discussion, 755 00:36:21,179 --> 00:36:23,878 and that when you look at the discussions, 756 00:36:23,878 --> 00:36:26,511 most of the sort of argument back and forth 757 00:36:26,511 --> 00:36:29,074 is not about the nuanced political situation 758 00:36:29,074 --> 00:36:30,205 of the country 759 00:36:30,205 --> 00:36:33,776 but it's instead about what is the common name in media sources 760 00:36:33,776 --> 00:36:36,372 and what do different institutions call it. 761 00:36:36,372 --> 00:36:38,287 And that when you look at the discussion, 762 00:36:38,287 --> 00:36:42,775 you can see a clear point where pretty much every news organization 763 00:36:42,775 --> 00:36:45,225 that isn't the BBC in the English Language, 764 00:36:45,225 --> 00:36:47,941 that's considered like a major western news source 765 00:36:47,941 --> 00:36:49,849 has switched their language sources, 766 00:36:49,849 --> 00:36:53,547 and the debate essentially becomes a debate 767 00:36:53,547 --> 00:36:56,831 of whether we should listen to the *Wall Street Journal* 768 00:36:56,831 --> 00:36:58,372 or the BBC. 769 00:36:58,531 --> 00:37:02,925 So the point I'm making is not about the specific politics 770 00:37:02,925 --> 00:37:04,685 of the situation, but instead the fact 771 00:37:04,685 --> 00:37:07,588 that it's really easy for those decisions 772 00:37:07,588 --> 00:37:12,877 to actually become almost a proxy dispute of how much do we love the BBC, 773 00:37:14,469 --> 00:37:16,139 and that when you look at the discussions 774 00:37:16,139 --> 00:37:18,169 you see this really nice case study 775 00:37:18,169 --> 00:37:21,834 in the issues of having those conversations 776 00:37:21,834 --> 00:37:25,619 and having those nuanced, and often insider perspectives 777 00:37:25,619 --> 00:37:28,919 when most of the discussions are centered around 778 00:37:28,919 --> 00:37:30,421 how much we love the BBC 779 00:37:30,421 --> 00:37:33,742 and are coming from people who are outside the context. 780 00:37:34,221 --> 00:37:35,734 So, it's not-- 781 00:37:35,734 --> 00:37:37,323 My point in all of this is basically 782 00:37:37,323 --> 00:37:41,219 that even if you're not fighting 20,000 people, 783 00:37:42,009 --> 00:37:44,797 even if you're only arguing with 20 people, 784 00:37:44,797 --> 00:37:47,063 probabilistically, 19 of them 785 00:37:47,063 --> 00:37:50,732 are going to be people who have very strong opinions, 786 00:37:50,732 --> 00:37:53,400 who don't necessarily bear any negative consequences 787 00:37:53,400 --> 00:37:56,095 of whichever change happens, 788 00:37:56,095 --> 00:37:59,542 but have a particular world view and have decided to stick in it, 789 00:37:59,542 --> 00:38:03,526 and so the proposals by the LGBTQ+ group 790 00:38:03,526 --> 00:38:06,320 to change the Wikidata criteria 791 00:38:06,320 --> 00:38:09,327 might be amazing, I might love them, I might not love them, 792 00:38:09,327 --> 00:38:11,251 I haven't read them. 793 00:38:11,688 --> 00:38:14,575 But the base premise of this is... 794 00:38:15,168 --> 00:38:18,395 We got the people who show up on Wikidata right now, 795 00:38:18,395 --> 00:38:22,053 and those are the representatives of all queer people 796 00:38:22,053 --> 00:38:25,912 and this is the universal rule of what should be done 797 00:38:25,912 --> 00:38:27,840 with the content of all queer people 798 00:38:27,840 --> 00:38:31,342 is almost a microcosm of the same problem. 799 00:38:31,512 --> 00:38:34,083 - (moderator) We have another question. - Yep. 800 00:38:34,437 --> 00:38:35,910 (person 5) Hi. 801 00:38:36,141 --> 00:38:38,249 I think there's another problem 802 00:38:38,249 --> 00:38:43,117 with the consensus-based approach we have, 803 00:38:43,117 --> 00:38:45,754 is that sometimes we have consensus 804 00:38:45,754 --> 00:38:48,507 on really difficult issues on how to deal with that 805 00:38:48,507 --> 00:38:52,998 and [inaudible] that on Wikidata, and nobody is reading the discussion. 806 00:38:53,855 --> 00:38:55,979 Typically, the project Names, 807 00:38:55,979 --> 00:39:00,560 which is a really, really old WikiProject on Wikidata-- 808 00:39:00,560 --> 00:39:05,020 and names are a really, really complicated issue in the world. 809 00:39:05,020 --> 00:39:07,952 Not every people of the world have a given name, 810 00:39:07,952 --> 00:39:12,187 not every people have a family name, not, well, you have an idea. 811 00:39:12,187 --> 00:39:15,259 And there are so many writing systems out there, 812 00:39:15,259 --> 00:39:18,103 and we have, actually, a system 813 00:39:18,103 --> 00:39:22,181 which was working for many cases in the world 814 00:39:22,181 --> 00:39:23,900 on how to use properties, 815 00:39:23,900 --> 00:39:25,904 what items should look like, 816 00:39:25,904 --> 00:39:28,326 how to link these together and everything-- 817 00:39:28,326 --> 00:39:30,024 We have eight pages-- 818 00:39:30,024 --> 00:39:34,303 nobody is reading that, and someone just added 819 00:39:34,303 --> 00:39:39,045 Latin script family names to a Chinese researcher. 820 00:39:39,880 --> 00:39:44,493 So, we don't have the names of these researchers 821 00:39:44,493 --> 00:39:48,790 but we know for sure that the value added was wrong. 822 00:39:48,790 --> 00:39:50,185 I don't have the correct value, 823 00:39:50,185 --> 00:39:52,195 but I know this one is not the correct value. 824 00:39:52,740 --> 00:39:57,241 And it's not just discussing the issue 825 00:39:57,241 --> 00:39:59,363 because we have big discussions 826 00:39:59,363 --> 00:40:01,082 and we have actually modeling 827 00:40:01,082 --> 00:40:07,570 which is mostly working on and even qualifier on things to deal 828 00:40:07,570 --> 00:40:09,548 with more complicated cases 829 00:40:09,548 --> 00:40:13,574 but people are just, "Oh, given names suggest a property, 830 00:40:13,574 --> 00:40:15,321 I will just add that." 831 00:40:16,049 --> 00:40:17,713 - No. - Yeah. 832 00:40:18,220 --> 00:40:21,354 I think it's not just how to model thing, 833 00:40:21,354 --> 00:40:25,302 it's really how to explain to people the model, 834 00:40:25,302 --> 00:40:30,595 and that's a technical part-- we could have tools with suggestions 835 00:40:30,595 --> 00:40:34,515 and I think the constraint thing which went live last year 836 00:40:34,515 --> 00:40:36,227 is a great thing for that. 837 00:40:36,446 --> 00:40:39,754 But even when we know to model thing, 838 00:40:39,910 --> 00:40:44,569 it's how to make this model known to people. 839 00:40:44,569 --> 00:40:49,029 That's a bit technical issue on how to do that better. 840 00:40:53,566 --> 00:40:55,090 (moderator) So, there was just remark. 841 00:40:55,090 --> 00:40:57,881 There's no real question for you? 842 00:40:58,315 --> 00:40:59,738 Or that's a question to you? 843 00:40:59,738 --> 00:41:02,061 - How to do that. - (person 5) Yeah, it's a question. 844 00:41:02,679 --> 00:41:05,971 (person 5): Sorry, even if we have the discussion, 845 00:41:05,971 --> 00:41:07,486 (moderator) Yeah, sure. 846 00:41:08,346 --> 00:41:10,826 (person 5) My question, if I was not clear, is that 847 00:41:10,826 --> 00:41:12,947 even when everyone is in agreement 848 00:41:12,947 --> 00:41:15,210 on how to model complicated cases, 849 00:41:15,210 --> 00:41:20,375 how do we make technically the model known for project 850 00:41:20,375 --> 00:41:22,385 with the scope of Wikidata, 851 00:41:22,385 --> 00:41:26,690 so people are not adding the wrong value in good faith? 852 00:41:26,690 --> 00:41:30,216 Because our problem is both. 853 00:41:30,216 --> 00:41:33,516 We have trouble modeling complicated realities, 854 00:41:33,516 --> 00:41:39,132 and we have trouble explaining to users, how to follow the model 855 00:41:39,132 --> 00:41:40,530 we actually have. 856 00:41:40,530 --> 00:41:42,157 Yep. 857 00:41:43,436 --> 00:41:45,675 I will say that if I could solve that problem 858 00:41:45,675 --> 00:41:48,232 which is to reframe it, 859 00:41:48,232 --> 00:41:52,601 how to reliably and consistently enculture new users 860 00:41:52,601 --> 00:41:57,142 into having the same view and understanding 861 00:41:57,142 --> 00:41:59,520 of the project space, 862 00:41:59,520 --> 00:42:01,548 then they would let me graduate 863 00:42:01,548 --> 00:42:03,110 and also give me a job. 864 00:42:03,110 --> 00:42:09,235 It's the second oldest problem in internet spaces is how to do that. 865 00:42:09,235 --> 00:42:12,144 The oldest problem is writing a system 866 00:42:12,144 --> 00:42:14,494 that will automatically detect insults. 867 00:42:16,121 --> 00:42:18,591 I will say that... 868 00:42:18,591 --> 00:42:21,020 You can look back at Wikipedia, 869 00:42:21,020 --> 00:42:22,680 or before that, there was the phenomenon 870 00:42:22,680 --> 00:42:26,897 of eternal September on Usenet 871 00:42:26,897 --> 00:42:30,295 which was, "Oh these people keep-- AOL disks have gone everywhere 872 00:42:30,295 --> 00:42:31,487 and now there's newcomers 873 00:42:31,487 --> 00:42:34,033 all the time who don't know how things work around here, 874 00:42:34,033 --> 00:42:37,861 and everything is drowning in people hitting "Reply All." 875 00:42:39,804 --> 00:42:42,340 Generally speaking, the place that I would look for that 876 00:42:42,340 --> 00:42:47,691 is there is a discipline called, "Computer-supported collaborative work," 877 00:42:47,944 --> 00:42:49,750 and one of their big questions 878 00:42:49,750 --> 00:42:54,465 is this question of onboarding, and of like... 879 00:42:54,979 --> 00:42:57,907 making the culture known to people. 880 00:42:57,907 --> 00:43:01,203 But it may not be something that is directly solvable, 881 00:43:01,203 --> 00:43:03,301 or that we want to directly solve, right? 882 00:43:03,301 --> 00:43:06,751 So, Susan Leigh Star who wrote *Sorting Things Out,* 883 00:43:06,751 --> 00:43:08,144 one of her other contributions 884 00:43:08,144 --> 00:43:12,586 was generally the study of infrastructures 885 00:43:12,586 --> 00:43:15,339 of which I would argue Wikidata is definitely one, 886 00:43:15,844 --> 00:43:18,757 and of the things that she argued 887 00:43:18,757 --> 00:43:21,706 was that infrastructures make themselves known 888 00:43:21,706 --> 00:43:23,012 through using them. 889 00:43:23,012 --> 00:43:27,564 So like, basically the only way to work out how a system works 890 00:43:27,564 --> 00:43:31,730 is to engage with it, and trip over, and fall flat on your face, 891 00:43:31,730 --> 00:43:34,761 and learn not to fall over that way again. 892 00:43:35,156 --> 00:43:39,903 And I think everyone everywhere, including new users, 893 00:43:40,718 --> 00:43:42,906 including people coming from other projects, 894 00:43:42,906 --> 00:43:48,620 wants a way of approaching this where they don't have to fall over. 895 00:43:49,291 --> 00:43:51,411 But I'm not sure if that exists, 896 00:43:51,411 --> 00:43:55,521 and I think that a better place we might look is maybe to ask 897 00:43:56,321 --> 00:43:58,934 what are the consequences of people screwing up 898 00:43:58,934 --> 00:44:02,512 and how do we make screwing up an understandable 899 00:44:02,512 --> 00:44:07,152 and a more expected component of the user experience. 900 00:44:07,517 --> 00:44:09,696 (moderator) Okay thanks. Next question. 901 00:44:10,750 --> 00:44:11,990 (person 6) Thank you. 902 00:44:13,118 --> 00:44:15,478 So, first, thank you very much for your presentation to us. 903 00:44:15,478 --> 00:44:17,486 Again, someone said, eye-opening. 904 00:44:18,172 --> 00:44:23,195 I was looking at the specific item on transsexualism, 905 00:44:23,836 --> 00:44:27,231 and it's actually even more interesting 906 00:44:27,231 --> 00:44:29,467 because I was looking at different Wikipedias, 907 00:44:29,467 --> 00:44:32,244 how they dealt with the issue. 908 00:44:32,441 --> 00:44:34,512 And I just look at three. 909 00:44:34,682 --> 00:44:38,193 So, apparently, what we are seeing on Wikidata 910 00:44:38,193 --> 00:44:44,390 actually reflects pretty much what happened to some extent 911 00:44:44,830 --> 00:44:47,253 at some level on English Wikipedia, 912 00:44:47,253 --> 00:44:50,818 whereas if you look at Portuguese Wikipedia, 913 00:44:51,384 --> 00:44:55,477 the actual item connects to transgender, 914 00:44:56,429 --> 00:45:02,271 and on French Wikipedia it connects to trans identity 915 00:45:02,835 --> 00:45:07,930 whereas transsexualism is a redirect in both Portuguese and French. 916 00:45:08,569 --> 00:45:14,511 And I was looking at the history of editing on the Wikidata item, 917 00:45:15,185 --> 00:45:18,899 and if you look at-- there were several sort of wars 918 00:45:18,899 --> 00:45:22,424 but the discussion page is actually only one line, 919 00:45:22,719 --> 00:45:26,015 but there were several conflicts between editors, 920 00:45:26,015 --> 00:45:28,369 particularly with the French 921 00:45:28,369 --> 00:45:32,143 that were opposing the use of transsexualism. 922 00:45:32,143 --> 00:45:35,947 If you look at the names of the items on each language, 923 00:45:35,947 --> 00:45:38,924 the only one on which you don't have transsexualism 924 00:45:38,924 --> 00:45:41,182 is French for trans identity, 925 00:45:41,182 --> 00:45:45,100 and then someone came, and did what you said about 926 00:45:45,100 --> 00:45:47,478 it's the opposite [inaudible], trans identity, 927 00:45:47,478 --> 00:45:50,734 and then there is a different item that-- 928 00:45:50,734 --> 00:45:51,940 Oh yeah. 929 00:45:51,940 --> 00:45:56,356 (person 6) So, it's a complete global fight over... 930 00:45:56,356 --> 00:45:59,498 basically it's reverberating conflicts 931 00:45:59,498 --> 00:46:03,221 that are apparently also 932 00:46:03,221 --> 00:46:08,462 the manifestations of conflicts that happen on each Wikipedia. 933 00:46:08,462 --> 00:46:12,224 Yes, that also reflect conflicts in local cultures, 934 00:46:12,224 --> 00:46:14,462 and in different parts of the world, yeah... 935 00:46:14,757 --> 00:46:16,718 And I'd argue that, I mean, 936 00:46:16,718 --> 00:46:20,524 I'm British so I have a tendency to say, "Wait, fighting with the French?" 937 00:46:20,524 --> 00:46:21,652 "Yes, Please!" 938 00:46:21,652 --> 00:46:22,873 (laughter) 939 00:46:22,873 --> 00:46:27,094 But I'd say there's almost something more fundamental than that, 940 00:46:27,094 --> 00:46:29,478 and you can make an argument in the other direction. 941 00:46:29,478 --> 00:46:32,651 I can, as a trans person, make an argument in the other direction and say, 942 00:46:32,651 --> 00:46:36,070 "Actually, it's the French and Portuguese who have it wrong." 943 00:46:36,070 --> 00:46:38,274 Because the actual question is 944 00:46:38,274 --> 00:46:40,456 is the entry transsexualism about 945 00:46:40,456 --> 00:46:44,947 the medical classification, or the state of being, 946 00:46:45,166 --> 00:46:48,343 or the historic medical classification, 947 00:46:48,343 --> 00:46:50,938 or the historic term for the state of being, 948 00:46:50,938 --> 00:46:53,517 or are these different entries, or the same entries? 949 00:46:53,517 --> 00:46:56,418 When are things distinct enough to be different objects, 950 00:46:56,418 --> 00:46:58,646 and how do we negotiate that fight 951 00:46:58,646 --> 00:47:00,780 between people who think that the medical status 952 00:47:00,780 --> 00:47:04,268 and the identity are the same thing, or different things. 953 00:47:05,273 --> 00:47:08,032 But yeah, there is no easy answer 954 00:47:08,032 --> 00:47:10,496 but yeah, I suspect if you look at a lot of these examples, 955 00:47:10,496 --> 00:47:12,588 and if you look at a lot of controversies, 956 00:47:12,588 --> 00:47:13,829 generally on Wikidata 957 00:47:13,829 --> 00:47:17,548 what you're going to see is these fights over... 958 00:47:17,548 --> 00:47:19,225 These almost negotiations 959 00:47:19,225 --> 00:47:20,861 are the local community norms, 960 00:47:20,861 --> 00:47:23,692 and beyond that are the cultural norms. 961 00:47:23,949 --> 00:47:25,597 Which is a problem because again, 962 00:47:25,597 --> 00:47:28,804 when we're talking about marginalized or minority groups, 963 00:47:29,314 --> 00:47:33,582 we would expect them to also be marginalized within Wiki communities, 964 00:47:33,582 --> 00:47:36,594 and also within Wikidata, 965 00:47:36,594 --> 00:47:38,440 and so Wikidata is sort of... 966 00:47:39,701 --> 00:47:42,901 building on these preexisting prioritizations 967 00:47:42,901 --> 00:47:45,343 of whose knowledge matters, and under what circumstances 968 00:47:45,343 --> 00:47:47,059 and in what form. 969 00:47:48,129 --> 00:47:51,485 (person 7): I wanted to touch on something you mentioned. 970 00:47:52,415 --> 00:47:57,790 Everything is complex and I think modeling it right, 971 00:47:57,790 --> 00:48:00,322 getting it right on Wikidata 972 00:48:00,322 --> 00:48:02,661 is not the sum of the issue. 973 00:48:02,661 --> 00:48:05,263 As you said, Wikidata is infrastructure, 974 00:48:05,263 --> 00:48:08,621 and as [Hermione] said, 975 00:48:08,621 --> 00:48:12,917 we have gotten it right perhaps in some things, in some other topics, 976 00:48:12,917 --> 00:48:15,375 and still can't actually practice it right. 977 00:48:15,375 --> 00:48:16,500 Yep. 978 00:48:16,500 --> 00:48:18,098 (person 7): So I want to suggest that 979 00:48:18,098 --> 00:48:21,618 this is a prevalent condition of the human race. 980 00:48:22,880 --> 00:48:28,480 And however well we model something, even if we model gender 981 00:48:29,187 --> 00:48:32,900 ten times more complexly than we do today, 982 00:48:33,316 --> 00:48:36,492 most SPARQL queries involving gender would not bother 983 00:48:36,492 --> 00:48:38,159 - with the qualifiers right? - Yeah. 984 00:48:38,159 --> 00:48:42,333 And would still generate very, very flattened, very simplified results. 985 00:48:42,333 --> 00:48:46,862 Google's use of our data in the infamous Google infoboxes 986 00:48:46,862 --> 00:48:49,689 will also flatten the data and ignore qualifiers. 987 00:48:49,929 --> 00:48:51,766 That is not going to change. 988 00:48:51,766 --> 00:48:54,646 Wikidata will continue to be used in simplistic ways. 989 00:48:55,654 --> 00:48:57,179 Indeed, the majority of use, 990 00:48:57,179 --> 00:48:59,397 probably, will be that simplistic thing. 991 00:49:00,020 --> 00:49:03,583 My point is, it's probably not fixable 992 00:49:03,583 --> 00:49:05,441 and we shouldn't stop trying. 993 00:49:06,804 --> 00:49:08,816 I mean we should try to get it right 994 00:49:08,816 --> 00:49:12,753 and understand that a lot of the use is, despite our best efforts, 995 00:49:12,753 --> 00:49:14,643 going to be simplistic and wrong. 996 00:49:14,643 --> 00:49:16,439 Yep. I would agree with that. 997 00:49:16,951 --> 00:49:18,629 I guess I would say that 998 00:49:18,629 --> 00:49:20,469 you know, it's not about like, 999 00:49:20,469 --> 00:49:23,691 my issue here is not about it being you know, 1000 00:49:23,691 --> 00:49:26,259 there is one true incredibly complex answer. 1001 00:49:28,099 --> 00:49:30,432 At some point I just gave up 1002 00:49:30,432 --> 00:49:36,138 even in my thesis which is about transness and technology 1003 00:49:36,138 --> 00:49:37,900 of defining transness. 1004 00:49:37,900 --> 00:49:39,473 I just gave up. 1005 00:49:39,673 --> 00:49:44,727 And I instead took what is referred to as a pragmatist view, 1006 00:49:44,727 --> 00:49:47,077 which is basically that it is whatever the people 1007 00:49:47,077 --> 00:49:49,278 in the situation that you're studying believe it to be, 1008 00:49:49,278 --> 00:49:53,447 and however they construct the world as if it were, 1009 00:49:54,983 --> 00:49:56,559 and what I'm getting at this 1010 00:49:56,559 --> 00:49:59,377 is not that there is some universal definition 1011 00:49:59,377 --> 00:50:01,696 of anything which, if sufficiently complicated, 1012 00:50:01,696 --> 00:50:04,760 would be enough, 1013 00:50:04,760 --> 00:50:08,775 but instead that I think that the scale is the problem, 1014 00:50:08,775 --> 00:50:10,898 and the universalism is the problem. 1015 00:50:12,650 --> 00:50:14,560 Maybe we should keep trying, 1016 00:50:14,560 --> 00:50:16,065 or maybe we should stop. 1017 00:50:16,065 --> 00:50:18,846 Maybe we should instead say that, again, 1018 00:50:18,846 --> 00:50:23,442 there should be a Wikibase install in every self-defined community 1019 00:50:23,442 --> 00:50:27,893 that wants it and they can define things, and articulate things 1020 00:50:27,893 --> 00:50:30,220 to their own satisfaction. 1021 00:50:30,800 --> 00:50:32,995 But then we end up in more political 1022 00:50:32,995 --> 00:50:37,023 and fraught debates of a reformist versus radical actions, 1023 00:50:37,023 --> 00:50:40,186 and how you open a box with a crowbar that's already inside it, 1024 00:50:40,186 --> 00:50:42,241 and I end up quoting Foucault for an hour, 1025 00:50:42,241 --> 00:50:44,185 and everyone gets sad. 1026 00:50:44,535 --> 00:50:46,929 Including me because I hate Foucault. 1027 00:50:47,495 --> 00:50:49,249 So this might be a discussion for elsewhere. 1028 00:50:49,249 --> 00:50:51,687 But generally agreed, I just-- 1029 00:50:51,977 --> 00:50:54,236 I would raise questions about 1030 00:50:54,236 --> 00:50:55,691 whether we should keep trying 1031 00:50:55,691 --> 00:50:57,699 for a better form of universalism, 1032 00:50:57,699 --> 00:51:00,145 or whether the problem is that universalism. 1033 00:51:00,704 --> 00:51:03,495 I'm guessing we have a time for one more? Yeah. 1034 00:51:04,040 --> 00:51:07,516 (person 8): This is a short question, possibly complex answer. 1035 00:51:07,776 --> 00:51:10,131 One of the most popular 1036 00:51:10,131 --> 00:51:15,352 and used properties is sex or gender on Wikidata. 1037 00:51:16,329 --> 00:51:18,135 Could you speak to whether you find 1038 00:51:18,135 --> 00:51:24,640 that merging useful, productive, problematic? 1039 00:51:26,264 --> 00:51:28,276 Sure, I mean I think it's always 1040 00:51:28,276 --> 00:51:30,209 going to be reductive cause it's a merging. 1041 00:51:31,137 --> 00:51:35,700 But I also think that it is deeply tiresome 1042 00:51:36,790 --> 00:51:38,602 in a way that's kind of interesting 1043 00:51:38,602 --> 00:51:42,423 insofar as it reveals the limitations of Wikidata, 1044 00:51:42,423 --> 00:51:44,035 though Wikidata claims to be building 1045 00:51:44,035 --> 00:51:47,298 towards this like big objective set of knowledge, 1046 00:51:47,298 --> 00:51:49,454 but ultimately kind of smushed these things together 1047 00:51:49,454 --> 00:51:52,648 because I mean they haven't asked 1048 00:51:52,648 --> 00:51:55,862 most people who have entries what their gender is, 1049 00:51:55,862 --> 00:51:57,287 and/or what their sex is, 1050 00:51:57,287 --> 00:51:59,266 and so they just merge them 1051 00:51:59,266 --> 00:52:01,812 so that inference is easier. 1052 00:52:02,037 --> 00:52:04,811 But generally speaking, yeah, I say that the merging 1053 00:52:04,811 --> 00:52:08,833 of the two together is reductive and dangerous 1054 00:52:08,833 --> 00:52:10,229 but... 1055 00:52:11,517 --> 00:52:13,217 Again it's not... 1056 00:52:13,645 --> 00:52:15,027 There is no good way of doing it. 1057 00:52:15,027 --> 00:52:17,682 I think this is a particularly bad way 1058 00:52:18,400 --> 00:52:22,263 of treating them as interchangeable things, 1059 00:52:22,993 --> 00:52:26,371 and treating them as forever-linked things, 1060 00:52:27,985 --> 00:52:31,728 but I can't suggest a better way that remains-- 1061 00:52:32,147 --> 00:52:33,866 that continues to have Wikidata 1062 00:52:33,866 --> 00:52:36,484 even tracking this information or the information contained 1063 00:52:36,484 --> 00:52:38,134 in that at all. 1064 00:52:38,592 --> 00:52:40,671 (moderator): Okay. I think we have to conclude here. 1065 00:52:40,671 --> 00:52:42,335 I still saw some raised hands 1066 00:52:42,335 --> 00:52:43,565 so hopefully you'll be around. 1067 00:52:43,565 --> 00:52:45,529 Yeah. I am a grad student. 1068 00:52:45,529 --> 00:52:47,316 I have functionally no life, so... 1069 00:52:47,316 --> 00:52:48,359 (laughter) 1070 00:52:48,359 --> 00:52:51,632 (moderator): Perfect. Okay. So please come and talk. 1071 00:52:52,311 --> 00:52:53,801 Thank you very much. 1072 00:52:54,035 --> 00:52:56,181 (applause)