1 00:00:05,929 --> 00:00:09,349 Hi, I am Satdeep. I work with the Foundation in Ben's team. 2 00:00:10,315 --> 00:00:12,851 Here's my friend from India, Bodhi. 3 00:00:12,851 --> 00:00:15,461 He's working with the Centre for Internet and Society, 4 00:00:15,461 --> 00:00:18,541 but he's here in his volunteer capacity. 5 00:00:19,523 --> 00:00:24,769 So, we're going to talk about knowledge gaps and Wikidata today. 6 00:00:25,340 --> 00:00:27,080 So what are knowledge gaps? 7 00:00:27,651 --> 00:00:31,651 As the name suggests, it's a gap in our existent knowledge. 8 00:00:31,651 --> 00:00:36,421 But in terms of Wikidata, we're looking at knowledge gaps 9 00:00:36,421 --> 00:00:38,261 in two different aspects. 10 00:00:38,261 --> 00:00:43,101 One is, how can Wikidata help us in filling the knowledge gaps 11 00:00:43,101 --> 00:00:45,141 in other Wikimedia projects? 12 00:00:45,141 --> 00:00:50,253 And the second is, how do we fill the knowledge gaps within Wikidata? 13 00:00:52,287 --> 00:00:57,187 For the first one, "Filling knowledge gaps with Wikidata." 14 00:00:57,187 --> 00:00:58,887 Wikidata is helping in a number of ways 15 00:00:58,887 --> 00:01:01,177 in filling knowledge gaps on different Wikimedia projects, 16 00:01:01,177 --> 00:01:02,767 for example, ArticlePlaceholder, 17 00:01:02,767 --> 00:01:06,147 or another tool called Scribe is being built, 18 00:01:06,147 --> 00:01:08,908 Wikidata Infoboxes, all of them are-- 19 00:01:08,908 --> 00:01:09,908 (audience reacts) 20 00:01:09,908 --> 00:01:14,218 Yes, there was a session about it early this morning or in the afternoon. 21 00:01:14,713 --> 00:01:18,713 And there are also a lot of different templates 22 00:01:19,700 --> 00:01:22,080 which use Wikidata. 23 00:01:22,429 --> 00:01:28,079 And then there are new templates called [inaudible], 24 00:01:28,773 --> 00:01:34,242 which along with this here are used to make lists like these. 25 00:01:34,242 --> 00:01:37,342 And if you click on one of the topics on this list, 26 00:01:37,702 --> 00:01:39,522 you get this draft article. 27 00:01:39,522 --> 00:01:42,988 There was a presentation about this in this same room by [inaudible]. 28 00:01:42,988 --> 00:01:46,792 So you get a draft article with some sentences 29 00:01:46,792 --> 00:01:50,152 and the infoboxes from Wikidata. 30 00:01:51,194 --> 00:01:54,988 But this is not what we're going to talk about here today. 31 00:01:55,705 --> 00:01:59,275 We're going to talk about how, in India, 32 00:01:59,275 --> 00:02:03,203 we first have to fill the knowledge gaps within Wikidata, 33 00:02:03,203 --> 00:02:05,563 so then we can do all these amazing things. 34 00:02:05,563 --> 00:02:09,223 So there are knowledge gaps in localization. 35 00:02:09,223 --> 00:02:12,323 We need to add a lot more labels in different languages. 36 00:02:13,256 --> 00:02:16,306 There needs to build local data about local places, people, 37 00:02:16,306 --> 00:02:18,576 so that we can do all those awesome things. 38 00:02:18,576 --> 00:02:23,188 But the main aspect of there is to build community capacity 39 00:02:23,188 --> 00:02:24,858 to do all that stuff. 40 00:02:24,858 --> 00:02:28,888 So, that's where we come to The Indic Case Study, 41 00:02:28,888 --> 00:02:30,648 which this is all about. 42 00:02:30,648 --> 00:02:33,238 And how did it all start? 43 00:02:33,618 --> 00:02:37,133 There is a person sitting right there, Asaf. 44 00:02:37,133 --> 00:02:39,610 He is responsible for all this-- 45 00:02:40,463 --> 00:02:42,826 for bringing Wikidata to India. 46 00:02:43,528 --> 00:02:48,127 So there was the first community capacity development training 47 00:02:49,376 --> 00:02:53,086 with the Tamil community in 2016, where he introduced Wikidata. 48 00:02:53,086 --> 00:02:58,117 And then there was like a bunch of Wikidatans, super users 49 00:02:58,117 --> 00:03:00,447 who started contributing to Wikidata. 50 00:03:00,447 --> 00:03:04,970 And then, in 2017, on both our requests, 51 00:03:04,970 --> 00:03:07,940 he came to India again and did 52 00:03:07,940 --> 00:03:09,697 (laughs) Wiki-a-Tra-- 53 00:03:09,697 --> 00:03:12,709 it's like Wiki travel in India. 54 00:03:12,709 --> 00:03:16,366 He did that, he went to seven different cities, 55 00:03:16,366 --> 00:03:19,419 seven different communities at least, in India, 56 00:03:19,419 --> 00:03:22,772 where he did Wikidata workshops, 57 00:03:23,192 --> 00:03:26,052 mostly two-days workshops in all those places. 58 00:03:26,052 --> 00:03:30,277 And then, in 2018, again, an Advanced Wikidata workshop. 59 00:03:30,277 --> 00:03:34,737 And that has actually helped in building some sort of Wikidata community 60 00:03:34,737 --> 00:03:36,709 around India. 61 00:03:39,848 --> 00:03:42,498 That also got the community engaged, 62 00:03:42,498 --> 00:03:45,258 and then we started building WikiProject India, 63 00:03:45,258 --> 00:03:47,328 and then some other projects related to that, 64 00:03:47,328 --> 00:03:51,468 such as WikiProject West Bengal, Indian Railways, and Kerala, 65 00:03:51,468 --> 00:03:53,748 which are like some specifics regions in India 66 00:03:53,748 --> 00:03:56,038 where the community has been trying to engage themselves 67 00:03:56,038 --> 00:03:58,338 and doing some work around it. 68 00:03:58,338 --> 00:04:02,868 And then there have been some more initiatives to engage newbies 69 00:04:02,868 --> 00:04:07,598 such as edit-a-thons, or labelathons, datathons, 70 00:04:08,655 --> 00:04:11,925 with which we've been trying to get more and more people involved. 71 00:04:11,925 --> 00:04:14,895 And some initiatives around education, 72 00:04:14,895 --> 00:04:18,455 workshops in education institutions-- Asaf also did one of those. 73 00:04:20,697 --> 00:04:22,252 Yeah. Next, Bodhi. 74 00:04:23,042 --> 00:04:26,432 So, there have been so many workshops in India, 75 00:04:26,432 --> 00:04:30,361 throughout all of India from 2017 to 2019. 76 00:04:30,361 --> 00:04:33,121 And we're also trying to engage, as Satdeep said, 77 00:04:33,121 --> 00:04:35,431 we are trying to engage the newbies in different ways. 78 00:04:35,431 --> 00:04:40,955 But still, the number of power users are not very much in India. 79 00:04:41,915 --> 00:04:48,166 Only very few, maybe five or six people are doing the heavy-duty work. 80 00:04:48,975 --> 00:04:51,795 So one of the reasons for that: 81 00:04:52,891 --> 00:04:58,961 mostly the Wikimedia community is focused in India on other projects, 82 00:04:58,961 --> 00:05:04,183 mostly in Wikipedia and somehow, right now, in Wikisource. 83 00:05:04,183 --> 00:05:08,344 So, there are very few editors who are-- 84 00:05:08,344 --> 00:05:09,644 very few active editors 85 00:05:09,644 --> 00:05:12,344 who are contributing to Wikidata regularly. 86 00:05:13,264 --> 00:05:15,506 India is a multilingual country, 87 00:05:15,506 --> 00:05:18,914 so there are around 22 Wikimedia projects 88 00:05:18,914 --> 00:05:20,203 running in India. 89 00:05:20,203 --> 00:05:22,935 So the workforces are totally divided. 90 00:05:23,630 --> 00:05:28,530 So, we don't have a focused group of people 91 00:05:28,530 --> 00:05:31,439 who are working on specific areas of Wikidata 92 00:05:31,439 --> 00:05:34,540 because they are so much divided into different projects, 93 00:05:34,540 --> 00:05:39,090 that we have to engage-- we're trying to actively engage them 94 00:05:39,090 --> 00:05:40,442 in different ways. 95 00:05:40,442 --> 00:05:42,013 And they are spread over a vast region, 96 00:05:42,013 --> 00:05:44,292 India is the seventh largest country in the world, 97 00:05:44,292 --> 00:05:48,462 and so it's quite difficult to coordinate the intercommunity, 98 00:05:48,462 --> 00:05:53,882 the 22 languages communities to work on only one project. 99 00:05:56,246 --> 00:05:59,898 So, we have adopted a different approach. 100 00:05:59,898 --> 00:06:02,868 Firstly, we're targeting the data gaps, 101 00:06:02,868 --> 00:06:06,736 which is easy because there are huge data gaps in India 102 00:06:07,478 --> 00:06:10,209 on every topic, almost every topic. 103 00:06:10,209 --> 00:06:11,486 And... 104 00:06:12,681 --> 00:06:14,356 (chuckles) 105 00:06:14,356 --> 00:06:15,598 ...start locally. 106 00:06:16,358 --> 00:06:17,770 Sorry. (laughs) 107 00:06:17,770 --> 00:06:21,568 - So, it's 1, 1, 1-- - Everything is a priority! 108 00:06:21,568 --> 00:06:23,678 (laughter) 109 00:06:25,368 --> 00:06:28,028 Anyway. So we start locally. 110 00:06:28,766 --> 00:06:34,666 So we have thought that intercountry-- 111 00:06:34,666 --> 00:06:37,727 the data ingestion of intercountries is quite difficult. 112 00:06:37,727 --> 00:06:41,087 And there are huge databases for India, 113 00:06:41,087 --> 00:06:44,596 for example, the science databases, the election databases. 114 00:06:44,596 --> 00:06:48,916 And if we work on the intercountry, 115 00:06:48,916 --> 00:06:54,369 then it'd be really impossible for five or six heavy-duty users. 116 00:06:54,369 --> 00:06:58,009 So we target one place at a time. 117 00:06:58,009 --> 00:07:01,779 So that is the map of India, 118 00:07:01,779 --> 00:07:06,315 and you can see the bright pink color that is West Bengal. 119 00:07:06,315 --> 00:07:12,243 So in October 2018 to May 2019, many things happened there. 120 00:07:12,243 --> 00:07:17,288 So lots of data were ingested in that part. 121 00:07:17,288 --> 00:07:20,928 And after this map was generated, 122 00:07:20,928 --> 00:07:24,184 there is a tool for that called Wikidata Analysis-- 123 00:07:24,184 --> 00:07:26,314 built by [inaudible], user: [inaudible]. 124 00:07:26,314 --> 00:07:32,023 And after we got this map, 125 00:07:32,973 --> 00:07:36,323 we shared this with other communities. 126 00:07:36,323 --> 00:07:39,603 That "We have done this for West Bengal, you can do it for your country. 127 00:07:39,603 --> 00:07:41,043 And this is really cool." 128 00:07:41,043 --> 00:07:43,983 And people have started working-- 129 00:07:44,855 --> 00:07:46,476 that was a direct effect. 130 00:07:46,476 --> 00:07:50,656 WikiProject Kerala was built just at that time, 131 00:07:50,656 --> 00:07:54,656 and they started working on the schools of India-- 132 00:07:54,656 --> 00:07:58,048 schools of Kerala-- and Kerala is situated right here-- 133 00:07:58,048 --> 00:08:01,086 and I couldn't [locate] that in the map right now 134 00:08:01,086 --> 00:08:05,796 because the tool is right now down. 135 00:08:07,316 --> 00:08:10,280 So we just started locally. 136 00:08:11,245 --> 00:08:14,525 We're trying to inspire people from other parts of the country 137 00:08:14,525 --> 00:08:15,729 to contribute. 138 00:08:16,833 --> 00:08:19,962 And that's what happened in West Bengal, 139 00:08:19,962 --> 00:08:25,431 around 40,000 villages with 2001 and 2011 census. 140 00:08:25,431 --> 00:08:28,860 Our data was ingested-- that's complete data. 141 00:08:28,860 --> 00:08:30,639 Almost complete data 142 00:08:30,639 --> 00:08:33,869 which could have been ingested in Wikidata. 143 00:08:34,435 --> 00:08:38,677 And there were 11,000 government hospitals with coordinates 144 00:08:38,677 --> 00:08:40,317 which were ingested, 145 00:08:40,317 --> 00:08:45,260 and there was [inaudible] approach to close to 1 million Bengali labels. 146 00:08:46,191 --> 00:08:47,261 And so on. 147 00:08:47,261 --> 00:08:50,668 There were many things happening, but these were the things 148 00:08:50,668 --> 00:08:53,843 which we've done in West Bengal at that time. 149 00:08:53,843 --> 00:08:58,243 So we also tried to create cool visualizations 150 00:08:58,243 --> 00:09:00,261 from those works we've done 151 00:09:00,261 --> 00:09:02,929 because census and elections, these are boring data. 152 00:09:02,929 --> 00:09:07,429 These are not paintings, and also so we cannot-- 153 00:09:07,429 --> 00:09:11,239 like these are also not GLAM data and other things. 154 00:09:11,239 --> 00:09:13,029 So these are boring data. 155 00:09:13,029 --> 00:09:19,460 So we need to find some way to make it interesting for people. 156 00:09:19,460 --> 00:09:22,790 So, we have tried some cool queries. 157 00:09:22,790 --> 00:09:24,897 This is one of them. There are many others. 158 00:09:25,321 --> 00:09:27,423 So this is the population growth in West Bengal 159 00:09:27,423 --> 00:09:31,651 between our villages-- around 36,000 villages 160 00:09:31,651 --> 00:09:34,037 between 2001 and 2011. 161 00:09:34,658 --> 00:09:37,778 And not only villages, we have uploaded census data 162 00:09:37,778 --> 00:09:42,547 about every administrative hierarchy, 163 00:09:42,547 --> 00:09:48,353 like community developing blocks, districts, municipalities, wards, etc., 164 00:09:48,353 --> 00:09:49,575 cities, towns. 165 00:09:51,669 --> 00:09:56,893 This is a new tool, InteGraality, 166 00:09:57,722 --> 00:10:00,292 and you can see 167 00:10:00,292 --> 00:10:05,487 that this is a count of hospitals 168 00:10:06,970 --> 00:10:08,160 in the world, 169 00:10:08,160 --> 00:10:11,310 and India is right now leading in Wikidata-- 170 00:10:11,310 --> 00:10:15,500 13,466 hospitals. 171 00:10:18,168 --> 00:10:21,218 The blue colors are the data completeness. 172 00:10:22,491 --> 00:10:29,539 But the funny thing is-- it's only one area of India. 173 00:10:29,539 --> 00:10:30,995 It's West Bengal, 174 00:10:30,995 --> 00:10:33,759 there are 11,642 hospitals right now. 175 00:10:33,759 --> 00:10:38,168 So if we complete all these steps and there are more-- 176 00:10:40,130 --> 00:10:41,740 if we complete all those steps, 177 00:10:41,740 --> 00:10:45,280 there will be a huge amount of data about hospitals 178 00:10:45,280 --> 00:10:48,030 with coordinates which will be there in Wikidata, 179 00:10:48,030 --> 00:10:54,916 and we have a plan to build an app based on that data, 180 00:10:55,934 --> 00:10:59,134 so that when a person gets ill, 181 00:11:02,801 --> 00:11:04,591 using that app, he may find 182 00:11:04,591 --> 00:11:08,718 the nearest location of the hospitals. 183 00:11:15,268 --> 00:11:19,268 So these hospitals are ranging from Primary Health Centers 184 00:11:19,268 --> 00:11:20,966 to [inaudible] Health Cares, 185 00:11:21,941 --> 00:11:26,881 with all sorts of facilities available for them. 186 00:11:26,881 --> 00:11:31,191 So we've tried to ingest all those data in Wikidata, 187 00:11:31,191 --> 00:11:32,541 if possible. 188 00:11:33,251 --> 00:11:37,390 And after completing this task, if we build some app, 189 00:11:37,390 --> 00:11:41,689 then maybe someone, a sick person in a dying urgency 190 00:11:41,689 --> 00:11:45,366 can find the nearest government hospital. 191 00:11:48,795 --> 00:11:52,255 - This is another-- - (Satdeep) Go back. 192 00:11:52,255 --> 00:11:53,743 (Bodhi) Oh, sorry. 193 00:11:54,914 --> 00:12:01,573 Okay. So this is the work which was done for Indian Railways. 194 00:12:01,573 --> 00:12:04,188 It was started there, also from West Bengal. 195 00:12:04,737 --> 00:12:08,994 And you can check the color-- 196 00:12:08,994 --> 00:12:11,094 the blue color is more complete data 197 00:12:11,094 --> 00:12:14,818 and the green color is slightly not complete, 198 00:12:14,818 --> 00:12:18,368 but it's going to get completed soon. 199 00:12:18,626 --> 00:12:22,863 And there are right now, 9,000 Indian railway stations 200 00:12:22,863 --> 00:12:25,764 with coordinates, obviously, because they are on the map. 201 00:12:25,764 --> 00:12:29,504 Right now, they're being connected with Pakistan and Bangladesh railways. 202 00:12:29,504 --> 00:12:34,682 So we have a plan to connect all Asian railways one day-- 203 00:12:34,682 --> 00:12:35,992 someday, maybe. 204 00:12:35,992 --> 00:12:37,042 (laughs) 205 00:12:37,042 --> 00:12:38,760 But, yeah, we'll do it. 206 00:12:39,334 --> 00:12:44,314 Anyway. So, right now on the table, 207 00:12:44,314 --> 00:12:48,854 we are in the second position after Japan, obviously. 208 00:12:49,947 --> 00:12:53,477 And-- yeah. So this is another cool query. 209 00:12:54,012 --> 00:12:59,487 Visualization showing the flight connections-- 210 00:12:59,726 --> 00:13:01,968 international and domestic flight connections from India, 211 00:13:01,968 --> 00:13:03,146 to and from India. 212 00:13:03,146 --> 00:13:05,803 So it's like kind of messy, but we can filter it 213 00:13:05,803 --> 00:13:09,573 for domestic connections or international connections. 214 00:13:09,573 --> 00:13:11,053 So, anyway. 215 00:13:13,061 --> 00:13:14,641 We have also completed 216 00:13:14,641 --> 00:13:18,141 everything about 2014 Indian General Election data. 217 00:13:18,141 --> 00:13:22,914 India general election is a kind of complex state of data 218 00:13:22,914 --> 00:13:25,314 because there are so many political parties, 219 00:13:25,314 --> 00:13:27,646 so many election-- not like a two party elections. 220 00:13:28,617 --> 00:13:32,407 So there were 6,000 political parties which participate in Indian-- 221 00:13:33,064 --> 00:13:36,449 I think 600 or something. 222 00:13:36,449 --> 00:13:38,826 So, anyway. 223 00:13:38,826 --> 00:13:40,336 So, yeah. 224 00:13:41,461 --> 00:13:44,531 And there were so many candidates, you can imagine. 225 00:13:45,239 --> 00:13:48,949 And some of them have the same name. 226 00:13:49,522 --> 00:13:51,195 Like in one constituency, 227 00:13:51,195 --> 00:13:53,365 there was like three people with the same name. 228 00:13:53,365 --> 00:13:54,445 (laughs) 229 00:13:54,445 --> 00:13:56,633 So that was like a funny thing. 230 00:13:58,456 --> 00:14:04,213 But we completed those data-- uploading those data in Wikidata. 231 00:14:04,213 --> 00:14:07,501 Right now, only 24 Indian general elections have been done. 232 00:14:07,501 --> 00:14:13,324 We don't have much users in Wikidata-- heavy-duty users in Wikidata in India. 233 00:14:13,805 --> 00:14:18,624 So currently we're uploading geoshape files of the constituencies. 234 00:14:19,036 --> 00:14:23,884 In West Bengal, we have already uploaded 43 constituencies, 235 00:14:23,884 --> 00:14:28,519 geoshape files of the constituencies, and also the [inaudible]. 236 00:14:28,519 --> 00:14:31,539 There is another part of India that has not been done, 237 00:14:31,539 --> 00:14:33,569 so when it will be completed then-- 238 00:14:33,569 --> 00:14:34,860 when it'll be-- 239 00:14:35,185 --> 00:14:38,861 when we upload other election that are-- 240 00:14:38,861 --> 00:14:42,529 like 2009 or before that, 241 00:14:42,529 --> 00:14:45,588 we'll create cool animations. 242 00:14:45,588 --> 00:14:51,190 That's showing how the voters have changed their minds 243 00:14:51,190 --> 00:14:54,953 from like centrist to rightist or leftist to rightist, anyway. 244 00:14:54,953 --> 00:14:58,668 So in the pipeline, there are schools, 245 00:14:58,668 --> 00:15:02,818 bank branches, post offices, geoshapes, elections, and many more. 246 00:15:04,124 --> 00:15:06,164 - (man 1) Cinema. - Cinema, yeah. 247 00:15:06,164 --> 00:15:07,224 (laughs) 248 00:15:07,224 --> 00:15:09,039 - Of course, cinema. - And monuments. 249 00:15:09,039 --> 00:15:10,529 And monuments. 250 00:15:11,211 --> 00:15:14,761 And most of them will be completed within a few months. 251 00:15:16,272 --> 00:15:21,872 And in a not so distant future, we'll try to upload weather data. 252 00:15:22,974 --> 00:15:27,764 There are not much good property for weather, right now, in Wikidata, 253 00:15:28,274 --> 00:15:31,110 that's why we're not touching it right now, 254 00:15:31,110 --> 00:15:32,448 but we'll do it. 255 00:15:32,862 --> 00:15:35,988 Also bibliographical data 256 00:15:36,415 --> 00:15:40,215 for Indian literature data are also very less in Wikidata. 257 00:15:41,485 --> 00:15:46,244 And there will be some institutional partnerships. 258 00:15:46,746 --> 00:15:48,606 There were some primary talks already, 259 00:15:48,606 --> 00:15:51,596 and maybe we'll have some good news in the future. 260 00:15:54,140 --> 00:15:55,550 So other ways to engage. 261 00:15:55,550 --> 00:16:00,013 We have created some subpages of WikiProject India. 262 00:16:00,406 --> 00:16:05,256 We have created a skillshare initiative-- started a skillshare initiative 263 00:16:05,256 --> 00:16:08,646 where people who have slightly more knowledge in Wikidata 264 00:16:08,646 --> 00:16:11,876 can share something with other people, 265 00:16:11,876 --> 00:16:16,756 on a one-to-one basis approaching online or offline way. 266 00:16:16,756 --> 00:16:20,073 We have also started a newsletter, a quarterly newsletter, 267 00:16:20,073 --> 00:16:25,739 the first issue has been published in October [2018], 268 00:16:26,431 --> 00:16:30,512 and we are showcasing cool visualizations in social media 269 00:16:30,512 --> 00:16:34,478 in Facebook and Twiter channels of Wikidata India, every day. 270 00:16:34,990 --> 00:16:38,420 So these are the links. 271 00:16:38,420 --> 00:16:40,286 You can find them there. 272 00:16:41,602 --> 00:16:43,502 Thank you so much for the... 273 00:16:45,483 --> 00:16:48,853 As most of you can already guess, 274 00:16:49,784 --> 00:16:54,514 Bodhi is from that part of India, the West Bengal, 275 00:16:54,514 --> 00:16:56,554 where they've done all that work. 276 00:16:57,040 --> 00:16:59,100 (laughs) 277 00:16:59,100 --> 00:17:04,752 So the West Bengali community in India has been really doing this amazing work, 278 00:17:04,752 --> 00:17:07,922 and this needs to go to other parts of India 279 00:17:07,922 --> 00:17:10,598 which need more capacity development, 280 00:17:10,598 --> 00:17:15,704 which need more trainings, also more coordination in India. 281 00:17:16,181 --> 00:17:19,591 And, okay, I would like to end this 282 00:17:19,591 --> 00:17:22,550 with how you can help in identifying some of the knowledge gaps 283 00:17:22,550 --> 00:17:24,765 and taking that conversation forward, 284 00:17:24,765 --> 00:17:26,685 which is not directly related with this topic. 285 00:17:26,685 --> 00:17:30,535 But there is a Wiki project, "Identifying knowledge gaps," 286 00:17:30,535 --> 00:17:34,225 you can join that and share your thoughts. 287 00:17:34,225 --> 00:17:38,859 We are also trying to use-- how can we use property P5008, 288 00:17:38,859 --> 00:17:42,749 which is on the focus list for a specific project-- 289 00:17:42,749 --> 00:17:48,149 how we can use that to surface certain topics for contest 290 00:17:48,149 --> 00:17:50,129 or other events. 291 00:17:50,898 --> 00:17:54,543 And in the end, we'd like to thank you. 292 00:17:55,248 --> 00:18:01,034 Also, we'd like to thank Asaf and Mahir and Tito 293 00:18:01,034 --> 00:18:05,941 who are another two power users of Wikidata. 294 00:18:05,941 --> 00:18:08,131 We'd like to sincerely thank everyone. 295 00:18:08,131 --> 00:18:09,731 Thank you so much. 296 00:18:09,731 --> 00:18:11,621 (applause) 297 00:18:15,014 --> 00:18:16,490 Questions. 298 00:18:17,003 --> 00:18:18,860 (woman 1) Mark here says, "Hi." 299 00:18:18,860 --> 00:18:20,090 (laughs) 300 00:18:20,090 --> 00:18:23,089 (moderator) So we have only five minutes for questions and answers. 301 00:18:28,457 --> 00:18:30,137 There. There's a question there. 302 00:18:30,137 --> 00:18:31,777 (woman 1) Do I need the microphone? 303 00:18:33,626 --> 00:18:36,062 (woman 1) Thank you so much for your presentation. 304 00:18:36,062 --> 00:18:39,248 Is this census data-- what exactly kind of data is that, 305 00:18:39,248 --> 00:18:40,498 that you've been ingesting? 306 00:18:40,498 --> 00:18:42,930 It's not for individuals, is it? 307 00:18:42,930 --> 00:18:45,733 It's more like populations and stuff like that? 308 00:18:46,148 --> 00:18:49,538 It's population data, mainly. Demographic data. 309 00:18:49,842 --> 00:18:52,952 (woman 1) Are there any other things that have been asked in the census? 310 00:18:56,347 --> 00:18:59,257 (man 1) For village, gender-- 311 00:19:00,921 --> 00:19:04,251 (man 2) I was a little involved with that, so I remember what the data looks like. 312 00:19:04,251 --> 00:19:07,603 Per settlement in India, per village town. 313 00:19:07,603 --> 00:19:11,926 You have the total population, the masculine versus feminine population, 314 00:19:12,481 --> 00:19:15,564 the literate versus illiterate population. 315 00:19:15,564 --> 00:19:18,944 Within that, you have also a separation by gender, 316 00:19:18,944 --> 00:19:20,964 so you know how many illiterate males there are 317 00:19:20,964 --> 00:19:23,093 versus so many illiterate females there are. 318 00:19:23,093 --> 00:19:24,603 It's actually quite detailed. 319 00:19:24,603 --> 00:19:28,923 There are hundreds and hundreds of pieces of data per village. 320 00:19:29,338 --> 00:19:33,228 Only some of them have been modeled on Wikidata. 321 00:19:36,937 --> 00:19:39,517 Just, of course, no individual census data. 322 00:19:41,990 --> 00:19:43,793 (woman 1) Sometimes countries get weird. 323 00:19:45,108 --> 00:19:48,628 (woman 2) So I wanted to ask you about the label ingestion 324 00:19:48,628 --> 00:19:50,868 or the translations of labels you do. 325 00:19:51,622 --> 00:19:53,932 How did you do that? Do you use tools? 326 00:19:53,932 --> 00:19:56,582 How do you get people to add it in their native language 327 00:19:56,582 --> 00:19:58,367 and translate the labels. 328 00:19:59,425 --> 00:20:02,797 So, mostly TABernacle, 329 00:20:03,638 --> 00:20:06,927 and QuickStatements. 330 00:20:06,927 --> 00:20:08,297 Those we can use, QuickStatements. 331 00:20:08,297 --> 00:20:10,497 (woman 2) Alright. Cool. 332 00:20:10,497 --> 00:20:13,823 But also at the same time, like using labelathons as an activity 333 00:20:13,823 --> 00:20:17,530 to engage more and more people to do that activity. 334 00:20:19,757 --> 00:20:21,234 Asaf. 335 00:20:21,234 --> 00:20:22,286 The hero. 336 00:20:24,698 --> 00:20:26,238 (Asaf) A note on TABernacle. 337 00:20:26,238 --> 00:20:29,266 I just want to mention for anyone who may be not aware, 338 00:20:30,349 --> 00:20:33,399 all of us here use Wikidata-related tools 339 00:20:33,399 --> 00:20:36,607 which means all of us have used tools by Magnus, 340 00:20:36,607 --> 00:20:38,297 the amazing tool builder. 341 00:20:38,297 --> 00:20:41,317 I just wanted to point out that he's here at the conference. 342 00:20:41,317 --> 00:20:43,067 So if you haven't had a chance yet 343 00:20:43,067 --> 00:20:47,780 to thank him for his amazing work that enables so much impact-- 344 00:20:47,780 --> 00:20:48,817 do so today. 345 00:20:48,817 --> 00:20:51,757 I'm not sure he is into hugs, but you can just thank him. 346 00:20:51,757 --> 00:20:53,407 (laughs) 347 00:21:05,300 --> 00:21:08,455 (man 3) Was the skillshare working? 348 00:21:08,455 --> 00:21:11,297 What do you do? What are the results? 349 00:21:11,635 --> 00:21:13,935 So, the response is [still no]. 350 00:21:15,471 --> 00:21:21,766 But, yeah. We have five or six people have already requested, 351 00:21:21,766 --> 00:21:23,386 and we have completed those. 352 00:21:24,404 --> 00:21:25,614 (Satdeep) That's going on-- 353 00:21:25,614 --> 00:21:29,726 Like, we just need to surface the value of Wikidata. 354 00:21:29,726 --> 00:21:31,911 I think we haven't really been able to do that. 355 00:21:31,911 --> 00:21:34,841 Also, we haven't been able to connect with other projects 356 00:21:34,841 --> 00:21:36,532 that they are already doing, 357 00:21:36,532 --> 00:21:38,602 like, for example, Wikisource or Wikipedia. 358 00:21:38,602 --> 00:21:42,402 Like how we need to communicate that in a better way 359 00:21:42,402 --> 00:21:45,312 to the larger community who is contributing. 360 00:21:45,312 --> 00:21:49,312 It was just like getting up and creating a Wiki periodical. 361 00:21:49,312 --> 00:21:52,102 Like how do we involve them and bring them here. 362 00:21:52,102 --> 00:21:53,886 That's still a problem. 363 00:21:54,378 --> 00:21:56,810 And Bodhi is showing the census data. 364 00:21:56,810 --> 00:21:58,574 Bodhi, can you please explain? 365 00:22:02,150 --> 00:22:07,213 (Bodhi) So this is population data from the 2011 census, 366 00:22:07,213 --> 00:22:12,486 5007 in 2001 in the census data. 367 00:22:13,126 --> 00:22:14,709 This is one village. 368 00:22:14,709 --> 00:22:17,824 So there are like 36,000 villages or 40,000 villages. 369 00:22:18,520 --> 00:22:21,022 This is the male population, female population, 370 00:22:21,829 --> 00:22:23,129 number of households, 371 00:22:23,129 --> 00:22:28,050 illiterate population with male, female, population qualifiers, 372 00:22:29,358 --> 00:22:31,648 literate population and illiterate populations, 373 00:22:31,648 --> 00:22:32,798 and so on. 374 00:22:32,798 --> 00:22:35,702 And this is the census code for 2001 and 2011. 375 00:22:40,319 --> 00:22:43,557 (woman 3) Okay. I just want to say that I loved your presentation, 376 00:22:43,557 --> 00:22:47,447 and I wanted to talk nearly about the same thing tomorrow, 377 00:22:47,447 --> 00:22:51,650 so it'll be great because tomorrow-- I will just [stay] watch from this one, 378 00:22:51,650 --> 00:22:54,230 so making my life easier. 379 00:22:55,544 --> 00:22:58,446 What I wanted to do or to talk about-- 380 00:22:58,446 --> 00:23:02,276 but I think the WikiProject you're starting on Wikidata 381 00:23:02,276 --> 00:23:03,671 will do that-- 382 00:23:03,671 --> 00:23:07,671 is all to engage people not working about India directly, 383 00:23:07,671 --> 00:23:12,607 but like I have tools, names, but I don't deal with Indian names 384 00:23:13,139 --> 00:23:17,139 because I am not sure I understand all there are on them, 385 00:23:17,139 --> 00:23:19,725 and I don't want to do something massively wrong, 386 00:23:19,725 --> 00:23:21,375 so better to be careful. 387 00:23:21,375 --> 00:23:26,624 But I just need to ask with someone who understand all the problems, 388 00:23:26,624 --> 00:23:28,984 and I can add an automated tool 389 00:23:28,984 --> 00:23:32,811 and deal with thousands upon thousands of items. 390 00:23:33,047 --> 00:23:35,727 And I think they are many, many tools 391 00:23:35,727 --> 00:23:39,687 already doing some automated description and things like that 392 00:23:39,687 --> 00:23:46,104 for which we don't actually need people every day, 393 00:23:46,104 --> 00:23:50,918 we just need like 10 minutes time for someone to tell me 394 00:23:50,918 --> 00:23:54,778 or to say family names in those languages, 395 00:23:54,778 --> 00:23:57,278 and then it just added to the tool. 396 00:23:57,278 --> 00:24:01,917 And you probably know [automated] description tool, 397 00:24:02,520 --> 00:24:06,750 but if you just ask the people who are using it massively 398 00:24:06,750 --> 00:24:08,810 to just add Indian languages, 399 00:24:08,810 --> 00:24:12,810 then you have all Wikidatans doing the same work for you, 400 00:24:12,810 --> 00:24:15,409 and actually, it is a problem. 401 00:24:16,969 --> 00:24:20,970 I am helping an African community build up their Wikidata 402 00:24:20,970 --> 00:24:23,819 in Wikipedia, so it's not the same problem, 403 00:24:23,819 --> 00:24:25,737 but nearly the same problem. 404 00:24:26,317 --> 00:24:29,347 And that's the problem we have 405 00:24:29,347 --> 00:24:32,257 which is actually bridging the gap 406 00:24:32,257 --> 00:24:35,119 between the biggest Wikidatans-- 407 00:24:35,813 --> 00:24:39,313 I am doing works in languages I don't know a word of, 408 00:24:40,843 --> 00:24:44,526 but it's this kind of adoption system, 409 00:24:44,526 --> 00:24:48,566 like I need a native speaker to tell me 410 00:24:48,566 --> 00:24:51,854 what I can do with all the problems on all the complicated cases. 411 00:24:51,854 --> 00:24:55,752 And everything that I can automate, I will automate. 412 00:24:55,752 --> 00:24:59,752 And it's just an idea, but do you think it will be like 413 00:24:59,752 --> 00:25:04,752 a good idea to create not so specific Wiki knowledge gap on Wikidata, 414 00:25:04,752 --> 00:25:09,462 but a matching system 415 00:25:09,462 --> 00:25:15,821 like, "Hey I am working on this subject, do you want to ask me for that?" 416 00:25:15,821 --> 00:25:20,521 - Like, yeah, a matching tool, like to-- - Connect people. 417 00:25:20,521 --> 00:25:22,980 - (woman 3) To connect people across languages. 418 00:25:24,119 --> 00:25:26,910 Yeah. So that was my idea because I think 419 00:25:27,253 --> 00:25:30,273 some of the African communities I am helping, 420 00:25:30,273 --> 00:25:33,177 would really, really love what you're doing, 421 00:25:33,177 --> 00:25:39,253 but none of them speak Indian, and we just need to have pivot people 422 00:25:39,746 --> 00:25:41,266 to create the link 423 00:25:41,523 --> 00:25:44,504 and make all this even more powerful. 424 00:25:44,504 --> 00:25:47,341 And I really, really love what you're doing. So thank you. 425 00:25:47,341 --> 00:25:48,551 Thank you so much. 426 00:25:48,551 --> 00:25:50,619 Thanks to Bodhi for all the awesome work. 427 00:25:51,341 --> 00:25:52,728 (laughs) 428 00:25:52,728 --> 00:25:54,498 And the larger Indian community. 429 00:25:54,498 --> 00:25:57,113 But that's a really good idea, I think we should take that up. 430 00:25:57,683 --> 00:26:03,090 As a movement, we have not been doing the sharing thing pretty good. 431 00:26:03,090 --> 00:26:04,840 We need to figure out how to do that. 432 00:26:04,840 --> 00:26:06,727 Because there are awesome tools, 433 00:26:06,727 --> 00:26:09,028 one is built, but the others don't know about. 434 00:26:09,028 --> 00:26:10,864 That's a larger problem, 435 00:26:10,864 --> 00:26:14,197 and that's a piece that fits into the larger problem. 436 00:26:14,197 --> 00:26:16,061 We should be solving someplace. 437 00:26:16,061 --> 00:26:18,050 Let's figure out where we can do that. 438 00:26:19,421 --> 00:26:20,867 Thank you. 439 00:26:20,867 --> 00:26:23,027 (applause)