0 00:00:00,000 --> 00:00:30,000 Dear viewer, these subtitles were generated by a machine via the service Trint and therefore are (very) buggy. If you are capable, please help us to create good quality subtitles: https://c3subtitles.de/talk/896 Thanks! 1 00:00:15,700 --> 00:00:17,170 On my left side, there is. 2 00:00:18,220 --> 00:00:20,619 Don Walker, she is 3 00:00:20,620 --> 00:00:22,539 a member of. 4 00:00:22,540 --> 00:00:25,089 Inventor Michael Doctor and 5 00:00:25,090 --> 00:00:27,459 government initiative, 6 00:00:27,460 --> 00:00:28,579 it's called E 7 00:00:29,650 --> 00:00:32,048 the category and 8 00:00:32,049 --> 00:00:34,359 Ph.D. student at the Faculty 9 00:00:34,360 --> 00:00:36,369 of Information of Toronto. 10 00:00:38,080 --> 00:00:40,629 Their talk is ensuring climate 11 00:00:40,630 --> 00:00:42,759 data remains public, 12 00:00:42,760 --> 00:00:44,529 and it's something like the Data 13 00:00:44,530 --> 00:00:46,659 Liberation Front helping us 14 00:00:46,660 --> 00:00:48,789 in these times to get all the 15 00:00:48,790 --> 00:00:51,339 information for all people 16 00:00:51,340 --> 00:00:53,889 on the long distance achieved 17 00:00:53,890 --> 00:00:55,000 a warm applause. 18 00:01:02,900 --> 00:01:04,339 Hi, everyone. 19 00:01:04,340 --> 00:01:07,489 Is that on OK? 20 00:01:07,490 --> 00:01:09,199 I hope you're enjoying your Congress so 21 00:01:09,200 --> 00:01:11,269 far, so like 22 00:01:11,270 --> 00:01:13,339 I was introduced, this talk 23 00:01:13,340 --> 00:01:16,069 is ensuring climate data remains public 24 00:01:16,070 --> 00:01:17,599 and that I'll speak to the question of 25 00:01:17,600 --> 00:01:19,549 how we keep important environmental and 26 00:01:19,550 --> 00:01:21,769 climate data accessible 27 00:01:21,770 --> 00:01:24,379 amidst political instability and risk. 28 00:01:24,380 --> 00:01:26,269 In particular, in this past year, I think 29 00:01:26,270 --> 00:01:28,459 many of us have been paying attention to 30 00:01:28,460 --> 00:01:29,719 the United States. And I'll speak to 31 00:01:29,720 --> 00:01:31,939 recent data preservation efforts 32 00:01:31,940 --> 00:01:32,940 there. 33 00:01:33,860 --> 00:01:36,349 So the plan is to have an intro 34 00:01:36,350 --> 00:01:38,539 of why I'm talking here today a bit about 35 00:01:38,540 --> 00:01:40,759 what makes now a pressing moment, 36 00:01:40,760 --> 00:01:43,069 kind of a whirlwind tour of efforts 37 00:01:43,070 --> 00:01:45,199 to identify, preserve and rethink 38 00:01:45,200 --> 00:01:47,539 access to climate data and then hopefully 39 00:01:47,540 --> 00:01:49,879 some sort of rousing call about futures 40 00:01:49,880 --> 00:01:51,439 for climate and environmental data. 41 00:01:52,490 --> 00:01:54,019 This isn't work I've been doing alone. 42 00:01:54,020 --> 00:01:55,579 I'll speak to about many projects and 43 00:01:55,580 --> 00:01:57,979 organizations that thousands of people, 44 00:01:57,980 --> 00:02:00,079 various variously coordinated, have 45 00:02:00,080 --> 00:02:01,159 worked on. 46 00:02:01,160 --> 00:02:02,659 And if you leave with one impression, I 47 00:02:02,660 --> 00:02:04,429 hope it's that climate science, climate 48 00:02:04,430 --> 00:02:06,589 data collection and the use 49 00:02:06,590 --> 00:02:08,478 of archiving and grassroots organizing 50 00:02:08,479 --> 00:02:10,279 around data are all collaborative 51 00:02:10,280 --> 00:02:11,479 efforts. 52 00:02:11,480 --> 00:02:13,059 My plan was to try and leave room for one 53 00:02:13,060 --> 00:02:14,719 to two burning questions, but I'm really 54 00:02:14,720 --> 00:02:16,819 more than happy to talk after 55 00:02:16,820 --> 00:02:18,109 here by the stage. 56 00:02:18,110 --> 00:02:19,969 And I have stickers, so please find me. 57 00:02:19,970 --> 00:02:20,970 I want to give them to you. 58 00:02:23,600 --> 00:02:24,600 OK, 59 00:02:25,790 --> 00:02:26,929 great. 60 00:02:26,930 --> 00:02:28,999 So first, I'm not an expert 61 00:02:29,000 --> 00:02:30,799 and actually climate science isn't my 62 00:02:30,800 --> 00:02:32,149 background. 63 00:02:32,150 --> 00:02:34,129 I'm a PhD student really interested in 64 00:02:34,130 --> 00:02:35,959 how designing with a framework of data 65 00:02:35,960 --> 00:02:38,689 justice ensures more equitable outcomes, 66 00:02:38,690 --> 00:02:40,279 both in forms of data collected and 67 00:02:40,280 --> 00:02:42,379 access to technologies to try and 68 00:02:42,380 --> 00:02:43,279 think through this. 69 00:02:43,280 --> 00:02:45,409 I've been looking to those actively using 70 00:02:45,410 --> 00:02:47,389 data to try and push for otherwise. 71 00:02:47,390 --> 00:02:50,269 Is this takes the form of DIY science, 72 00:02:50,270 --> 00:02:51,919 counter mapping and increasingly 73 00:02:51,920 --> 00:02:53,599 decentralized Web projects? 74 00:02:53,600 --> 00:02:55,009 I actually got involved with thinking 75 00:02:55,010 --> 00:02:57,559 about climate data somewhat circuitously, 76 00:02:57,560 --> 00:02:59,539 and a member of a local civic tech meetup 77 00:02:59,540 --> 00:03:01,879 shout out to Civic Tech Toronto 78 00:03:01,880 --> 00:03:03,829 that served as a meeting space and anchor 79 00:03:03,830 --> 00:03:05,899 for many of these early efforts. 80 00:03:07,260 --> 00:03:09,169 And so what's edgy, edgy, which is kind 81 00:03:09,170 --> 00:03:10,969 of a mouthful, is Environmental Data and 82 00:03:10,970 --> 00:03:13,039 Governance Initiative, a distributed, 83 00:03:13,040 --> 00:03:15,109 consensus based organization of more 84 00:03:15,110 --> 00:03:17,599 than 150 scholars, organizers 85 00:03:17,600 --> 00:03:19,039 and nonprofit groups. 86 00:03:19,040 --> 00:03:21,019 Hedgy was formed from an email thread 87 00:03:21,020 --> 00:03:23,119 that started in November 2016 in 88 00:03:23,120 --> 00:03:25,219 the immediate wake of the US 89 00:03:25,220 --> 00:03:26,749 presidential elections. 90 00:03:26,750 --> 00:03:28,909 For more than a little over a year now, 91 00:03:28,910 --> 00:03:30,829 we've been documenting, contextualizing 92 00:03:30,830 --> 00:03:32,599 and analyzing changes to environmental 93 00:03:32,600 --> 00:03:34,669 data and government governance 94 00:03:34,670 --> 00:03:35,930 practices in the US. 95 00:03:37,850 --> 00:03:40,009 I've tried to include at least a portion 96 00:03:40,010 --> 00:03:41,179 of the people who've been involved in 97 00:03:41,180 --> 00:03:43,339 e.g. projects on the slide, but many 98 00:03:43,340 --> 00:03:44,340 more exist. 99 00:03:45,590 --> 00:03:47,929 So first, to unpack the data 100 00:03:47,930 --> 00:03:49,459 infrastructures of climate and 101 00:03:49,460 --> 00:03:51,799 environment a bit more climate 102 00:03:51,800 --> 00:03:53,329 science and environmental data rely on a 103 00:03:53,330 --> 00:03:55,129 collaborative, often state supported 104 00:03:55,130 --> 00:03:56,329 research infrastructure. 105 00:03:56,330 --> 00:03:58,219 I think there's been many talks earlier 106 00:03:58,220 --> 00:03:59,569 in Congress that have highlighted how 107 00:03:59,570 --> 00:04:01,369 data contributes to knowledge about 108 00:04:01,370 --> 00:04:03,049 climate change, climate modeling, 109 00:04:03,050 --> 00:04:05,119 satellites, building our 110 00:04:05,120 --> 00:04:06,949 own DIY satellite ground station network, 111 00:04:06,950 --> 00:04:08,329 which I now want to get a ground station 112 00:04:08,330 --> 00:04:09,319 up on. 113 00:04:09,320 --> 00:04:11,179 And so I would check to those or check 114 00:04:11,180 --> 00:04:12,180 those out for examples. 115 00:04:13,460 --> 00:04:14,779 But I kind of just want to stress the 116 00:04:14,780 --> 00:04:17,389 sort of coordinated global scale 117 00:04:17,390 --> 00:04:19,819 of this collection and processing and 118 00:04:19,820 --> 00:04:21,648 something that scholar Paul Edwards has 119 00:04:21,649 --> 00:04:22,939 described as a global knowledge 120 00:04:22,940 --> 00:04:24,859 infrastructure, making global data 121 00:04:26,540 --> 00:04:27,559 in the United States. 122 00:04:27,560 --> 00:04:29,389 At the federal level, there are a handful 123 00:04:29,390 --> 00:04:31,579 of agencies, departments and institutions 124 00:04:31,580 --> 00:04:33,439 involved with the creation and publishing 125 00:04:33,440 --> 00:04:36,019 of this data NOAA, USGS, 126 00:04:36,020 --> 00:04:38,629 NASA, DOE, EPA and more. 127 00:04:38,630 --> 00:04:40,339 In addition to these are research 128 00:04:40,340 --> 00:04:42,679 institutions like Columbia University 129 00:04:42,680 --> 00:04:44,029 or the Center for International Earth 130 00:04:44,030 --> 00:04:45,769 Science Information Network is based. 131 00:04:47,880 --> 00:04:49,589 Given the coordinated collection and 132 00:04:49,590 --> 00:04:50,969 holding of this data, there's certainly 133 00:04:50,970 --> 00:04:53,219 no singular form of public access, 134 00:04:53,220 --> 00:04:54,929 but publishing of data and data products 135 00:04:54,930 --> 00:04:56,489 has been increasingly public through a 136 00:04:56,490 --> 00:04:59,099 combination of policies, portals, 137 00:04:59,100 --> 00:05:01,109 libraries and archives and open 138 00:05:01,110 --> 00:05:03,449 government data initiatives in the U.S. 139 00:05:03,450 --> 00:05:05,789 under Title 17, Section one and five, 140 00:05:05,790 --> 00:05:07,469 most data with some exemptions is 141 00:05:07,470 --> 00:05:09,059 considered a work of the US government 142 00:05:09,060 --> 00:05:11,429 and therefore in the public domain 143 00:05:11,430 --> 00:05:13,019 and historical climate, environmental 144 00:05:13,020 --> 00:05:14,579 data is critically important to 145 00:05:14,580 --> 00:05:16,949 contextualize and understand current 146 00:05:16,950 --> 00:05:18,329 observed phenomena. 147 00:05:18,330 --> 00:05:20,309 However, in addition to the data itself, 148 00:05:20,310 --> 00:05:22,499 there are reports, summaries and analysis 149 00:05:22,500 --> 00:05:24,029 that really open up the topic to a 150 00:05:24,030 --> 00:05:24,939 broader audience. 151 00:05:24,940 --> 00:05:26,369 And I kind of consider myself that 152 00:05:26,370 --> 00:05:28,469 broader audience to sort 153 00:05:28,470 --> 00:05:30,360 of beyond those with domain expertize. 154 00:05:32,340 --> 00:05:33,779 So I kind of just want to pause for a 155 00:05:33,780 --> 00:05:35,459 moment here and untangle climate and 156 00:05:35,460 --> 00:05:36,529 environmental data, 157 00:05:37,530 --> 00:05:39,119 people can use them interchangeably and 158 00:05:39,120 --> 00:05:40,979 I've been doing so right now. 159 00:05:40,980 --> 00:05:42,659 But I think there are some differences in 160 00:05:42,660 --> 00:05:44,039 the way certain communities use them that 161 00:05:44,040 --> 00:05:45,119 are important. 162 00:05:45,120 --> 00:05:47,279 So in many cases, when people 163 00:05:47,280 --> 00:05:48,329 say climate data, they're really 164 00:05:48,330 --> 00:05:50,009 referring to atmospheric weather and 165 00:05:50,010 --> 00:05:51,449 hydrologic conditions data, 166 00:05:52,530 --> 00:05:54,239 whereas environmental data, when people 167 00:05:54,240 --> 00:05:56,369 use that are often explicitly 168 00:05:56,370 --> 00:05:57,689 referring to environmental health and 169 00:05:57,690 --> 00:05:59,699 hazard. This includes air and water 170 00:05:59,700 --> 00:06:01,799 quality, toxic and pollutants as well as 171 00:06:01,800 --> 00:06:02,800 waste. 172 00:06:03,690 --> 00:06:05,249 So both are really vital to help 173 00:06:05,250 --> 00:06:06,719 characterize and navigate our 174 00:06:06,720 --> 00:06:08,519 relationship to our environments. 175 00:06:08,520 --> 00:06:10,409 But I think at times can can maybe feel 176 00:06:10,410 --> 00:06:12,539 at different scales. And so this is the 177 00:06:12,540 --> 00:06:15,629 first of two terrible attempts 178 00:06:15,630 --> 00:06:17,120 to position them against each other. 179 00:06:19,380 --> 00:06:21,389 So access to climate data and methods has 180 00:06:21,390 --> 00:06:23,159 already faced challenges prior to the 181 00:06:23,160 --> 00:06:25,349 past year, in many cases from 182 00:06:25,350 --> 00:06:27,479 those disputing global warming. 183 00:06:27,480 --> 00:06:29,429 And this has led to motivated targeting 184 00:06:29,430 --> 00:06:31,289 of climate scientists and their data sets 185 00:06:31,290 --> 00:06:33,749 in cases, in some cases with financial 186 00:06:33,750 --> 00:06:35,339 support from lobby groups. 187 00:06:35,340 --> 00:06:36,809 One of the more well-known examples is 188 00:06:36,810 --> 00:06:38,609 the hockey stick controversy, where a 189 00:06:38,610 --> 00:06:40,469 graph showing the gradual cooling and 190 00:06:40,470 --> 00:06:42,689 then recent rapid warming, roughly 191 00:06:42,690 --> 00:06:44,219 resembling a hockey stick, was 192 00:06:44,220 --> 00:06:46,259 highlighted in an Intergovernmental Panel 193 00:06:46,260 --> 00:06:47,819 on Climate Change report. 194 00:06:47,820 --> 00:06:49,859 It had been published in subsequent 195 00:06:49,860 --> 00:06:52,019 years, the results of which the results 196 00:06:52,020 --> 00:06:53,789 have been replicated numerous times with 197 00:06:53,790 --> 00:06:55,229 different and additional data. 198 00:06:55,230 --> 00:06:57,239 But at that time, the results were new 199 00:06:57,240 --> 00:06:58,859 and compelling. 200 00:06:58,860 --> 00:07:00,899 As a result, they were then disputed. 201 00:07:00,900 --> 00:07:02,519 Michael Mann and his colleagues wound up 202 00:07:02,520 --> 00:07:04,439 personally targeted online, subject to 203 00:07:04,440 --> 00:07:05,939 Freedom of Information Act request and 204 00:07:05,940 --> 00:07:07,559 drawn into court proceedings that lasted 205 00:07:07,560 --> 00:07:08,560 many years. 206 00:07:09,570 --> 00:07:10,859 There are more examples, but in the 207 00:07:10,860 --> 00:07:12,269 interest of time, I'll have to skip them. 208 00:07:12,270 --> 00:07:14,459 But I think maybe the other most visible 209 00:07:14,460 --> 00:07:16,589 one would be the 2009 Climategate 210 00:07:16,590 --> 00:07:17,939 email leaks. 211 00:07:17,940 --> 00:07:20,369 My sense is that before 2017, 212 00:07:20,370 --> 00:07:22,289 this form of Tarou targeting would have 213 00:07:22,290 --> 00:07:24,689 been identified as the most likely public 214 00:07:24,690 --> 00:07:26,999 risk to climate science sort of way 215 00:07:27,000 --> 00:07:29,189 to introduce doubt around climate change 216 00:07:29,190 --> 00:07:30,929 in public opinion through concerted 217 00:07:30,930 --> 00:07:33,089 efforts to discredit results or 218 00:07:33,090 --> 00:07:34,090 scientists. 219 00:07:35,670 --> 00:07:37,109 Just want to say one more time. 220 00:07:37,110 --> 00:07:38,049 Shout out to Paul Edwards. 221 00:07:38,050 --> 00:07:39,879 His discussion on environmental data 222 00:07:39,880 --> 00:07:41,489 systems is under siege is really 223 00:07:41,490 --> 00:07:42,779 instructive. 224 00:07:42,780 --> 00:07:44,339 In his book, A Vast Machine, as well as 225 00:07:44,340 --> 00:07:46,199 my recent research, he kind of unpacks 226 00:07:46,200 --> 00:07:48,059 the history of climate data. 227 00:07:48,060 --> 00:07:50,819 And I think his 228 00:07:50,820 --> 00:07:52,559 his work and kind of these previous 229 00:07:52,560 --> 00:07:54,239 examples kind of raise important 230 00:07:54,240 --> 00:07:56,669 questions about access to climate data. 231 00:07:56,670 --> 00:07:58,549 What man's opponents and climate change 232 00:07:58,550 --> 00:08:00,719 skeptics said they wanted in many cases 233 00:08:00,720 --> 00:08:03,029 was the raw data or like a full 234 00:08:03,030 --> 00:08:04,030 record. 235 00:08:04,800 --> 00:08:07,079 And so and in one case in particular, 236 00:08:07,080 --> 00:08:08,609 Project sought to actually audit the 237 00:08:08,610 --> 00:08:10,109 siting of surface temperature 238 00:08:10,110 --> 00:08:11,519 instruments. 239 00:08:11,520 --> 00:08:14,009 However, I think it's important to to 240 00:08:14,010 --> 00:08:15,839 note and something that scientists like 241 00:08:15,840 --> 00:08:17,909 at the time was how necessary context is 242 00:08:17,910 --> 00:08:19,019 to interpreting data. 243 00:08:19,020 --> 00:08:20,549 And I think we need to better understand 244 00:08:20,550 --> 00:08:21,959 that was working with complex data, 245 00:08:21,960 --> 00:08:23,699 including climate and environmental data. 246 00:08:26,710 --> 00:08:28,089 This moment in particular. 247 00:08:29,960 --> 00:08:32,239 On November 8th, 2016, Donald Trump 248 00:08:32,240 --> 00:08:33,240 was elected 249 00:08:34,490 --> 00:08:36,048 for many people, there was an immediate 250 00:08:36,049 --> 00:08:37,609 sense that we have to be ready, we have 251 00:08:37,610 --> 00:08:38,779 to do something. 252 00:08:38,780 --> 00:08:40,579 Scientists, environmentalists and 253 00:08:40,580 --> 00:08:42,709 environmental justice organizers saw 254 00:08:42,710 --> 00:08:44,299 statements made during the campaign is 255 00:08:44,300 --> 00:08:46,069 indicating that climate and environmental 256 00:08:46,070 --> 00:08:48,139 data infrastructures could be at risk 257 00:08:48,140 --> 00:08:49,669 and actively targeted. 258 00:08:49,670 --> 00:08:52,039 But this isn't the same as the above 259 00:08:52,040 --> 00:08:54,199 and said this is the 260 00:08:54,200 --> 00:08:56,269 risk is how do you ensure continued 261 00:08:56,270 --> 00:08:58,429 access to data about climate and the 262 00:08:58,430 --> 00:08:59,989 environment when the supporting 263 00:08:59,990 --> 00:09:02,119 institutions may no longer be able 264 00:09:02,120 --> 00:09:03,949 or desire to? 265 00:09:03,950 --> 00:09:05,599 However, many from an environmental 266 00:09:05,600 --> 00:09:07,849 justice background have long recognized 267 00:09:07,850 --> 00:09:09,919 existing environmental data 268 00:09:09,920 --> 00:09:11,629 structures as imperfect. 269 00:09:11,630 --> 00:09:13,759 For example, in cases where it's relying 270 00:09:13,760 --> 00:09:15,979 upon industry reported data or 271 00:09:15,980 --> 00:09:18,049 non or is non representative of 272 00:09:18,050 --> 00:09:19,909 communities embodied experiment, 273 00:09:19,910 --> 00:09:22,069 experience of pollution 274 00:09:22,070 --> 00:09:22,999 and toxics. 275 00:09:23,000 --> 00:09:25,069 This put people into a position 276 00:09:25,070 --> 00:09:27,109 of concern for the preservation of 277 00:09:27,110 --> 00:09:29,749 imperfect data to avoid an alternative 278 00:09:29,750 --> 00:09:30,750 of no data. 279 00:09:33,330 --> 00:09:34,739 But wait, you may have been thinking this 280 00:09:34,740 --> 00:09:36,689 whole time, are you from Toronto and 281 00:09:36,690 --> 00:09:38,099 isn't Toronto and Canada? 282 00:09:38,100 --> 00:09:39,149 It is I am. 283 00:09:40,380 --> 00:09:42,539 However, Canadians experienced 284 00:09:42,540 --> 00:09:45,119 kind of our own mobilizing 285 00:09:45,120 --> 00:09:47,489 moment under our previous prime minister, 286 00:09:47,490 --> 00:09:49,169 Stephen Harper. 287 00:09:49,170 --> 00:09:51,809 And and I think this highlighted 288 00:09:51,810 --> 00:09:54,119 the new form of a threat to climate 289 00:09:54,120 --> 00:09:56,459 environmental data infrastructure. 290 00:09:56,460 --> 00:09:58,649 Stephen Harper was able to really quickly 291 00:09:58,650 --> 00:10:00,839 and successfully implement an agenda 292 00:10:00,840 --> 00:10:02,279 of systematically undercutting 293 00:10:02,280 --> 00:10:03,539 environmental and climate research 294 00:10:03,540 --> 00:10:05,639 budgets, closing labs including 295 00:10:05,640 --> 00:10:08,159 and at an Arctic research station, 296 00:10:08,160 --> 00:10:10,739 weakening government environmental 297 00:10:10,740 --> 00:10:12,179 regulations, and then shutting down 298 00:10:12,180 --> 00:10:14,099 libraries and reducing historical 299 00:10:14,100 --> 00:10:16,409 periodical and record collections. 300 00:10:16,410 --> 00:10:18,389 The speed and immediate impact, I think, 301 00:10:18,390 --> 00:10:19,919 served as a rallying moment and 302 00:10:19,920 --> 00:10:21,989 highlighted facets of vulnerability that 303 00:10:21,990 --> 00:10:23,490 many had not been considering. 304 00:10:26,500 --> 00:10:27,790 So do something 305 00:10:29,170 --> 00:10:31,269 for edgy members that something 306 00:10:31,270 --> 00:10:33,429 quickly became preserving existing 307 00:10:33,430 --> 00:10:35,349 and federal environmental data through 308 00:10:35,350 --> 00:10:37,119 helping facilitate grassroots archiving 309 00:10:37,120 --> 00:10:39,519 efforts, monitoring changes to federal 310 00:10:39,520 --> 00:10:41,739 websites, and documenting the political 311 00:10:41,740 --> 00:10:43,749 transition through interviews and timely 312 00:10:43,750 --> 00:10:44,979 academic analysis. 313 00:10:48,290 --> 00:10:50,719 Between December 2016 and June 314 00:10:50,720 --> 00:10:52,939 2017, local organizers 315 00:10:52,940 --> 00:10:55,009 hosted 49 data rescue events 316 00:10:55,010 --> 00:10:56,269 in cities across the U.S. 317 00:10:56,270 --> 00:10:58,429 and Canada with support from hedgy and 318 00:10:58,430 --> 00:11:00,259 the data refuge project at the University 319 00:11:00,260 --> 00:11:01,399 of Pennsylvania. 320 00:11:01,400 --> 00:11:03,199 At events ranging in size from a couple 321 00:11:03,200 --> 00:11:05,479 dozen to over two hundred people gathered 322 00:11:05,480 --> 00:11:07,759 to nominate key federal environmental 323 00:11:07,760 --> 00:11:09,349 data sets for archiving is part of the 324 00:11:09,350 --> 00:11:11,509 Internet archives preexisting 325 00:11:11,510 --> 00:11:12,889 end of term Kraul. 326 00:11:12,890 --> 00:11:15,199 In addition, attendees strategically 327 00:11:15,200 --> 00:11:17,419 organized how to deal with links and data 328 00:11:17,420 --> 00:11:18,769 sets that could not be preserved through 329 00:11:18,770 --> 00:11:19,850 automated methods. 330 00:11:20,900 --> 00:11:22,819 At these events, attendees nominated over 331 00:11:22,820 --> 00:11:24,889 63000 Web pages as seeds 332 00:11:24,890 --> 00:11:26,689 for subsequent calling. 333 00:11:26,690 --> 00:11:28,489 However, and it's hard not to go into 334 00:11:28,490 --> 00:11:30,229 like a really extended conversation about 335 00:11:30,230 --> 00:11:31,849 crawler software here, which I'm probably 336 00:11:31,850 --> 00:11:33,439 not the best person to do. 337 00:11:33,440 --> 00:11:35,149 Crullers software is not actually easily 338 00:11:35,150 --> 00:11:37,249 able to fully archive and discover links 339 00:11:37,250 --> 00:11:39,259 to data sets and Web pages on all sites, 340 00:11:39,260 --> 00:11:41,329 partially because of underlying 341 00:11:41,330 --> 00:11:42,859 Web development practices and Internet 342 00:11:42,860 --> 00:11:44,449 infrastructure and partially because of 343 00:11:44,450 --> 00:11:46,699 resource and storage constraints. 344 00:11:46,700 --> 00:11:48,889 So in addition, more than 22000 345 00:11:48,890 --> 00:11:50,479 data sets whereas identified, were 346 00:11:50,480 --> 00:11:52,219 identified as candidates for non 347 00:11:52,220 --> 00:11:54,679 automated preservation, deemed that 348 00:11:54,680 --> 00:11:56,569 we deemed them as not able to be 349 00:11:56,570 --> 00:11:58,609 successfully crawled, several hundred of 350 00:11:58,610 --> 00:11:59,839 which went through a workflow of 351 00:11:59,840 --> 00:12:02,029 developing custom solutions to scrape 352 00:12:02,030 --> 00:12:04,519 links and data sets and upload them to 353 00:12:04,520 --> 00:12:06,739 a data refuge repository using an open 354 00:12:06,740 --> 00:12:07,740 source toolkit. 355 00:12:09,230 --> 00:12:10,789 I'm going to use the benefit of hindsight 356 00:12:10,790 --> 00:12:13,009 now to sort of avoid falling into 357 00:12:13,010 --> 00:12:14,809 a narrative that portrays us as underdogs 358 00:12:14,810 --> 00:12:16,639 or kind of what alone accomplished this 359 00:12:16,640 --> 00:12:18,080 project of a massive scale. 360 00:12:19,520 --> 00:12:21,049 As people in many cases, without the 361 00:12:21,050 --> 00:12:23,149 expertize of a digital 362 00:12:23,150 --> 00:12:25,099 preservation and archiving or a long 363 00:12:25,100 --> 00:12:27,349 track record in it, we didn't 364 00:12:27,350 --> 00:12:29,449 fully appreciate the scale and along 365 00:12:29,450 --> 00:12:32,209 the way, rediscovered, rediscovered. 366 00:12:32,210 --> 00:12:34,339 I want to stress that longstanding issues 367 00:12:34,340 --> 00:12:36,529 with archiving and digital preservation 368 00:12:36,530 --> 00:12:39,199 that many groups are already navigating. 369 00:12:39,200 --> 00:12:41,149 So rather than forging ahead alone, we 370 00:12:41,150 --> 00:12:42,919 quickly found affinities with existing 371 00:12:42,920 --> 00:12:45,469 advocates, projects and institutions, 372 00:12:45,470 --> 00:12:47,029 many of which who have been operating in 373 00:12:47,030 --> 00:12:48,829 this space for a long time. 374 00:12:48,830 --> 00:12:50,629 So in addition to US and data refuge 375 00:12:50,630 --> 00:12:52,579 climate near Project Azimuth and the 376 00:12:52,580 --> 00:12:54,619 archive team who had existed for years 377 00:12:54,620 --> 00:12:56,929 prior, also became rallying projects 378 00:12:56,930 --> 00:12:58,399 for people who wanted to quickly organize 379 00:12:58,400 --> 00:13:00,409 around preserving data. 380 00:13:00,410 --> 00:13:02,959 I'm just going to mention three projects. 381 00:13:02,960 --> 00:13:05,089 There's way too many, but one 382 00:13:05,090 --> 00:13:06,709 to untangle just some things that I think 383 00:13:06,710 --> 00:13:08,239 are interesting around access, coverage 384 00:13:08,240 --> 00:13:10,789 and risk. So first, Internet Archive, 385 00:13:11,810 --> 00:13:13,519 Internet Archive is an unparalleled 386 00:13:13,520 --> 00:13:15,319 resource for web archiving. 387 00:13:15,320 --> 00:13:17,209 In this particular case, with just end of 388 00:13:17,210 --> 00:13:19,399 term crawl, they managed to get over 389 00:13:19,400 --> 00:13:21,289 200 terabytes of the government web. 390 00:13:22,460 --> 00:13:24,589 And because of the additional focus, they 391 00:13:24,590 --> 00:13:26,389 have got sections of websites that might 392 00:13:26,390 --> 00:13:28,519 have been missed based on the 393 00:13:28,520 --> 00:13:30,709 way they can figure that crawl. 394 00:13:30,710 --> 00:13:32,329 And so while it may not include an 395 00:13:32,330 --> 00:13:33,889 archive copy of all the data sets for the 396 00:13:33,890 --> 00:13:35,839 reasons mentioned earlier, it provides an 397 00:13:35,840 --> 00:13:37,999 important snapshot of how that data was 398 00:13:38,000 --> 00:13:40,009 presented on websites at the end of the 399 00:13:40,010 --> 00:13:42,049 previous administration and further 400 00:13:42,050 --> 00:13:44,239 provides the ability to browse previous 401 00:13:44,240 --> 00:13:45,709 versions of those sites in a way that 402 00:13:45,710 --> 00:13:47,809 extends how the content was initially 403 00:13:47,810 --> 00:13:48,709 presented. 404 00:13:48,710 --> 00:13:50,749 So I think that kind of opens the 405 00:13:50,750 --> 00:13:52,609 question about what what we think about 406 00:13:52,610 --> 00:13:53,900 when we think about access. 407 00:13:55,010 --> 00:13:57,109 The next one is Code for Science 408 00:13:57,110 --> 00:13:59,389 spearheaded project Svalbard named 409 00:13:59,390 --> 00:14:01,459 after the Sea Bolt, a collection of 410 00:14:01,460 --> 00:14:03,769 over thirty eight gigabytes of metadata 411 00:14:03,770 --> 00:14:05,539 to try and create a single catalog of 412 00:14:05,540 --> 00:14:06,829 research data files. 413 00:14:06,830 --> 00:14:09,019 And while data has a 414 00:14:09,020 --> 00:14:11,389 catalog, not all data that could be there 415 00:14:11,390 --> 00:14:13,669 is there without a comprehensive 416 00:14:13,670 --> 00:14:15,769 view. Assessing where data is 417 00:14:15,770 --> 00:14:17,959 and how much data is preserved is 418 00:14:17,960 --> 00:14:19,760 difficult, as you can imagine. 419 00:14:22,280 --> 00:14:24,679 And then finally, as existing 420 00:14:24,680 --> 00:14:26,899 data center practitioners, the Earth 421 00:14:26,900 --> 00:14:29,059 Science and Information Partnership made 422 00:14:29,060 --> 00:14:30,739 a case for a collaborative effort to 423 00:14:30,740 --> 00:14:31,879 understand risk. 424 00:14:31,880 --> 00:14:33,619 Stressing existing preservation and 425 00:14:33,620 --> 00:14:35,959 backup methods may not be visible, 426 00:14:35,960 --> 00:14:37,819 particularly for climate data. 427 00:14:37,820 --> 00:14:39,289 They surface different understandings of 428 00:14:39,290 --> 00:14:42,499 risk from a from from public ones. 429 00:14:42,500 --> 00:14:43,939 Coming from a data practitioner 430 00:14:43,940 --> 00:14:46,099 perspective, I think it's 431 00:14:46,100 --> 00:14:48,199 really important if that 432 00:14:48,200 --> 00:14:49,999 was a bad, bad slide job there. 433 00:14:50,000 --> 00:14:52,619 Sorry to 434 00:14:52,620 --> 00:14:55,219 to to pull up this quote that they say. 435 00:14:55,220 --> 00:14:57,349 So they frame these as long standing 436 00:14:57,350 --> 00:14:58,689 factors of risk. 437 00:14:58,690 --> 00:15:00,109 But but I would say there's a new 438 00:15:00,110 --> 00:15:02,989 dimension under certain administrations 439 00:15:02,990 --> 00:15:04,759 and that is of obsolete technology or 440 00:15:04,760 --> 00:15:07,069 data formats, lack of metadata, lack 441 00:15:07,070 --> 00:15:09,169 of expertize, lack of funding to 442 00:15:09,170 --> 00:15:10,099 maintain the data. 443 00:15:10,100 --> 00:15:12,169 And I think in addition, you know, 444 00:15:12,170 --> 00:15:14,269 lack of funding for additional 445 00:15:14,270 --> 00:15:16,939 or extended collection in the future. 446 00:15:16,940 --> 00:15:19,009 So a year later, what happened? 447 00:15:21,410 --> 00:15:22,749 This is kind of a weird transition, just 448 00:15:22,750 --> 00:15:24,579 noticing that we haven't seen a mass 449 00:15:24,580 --> 00:15:25,809 removal of data sets. 450 00:15:25,810 --> 00:15:27,999 There have been a few, a few that 451 00:15:28,000 --> 00:15:29,000 have been taken down 452 00:15:30,210 --> 00:15:32,529 for reasons that are not clearly linkable 453 00:15:32,530 --> 00:15:35,019 to sort of 454 00:15:35,020 --> 00:15:36,609 the goal of removing them from public 455 00:15:36,610 --> 00:15:38,979 access as a sort of politically 456 00:15:38,980 --> 00:15:41,319 motivated executive orders. 457 00:15:41,320 --> 00:15:43,209 And Scott Pruett's appointment to the EPA 458 00:15:43,210 --> 00:15:45,039 has led to a reverse of a ban on the 459 00:15:45,040 --> 00:15:47,439 neurotoxic on a neurotoxic 460 00:15:47,440 --> 00:15:48,399 pesticide. 461 00:15:48,400 --> 00:15:50,499 A proposal to rescind Obama's clean power 462 00:15:50,500 --> 00:15:52,389 plan is in the works and cuts to 463 00:15:52,390 --> 00:15:54,669 important environmental programs, notably 464 00:15:54,670 --> 00:15:56,289 those that protect marginalized and 465 00:15:56,290 --> 00:15:58,659 vulnerable populations, are underway. 466 00:15:58,660 --> 00:16:00,879 Further budget proposals 467 00:16:00,880 --> 00:16:02,769 aimed at severely cutting funding to key 468 00:16:02,770 --> 00:16:03,770 federal agencies 469 00:16:05,740 --> 00:16:06,849 involved with environmental data 470 00:16:06,850 --> 00:16:08,559 collection. In terms of data, we've 471 00:16:08,560 --> 00:16:10,659 actually seen a shift 472 00:16:10,660 --> 00:16:11,979 in how it's presented on federal 473 00:16:11,980 --> 00:16:14,079 websites. The screenshot on the slide is 474 00:16:14,080 --> 00:16:16,059 actually from a recent hedgy website 475 00:16:16,060 --> 00:16:18,549 monitoring report documenting 476 00:16:18,550 --> 00:16:20,439 the removals and changes in access to 477 00:16:20,440 --> 00:16:22,719 resources on the EPA's climate and energy 478 00:16:22,720 --> 00:16:24,849 resources for state, local and 479 00:16:24,850 --> 00:16:25,850 tribal government. 480 00:16:26,680 --> 00:16:28,869 So since January, 481 00:16:28,870 --> 00:16:31,059 a website monitoring team has released 482 00:16:31,060 --> 00:16:32,709 over twenty five reports like this 483 00:16:32,710 --> 00:16:34,869 documenting changes to how 484 00:16:34,870 --> 00:16:36,789 environmental and climate data is 485 00:16:36,790 --> 00:16:37,790 presented. 486 00:16:39,650 --> 00:16:41,269 So what next? 487 00:16:41,270 --> 00:16:43,099 I think the biggest opportunity I see is 488 00:16:43,100 --> 00:16:44,839 in the public conversation and attention 489 00:16:44,840 --> 00:16:47,419 toward the continued access to this data, 490 00:16:47,420 --> 00:16:48,499 the fact that people who weren't 491 00:16:48,500 --> 00:16:50,749 librarians, web archivists or research 492 00:16:50,750 --> 00:16:52,969 scientists showed up and stayed involved 493 00:16:52,970 --> 00:16:55,249 attests to this as the Edges 494 00:16:55,250 --> 00:16:56,839 website monitoring work as a way to 495 00:16:56,840 --> 00:16:58,939 attempt to mobilize that that continued 496 00:16:58,940 --> 00:17:00,259 public conversation. 497 00:17:00,260 --> 00:17:02,359 But there could be more in the wake 498 00:17:02,360 --> 00:17:04,969 of recent FCC decision on net neutrality. 499 00:17:04,970 --> 00:17:06,229 I think we're seeing another wave of 500 00:17:06,230 --> 00:17:07,309 public conversation around 501 00:17:07,310 --> 00:17:09,019 infrastructure, but kind of operating at 502 00:17:09,020 --> 00:17:10,129 a lower level. 503 00:17:10,130 --> 00:17:12,229 Since the summer, Hedgy has been working 504 00:17:12,230 --> 00:17:14,659 with Protocol Labs, the creator of IP FSR 505 00:17:14,660 --> 00:17:17,719 Interplanetary File System and query 506 00:17:17,720 --> 00:17:19,309 a data science company developing data, 507 00:17:19,310 --> 00:17:21,858 set research tools on the distributed web 508 00:17:21,859 --> 00:17:23,838 on a project called Data Together, which 509 00:17:23,839 --> 00:17:25,699 aims to convene a conversation around 510 00:17:25,700 --> 00:17:27,019 building our own and better data 511 00:17:27,020 --> 00:17:28,189 infrastructures. 512 00:17:28,190 --> 00:17:30,259 We want to explore how decentralized web 513 00:17:30,260 --> 00:17:32,509 patterns can support community data 514 00:17:32,510 --> 00:17:34,430 stewardship in part 515 00:17:35,630 --> 00:17:37,909 through content, address web archiving 516 00:17:37,910 --> 00:17:40,099 and are having those conversations out 517 00:17:40,100 --> 00:17:42,409 in the open where people can join in. 518 00:17:42,410 --> 00:17:44,659 That could be a whole talk in itself and 519 00:17:44,660 --> 00:17:46,699 I would prefer that somewhat to give it. 520 00:17:46,700 --> 00:17:48,319 And I have more questions and answers. 521 00:17:48,320 --> 00:17:49,939 So I think it's probably something that 522 00:17:49,940 --> 00:17:51,589 works better as a conversation and one 523 00:17:51,590 --> 00:17:53,179 I'm hoping at least some of you will want 524 00:17:53,180 --> 00:17:54,180 to participate in. 525 00:17:55,610 --> 00:17:57,859 And maybe just to kind of 526 00:17:59,000 --> 00:18:01,069 suggest as well, I think many in hacker 527 00:18:01,070 --> 00:18:02,689 free software, open hardware, open 528 00:18:02,690 --> 00:18:04,999 science communities have have recognized 529 00:18:05,000 --> 00:18:06,859 the ways that technology is not neutral, 530 00:18:06,860 --> 00:18:09,109 that it can come with embedded bias 531 00:18:09,110 --> 00:18:11,419 and predispositions to be used in certain 532 00:18:11,420 --> 00:18:12,319 ways. 533 00:18:12,320 --> 00:18:14,239 And I think if that recognition can be 534 00:18:14,240 --> 00:18:16,279 coupled with the recognition from 535 00:18:16,280 --> 00:18:17,779 environmental justice advocates and 536 00:18:17,780 --> 00:18:19,669 academics into the ways that data is not 537 00:18:19,670 --> 00:18:22,219 neutral and then also, 538 00:18:22,220 --> 00:18:23,899 you know, with an attention to the vital 539 00:18:23,900 --> 00:18:25,699 data about climate and environment that 540 00:18:25,700 --> 00:18:27,289 is critical to navigating our changing 541 00:18:27,290 --> 00:18:29,419 relationship to the environment, I really 542 00:18:29,420 --> 00:18:31,069 think we have a chance to build better 543 00:18:31,070 --> 00:18:32,070 data together. 544 00:18:33,470 --> 00:18:35,539 So maybe just in conclusion, I 545 00:18:35,540 --> 00:18:36,919 want to say, you know, Edge is always 546 00:18:36,920 --> 00:18:37,999 looking for people interested in 547 00:18:38,000 --> 00:18:39,319 volunteering. 548 00:18:39,320 --> 00:18:40,699 Our projects range 549 00:18:41,960 --> 00:18:43,489 for people from a variety of backgrounds. 550 00:18:43,490 --> 00:18:45,229 In particular, if you are like a dev ops, 551 00:18:45,230 --> 00:18:46,909 which come find me, we need like a 552 00:18:46,910 --> 00:18:48,679 serious dev ops help. 553 00:18:48,680 --> 00:18:50,479 And please check out our website, GitHub. 554 00:18:50,480 --> 00:18:52,669 You can sign up to our mailing list. 555 00:18:52,670 --> 00:18:54,199 You can help us created it together as a 556 00:18:54,200 --> 00:18:55,879 mailing list or maybe just have some 557 00:18:55,880 --> 00:18:57,409 conversations about this somewhere 558 00:18:57,410 --> 00:18:58,609 online. 559 00:18:58,610 --> 00:18:59,610 I think it's. 560 00:19:03,240 --> 00:19:05,519 Well, a big, big 561 00:19:05,520 --> 00:19:07,649 sense, because this is an important 562 00:19:07,650 --> 00:19:09,839 work, we as 563 00:19:09,840 --> 00:19:12,779 we like the best information 564 00:19:12,780 --> 00:19:13,819 we have to achieve. 565 00:19:13,820 --> 00:19:15,929 So let's now come to the 566 00:19:15,930 --> 00:19:17,189 Q&A. 567 00:19:17,190 --> 00:19:19,229 Please go to the microphones. 568 00:19:19,230 --> 00:19:21,449 And if there's a question 569 00:19:21,450 --> 00:19:23,549 from the Internet, I get informed 570 00:19:23,550 --> 00:19:25,680 from the video Angell's. 571 00:19:27,750 --> 00:19:29,939 Is there any question 572 00:19:29,940 --> 00:19:31,470 are there is somebody coming 573 00:19:33,330 --> 00:19:35,459 microphone one for 574 00:19:35,460 --> 00:19:36,460 you? 575 00:19:36,930 --> 00:19:38,999 I you often hear that 576 00:19:39,000 --> 00:19:40,979 the scientific data sets are very 577 00:19:40,980 --> 00:19:43,139 fragmented or not very easily 578 00:19:43,140 --> 00:19:44,639 accessible. I can imagine that's 579 00:19:44,640 --> 00:19:46,289 certainly something you run into while 580 00:19:46,290 --> 00:19:48,089 trying to rescue it. And, Carol, it's 581 00:19:48,090 --> 00:19:49,379 what is your experience? 582 00:19:49,380 --> 00:19:51,059 Been there and did you see any 583 00:19:51,060 --> 00:19:53,429 opportunity to, for example, improved us 584 00:19:53,430 --> 00:19:54,430 in your efforts? 585 00:19:55,680 --> 00:19:58,139 Yeah, so 586 00:19:58,140 --> 00:20:00,569 absolutely. I think we did run aground 587 00:20:00,570 --> 00:20:02,999 off of that fragmentation. 588 00:20:03,000 --> 00:20:05,069 And I think maybe if we could 589 00:20:05,070 --> 00:20:06,809 offer one thing to other people with the 590 00:20:06,810 --> 00:20:09,089 experience of like this not being 591 00:20:09,090 --> 00:20:10,439 something we were familiar with and kind 592 00:20:10,440 --> 00:20:11,939 of stumbling through it and making like 593 00:20:11,940 --> 00:20:14,159 all the mistakes possible, I mean, like, 594 00:20:14,160 --> 00:20:15,839 where is this? How can we find it? 595 00:20:17,310 --> 00:20:19,589 So in terms of kind 596 00:20:19,590 --> 00:20:21,179 of what we found or sort of like a way 597 00:20:21,180 --> 00:20:23,549 for it actually went to flag 598 00:20:23,550 --> 00:20:25,619 that I think there are already processes 599 00:20:25,620 --> 00:20:27,509 to try and like address fragmentation. 600 00:20:28,740 --> 00:20:30,659 And I think data governance is becoming 601 00:20:30,660 --> 00:20:31,829 this open data portal. 602 00:20:31,830 --> 00:20:33,059 I think the way that certain countries 603 00:20:33,060 --> 00:20:35,219 have kind of like a 604 00:20:35,220 --> 00:20:37,349 one stop portal to try and find 605 00:20:37,350 --> 00:20:39,389 data sets as well as, you know, 606 00:20:39,390 --> 00:20:41,489 coordination between the 607 00:20:41,490 --> 00:20:44,159 IPS. I had the screenshot of the IPCC 608 00:20:44,160 --> 00:20:45,839 data distribution center, like I think 609 00:20:45,840 --> 00:20:48,089 those projects are a good 610 00:20:48,090 --> 00:20:49,609 are one attempt at that. 611 00:20:50,850 --> 00:20:52,799 I mean, I think there's still this 612 00:20:52,800 --> 00:20:55,020 problem of access in the sense that 613 00:20:56,730 --> 00:20:58,979 I don't it's unclear to me how 614 00:20:58,980 --> 00:21:00,929 people who aren't within a certain 615 00:21:00,930 --> 00:21:02,459 community of practice would kind of even 616 00:21:02,460 --> 00:21:04,829 know to get there to get that data. 617 00:21:04,830 --> 00:21:06,539 And, you know, one thing we've heard in 618 00:21:06,540 --> 00:21:08,729 conversation with others, 619 00:21:08,730 --> 00:21:10,889 including the US Climate Alliance, 620 00:21:10,890 --> 00:21:13,469 I had their logo up, 621 00:21:13,470 --> 00:21:14,729 you know, is that I think there's a 622 00:21:14,730 --> 00:21:16,889 certain set of people who care a lot and, 623 00:21:16,890 --> 00:21:18,059 you know, their decisions and how they 624 00:21:18,060 --> 00:21:19,199 work is going to be really heavily 625 00:21:19,200 --> 00:21:20,999 impacted by climate data. 626 00:21:21,000 --> 00:21:22,379 But they're not going to look at the 627 00:21:22,380 --> 00:21:23,729 data. They're going to look at those 628 00:21:23,730 --> 00:21:26,009 reports. And like getting access 629 00:21:26,010 --> 00:21:27,660 to those is extremely important. 630 00:21:29,820 --> 00:21:31,949 I mean, I think portals are a 631 00:21:31,950 --> 00:21:33,719 big help. I think the the library 632 00:21:33,720 --> 00:21:35,099 repository programs, those things that 633 00:21:35,100 --> 00:21:37,079 already exist are really important. 634 00:21:37,080 --> 00:21:38,369 And I don't want to see them go away. 635 00:21:38,370 --> 00:21:40,949 But I still think there's like 636 00:21:40,950 --> 00:21:42,539 something slightly missing about 637 00:21:42,540 --> 00:21:44,759 usability. And I'm not I'm not entirely 638 00:21:44,760 --> 00:21:45,959 sure how to address that. 639 00:21:45,960 --> 00:21:48,359 But I think 640 00:21:48,360 --> 00:21:50,399 these Onestop things and work around 641 00:21:50,400 --> 00:21:51,959 opening the data sets is like a really 642 00:21:51,960 --> 00:21:52,970 good first steps. 643 00:21:54,520 --> 00:21:56,769 And thanks for your efforts on this, OK, 644 00:21:56,770 --> 00:21:59,139 we have ten minutes and 645 00:21:59,140 --> 00:22:01,389 three questions from from the Internet 646 00:22:01,390 --> 00:22:02,289 to in the room. 647 00:22:02,290 --> 00:22:05,109 So Internet start first, please. 648 00:22:05,110 --> 00:22:07,269 OK, so one person 649 00:22:07,270 --> 00:22:09,369 from Iasi asks, is 650 00:22:09,370 --> 00:22:11,499 the bar for putting data into 651 00:22:11,500 --> 00:22:13,629 the World Data Center for Climate 652 00:22:13,630 --> 00:22:15,759 too high in terms of providing a number 653 00:22:15,760 --> 00:22:17,859 of metadata, which is 654 00:22:17,860 --> 00:22:19,510 a lot of work, but. 655 00:22:22,320 --> 00:22:23,320 So 656 00:22:25,380 --> 00:22:27,539 my understanding is that operates 657 00:22:27,540 --> 00:22:29,759 a bit similar to the govt in the sense 658 00:22:29,760 --> 00:22:32,339 that it's like opt in so 659 00:22:32,340 --> 00:22:34,649 and it's opt in from a data publisher 660 00:22:34,650 --> 00:22:35,819 level. 661 00:22:35,820 --> 00:22:37,919 And if I'm incorrect there, I'm 662 00:22:37,920 --> 00:22:40,529 sorry, but working with that assumption, 663 00:22:40,530 --> 00:22:42,119 I think the barrier we found is that not 664 00:22:42,120 --> 00:22:43,169 everyone has opted in. 665 00:22:43,170 --> 00:22:45,239 So if you're a person who 666 00:22:45,240 --> 00:22:46,439 cares about the data, but you're not the 667 00:22:46,440 --> 00:22:48,689 person who made the data, you're 668 00:22:48,690 --> 00:22:51,089 kind of stuck if if that the publisher 669 00:22:51,090 --> 00:22:53,789 has not included it in in these 670 00:22:53,790 --> 00:22:54,790 repositories. 671 00:22:55,990 --> 00:22:58,199 And so, I mean, I think an 672 00:22:58,200 --> 00:23:00,269 interesting approach could be 673 00:23:00,270 --> 00:23:02,249 to figure out ways to incentivize more 674 00:23:02,250 --> 00:23:03,509 people to get it in there. 675 00:23:03,510 --> 00:23:04,859 And I don't know what those hooks could 676 00:23:04,860 --> 00:23:06,119 be, but if there's a way that, like you 677 00:23:06,120 --> 00:23:08,459 request, they push it there 678 00:23:08,460 --> 00:23:10,229 and there's a way to motivate that 679 00:23:10,230 --> 00:23:10,799 behavior. 680 00:23:10,800 --> 00:23:12,200 Like, I think that would be awesome. 681 00:23:14,910 --> 00:23:16,349 Microphone five, please. 682 00:23:17,350 --> 00:23:19,949 Have I have questions regarding 683 00:23:19,950 --> 00:23:22,049 creation of new that it means that one of 684 00:23:22,050 --> 00:23:24,179 the concerns is to protect that and 685 00:23:24,180 --> 00:23:26,519 preserve that from the old scientific 686 00:23:26,520 --> 00:23:28,559 research. But what will happen if, for 687 00:23:28,560 --> 00:23:30,509 example, there will be lack of funding 688 00:23:30,510 --> 00:23:32,129 for the next research and for example, 689 00:23:32,130 --> 00:23:34,469 our long time serious for climate 690 00:23:34,470 --> 00:23:36,419 research will be lost for that? 691 00:23:36,420 --> 00:23:37,829 Are there or, for example, in this 692 00:23:37,830 --> 00:23:39,899 community, are there people who try 693 00:23:39,900 --> 00:23:42,029 to reach 694 00:23:42,030 --> 00:23:44,129 a broader audience and tell them that we 695 00:23:44,130 --> 00:23:46,469 need to find funding for preserving data 696 00:23:46,470 --> 00:23:48,539 and for creating the new data and 697 00:23:48,540 --> 00:23:50,879 for measuring still all of these 698 00:23:50,880 --> 00:23:51,880 climate sphynx? 699 00:23:53,270 --> 00:23:55,459 I mean, I agree, I think that's really 700 00:23:55,460 --> 00:23:56,460 critical 701 00:23:57,570 --> 00:23:58,999 and that's something where I think there 702 00:23:59,000 --> 00:24:01,219 are in the United States and in Canada, 703 00:24:01,220 --> 00:24:02,599 a lot of people mobilized around this 704 00:24:02,600 --> 00:24:04,699 issue of thinking about 705 00:24:04,700 --> 00:24:06,889 how the sort 706 00:24:06,890 --> 00:24:08,569 of the knock on effects of limiting 707 00:24:08,570 --> 00:24:10,369 budgets now and then sort of a continued 708 00:24:10,370 --> 00:24:12,319 constraining of budgets and a lack of 709 00:24:12,320 --> 00:24:14,149 funding and a lack of, you know, in 710 00:24:14,150 --> 00:24:15,799 cutting jobs instead of growing jobs. 711 00:24:17,960 --> 00:24:19,749 The group that I'm most familiar with, 712 00:24:19,750 --> 00:24:20,989 who I think is doing really strong 713 00:24:20,990 --> 00:24:23,209 advocacy around that in the states is the 714 00:24:23,210 --> 00:24:25,189 Union for Concerned Scientists. 715 00:24:25,190 --> 00:24:26,599 So I think there are definitely groups 716 00:24:26,600 --> 00:24:28,729 who are flagging 717 00:24:28,730 --> 00:24:30,919 what those like, 718 00:24:30,920 --> 00:24:33,529 what the outcomes of the budget proposals 719 00:24:33,530 --> 00:24:35,719 would work or the impact of those. 720 00:24:35,720 --> 00:24:37,789 And so there are 721 00:24:37,790 --> 00:24:39,530 groups who are advocating for it, 722 00:24:40,760 --> 00:24:42,889 I think may 723 00:24:42,890 --> 00:24:45,349 not also being an expert in 724 00:24:45,350 --> 00:24:47,779 government policy 725 00:24:47,780 --> 00:24:49,519 or how budgets are implemented. 726 00:24:49,520 --> 00:24:51,739 I think there are constraints in how 727 00:24:51,740 --> 00:24:53,959 advocacy is a tool to affect change in 728 00:24:53,960 --> 00:24:55,609 what a budget is that gets adopted 729 00:24:56,660 --> 00:24:58,729 means that maybe that that alone is is a 730 00:24:58,730 --> 00:25:00,979 strategy is not going to 731 00:25:00,980 --> 00:25:02,140 prevent it from happening. 732 00:25:04,790 --> 00:25:06,520 OK, microphone one, please. 733 00:25:07,820 --> 00:25:10,219 So is the distributed 734 00:25:10,220 --> 00:25:12,439 data digitally signed, I could 735 00:25:12,440 --> 00:25:13,969 imagine that there are 736 00:25:15,130 --> 00:25:17,029 some groups of people who might be 737 00:25:17,030 --> 00:25:19,159 interested in fiddling around with 738 00:25:19,160 --> 00:25:20,160 it. 739 00:25:20,510 --> 00:25:22,789 Yeah, so the 740 00:25:22,790 --> 00:25:25,219 through the data rescue 741 00:25:25,220 --> 00:25:27,289 process, we worked really closely 742 00:25:27,290 --> 00:25:29,569 and did a refuge project 743 00:25:29,570 --> 00:25:31,519 is many of them are librarians. 744 00:25:31,520 --> 00:25:33,079 So there was a strong concern with 745 00:25:33,080 --> 00:25:35,239 maintaining stability of data and also 746 00:25:35,240 --> 00:25:36,409 thinking about integrity and 747 00:25:36,410 --> 00:25:37,410 verification. 748 00:25:39,050 --> 00:25:40,459 I think actually it raised a lot of 749 00:25:40,460 --> 00:25:41,659 really interesting questions for me, at 750 00:25:41,660 --> 00:25:44,449 least in in, uh, 751 00:25:44,450 --> 00:25:46,999 how you would imagine, like a very, uh, 752 00:25:47,000 --> 00:25:49,369 volunteer and human intensive process 753 00:25:49,370 --> 00:25:51,139 of doing that verification. 754 00:25:51,140 --> 00:25:53,599 So there was a workflow management 755 00:25:53,600 --> 00:25:55,669 tool that was developed where you 756 00:25:55,670 --> 00:25:57,529 would we would have like a log of who had 757 00:25:57,530 --> 00:25:59,419 touched each data. 758 00:25:59,420 --> 00:26:01,369 I didn't have data at an event or a data 759 00:26:01,370 --> 00:26:03,379 set or a page. 760 00:26:03,380 --> 00:26:06,439 And then we used existing 761 00:26:06,440 --> 00:26:09,559 librarian and Library of Congress tools 762 00:26:09,560 --> 00:26:11,899 to kind of generate a 763 00:26:11,900 --> 00:26:13,849 checksum and to ensure that what was 764 00:26:13,850 --> 00:26:15,169 uploaded was what people thought was 765 00:26:15,170 --> 00:26:16,549 uploaded. When you download it, you can 766 00:26:16,550 --> 00:26:17,809 verify that. 767 00:26:17,810 --> 00:26:19,579 And so it was trying to do a parallel 768 00:26:19,580 --> 00:26:21,889 like sort of social and technical 769 00:26:21,890 --> 00:26:23,959 implementation to do that in the in 770 00:26:23,960 --> 00:26:25,189 the move towards some of the data 771 00:26:25,190 --> 00:26:26,809 together where we actually have a 772 00:26:26,810 --> 00:26:30,229 reference implementation of, 773 00:26:30,230 --> 00:26:31,909 you know, generating works, which is a 774 00:26:31,910 --> 00:26:33,979 web archiving format, and 775 00:26:33,980 --> 00:26:35,899 writing them directly and adding them to 776 00:26:35,900 --> 00:26:38,089 epiphytes. And so with with IP address 777 00:26:38,090 --> 00:26:40,309 and content address protocols, there 778 00:26:40,310 --> 00:26:42,859 are additional ways to do 779 00:26:42,860 --> 00:26:44,989 verification and to ensure that what 780 00:26:44,990 --> 00:26:46,519 you only retrieve is the data you think 781 00:26:46,520 --> 00:26:47,929 you're retrieving. 782 00:26:47,930 --> 00:26:49,459 So I think that those are important 783 00:26:49,460 --> 00:26:51,499 questions. Really interesting. 784 00:26:51,500 --> 00:26:53,809 And I think we we tried I don't 785 00:26:53,810 --> 00:26:55,909 I mean, I'm I'm not a 786 00:26:55,910 --> 00:26:57,439 librarian by practice. 787 00:26:57,440 --> 00:26:59,449 So I think there are a lot of tradeoffs 788 00:26:59,450 --> 00:27:01,159 there that I'm probably not as sensitive 789 00:27:01,160 --> 00:27:02,160 to. 790 00:27:02,520 --> 00:27:04,939 Well, that sounds very trustworthy. 791 00:27:04,940 --> 00:27:05,940 Thank you so much. 792 00:27:07,700 --> 00:27:10,759 Yeah. Them please give a big applause 793 00:27:10,760 --> 00:27:13,069 for Don Walker, 794 00:27:13,070 --> 00:27:15,229 for the WHO 795 00:27:15,230 --> 00:27:16,430 talk about these 796 00:27:17,900 --> 00:27:19,069 public data. 797 00:27:20,660 --> 00:27:23,069 What is necessary for us all because 798 00:27:23,070 --> 00:27:24,439 the is not allowed to be.