0 00:00:00,000 --> 00:00:30,000 Dear viewer, these subtitles were generated by a machine via the service Trint and therefore are (very) buggy. If you are capable, please help us to create good quality subtitles: https://c3subtitles.de/talk/609 Thanks! 1 00:00:09,510 --> 00:00:11,399 It's actually nice to encrypt your data, 2 00:00:11,400 --> 00:00:13,409 as you may probably know, and it's even 3 00:00:13,410 --> 00:00:15,659 nicer if you can store it online, but 4 00:00:15,660 --> 00:00:17,879 the only drawback is that you 5 00:00:17,880 --> 00:00:19,379 that the cloud storage provider that 6 00:00:19,380 --> 00:00:21,899 actually can't do such operations 7 00:00:21,900 --> 00:00:24,149 on the encrypted data, the next 8 00:00:24,150 --> 00:00:26,819 talk actually presents some solutions 9 00:00:26,820 --> 00:00:27,929 to this problem. 10 00:00:27,930 --> 00:00:29,249 There'll be a smaller and Christian 11 00:00:29,250 --> 00:00:31,439 follow present, different 12 00:00:31,440 --> 00:00:34,049 approaches, how others can search 13 00:00:34,050 --> 00:00:36,269 through encrypted data without 14 00:00:36,270 --> 00:00:38,419 actually knowing the 15 00:00:38,420 --> 00:00:40,229 the key nor the plain text. 16 00:00:40,230 --> 00:00:42,359 So please welcome warmly Christian 17 00:00:42,360 --> 00:00:43,830 Falo and Tobias Malow. 18 00:00:50,630 --> 00:00:53,119 Perfect. Thank you very much and thanks 19 00:00:53,120 --> 00:00:55,519 for being here at day four after 20 00:00:55,520 --> 00:00:57,549 like these many parties and having party 21 00:00:57,550 --> 00:00:59,209 hard and I'm surprised to see so many 22 00:00:59,210 --> 00:01:01,039 people here. And by that, I mean, I'm 23 00:01:01,040 --> 00:01:03,079 surprised that I can actually see so many 24 00:01:03,080 --> 00:01:05,119 people after yesterday's tonight. 25 00:01:05,120 --> 00:01:07,399 And I'm I'm glad that I've made it 26 00:01:07,400 --> 00:01:09,049 up here and this very morning. 27 00:01:10,490 --> 00:01:11,389 That's correct. 28 00:01:11,390 --> 00:01:13,549 We are concerned with the problem 29 00:01:13,550 --> 00:01:15,619 of encrypting data in such a 30 00:01:15,620 --> 00:01:18,199 way that you can still 31 00:01:18,200 --> 00:01:20,449 search over these ciphertext. 32 00:01:20,450 --> 00:01:22,639 And over the course of this talk, 33 00:01:22,640 --> 00:01:24,859 we will present some well schemes 34 00:01:24,860 --> 00:01:27,049 or solution to that problem, which 35 00:01:27,050 --> 00:01:29,149 will hopefully inspire 36 00:01:29,150 --> 00:01:31,309 inspire you to firstly 37 00:01:31,310 --> 00:01:33,709 demand these services from your 38 00:01:33,710 --> 00:01:34,879 cloud provider. 39 00:01:34,880 --> 00:01:37,159 And secondly, if you're inclined to 40 00:01:37,160 --> 00:01:39,439 to do the programing, then to hack 41 00:01:39,440 --> 00:01:40,999 on these schemes and implement them and 42 00:01:41,000 --> 00:01:42,379 make them practical. 43 00:01:42,380 --> 00:01:44,629 So I'm Toby, this Christian in 44 00:01:44,630 --> 00:01:46,009 the course of the next forty five 45 00:01:46,010 --> 00:01:48,019 minutes. Well, we'll talk about this and 46 00:01:48,020 --> 00:01:50,029 hopefully we'll we'll leave some room for 47 00:01:50,030 --> 00:01:52,129 Q&A. We also intend to before I 48 00:01:52,130 --> 00:01:54,349 forget that we intend to hang 49 00:01:54,350 --> 00:01:56,419 around at the bar just outside this 50 00:01:56,420 --> 00:01:58,729 lecture hall after this talk 51 00:01:58,730 --> 00:02:00,919 in case you want to chat about these 52 00:02:00,920 --> 00:02:02,599 techniques or encryption or just buy us a 53 00:02:02,600 --> 00:02:03,739 beer. 54 00:02:03,740 --> 00:02:06,589 Well, maybe not the 12, but later. 55 00:02:06,590 --> 00:02:09,168 So this is our agenda. 56 00:02:09,169 --> 00:02:11,689 We want to talk a little bit about 57 00:02:11,690 --> 00:02:13,999 how we see the current world 58 00:02:14,000 --> 00:02:15,979 or the world around the cloud is 59 00:02:15,980 --> 00:02:18,049 organized and where your 60 00:02:18,050 --> 00:02:19,729 position as the customer is and that 61 00:02:19,730 --> 00:02:21,259 architecture. 62 00:02:21,260 --> 00:02:23,419 Then then we present a couple 63 00:02:23,420 --> 00:02:25,519 of schemes which allow you to 64 00:02:25,520 --> 00:02:27,739 well perform searching over, 65 00:02:27,740 --> 00:02:29,539 well, your ciphertext, which you have 66 00:02:29,540 --> 00:02:31,249 uploaded to the cloud. 67 00:02:31,250 --> 00:02:33,769 And then we'll we'll wrap up. 68 00:02:33,770 --> 00:02:34,770 So. 69 00:02:35,810 --> 00:02:38,119 Let's talk a little bit about 70 00:02:38,120 --> 00:02:40,249 the cloud, so 71 00:02:41,570 --> 00:02:43,699 let me ask you, how many of you 72 00:02:43,700 --> 00:02:46,009 use the cloud, any external 73 00:02:46,010 --> 00:02:47,959 third party provider to upload? 74 00:02:47,960 --> 00:02:49,430 Well, data, whatever it is. 75 00:02:50,950 --> 00:02:53,199 That's many and probably 76 00:02:53,200 --> 00:02:55,779 the majority, and 77 00:02:55,780 --> 00:02:58,569 that's how we actually imagine 78 00:02:58,570 --> 00:03:00,819 this world right now to be 79 00:03:00,820 --> 00:03:03,039 that many people use external third 80 00:03:03,040 --> 00:03:05,109 party services to upload 81 00:03:05,110 --> 00:03:06,369 to hold their data. 82 00:03:06,370 --> 00:03:08,229 Yeah, well, for the customers to 83 00:03:09,700 --> 00:03:11,979 avail of this data, well, at any point 84 00:03:11,980 --> 00:03:14,139 in time with any device they have. 85 00:03:14,140 --> 00:03:16,599 So I for myself, 86 00:03:16,600 --> 00:03:19,269 I went with Stan, a trustworthy guy. 87 00:03:19,270 --> 00:03:21,549 I can upload my data the first 88 00:03:21,550 --> 00:03:23,319 a couple of gigabytes for free. 89 00:03:23,320 --> 00:03:25,989 And he promises me well, if 90 00:03:25,990 --> 00:03:28,059 he promises me that I can at 91 00:03:28,060 --> 00:03:29,889 any point in time download and access my 92 00:03:29,890 --> 00:03:32,229 data and 93 00:03:32,230 --> 00:03:34,029 well, there's many of those rights must 94 00:03:34,030 --> 00:03:35,949 not be him. There's so many companies 95 00:03:35,950 --> 00:03:38,799 offering cloud services 96 00:03:38,800 --> 00:03:40,989 and they not only allow you to store 97 00:03:40,990 --> 00:03:43,089 your files, you can also upload 98 00:03:43,090 --> 00:03:45,129 your contacts or your calendars, your 99 00:03:45,130 --> 00:03:46,360 messages, your files 100 00:03:47,380 --> 00:03:49,929 and all of them 101 00:03:49,930 --> 00:03:52,209 probably promise you that they will 102 00:03:52,210 --> 00:03:54,849 not be malicious or do not 103 00:03:54,850 --> 00:03:57,459 go through your data and run analysis 104 00:03:57,460 --> 00:03:59,349 or even tell your user profile. 105 00:03:59,350 --> 00:04:02,019 Except a couple of providers 106 00:04:02,020 --> 00:04:03,339 actually do so. 107 00:04:03,340 --> 00:04:05,169 And they do tell you straight in your 108 00:04:05,170 --> 00:04:07,299 face that they will mine your data to 109 00:04:07,300 --> 00:04:09,969 present you, for example, better ads. 110 00:04:09,970 --> 00:04:12,069 So they go through your email, 111 00:04:12,070 --> 00:04:14,109 they look at your keywords and they will 112 00:04:14,110 --> 00:04:16,239 determine what ads might 113 00:04:16,240 --> 00:04:18,278 be best for you once you know what you 114 00:04:18,279 --> 00:04:19,720 are most inclined to click at. 115 00:04:21,110 --> 00:04:23,239 So anything 116 00:04:23,240 --> 00:04:25,249 you can say, well, I'm paying money, 117 00:04:25,250 --> 00:04:27,469 right, I'm using this premium provider 118 00:04:27,470 --> 00:04:30,439 for my email, my calendars, my my data, 119 00:04:30,440 --> 00:04:32,929 so they have their incentive 120 00:04:32,930 --> 00:04:35,899 to cheat on me is not that big. 121 00:04:35,900 --> 00:04:38,569 Well, you might be right, except, 122 00:04:38,570 --> 00:04:40,159 you know, if they could make an extra 123 00:04:40,160 --> 00:04:42,499 profit, they might possibly do so. 124 00:04:42,500 --> 00:04:44,809 And also, we don't we are being 125 00:04:44,810 --> 00:04:45,859 cryptographers. 126 00:04:45,860 --> 00:04:48,319 We don't necessarily want the guarantee 127 00:04:48,320 --> 00:04:50,689 that they don't look through your data 128 00:04:50,690 --> 00:04:51,649 on a piece of paper. 129 00:04:51,650 --> 00:04:53,779 We want to have, well, a rather 130 00:04:53,780 --> 00:04:56,239 mathematical say guarantee or proof 131 00:04:56,240 --> 00:04:58,339 or we want want 132 00:04:58,340 --> 00:05:00,289 the provider to not be actually able 133 00:05:00,290 --> 00:05:01,849 technically to go through the data. 134 00:05:01,850 --> 00:05:04,069 Not only, well, the privacy statement 135 00:05:04,070 --> 00:05:07,159 from this provider should guarantee 136 00:05:07,160 --> 00:05:09,079 that they don't look through the data. 137 00:05:09,080 --> 00:05:10,789 So we are looking for a cryptographic 138 00:05:10,790 --> 00:05:13,039 solutions to that problem of, 139 00:05:13,040 --> 00:05:15,169 well, uploading a data, but 140 00:05:15,170 --> 00:05:16,170 still 141 00:05:17,630 --> 00:05:19,759 still enable you to execute 142 00:05:19,760 --> 00:05:21,270 operations on that encrypted data. 143 00:05:22,550 --> 00:05:24,739 So the problem, as I've said, is 144 00:05:24,740 --> 00:05:26,869 that these providers 145 00:05:26,870 --> 00:05:28,279 go through the data and they extract 146 00:05:28,280 --> 00:05:30,319 information about you, about your usage, 147 00:05:30,320 --> 00:05:32,059 about your behavior. 148 00:05:32,060 --> 00:05:33,949 And it turns out that mining plaintext 149 00:05:33,950 --> 00:05:35,869 data is actually not that hard. 150 00:05:35,870 --> 00:05:38,089 So the data mining being performed 151 00:05:38,090 --> 00:05:40,609 is, well, relatively 152 00:05:40,610 --> 00:05:42,469 easy because it's plain text and you can 153 00:05:42,470 --> 00:05:44,089 just, you know, run your analysis over 154 00:05:44,090 --> 00:05:45,090 the plaintext. 155 00:05:46,740 --> 00:05:48,899 And we think that we are on 156 00:05:48,900 --> 00:05:50,379 the wrong track. 157 00:05:50,380 --> 00:05:52,529 We think we 158 00:05:52,530 --> 00:05:55,229 as a community, as the cryptographic 159 00:05:55,230 --> 00:05:57,539 or the hacker community should actively 160 00:05:57,540 --> 00:05:59,729 work towards the goal 161 00:05:59,730 --> 00:06:02,009 of providers not being able 162 00:06:02,010 --> 00:06:04,139 to look through your data and 163 00:06:04,140 --> 00:06:06,299 to not well create profiles about 164 00:06:06,300 --> 00:06:07,589 you and to not be able to predict what 165 00:06:07,590 --> 00:06:08,590 you're doing next. 166 00:06:09,280 --> 00:06:10,359 So we will hopefully. 167 00:06:10,360 --> 00:06:11,549 Well, thank you. 168 00:06:16,850 --> 00:06:18,979 So we ask the question, why, you know, 169 00:06:18,980 --> 00:06:21,079 Interex, why do you not, you 170 00:06:21,080 --> 00:06:23,359 know, do put on encryption 171 00:06:23,360 --> 00:06:25,709 on your data before you upload it 172 00:06:25,710 --> 00:06:27,799 because the encryption will save 173 00:06:27,800 --> 00:06:30,049 you or will save us from the dark side, 174 00:06:30,050 --> 00:06:31,159 you know, which performs all this 175 00:06:31,160 --> 00:06:32,160 analysis? 176 00:06:32,830 --> 00:06:34,989 In our scenario that we 177 00:06:34,990 --> 00:06:36,149 have in mind right now, when when 178 00:06:36,150 --> 00:06:38,229 presenting these slides, in 179 00:06:38,230 --> 00:06:40,359 our scenario, the USA 180 00:06:40,360 --> 00:06:41,799 has some data like locally. 181 00:06:41,800 --> 00:06:44,319 Right. And the user uploads 182 00:06:44,320 --> 00:06:46,809 their data to a third party 183 00:06:46,810 --> 00:06:49,209 and then forgets about the data locally 184 00:06:49,210 --> 00:06:50,799 because, you know, you have uploaded your 185 00:06:50,800 --> 00:06:52,509 massive movie library and then you're 186 00:06:52,510 --> 00:06:54,609 going with your mobile online 187 00:06:54,610 --> 00:06:56,769 and you don't know you don't necessarily 188 00:06:56,770 --> 00:06:57,939 know about the full 189 00:06:58,990 --> 00:07:01,989 well, the full data that you have online. 190 00:07:01,990 --> 00:07:05,289 Yet at a later point in time, you want 191 00:07:05,290 --> 00:07:06,369 to perform search, right? 192 00:07:06,370 --> 00:07:07,899 You want to know which files by half, 193 00:07:07,900 --> 00:07:10,359 which contacts you? I have without having 194 00:07:10,360 --> 00:07:13,359 necessarily seen that data previously 195 00:07:13,360 --> 00:07:15,369 on that device that you want to access 196 00:07:15,370 --> 00:07:17,170 your data with an. 197 00:07:18,690 --> 00:07:21,299 What we will do is we will present 198 00:07:21,300 --> 00:07:23,489 certain schemes that allow that, 199 00:07:23,490 --> 00:07:26,429 allow you to perform these operations, 200 00:07:26,430 --> 00:07:28,649 but these well schemes 201 00:07:28,650 --> 00:07:31,739 or these technologies are not necessarily 202 00:07:31,740 --> 00:07:32,789 plug and play. Right. 203 00:07:32,790 --> 00:07:35,429 It's not that you could drop in, 204 00:07:35,430 --> 00:07:37,589 say, the scheme, and then you're done. 205 00:07:37,590 --> 00:07:38,790 You need to, you know, 206 00:07:39,810 --> 00:07:41,879 say manage keys and look 207 00:07:41,880 --> 00:07:43,979 at the rough edges off of cryptographic 208 00:07:43,980 --> 00:07:45,579 schemes in the science track, by the way. 209 00:07:45,580 --> 00:07:47,759 Right. So we are taking academic 210 00:07:47,760 --> 00:07:49,529 work and we're trying to make it 211 00:07:49,530 --> 00:07:52,049 practical and academic work. 212 00:07:52,050 --> 00:07:54,119 Well, sometimes it's 213 00:07:54,120 --> 00:07:56,919 complicated, say, to apply for real life. 214 00:07:56,920 --> 00:07:59,609 So just keep that in mind. 215 00:07:59,610 --> 00:08:00,959 When when listening to these. 216 00:08:00,960 --> 00:08:01,960 To these. 217 00:08:03,230 --> 00:08:04,230 Um. 218 00:08:05,790 --> 00:08:08,249 Yes, we as I said, we are 219 00:08:08,250 --> 00:08:10,529 trying to point you in 220 00:08:10,530 --> 00:08:12,539 a direction of trying to give you some 221 00:08:12,540 --> 00:08:14,699 inspiration as to what 222 00:08:14,700 --> 00:08:16,799 to look for when, well, either 223 00:08:16,800 --> 00:08:19,169 implementing encrypted 224 00:08:19,170 --> 00:08:21,089 search or when demanding encrypted search 225 00:08:21,090 --> 00:08:22,090 from your cloud provider. 226 00:08:23,210 --> 00:08:25,309 So let me start by a very 227 00:08:25,310 --> 00:08:27,409 simple scheme, a 228 00:08:27,410 --> 00:08:29,719 very simple encryption scheme that allows 229 00:08:29,720 --> 00:08:32,058 you to perform an encrypted 230 00:08:32,059 --> 00:08:34,129 search as an engineer, I'm 231 00:08:34,130 --> 00:08:36,529 trying to think about the most 232 00:08:36,530 --> 00:08:38,298 minimal solution, the most easy solution 233 00:08:38,299 --> 00:08:39,739 first and then try to build it up to make 234 00:08:39,740 --> 00:08:40,740 it better and better. 235 00:08:41,600 --> 00:08:44,178 So my engineering approach 236 00:08:44,179 --> 00:08:47,119 to that very simple encryption scheme 237 00:08:47,120 --> 00:08:49,220 is to just simply encrypt all the things. 238 00:08:50,930 --> 00:08:53,089 Sounds simple, right? It actually is 239 00:08:53,090 --> 00:08:55,669 in the scheme. We have our plaintext data 240 00:08:55,670 --> 00:08:57,259 and we simply encrypt each and every 241 00:08:57,260 --> 00:08:59,419 entry of our database 242 00:08:59,420 --> 00:09:00,889 with a secure encryption scheme. 243 00:09:00,890 --> 00:09:02,989 And we upload that data to 244 00:09:02,990 --> 00:09:04,309 the to the third party. 245 00:09:07,490 --> 00:09:08,970 The right word on any any questions. 246 00:09:10,340 --> 00:09:12,409 So in a way, when you want to perform 247 00:09:12,410 --> 00:09:14,419 your operation, what you need to do is, 248 00:09:14,420 --> 00:09:15,859 well, you need to download all the things 249 00:09:15,860 --> 00:09:18,619 right. So you need to 250 00:09:18,620 --> 00:09:21,019 ask the database on the Internet, 251 00:09:21,020 --> 00:09:23,929 on the couch to give you all the entries. 252 00:09:23,930 --> 00:09:26,389 Then you need to decrypt locally and 253 00:09:26,390 --> 00:09:28,759 then you have the plain text and then 254 00:09:28,760 --> 00:09:29,989 you can perform whatever operation you 255 00:09:29,990 --> 00:09:30,990 want. 256 00:09:31,560 --> 00:09:33,179 In this scheme, well, it's very simple, 257 00:09:33,180 --> 00:09:35,249 right? It's probably not 258 00:09:35,250 --> 00:09:36,539 what we want. 259 00:09:36,540 --> 00:09:37,949 Why? Because, well, if you're on your 260 00:09:37,950 --> 00:09:39,929 mobile, you don't want to to download 261 00:09:39,930 --> 00:09:41,729 your three terabyte movies library first 262 00:09:41,730 --> 00:09:43,439 to find the file. 263 00:09:43,440 --> 00:09:45,659 And so this is not necessarily, 264 00:09:45,660 --> 00:09:47,309 well, a good scheme for us. 265 00:09:47,310 --> 00:09:48,719 So we are trying to build up 266 00:09:49,740 --> 00:09:51,929 wealth solutions which perform 267 00:09:51,930 --> 00:09:53,219 better in that regard. 268 00:09:53,220 --> 00:09:55,049 However, I just want to point out that 269 00:09:55,050 --> 00:09:57,209 this, as far as I'm aware, is being 270 00:09:57,210 --> 00:09:59,339 implemented in commercial products 271 00:09:59,340 --> 00:10:01,169 that you could buy. So as far as I'm 272 00:10:01,170 --> 00:10:03,509 aware, proxy products that, 273 00:10:03,510 --> 00:10:05,789 you know, sell as 274 00:10:05,790 --> 00:10:07,769 an appliance that you put between you and 275 00:10:07,770 --> 00:10:10,259 Gmail, as far as I'm aware, 276 00:10:10,260 --> 00:10:12,629 they will do exactly that. 277 00:10:12,630 --> 00:10:14,999 So the question now, 278 00:10:15,000 --> 00:10:17,369 as soon as the engineer that I am 279 00:10:17,370 --> 00:10:19,139 is can we do better? 280 00:10:20,330 --> 00:10:22,509 And that question I asked Christian, 281 00:10:22,510 --> 00:10:24,079 Christian, can we do better? 282 00:10:24,080 --> 00:10:25,459 Yes, we can do better. 283 00:10:25,460 --> 00:10:27,389 Yes we can. Sure we can. 284 00:10:27,390 --> 00:10:28,610 Yeah, let's do it. 285 00:10:33,630 --> 00:10:35,729 OK, we one that 286 00:10:35,730 --> 00:10:38,309 the cloud, uh, 287 00:10:38,310 --> 00:10:40,709 performs the search for us, because 288 00:10:40,710 --> 00:10:43,679 this looks like cloud service and 289 00:10:43,680 --> 00:10:45,959 I'm not willing to perform the thoughts 290 00:10:45,960 --> 00:10:48,179 on my computer, especially 291 00:10:48,180 --> 00:10:51,179 when I have big data stuff on the cloud. 292 00:10:51,180 --> 00:10:53,609 And, um, slutting unencrypted 293 00:10:53,610 --> 00:10:56,949 data sounds a little bit like magic. 294 00:10:56,950 --> 00:10:58,929 But in fact, it's so simple. 295 00:10:58,930 --> 00:11:01,059 Yeah, I soever like scheme 296 00:11:01,060 --> 00:11:03,340 of a simple scheme, how to perform 297 00:11:04,600 --> 00:11:06,699 a search on the 298 00:11:06,700 --> 00:11:09,009 data and figure out 299 00:11:09,010 --> 00:11:10,689 just by using a deterministic encryption 300 00:11:10,690 --> 00:11:11,690 scheme. 301 00:11:12,100 --> 00:11:14,589 OK, that's an interesting direction, 302 00:11:14,590 --> 00:11:17,769 means same plaintext, 303 00:11:17,770 --> 00:11:20,229 same ciphertext, so we have two identical 304 00:11:20,230 --> 00:11:22,329 plaintext, then we will have 305 00:11:22,330 --> 00:11:24,759 two identical ciphertext 306 00:11:24,760 --> 00:11:26,529 super easy. 307 00:11:26,530 --> 00:11:30,069 We can use such a scheme, a space 308 00:11:30,070 --> 00:11:32,139 to encrypt all our keywords 309 00:11:32,140 --> 00:11:34,329 one by one, and then we have 310 00:11:34,330 --> 00:11:36,109 a ciphertext collection. 311 00:11:36,110 --> 00:11:37,110 OK, that's fine. 312 00:11:38,140 --> 00:11:40,989 And now, uh, now, 313 00:11:40,990 --> 00:11:43,659 uh, it's, uh, it's not working, uh, 314 00:11:43,660 --> 00:11:46,089 it's broken or maybe it's 315 00:11:46,090 --> 00:11:47,249 not that. 316 00:11:47,250 --> 00:11:49,689 Oh, well, 317 00:11:49,690 --> 00:11:50,690 hang on. 318 00:11:52,040 --> 00:11:54,259 So shall I go through 319 00:11:54,260 --> 00:11:56,029 this again to anybody? 320 00:11:56,030 --> 00:11:57,710 Did anybody not understand anything so. 321 00:11:59,060 --> 00:12:00,060 Sorry about that. 322 00:12:01,770 --> 00:12:02,770 We are. 323 00:12:03,920 --> 00:12:04,920 Here, right? 324 00:12:06,140 --> 00:12:07,140 Oh, that's wrong. 325 00:12:08,540 --> 00:12:09,739 It's not just here. 326 00:12:10,820 --> 00:12:11,820 OK, sorry about that. 327 00:12:15,490 --> 00:12:17,379 I'm fed up with crypto, with computers. 328 00:12:21,870 --> 00:12:23,009 OK, I will. 329 00:12:23,010 --> 00:12:25,139 OK, that's a two button interface right 330 00:12:25,140 --> 00:12:26,149 there. 331 00:12:26,150 --> 00:12:27,150 Yes. 332 00:12:29,160 --> 00:12:30,160 Hey, it's a 333 00:12:31,600 --> 00:12:32,759 oh, we should have practiced that 334 00:12:32,760 --> 00:12:33,760 beforehand. Yeah, yeah. 335 00:12:35,250 --> 00:12:36,479 I don't think this computer thing will 336 00:12:36,480 --> 00:12:38,339 ever, you know, we're going to succeed if 337 00:12:38,340 --> 00:12:39,359 it's too complicated, 338 00:12:41,040 --> 00:12:42,040 right. 339 00:12:42,900 --> 00:12:43,900 Yeah. 340 00:12:44,880 --> 00:12:46,229 Well, OK. 341 00:12:46,230 --> 00:12:47,969 Oh yeah. I've received a crash report. 342 00:12:47,970 --> 00:12:49,599 All right. That's that's handy. 343 00:12:49,600 --> 00:12:50,699 Oh yeah. 344 00:12:50,700 --> 00:12:52,799 Please stop off on the hacks right now. 345 00:12:52,800 --> 00:12:55,709 Uh, OK, that's horrible. 346 00:12:55,710 --> 00:12:57,590 But you're going to do it anyway. 347 00:13:00,120 --> 00:13:01,120 Let's get the slide, you 348 00:13:02,790 --> 00:13:04,049 don't press the button. OK, OK, 349 00:13:06,750 --> 00:13:08,039 maybe I press the button just. 350 00:13:09,780 --> 00:13:10,780 Oh. 351 00:13:11,780 --> 00:13:12,780 Isiah's. 352 00:13:13,540 --> 00:13:15,909 Next, the next 353 00:13:15,910 --> 00:13:16,869 right. 354 00:13:16,870 --> 00:13:18,460 It's easy to say, but a 355 00:13:21,090 --> 00:13:22,929 crash crashes on that very flight, the 356 00:13:22,930 --> 00:13:23,930 terrible. 357 00:13:25,430 --> 00:13:27,710 OK, can you tell a joke? 358 00:13:30,320 --> 00:13:32,509 Let me just have this finishing the pre 359 00:13:32,510 --> 00:13:33,510 rendering. 360 00:13:35,280 --> 00:13:37,109 Now, it's not that we would have never, 361 00:13:37,110 --> 00:13:38,699 you know, gone through the slides with 362 00:13:38,700 --> 00:13:40,889 that very program and of course, a random 363 00:13:40,890 --> 00:13:41,890 problem right now. 364 00:13:42,560 --> 00:13:44,139 OK. 365 00:13:44,140 --> 00:13:46,629 Oh, there you go. OK, now it's. 366 00:13:46,630 --> 00:13:47,799 I'm not skipping through this, like I'm 367 00:13:47,800 --> 00:13:48,800 opening it directly. 368 00:13:49,630 --> 00:13:50,630 No, I'm not. 369 00:13:52,030 --> 00:13:53,919 OK, I have a backup plan. 370 00:13:53,920 --> 00:13:56,019 Don't don't worry, it's all 371 00:13:56,020 --> 00:13:58,119 right because, uh, as an 372 00:13:58,120 --> 00:13:59,169 OK from. 373 00:14:01,100 --> 00:14:02,100 Now we are 374 00:14:04,730 --> 00:14:05,989 listening to the latest source would 375 00:14:05,990 --> 00:14:07,070 probably be better in your ear right 376 00:14:08,180 --> 00:14:09,180 now. Yeah. 377 00:14:15,650 --> 00:14:18,889 Finally, OK, 378 00:14:18,890 --> 00:14:21,499 that's amnesty and corruption once again, 379 00:14:21,500 --> 00:14:24,229 if same train, same plaintext, 380 00:14:24,230 --> 00:14:25,609 I same ciphertext. 381 00:14:25,610 --> 00:14:27,439 So it's quite easy to search for farmers 382 00:14:27,440 --> 00:14:29,539 much on the other side, 383 00:14:29,540 --> 00:14:31,639 trust and crypto keywords they're looking 384 00:14:31,640 --> 00:14:32,640 for. 385 00:14:33,050 --> 00:14:35,059 And then for each ciphertext, the 386 00:14:35,060 --> 00:14:37,309 collection, uh, 387 00:14:37,310 --> 00:14:39,379 you just check if if 388 00:14:39,380 --> 00:14:41,089 it's a match or not and when there's a 389 00:14:41,090 --> 00:14:42,289 match, it's fine. 390 00:14:42,290 --> 00:14:44,379 The then the keeper is part of 391 00:14:44,380 --> 00:14:46,339 the ciphertext collection. 392 00:14:46,340 --> 00:14:48,499 That's why this is this is 393 00:14:48,500 --> 00:14:50,559 indeed two is a little bit too easy. 394 00:14:50,560 --> 00:14:53,769 Uh, OK, let's try it one more time. 395 00:14:53,770 --> 00:14:55,449 Uh huh. 396 00:14:55,450 --> 00:14:57,579 So it's OK, there must be 397 00:14:57,580 --> 00:15:00,659 a catch. Yeah, I understand 398 00:15:00,660 --> 00:15:03,009 it's a serious problem because 399 00:15:03,010 --> 00:15:05,049 that having this description, it's not 400 00:15:05,050 --> 00:15:06,309 really secure. 401 00:15:06,310 --> 00:15:08,799 You cannot consider this as secure 402 00:15:08,800 --> 00:15:11,259 because you all know the, uh, example 403 00:15:11,260 --> 00:15:12,159 of the pinguin. Yeah. 404 00:15:12,160 --> 00:15:14,309 I mean, it's just the form 405 00:15:14,310 --> 00:15:15,549 that exists. 406 00:15:15,550 --> 00:15:17,919 And it's only 407 00:15:17,920 --> 00:15:19,989 partially, uh, hides the plain 408 00:15:19,990 --> 00:15:21,159 text. 409 00:15:21,160 --> 00:15:22,160 And so we 410 00:15:23,350 --> 00:15:24,489 can we do better. 411 00:15:24,490 --> 00:15:26,409 Yeah. Because this is not what we want. 412 00:15:26,410 --> 00:15:28,509 Uh, at the end, we want 413 00:15:28,510 --> 00:15:30,519 to hide our data. 414 00:15:30,520 --> 00:15:32,289 And do it well, very well. 415 00:15:32,290 --> 00:15:34,359 OK, and now there is some 416 00:15:34,360 --> 00:15:36,669 idea from Song by The 417 00:15:37,780 --> 00:15:40,459 Pretenders that 2001 418 00:15:40,460 --> 00:15:42,699 and it's quite it's worked like a charm 419 00:15:42,700 --> 00:15:44,709 you just use. 420 00:15:44,710 --> 00:15:46,869 The missing persons scheme and 421 00:15:46,870 --> 00:15:49,139 then, you know, OK, the is not releasing, 422 00:15:49,140 --> 00:15:51,339 you will have to fix something and 423 00:15:51,340 --> 00:15:53,019 there is a fix in step. 424 00:15:53,020 --> 00:15:55,239 And what we do is just we we 425 00:15:55,240 --> 00:15:57,579 all mask. Mask means a random bit thing. 426 00:15:57,580 --> 00:15:58,580 Yeah. If I tough. 427 00:15:59,800 --> 00:16:01,579 A random thing and have something like a 428 00:16:01,580 --> 00:16:03,209 one time pet, and we all know one time 429 00:16:03,210 --> 00:16:05,259 pet works like a charm, it's secure, 430 00:16:05,260 --> 00:16:07,429 you're OK if you 431 00:16:07,430 --> 00:16:08,719 don't read the math, are using a 432 00:16:08,720 --> 00:16:09,919 streamside, whatever. 433 00:16:09,920 --> 00:16:10,970 It's not a big problem. 434 00:16:12,390 --> 00:16:14,609 OK, but then again, 435 00:16:14,610 --> 00:16:16,859 we have the problem when if 436 00:16:16,860 --> 00:16:18,870 it's Q and corruption, how we can, 437 00:16:19,920 --> 00:16:21,989 uh, perform a search on 438 00:16:21,990 --> 00:16:22,889 the encrypted data. 439 00:16:22,890 --> 00:16:24,689 OK, and this is about the point where we 440 00:16:24,690 --> 00:16:25,649 lose to the audience. 441 00:16:25,650 --> 00:16:27,989 Right. So are we, uh, using 442 00:16:27,990 --> 00:16:28,939 the audience? 443 00:16:28,940 --> 00:16:30,089 We're losing the audience because now 444 00:16:30,090 --> 00:16:31,349 it's getting complicated. 445 00:16:31,350 --> 00:16:33,539 Yeah, OK, OK. 446 00:16:33,540 --> 00:16:35,699 But we have this means now we 447 00:16:35,700 --> 00:16:37,139 have some magic. 448 00:16:37,140 --> 00:16:39,389 Yeah. And we cannot use any mask 449 00:16:39,390 --> 00:16:41,519 at all. We need a magic mask 450 00:16:41,520 --> 00:16:43,739 to perform this kind 451 00:16:43,740 --> 00:16:45,459 of encryption. 452 00:16:45,460 --> 00:16:47,769 OK, and and 453 00:16:47,770 --> 00:16:50,049 OK, let's let's see how we can craft 454 00:16:50,050 --> 00:16:52,149 such a mask, OK, we 455 00:16:52,150 --> 00:16:54,099 divide the mask in the left side and the 456 00:16:54,100 --> 00:16:56,319 right side and the left 457 00:16:56,320 --> 00:16:58,629 side. We are using random bits. 458 00:16:58,630 --> 00:17:00,789 Yeah, we can generate a friend of bits 459 00:17:00,790 --> 00:17:03,489 once again by using a 460 00:17:03,490 --> 00:17:05,469 stream cipher. Yeah, a boring string 461 00:17:05,470 --> 00:17:07,299 cipher, but boring and secure 462 00:17:07,300 --> 00:17:08,229 screensaver. 463 00:17:08,230 --> 00:17:09,230 No big deal. 464 00:17:10,329 --> 00:17:12,309 Then we performed the deterministic 465 00:17:12,310 --> 00:17:14,710 encryption on our keyboards. 466 00:17:15,740 --> 00:17:17,779 And then we divide this keywords in the 467 00:17:17,780 --> 00:17:19,939 left part and the right part, 468 00:17:19,940 --> 00:17:22,068 and from the left part, we derive 469 00:17:22,069 --> 00:17:24,139 a Slutsky QEI 470 00:17:24,140 --> 00:17:25,550 and we can do this by 471 00:17:27,200 --> 00:17:29,869 using a key hash function. 472 00:17:29,870 --> 00:17:32,449 So you can keep hash functions. 473 00:17:32,450 --> 00:17:33,689 One can use H.M.S.. 474 00:17:33,690 --> 00:17:35,569 Yeah, you only know the max you can 475 00:17:35,570 --> 00:17:37,009 perform here. 476 00:17:37,010 --> 00:17:39,949 You trust your hash for left part 477 00:17:39,950 --> 00:17:42,109 of our search query using each Mac 478 00:17:42,110 --> 00:17:44,539 and then we have our Slutsky. 479 00:17:44,540 --> 00:17:47,059 Yeah. And then with the left side 480 00:17:47,060 --> 00:17:49,279 of our mosque and the Slutsky 481 00:17:49,280 --> 00:17:51,409 we craft the right side of 482 00:17:51,410 --> 00:17:53,419 Omarska. You drive the right side and the 483 00:17:53,420 --> 00:17:55,519 right side, the left side 484 00:17:55,520 --> 00:17:57,649 of our markets as I and the right like 485 00:17:57,650 --> 00:17:59,929 to see. And that was the trick is from 486 00:17:59,930 --> 00:18:01,639 the other deterministic encryption from 487 00:18:01,640 --> 00:18:03,859 our internal ciphertext and the left side 488 00:18:03,860 --> 00:18:05,989 of the mask, we can compute the right 489 00:18:05,990 --> 00:18:07,519 side of the mask. 490 00:18:07,520 --> 00:18:10,009 And this is Hendee, 491 00:18:10,010 --> 00:18:12,079 OK? And now we we exploit 492 00:18:12,080 --> 00:18:14,149 this to perform, 493 00:18:14,150 --> 00:18:15,859 uh, such encrypted data. 494 00:18:17,530 --> 00:18:19,779 Yeah, the basic idea, once again, 495 00:18:19,780 --> 00:18:22,119 we can compute the right side of Omarska 496 00:18:22,120 --> 00:18:23,120 on the left side. 497 00:18:24,270 --> 00:18:26,519 This is the basic idea, and now 498 00:18:26,520 --> 00:18:29,069 let's watch your fellow magic mask. 499 00:18:29,070 --> 00:18:30,979 Let's do the magic. 500 00:18:30,980 --> 00:18:33,109 Uh, OK, first of all, 501 00:18:33,110 --> 00:18:35,359 we have to upload our stuff, we 502 00:18:35,360 --> 00:18:36,360 each. 503 00:18:37,280 --> 00:18:38,569 It's plain text. 504 00:18:38,570 --> 00:18:41,629 We have to first form the encryption 505 00:18:41,630 --> 00:18:42,630 then. 506 00:18:43,760 --> 00:18:46,129 We have to explore 507 00:18:46,130 --> 00:18:48,819 the result with our magic mask. 508 00:18:48,820 --> 00:18:50,959 And then be uploaded to the cloud, to 509 00:18:50,960 --> 00:18:52,349 a server. 510 00:18:52,350 --> 00:18:53,350 So far, so good. 511 00:18:54,490 --> 00:18:56,589 And now the metrics 512 00:18:56,590 --> 00:18:58,629 that we want to search at some point of 513 00:18:58,630 --> 00:19:00,759 time involve a former search 514 00:19:00,760 --> 00:19:02,979 query and then this means 515 00:19:02,980 --> 00:19:05,049 for such a search 516 00:19:05,050 --> 00:19:07,749 query consists of the deterministic 517 00:19:07,750 --> 00:19:09,969 encryption of our keywords 518 00:19:09,970 --> 00:19:12,309 and the Slutsky Soviet era 519 00:19:12,310 --> 00:19:13,900 each time when we perform. 520 00:19:15,160 --> 00:19:17,379 Such query, we upload the 521 00:19:17,380 --> 00:19:19,539 interim ciphertext and 522 00:19:19,540 --> 00:19:22,389 the Slutsky, OK, and then 523 00:19:22,390 --> 00:19:25,329 the cloud can test 524 00:19:25,330 --> 00:19:27,699 for each other and 525 00:19:27,700 --> 00:19:30,309 ciphertext from our ciphertext collection 526 00:19:30,310 --> 00:19:32,619 if the X or if our 527 00:19:32,620 --> 00:19:35,199 deterministic ciphertext ISMAT 528 00:19:35,200 --> 00:19:37,599 is the magic mask, and if so, 529 00:19:37,600 --> 00:19:39,699 if this turns out to be a magic mask, 530 00:19:39,700 --> 00:19:41,799 we haven't hit a match. 531 00:19:41,800 --> 00:19:44,199 And then we can figure out if some if 532 00:19:44,200 --> 00:19:46,459 this is part of our 533 00:19:46,460 --> 00:19:48,009 of our ciphertext collection or not. 534 00:19:48,010 --> 00:19:50,109 So now this enables us to search of 535 00:19:50,110 --> 00:19:51,009 bankrupt the data. 536 00:19:51,010 --> 00:19:52,449 That's beautiful. 537 00:19:52,450 --> 00:19:53,450 And it's fine 538 00:19:55,000 --> 00:19:55,929 to talk. 539 00:19:55,930 --> 00:19:56,859 It's done. 540 00:19:56,860 --> 00:19:58,239 Uh, not really. 541 00:19:58,240 --> 00:19:59,139 There is a problem. 542 00:19:59,140 --> 00:20:00,809 Every blessing comes curse. 543 00:20:00,810 --> 00:20:01,989 Yeah. 544 00:20:01,990 --> 00:20:04,689 And do we have to curse is 545 00:20:04,690 --> 00:20:06,439 the first curse. 546 00:20:06,440 --> 00:20:08,539 The scheme is not, uh, 547 00:20:08,540 --> 00:20:10,539 super secure. It's vulnerable to 548 00:20:10,540 --> 00:20:11,790 statistical analysis. 549 00:20:13,200 --> 00:20:15,359 What does it mean, OK, if 550 00:20:15,360 --> 00:20:17,819 I know maybe my target, my victim, 551 00:20:19,170 --> 00:20:21,269 it's someone from the Wikipedia 552 00:20:21,270 --> 00:20:24,429 community and the staff 553 00:20:24,430 --> 00:20:27,029 are targets uploading, I'm a victim, 554 00:20:27,030 --> 00:20:29,639 maybe it's connected to Wikipedia. 555 00:20:29,640 --> 00:20:32,159 And then I can make some 556 00:20:32,160 --> 00:20:33,929 estimation about the search pattern. 557 00:20:33,930 --> 00:20:36,359 I can make some guesses, wild guesses. 558 00:20:36,360 --> 00:20:38,759 And this means, I guess, the 559 00:20:38,760 --> 00:20:39,760 frequency. 560 00:20:40,940 --> 00:20:42,519 Offer so much, yeah. 561 00:20:43,910 --> 00:20:46,759 And then here a large 562 00:20:46,760 --> 00:20:48,680 to means high frequency. 563 00:20:49,750 --> 00:20:51,909 And small 564 00:20:51,910 --> 00:20:54,279 size means low frequency that 565 00:20:54,280 --> 00:20:55,269 can make estimation. 566 00:20:55,270 --> 00:20:57,309 And then what next? 567 00:20:57,310 --> 00:20:58,809 I'm gonna tell you, is that behavior? 568 00:20:58,810 --> 00:20:59,810 Yeah. 569 00:21:00,680 --> 00:21:02,679 And this means I'm monitoring the search 570 00:21:02,680 --> 00:21:03,829 current search queries. 571 00:21:03,830 --> 00:21:06,200 Yeah, and then after a while. 572 00:21:07,320 --> 00:21:09,659 I can make a quick compare 573 00:21:09,660 --> 00:21:12,069 my my guess with what I've 574 00:21:12,070 --> 00:21:14,129 monitored and then I 575 00:21:14,130 --> 00:21:17,909 see, oh, this guy looked a lot for, uh, 576 00:21:17,910 --> 00:21:20,639 uh, zero x 577 00:21:20,640 --> 00:21:23,249 nine for, uh, 578 00:21:23,250 --> 00:21:25,319 and on you seven, blah, blah, blah. 579 00:21:25,320 --> 00:21:27,749 Oh, that's a good chance that this, 580 00:21:27,750 --> 00:21:29,759 uh, that you searched for. 581 00:21:29,760 --> 00:21:31,979 This might be the ciphertext 582 00:21:31,980 --> 00:21:34,319 for Mr. cipherText, 583 00:21:34,320 --> 00:21:36,239 for Wikipedia or Foundation one or 584 00:21:36,240 --> 00:21:38,249 foundation or so on. 585 00:21:38,250 --> 00:21:40,619 Yeah. And you and my target 586 00:21:40,620 --> 00:21:42,989 only looked twice for the small 587 00:21:42,990 --> 00:21:45,449 ciphertext, our x uh, the, 588 00:21:45,450 --> 00:21:47,539 uh, eight felines on. 589 00:21:47,540 --> 00:21:49,709 So this, uh, must 590 00:21:49,710 --> 00:21:51,899 be one of a smaller 591 00:21:51,900 --> 00:21:54,449 stuff here on the right side, uh, like, 592 00:21:54,450 --> 00:21:56,999 uh, I've got at least license 593 00:21:57,000 --> 00:21:58,619 or whatever though. 594 00:21:58,620 --> 00:22:00,299 And this is how I can partially 595 00:22:00,300 --> 00:22:03,059 reconstruct the ciphertext. 596 00:22:03,060 --> 00:22:05,429 So, yeah, this is the problem. 597 00:22:05,430 --> 00:22:07,319 If your point is to perform a lot of 598 00:22:07,320 --> 00:22:09,659 search queries, you can 599 00:22:09,660 --> 00:22:11,549 decrypt parts of the ciphertext. 600 00:22:12,660 --> 00:22:14,999 And it just can 601 00:22:15,000 --> 00:22:17,279 become a problem. 602 00:22:17,280 --> 00:22:19,379 OK, and then there is the next 603 00:22:19,380 --> 00:22:21,719 problem, speed, 604 00:22:21,720 --> 00:22:22,720 OK, this is, 605 00:22:23,820 --> 00:22:25,229 uh, symmetric encryption. 606 00:22:25,230 --> 00:22:27,319 It's not so bad then using full 607 00:22:27,320 --> 00:22:29,489 homophily encryption either. 608 00:22:29,490 --> 00:22:31,559 And OK, first 609 00:22:31,560 --> 00:22:33,809 of all, I have I 610 00:22:33,810 --> 00:22:36,119 implemented this stuff and, 611 00:22:36,120 --> 00:22:37,499 uh, this is a fun performance, 612 00:22:37,500 --> 00:22:38,399 benchmarks. 613 00:22:38,400 --> 00:22:40,469 And first of all, you see over ciphertext 614 00:22:40,470 --> 00:22:42,479 is about six times larger than the 615 00:22:42,480 --> 00:22:43,859 plaintext. 616 00:22:43,860 --> 00:22:44,860 What happens? 617 00:22:46,640 --> 00:22:49,019 I patted each worth 618 00:22:49,020 --> 00:22:51,649 to 32 bites before 619 00:22:51,650 --> 00:22:54,049 I interrupted this petting is 620 00:22:54,050 --> 00:22:56,779 crucial because when you 621 00:22:56,780 --> 00:22:59,179 encrypts, uh, natural 622 00:22:59,180 --> 00:23:01,370 language stuff, words, 623 00:23:02,510 --> 00:23:04,669 uh, you can see from the fact that and 624 00:23:04,670 --> 00:23:06,259 then you reveal a lot of information. 625 00:23:06,260 --> 00:23:08,209 Yeah. If you if you if I interrupt. 626 00:23:08,210 --> 00:23:10,339 Uh, yes and no. 627 00:23:10,340 --> 00:23:12,359 And then I look on the for outputting and 628 00:23:12,360 --> 00:23:13,579 looking for the life of Texas. 629 00:23:13,580 --> 00:23:15,319 CipherText of Yes. 630 00:23:15,320 --> 00:23:17,449 Should be longer than the cymatics of 631 00:23:17,450 --> 00:23:19,129 no. And then I ciphertext. 632 00:23:19,130 --> 00:23:20,659 I can learn a lot about language 633 00:23:20,660 --> 00:23:23,149 ciphertext, I can learn a lot 634 00:23:23,150 --> 00:23:25,309 about Blendtec. And this is why I just 635 00:23:25,310 --> 00:23:27,859 patted each word to 32 bytes 636 00:23:27,860 --> 00:23:29,299 and then perform the encryption. 637 00:23:29,300 --> 00:23:31,789 OK, just have OK. 638 00:23:31,790 --> 00:23:33,859 This is a fact that depends on what 639 00:23:33,860 --> 00:23:34,759 you encrypt. 640 00:23:34,760 --> 00:23:36,889 OK. The other thing is, uh, time to 641 00:23:36,890 --> 00:23:39,439 encrypt. OK, this is quite fast because 642 00:23:39,440 --> 00:23:41,969 I used a s and 643 00:23:41,970 --> 00:23:44,249 with the a selective instruction 644 00:23:44,250 --> 00:23:47,749 stuff from insert thoughts quite fast. 645 00:23:47,750 --> 00:23:49,909 And this is only on the stuff, 646 00:23:49,910 --> 00:23:51,629 on a single warmish, on the normal 647 00:23:51,630 --> 00:23:53,709 regular notebook, on the single 648 00:23:53,710 --> 00:23:55,460 core. It's OK 649 00:23:56,510 --> 00:23:57,589 and the such. 650 00:23:57,590 --> 00:23:59,809 Yeah. OK, so that might be a problem 651 00:23:59,810 --> 00:24:01,739 because you need linear time. 652 00:24:01,740 --> 00:24:04,039 Yeah. Eidson any such 653 00:24:04,040 --> 00:24:06,439 query. I have to go to each entry 654 00:24:06,440 --> 00:24:08,689 of my ciphertext collection and check if 655 00:24:08,690 --> 00:24:10,219 it's a match or not. 656 00:24:10,220 --> 00:24:12,499 And when I perform big data 657 00:24:12,500 --> 00:24:14,659 or utilites data you'd say that this 658 00:24:14,660 --> 00:24:16,849 means uh I have to wait 659 00:24:16,850 --> 00:24:18,679 a couple of minutes. 660 00:24:18,680 --> 00:24:20,869 It's quite good if you like to drink 661 00:24:20,870 --> 00:24:22,579 coffee a lot and you can make coffee 662 00:24:22,580 --> 00:24:24,709 breaks. But 663 00:24:24,710 --> 00:24:26,629 this is not what you really want to wait 664 00:24:26,630 --> 00:24:28,099 a couple of minutes before you have to 665 00:24:28,100 --> 00:24:29,100 search results. 666 00:24:31,420 --> 00:24:32,529 What can we do better? 667 00:24:32,530 --> 00:24:34,659 Can we optimize 668 00:24:34,660 --> 00:24:37,269 for such time and then 669 00:24:37,270 --> 00:24:39,519 what we can do, OK, we can look at 670 00:24:39,520 --> 00:24:42,129 what do the guys, 671 00:24:42,130 --> 00:24:44,289 the database guys or the 672 00:24:44,290 --> 00:24:47,169 operating system guys have a lot of data 673 00:24:47,170 --> 00:24:50,139 and they have to touch quick 674 00:24:50,140 --> 00:24:53,019 and they using indexes, 675 00:24:53,020 --> 00:24:54,020 using an index. 676 00:24:54,960 --> 00:24:57,279 And OK, let's use the index, 677 00:24:57,280 --> 00:24:58,980 though, to speed up things. 678 00:25:01,890 --> 00:25:04,259 OK, then once again, you can 679 00:25:04,260 --> 00:25:06,389 hear have the most simple 680 00:25:06,390 --> 00:25:08,489 stuff to have 681 00:25:08,490 --> 00:25:10,619 a plain text index on your client's 682 00:25:10,620 --> 00:25:11,669 device. 683 00:25:11,670 --> 00:25:13,829 Yeah, right. And and then and 684 00:25:13,830 --> 00:25:15,869 the plain text group that using it you 685 00:25:15,870 --> 00:25:17,729 can encrypted the plain text using a 686 00:25:17,730 --> 00:25:19,229 secure encryption scheme, your favorite 687 00:25:19,230 --> 00:25:20,699 secure encryption scheme. 688 00:25:20,700 --> 00:25:22,529 Just encrypt it and upload it to the 689 00:25:22,530 --> 00:25:23,789 cloud. Everything is fine. 690 00:25:23,790 --> 00:25:26,399 And your text is on a local device. 691 00:25:26,400 --> 00:25:29,129 Yeah. Nowadays you have 692 00:25:29,130 --> 00:25:32,219 your smartphone, your tablet, 693 00:25:32,220 --> 00:25:34,739 your notebook, your PC, 694 00:25:34,740 --> 00:25:36,809 your server, your whatsoever, 695 00:25:36,810 --> 00:25:39,149 and five fifty thousand 696 00:25:39,150 --> 00:25:41,429 thoughts and 697 00:25:41,430 --> 00:25:43,679 becomes quite a mess when you 698 00:25:43,680 --> 00:25:45,089 try to synchronize stuff. 699 00:25:45,090 --> 00:25:46,920 The index on site. 700 00:25:48,360 --> 00:25:50,669 And yeah, 701 00:25:50,670 --> 00:25:52,349 some guys know what they're talking 702 00:25:52,350 --> 00:25:54,479 about, it's 703 00:25:54,480 --> 00:25:55,480 not work. 704 00:25:56,040 --> 00:25:58,559 Yeah, OK. Next approach, let's 705 00:25:58,560 --> 00:26:00,929 interrupt the index and upload 706 00:26:00,930 --> 00:26:02,699 it to the server. 707 00:26:02,700 --> 00:26:03,679 OK, you can do it. 708 00:26:03,680 --> 00:26:05,189 It is just encrypted. 709 00:26:05,190 --> 00:26:07,200 Atholl if a secure encryption scheme. 710 00:26:08,510 --> 00:26:10,579 And then on demand, 711 00:26:10,580 --> 00:26:12,050 you can download the index. 712 00:26:13,570 --> 00:26:16,089 Up to date index and can perform your 713 00:26:16,090 --> 00:26:18,459 encryption on the index. 714 00:26:18,460 --> 00:26:20,679 Yeah, well, 715 00:26:20,680 --> 00:26:22,899 then, if you're doing, uh, big 716 00:26:22,900 --> 00:26:25,209 data again, your index 717 00:26:25,210 --> 00:26:27,009 can become a couple of hundred a couple 718 00:26:27,010 --> 00:26:29,229 of hundred megabytes maybe, and 719 00:26:29,230 --> 00:26:32,499 this makes no fun to download at all. 720 00:26:32,500 --> 00:26:34,699 Specially when you are 721 00:26:34,700 --> 00:26:36,579 at the countryside in Germany. 722 00:26:36,580 --> 00:26:38,679 And I have a little something 723 00:26:38,680 --> 00:26:41,469 makes not fun at all. 724 00:26:41,470 --> 00:26:42,470 So. 725 00:26:43,090 --> 00:26:45,429 I want to have my index 726 00:26:45,430 --> 00:26:47,619 now and not in 10 727 00:26:47,620 --> 00:26:50,069 minutes, 20 minutes or whatever, 728 00:26:50,070 --> 00:26:52,239 this is not this is the best 729 00:26:52,240 --> 00:26:53,529 news experience yet. 730 00:26:53,530 --> 00:26:54,759 We've only had four nice suits 731 00:26:54,760 --> 00:26:55,760 experience. 732 00:26:57,270 --> 00:26:59,369 OK, then, to 733 00:26:59,370 --> 00:27:01,979 achieve this, we need a little bit 734 00:27:01,980 --> 00:27:04,809 advanced transcriptase stuff. 735 00:27:04,810 --> 00:27:07,239 OK, here, first of all, 736 00:27:07,240 --> 00:27:09,939 we have to generate a special index 737 00:27:09,940 --> 00:27:12,189 that fits our purpose and 738 00:27:12,190 --> 00:27:14,109 how we can do this, therefore, I mean, 739 00:27:14,110 --> 00:27:16,239 it's such key and 740 00:27:16,240 --> 00:27:17,679 the index key. 741 00:27:17,680 --> 00:27:20,109 And for this example, we want to 742 00:27:20,110 --> 00:27:22,599 generate an index for the last name 743 00:27:22,600 --> 00:27:24,609 of user data. 744 00:27:24,610 --> 00:27:26,829 And we want to hear so much for last 745 00:27:26,830 --> 00:27:27,830 night's. 746 00:27:28,510 --> 00:27:30,139 And then for each last name, which 747 00:27:30,140 --> 00:27:32,439 generate a such key and the index 748 00:27:32,440 --> 00:27:33,440 key. 749 00:27:34,860 --> 00:27:37,349 And then before such Slutsky, 750 00:27:37,350 --> 00:27:38,400 we just let. 751 00:27:39,910 --> 00:27:41,079 Zero. 752 00:27:41,080 --> 00:27:43,239 And that is our slogan, so we 753 00:27:43,240 --> 00:27:46,149 derive from zero and the Slutsky 754 00:27:46,150 --> 00:27:48,219 and the hash function, our 755 00:27:48,220 --> 00:27:51,009 touchstone for and then 756 00:27:51,010 --> 00:27:53,409 with the index key, we just 757 00:27:53,410 --> 00:27:55,509 securely encrypts the raw ideas 758 00:27:55,510 --> 00:27:56,619 in this example, the. 759 00:28:00,400 --> 00:28:03,219 And this works, 760 00:28:03,220 --> 00:28:05,049 so if you want to perform a search. 761 00:28:08,020 --> 00:28:10,329 Uh, uploads for former secretary 762 00:28:10,330 --> 00:28:12,429 by just uploading for such key and the 763 00:28:12,430 --> 00:28:13,969 index key. 764 00:28:13,970 --> 00:28:16,929 For our last name 765 00:28:16,930 --> 00:28:19,089 and then for Cloud can perform a look 766 00:28:19,090 --> 00:28:20,530 up in the index. 767 00:28:21,630 --> 00:28:23,130 And when it gets hit. 768 00:28:25,020 --> 00:28:27,179 It decrypts the index 769 00:28:27,180 --> 00:28:29,359 and send us the results. 770 00:28:29,360 --> 00:28:31,439 Oh, this is quite embarrassing. 771 00:28:31,440 --> 00:28:33,989 OK, there is a problem because 772 00:28:33,990 --> 00:28:36,659 if you have a very common, 773 00:28:36,660 --> 00:28:39,149 uh, last name, like Smith 774 00:28:39,150 --> 00:28:41,339 or Mahala, you will have, 775 00:28:41,340 --> 00:28:43,469 uh, lots 776 00:28:43,470 --> 00:28:45,839 of, uh, well, 777 00:28:45,840 --> 00:28:48,419 you have a lot of hits, a lot of, uh, 778 00:28:48,420 --> 00:28:50,909 values that that fits 779 00:28:50,910 --> 00:28:51,929 a lot of real ideas. 780 00:28:51,930 --> 00:28:54,629 You have a good set of ideas that 781 00:28:54,630 --> 00:28:57,179 that match for Smith, 782 00:28:57,180 --> 00:28:59,639 people with Smith and therefore 783 00:28:59,640 --> 00:29:02,459 the size of the value of for Smith 784 00:29:02,460 --> 00:29:04,529 is larger than the 785 00:29:04,530 --> 00:29:06,839 size of the value for a dollar 786 00:29:06,840 --> 00:29:08,939 because like, 787 00:29:08,940 --> 00:29:11,039 I'm coming, I'm coming him. 788 00:29:11,040 --> 00:29:13,409 So you can if you get access 789 00:29:13,410 --> 00:29:15,689 to this index, you can 790 00:29:15,690 --> 00:29:17,969 from the size of a well, you 791 00:29:17,970 --> 00:29:20,599 can you can try to postulate 792 00:29:20,600 --> 00:29:22,410 to try to estimate the plaintext 793 00:29:23,470 --> 00:29:25,709 last name. I'm just looking at 794 00:29:25,710 --> 00:29:26,880 the value size. 795 00:29:28,370 --> 00:29:30,289 And this is a problem. 796 00:29:30,290 --> 00:29:32,429 Yeah, we have to hide Forsys, 797 00:29:32,430 --> 00:29:33,430 so. 798 00:29:33,920 --> 00:29:36,069 Number of occurrences, the frequency of 799 00:29:36,070 --> 00:29:37,389 a loss of a last name. 800 00:29:37,390 --> 00:29:39,569 Yeah, if not, you have a bad 801 00:29:39,570 --> 00:29:41,909 time because we 802 00:29:41,910 --> 00:29:43,979 that, uh, you want to be able to 803 00:29:43,980 --> 00:29:46,109 lose our, uh, our index 804 00:29:46,110 --> 00:29:49,409 without, uh, get uh, if 805 00:29:49,410 --> 00:29:51,929 we want to give our adversary 806 00:29:51,930 --> 00:29:54,119 the index without becoming 807 00:29:54,120 --> 00:29:56,309 trouble. So, uh, this is the whole idea. 808 00:29:56,310 --> 00:29:58,829 If if if you if 809 00:29:58,830 --> 00:30:00,329 some of our assumption is that the 810 00:30:00,330 --> 00:30:01,959 adversary. 811 00:30:01,960 --> 00:30:04,149 Has never, ever access to the 812 00:30:04,150 --> 00:30:05,949 our index. We don't need to encrypt our 813 00:30:05,950 --> 00:30:07,419 index. Yeah. 814 00:30:07,420 --> 00:30:09,999 OK, so and then we assume 815 00:30:10,000 --> 00:30:12,069 that adversary has access 816 00:30:12,070 --> 00:30:14,139 to our index, and even if it 817 00:30:14,140 --> 00:30:16,389 exists, his index would not reveal any 818 00:30:16,390 --> 00:30:18,849 information about 819 00:30:18,850 --> 00:30:21,069 our plaintext. 820 00:30:21,070 --> 00:30:23,140 Therefore, we have to hide Forsayth. 821 00:30:24,710 --> 00:30:27,119 OK, and and that's 822 00:30:27,120 --> 00:30:29,509 a cool idea from it all from last 823 00:30:29,510 --> 00:30:32,099 year, they've published a paper, 824 00:30:32,100 --> 00:30:34,099 How to Hide Forsys. 825 00:30:35,350 --> 00:30:37,839 And this is the flatten out of the index, 826 00:30:37,840 --> 00:30:38,910 quite cool idea. 827 00:30:40,090 --> 00:30:42,419 OK, therefore, we have to remember 828 00:30:42,420 --> 00:30:45,089 the occurrences of a last name. 829 00:30:45,090 --> 00:30:47,819 OK, it starts here. 830 00:30:47,820 --> 00:30:50,129 And if, uh, if it's the first 831 00:30:50,130 --> 00:30:52,499 straw, you its full 832 00:30:52,500 --> 00:30:54,869 swing, last name for, OK, 833 00:30:54,870 --> 00:30:56,999 let's make an entry for food 834 00:30:57,000 --> 00:30:59,219 and occurrences to zero, this means 835 00:30:59,220 --> 00:31:01,619 the hash Yashiro 836 00:31:01,620 --> 00:31:03,509 and then encrypt the real ID. 837 00:31:03,510 --> 00:31:05,039 You have one. 838 00:31:05,040 --> 00:31:07,739 OK, then next, we have Bob Fu, 839 00:31:07,740 --> 00:31:10,109 same last name, the last 840 00:31:10,110 --> 00:31:12,179 Democrat once before, 841 00:31:12,180 --> 00:31:15,009 so now we have one instead of zero. 842 00:31:15,010 --> 00:31:16,409 OK. 843 00:31:16,410 --> 00:31:18,329 And shareholder value is the liberal 844 00:31:18,330 --> 00:31:19,330 idea. 845 00:31:20,360 --> 00:31:22,879 And then they fire 846 00:31:22,880 --> 00:31:25,519 and then the occurrences so far, 847 00:31:25,520 --> 00:31:26,749 none. 848 00:31:26,750 --> 00:31:28,690 Then again, we just 849 00:31:29,750 --> 00:31:31,999 under the new Slutsky for the. 850 00:31:34,670 --> 00:31:37,079 OK, and then we encrypt the, uh, 851 00:31:37,080 --> 00:31:39,169 the value, the idea and 852 00:31:39,170 --> 00:31:41,419 I was encrypt for each 853 00:31:41,420 --> 00:31:43,699 and any entry index, only one row 854 00:31:43,700 --> 00:31:45,979 ID so that 855 00:31:45,980 --> 00:31:48,619 we are aligned for all this 856 00:31:48,620 --> 00:31:50,149 is always the same. 857 00:31:50,150 --> 00:31:52,909 And this means our adversaries 858 00:31:52,910 --> 00:31:55,919 is not so much anymore if you get access, 859 00:31:55,920 --> 00:31:57,700 just a bunch of random values. 860 00:32:00,590 --> 00:32:02,719 OK, and once 861 00:32:02,720 --> 00:32:05,509 again, how this works in reality, 862 00:32:05,510 --> 00:32:08,329 just you encrypt 863 00:32:08,330 --> 00:32:10,279 your plain text using your favorite 864 00:32:10,280 --> 00:32:11,630 Sawatzki one Guptill scheme. 865 00:32:12,820 --> 00:32:14,529 And upload it. 866 00:32:14,530 --> 00:32:16,689 And then you 867 00:32:16,690 --> 00:32:18,539 read your index. 868 00:32:18,540 --> 00:32:20,069 And upload the index. 869 00:32:23,180 --> 00:32:25,130 OK, and now it's the can 870 00:32:26,150 --> 00:32:28,579 the cloud can now have the capability 871 00:32:28,580 --> 00:32:31,099 to search of an encrypted data, 872 00:32:31,100 --> 00:32:33,409 you once again for its search, 873 00:32:33,410 --> 00:32:35,599 the compute for such key 874 00:32:35,600 --> 00:32:37,729 and for index key, upload it to 875 00:32:37,730 --> 00:32:39,829 the cloud and now the 876 00:32:39,830 --> 00:32:41,899 cloud can make look ups in the 877 00:32:41,900 --> 00:32:44,269 index. First, 878 00:32:44,270 --> 00:32:46,849 we start by making look up for zero. 879 00:32:46,850 --> 00:32:48,259 Yeah, we have it. 880 00:32:48,260 --> 00:32:50,569 OK, then let's try one hit 881 00:32:50,570 --> 00:32:52,849 again. Yeah, let's try to no-hit. 882 00:32:52,850 --> 00:32:55,309 Oh it's over then. 883 00:32:55,310 --> 00:32:56,749 Fower hits three decrypts 884 00:32:58,010 --> 00:33:00,079 the values and sent them 885 00:33:00,080 --> 00:33:01,790 the results to the client. 886 00:33:03,700 --> 00:33:05,799 And and this works 887 00:33:05,800 --> 00:33:07,309 like a charm. 888 00:33:07,310 --> 00:33:09,389 So it's OK, 889 00:33:09,390 --> 00:33:12,499 I once again, I use the same plain text 890 00:33:12,500 --> 00:33:15,019 such as the King James Bible. 891 00:33:15,020 --> 00:33:17,149 So OK, and I use the King 892 00:33:17,150 --> 00:33:19,159 James Bible because a lot of other 893 00:33:19,160 --> 00:33:20,659 researchers are doing it. 894 00:33:20,660 --> 00:33:22,789 And it contains a lot of evil, a lot 895 00:33:22,790 --> 00:33:24,859 of words, about eight hundred 896 00:33:24,860 --> 00:33:26,429 thousand words. 897 00:33:26,430 --> 00:33:29,539 So it's it's quite OK. 898 00:33:29,540 --> 00:33:31,639 And yeah, I, 899 00:33:31,640 --> 00:33:33,829 I didn't care about 900 00:33:33,830 --> 00:33:35,089 the of putting. 901 00:33:37,250 --> 00:33:39,319 I used it as a blob, as a binary blob, 902 00:33:39,320 --> 00:33:41,239 the entire King James Bible. 903 00:33:41,240 --> 00:33:43,249 This is just this is why the ciphertext 904 00:33:43,250 --> 00:33:45,769 is equal length and the plaintext. 905 00:33:45,770 --> 00:33:47,359 Depending on your scenario, you have to 906 00:33:47,360 --> 00:33:48,169 do some padding. 907 00:33:48,170 --> 00:33:49,819 So it's most likely that the ciphertext 908 00:33:49,820 --> 00:33:51,799 is larger than your plaintext. 909 00:33:51,800 --> 00:33:52,800 That's OK. 910 00:33:53,850 --> 00:33:56,199 OK, what are the index sites to see here, 911 00:33:56,200 --> 00:33:57,809 it's OK. 912 00:33:57,810 --> 00:33:59,999 Uh, depending on your, 913 00:34:00,000 --> 00:34:02,499 uh, on your 914 00:34:02,500 --> 00:34:04,649 flying text, you'll 915 00:34:04,650 --> 00:34:06,959 need about, uh, thirty 916 00:34:06,960 --> 00:34:08,369 two bytes for each entry. 917 00:34:10,179 --> 00:34:12,399 Sixteen bytes for 918 00:34:12,400 --> 00:34:15,738 the search target and 65 to encrypt. 919 00:34:15,739 --> 00:34:17,540 The well, the real ID. 920 00:34:19,050 --> 00:34:21,629 And sixty five for its entry. 921 00:34:22,679 --> 00:34:24,749 Might be OK, depending on 922 00:34:24,750 --> 00:34:27,119 your scenario, but the cool thing now 923 00:34:27,120 --> 00:34:29,369 is we can perform so much in in, 924 00:34:29,370 --> 00:34:32,069 uh, Konstantine more or less. 925 00:34:32,070 --> 00:34:34,859 So it's it's I make a lot of 926 00:34:34,860 --> 00:34:37,299 tests and it was 927 00:34:37,300 --> 00:34:39,809 all alfer results I got was, 928 00:34:39,810 --> 00:34:41,968 uh, such and such 929 00:34:41,969 --> 00:34:44,158 crude was less than one millisecond. 930 00:34:44,159 --> 00:34:45,448 Less than one millisecond. 931 00:34:45,449 --> 00:34:47,169 It's, uh, it's quite good. 932 00:34:47,170 --> 00:34:49,468 So it's it's fucking it's 933 00:34:49,469 --> 00:34:50,729 and the speed is fine. 934 00:34:52,570 --> 00:34:55,448 And yeah, so, uh, 935 00:34:55,449 --> 00:34:57,309 this is everything is sunshine and 936 00:34:57,310 --> 00:35:00,129 rainbows and not really, 937 00:35:00,130 --> 00:35:03,249 uh, again, we have the problem with, uh, 938 00:35:03,250 --> 00:35:05,649 statistical analysis. 939 00:35:05,650 --> 00:35:07,869 Because we the 940 00:35:07,870 --> 00:35:10,479 Slutsky and the index key for same 941 00:35:10,480 --> 00:35:12,519 for me. I have multiple queries for the 942 00:35:12,520 --> 00:35:14,559 same last name. So we have so much 20 943 00:35:14,560 --> 00:35:16,779 times for the same last name. 944 00:35:16,780 --> 00:35:19,179 I have 20 times the same search query. 945 00:35:19,180 --> 00:35:21,919 And yeah, OK, I technique's 946 00:35:21,920 --> 00:35:23,589 I get rid of those. 947 00:35:23,590 --> 00:35:26,119 But then the performance of empowerments 948 00:35:26,120 --> 00:35:27,729 performance breakdown then you have 949 00:35:27,730 --> 00:35:30,039 usually wait a 950 00:35:30,040 --> 00:35:33,099 couple of hours or minutes 951 00:35:33,100 --> 00:35:35,729 or ciphertext size explodes. 952 00:35:35,730 --> 00:35:37,419 Yeah. And if you want to make it 953 00:35:37,420 --> 00:35:39,719 practical then yeah. 954 00:35:39,720 --> 00:35:41,799 Until now you have to live 955 00:35:41,800 --> 00:35:43,719 with the statistical analysis. 956 00:35:43,720 --> 00:35:45,699 Maybe we get rid of those and then the 957 00:35:45,700 --> 00:35:47,169 future depends. 958 00:35:48,180 --> 00:35:50,369 OK, and now, uh, 959 00:35:50,370 --> 00:35:52,079 probably concludes the talk and give 960 00:35:52,080 --> 00:35:54,719 outlook what's going on in the future. 961 00:35:54,720 --> 00:35:55,720 Thank you very much, Christian. 962 00:36:03,690 --> 00:36:05,759 So that was fascinating, isn't 963 00:36:05,760 --> 00:36:08,219 it? So the you enable 964 00:36:08,220 --> 00:36:11,369 a third party to execute 965 00:36:11,370 --> 00:36:13,829 a search operation on 966 00:36:13,830 --> 00:36:15,989 encrypted data, although, you know, 967 00:36:15,990 --> 00:36:17,099 you've encrypted the data. 968 00:36:17,100 --> 00:36:18,749 How could anyone possibly execute any 969 00:36:18,750 --> 00:36:20,879 operations, any operation on that 970 00:36:20,880 --> 00:36:21,839 encrypted data? 971 00:36:21,840 --> 00:36:23,699 But it's possible. 972 00:36:23,700 --> 00:36:25,889 And these were just a couple of schemes 973 00:36:25,890 --> 00:36:28,199 that we've presented and which were those 974 00:36:28,200 --> 00:36:29,669 that we've implemented. 975 00:36:29,670 --> 00:36:31,829 And there's many more. 976 00:36:31,830 --> 00:36:33,989 And from what we've 977 00:36:33,990 --> 00:36:36,749 shown you, these schemes 978 00:36:36,750 --> 00:36:38,759 have the problem of the deterministic 979 00:36:38,760 --> 00:36:39,899 search token. 980 00:36:39,900 --> 00:36:42,299 So whenever you query 981 00:36:42,300 --> 00:36:44,459 two times for the same or for 982 00:36:44,460 --> 00:36:46,559 the same keyword, then 983 00:36:46,560 --> 00:36:48,129 you will generate the same token. 984 00:36:48,130 --> 00:36:50,339 And the database or the service 985 00:36:50,340 --> 00:36:52,559 provider might very well 986 00:36:52,560 --> 00:36:53,669 interfere. 987 00:36:53,670 --> 00:36:55,919 What you are searching for based 988 00:36:55,920 --> 00:36:57,509 only on your queries. 989 00:36:57,510 --> 00:36:58,510 Based only on your. On your. 990 00:37:00,620 --> 00:37:01,640 And there's. 991 00:37:02,690 --> 00:37:04,789 Attempts to 992 00:37:04,790 --> 00:37:07,339 or there's other techniques to, 993 00:37:07,340 --> 00:37:09,709 well, deal with the problem, 994 00:37:09,710 --> 00:37:12,649 but making those practical 995 00:37:12,650 --> 00:37:13,939 is a major challenge. 996 00:37:13,940 --> 00:37:16,249 Currently fully morphic encryption has 997 00:37:16,250 --> 00:37:18,559 been, you know, on everyone's mind 998 00:37:18,560 --> 00:37:19,579 for the last couple of years. 999 00:37:19,580 --> 00:37:21,379 And there's massive research efforts 1000 00:37:21,380 --> 00:37:23,359 going on right now. 1001 00:37:23,360 --> 00:37:25,639 But for now, 1002 00:37:25,640 --> 00:37:27,859 you cannot use that because it's 1003 00:37:27,860 --> 00:37:30,349 simply too well, too demanding 1004 00:37:30,350 --> 00:37:33,379 in terms of performance, computation 1005 00:37:33,380 --> 00:37:34,309 or memory. 1006 00:37:34,310 --> 00:37:35,209 So this is not an option. 1007 00:37:35,210 --> 00:37:37,279 But if you if you happen to have 1008 00:37:37,280 --> 00:37:39,229 a few spare cycles, you may very well 1009 00:37:39,230 --> 00:37:41,479 enter this area of research and try 1010 00:37:41,480 --> 00:37:43,609 to find solutions well 1011 00:37:43,610 --> 00:37:44,610 to these problems. 1012 00:37:46,450 --> 00:37:48,729 We have seen them that we've implemented 1013 00:37:48,730 --> 00:37:50,559 those schemes and there's many more. 1014 00:37:50,560 --> 00:37:52,839 And again, if you have a few 1015 00:37:52,840 --> 00:37:54,519 spare cycles but rather want to hack 1016 00:37:54,520 --> 00:37:56,649 instead of research, then go go 1017 00:37:56,650 --> 00:37:58,689 off, read these papers and build 1018 00:37:58,690 --> 00:38:01,269 libraries for four encrypted touch, 1019 00:38:01,270 --> 00:38:04,269 build libraries so that subscribers 1020 00:38:04,270 --> 00:38:06,279 can use these libraries and offer 1021 00:38:06,280 --> 00:38:07,389 encrypted services. 1022 00:38:10,780 --> 00:38:13,209 Ideally, we'd have a collaborative 1023 00:38:13,210 --> 00:38:15,609 effort to demand 1024 00:38:15,610 --> 00:38:18,249 encrypted services and to write these 1025 00:38:18,250 --> 00:38:20,409 programs, these libraries for third 1026 00:38:20,410 --> 00:38:22,599 parties to offer these services, we will 1027 00:38:22,600 --> 00:38:24,789 not kick off the new Latson 1028 00:38:24,790 --> 00:38:26,919 initiative, but we 1029 00:38:26,920 --> 00:38:29,499 will hopefully will inspire some of you 1030 00:38:29,500 --> 00:38:31,629 to go into that direction and to make 1031 00:38:31,630 --> 00:38:33,699 well, to bring more encryption to 1032 00:38:33,700 --> 00:38:35,099 the Internet, to the cloud. 1033 00:38:37,700 --> 00:38:40,459 We have seen a couple of schemes, 1034 00:38:40,460 --> 00:38:42,109 we have seen the very first 1035 00:38:42,110 --> 00:38:44,529 deterministically what encryption scheme. 1036 00:38:44,530 --> 00:38:47,359 It's very easy to set up 1037 00:38:47,360 --> 00:38:49,549 and you can do that well with low 1038 00:38:49,550 --> 00:38:52,429 computational effort on your and machine. 1039 00:38:52,430 --> 00:38:54,259 The search, however, does not perform 1040 00:38:54,260 --> 00:38:56,570 very well in terms of security. 1041 00:38:57,920 --> 00:39:00,349 We have seen a well, 1042 00:39:00,350 --> 00:39:02,539 probably better scheme 1043 00:39:02,540 --> 00:39:03,619 in that regard. 1044 00:39:03,620 --> 00:39:05,719 We can search over 1045 00:39:05,720 --> 00:39:07,549 the size of the database because the 1046 00:39:07,550 --> 00:39:09,559 server has to, well, go through each and 1047 00:39:09,560 --> 00:39:12,049 every entry in the database. 1048 00:39:12,050 --> 00:39:14,159 That may or may not be what you want. 1049 00:39:14,160 --> 00:39:16,549 If you don't want to have such a scheme, 1050 00:39:16,550 --> 00:39:18,869 you may want to look into the actual 1051 00:39:18,870 --> 00:39:20,089 scheme. 1052 00:39:20,090 --> 00:39:21,859 You can search in basically no time 1053 00:39:21,860 --> 00:39:22,969 because you have the index. 1054 00:39:22,970 --> 00:39:25,009 And if you saw the index cleverly, then, 1055 00:39:25,010 --> 00:39:26,149 well, if you make it a hash table, 1056 00:39:26,150 --> 00:39:27,739 something, then you can search an open 1057 00:39:27,740 --> 00:39:28,910 one and of one, 1058 00:39:30,290 --> 00:39:32,599 however. Well, index, 1059 00:39:32,600 --> 00:39:33,769 whenever you have an index, you need to 1060 00:39:33,770 --> 00:39:35,899 think about what happens when 1061 00:39:35,900 --> 00:39:38,149 you add new entries, when you delete 1062 00:39:38,150 --> 00:39:40,609 entries, when you change entries. 1063 00:39:40,610 --> 00:39:43,099 So this well, if you are going 1064 00:39:43,100 --> 00:39:45,409 the cash all route, then 1065 00:39:45,410 --> 00:39:46,410 keep that in mind. 1066 00:39:48,260 --> 00:39:50,119 There's so many schemes, as I've said 1067 00:39:50,120 --> 00:39:52,519 already, all of them 1068 00:39:52,520 --> 00:39:53,839 have different, slightly different 1069 00:39:53,840 --> 00:39:55,969 features, so depending on what 1070 00:39:55,970 --> 00:39:58,549 you actually want, you can 1071 00:39:58,550 --> 00:40:00,499 build a very efficient scheme. 1072 00:40:00,500 --> 00:40:02,659 So if you will cut down 1073 00:40:02,660 --> 00:40:04,879 on the functionality that you expect from 1074 00:40:04,880 --> 00:40:07,189 your from your scheme, then, 1075 00:40:07,190 --> 00:40:09,679 well, by doing some clever engineering, 1076 00:40:09,680 --> 00:40:13,249 you can cut down on 1077 00:40:13,250 --> 00:40:16,079 runtime and memory demands 1078 00:40:16,080 --> 00:40:17,239 significantly. 1079 00:40:17,240 --> 00:40:19,459 So if you are 1080 00:40:19,460 --> 00:40:21,829 about to build a scheme, think about 1081 00:40:21,830 --> 00:40:24,109 what your actual requirements 1082 00:40:24,110 --> 00:40:25,110 are. 1083 00:40:26,280 --> 00:40:28,469 And as I said, many 1084 00:40:28,470 --> 00:40:30,569 more, if you research, if you do some 1085 00:40:30,570 --> 00:40:32,459 research on the Internet search and 1086 00:40:32,460 --> 00:40:34,649 corruption, that term will find you. 1087 00:40:34,650 --> 00:40:37,289 Well, quite a few academic papers 1088 00:40:37,290 --> 00:40:38,429 on that. 1089 00:40:38,430 --> 00:40:39,659 On that topic. 1090 00:40:39,660 --> 00:40:41,309 You will also find a couple of libraries. 1091 00:40:41,310 --> 00:40:43,289 There's already software implementations 1092 00:40:43,290 --> 00:40:45,269 available in. 1093 00:40:45,270 --> 00:40:47,339 I haven't evaluated them all, but, 1094 00:40:47,340 --> 00:40:49,919 um, well, uh, 1095 00:40:49,920 --> 00:40:51,549 some need work, let's put it that way. 1096 00:40:51,550 --> 00:40:54,299 So, again, if you have some spare cycles, 1097 00:40:54,300 --> 00:40:55,589 go and look at these libraries and 1098 00:40:55,590 --> 00:40:57,419 provide patches, make them work, make 1099 00:40:57,420 --> 00:40:58,559 them actually built in first place. 1100 00:40:58,560 --> 00:41:00,689 That would be good so 1101 00:41:00,690 --> 00:41:02,219 that we can have nice things in the 1102 00:41:02,220 --> 00:41:03,220 future. 1103 00:41:04,610 --> 00:41:07,069 And as we've hopefully 1104 00:41:07,070 --> 00:41:09,649 presented, searching of encrypted data 1105 00:41:09,650 --> 00:41:11,869 is practical, you can 1106 00:41:11,870 --> 00:41:13,939 build your encrypted database, you can 1107 00:41:13,940 --> 00:41:15,919 have clients that search over encrypted 1108 00:41:15,920 --> 00:41:18,259 data without, well, the server, 1109 00:41:18,260 --> 00:41:19,849 the server side learning either the plain 1110 00:41:19,850 --> 00:41:21,949 text nor like the key and or what you are 1111 00:41:21,950 --> 00:41:23,300 what you are searching for directly. 1112 00:41:24,440 --> 00:41:27,109 So whenever 1113 00:41:27,110 --> 00:41:29,389 someone or whenever you are in 1114 00:41:29,390 --> 00:41:30,889 a discussion about whether such a thing 1115 00:41:30,890 --> 00:41:32,869 is possible now, you hopefully know that 1116 00:41:32,870 --> 00:41:34,969 this is possible and we should use that. 1117 00:41:36,880 --> 00:41:38,979 And with that, we'd like to conclude 1118 00:41:38,980 --> 00:41:40,809 and we'd like to thank you very much for 1119 00:41:40,810 --> 00:41:42,909 your attention again. Before I forget it, 1120 00:41:42,910 --> 00:41:45,249 we will be at the bar in like 1121 00:41:45,250 --> 00:41:47,319 at 40 a quarter to 1122 00:41:47,320 --> 00:41:48,239 one or whatever it is. 1123 00:41:48,240 --> 00:41:49,209 So just right off the top. 1124 00:41:49,210 --> 00:41:50,210 Thank you very much. 1125 00:41:58,780 --> 00:42:00,219 So let's go over to. 1126 00:42:00,220 --> 00:42:02,289 We still have plenty of time left, 1127 00:42:02,290 --> 00:42:04,389 please, if you leave the room now, do 1128 00:42:04,390 --> 00:42:06,699 so quietly because we want to actually 1129 00:42:06,700 --> 00:42:08,829 hear what is what is being asked into 1130 00:42:08,830 --> 00:42:10,239 the microphones. 1131 00:42:10,240 --> 00:42:12,429 So please do it quietly and take 1132 00:42:12,430 --> 00:42:14,080 all the trash with you. 1133 00:42:15,310 --> 00:42:17,219 So now let's start 1134 00:42:18,520 --> 00:42:20,619 microphone over there, please. 1135 00:42:20,620 --> 00:42:22,269 Hi in. 1136 00:42:22,270 --> 00:42:24,639 Philip Roth with the very 1137 00:42:24,640 --> 00:42:26,319 good paper. The Moral Code of 1138 00:42:26,320 --> 00:42:28,389 Cryptography writes in 1139 00:42:28,390 --> 00:42:30,729 critique of FHC, 1140 00:42:30,730 --> 00:42:32,559 providing strong funding for AVICHAI and 1141 00:42:32,560 --> 00:42:34,959 IO provides risk free political cover. 1142 00:42:34,960 --> 00:42:36,639 It supports the storyline that cloud 1143 00:42:36,640 --> 00:42:38,749 storage and computing is safe. 1144 00:42:38,750 --> 00:42:40,749 It helps entrench favored values within 1145 00:42:40,750 --> 00:42:43,839 the cryptographic community, speculative 1146 00:42:43,840 --> 00:42:46,179 theory centric directions, and it helps 1147 00:42:46,180 --> 00:42:48,309 keep harmless academics who could, if 1148 00:42:48,310 --> 00:42:50,169 they go to the store, to invest in more 1149 00:42:50,170 --> 00:42:51,729 sensitive directions. 1150 00:42:51,730 --> 00:42:54,039 So I read this as a critique of that. 1151 00:42:54,040 --> 00:42:56,799 This work sucks into good cryptography, 1152 00:42:56,800 --> 00:42:58,989 work into an 1153 00:42:58,990 --> 00:43:01,299 area where you can actually 1154 00:43:01,300 --> 00:43:02,989 do better, work somewhere else. 1155 00:43:02,990 --> 00:43:04,209 What are your thoughts on this? 1156 00:43:04,210 --> 00:43:05,889 Right. That's a very interesting paper. 1157 00:43:05,890 --> 00:43:08,139 By the way, if you have a couple of 1158 00:43:08,140 --> 00:43:10,089 well, if you have a couple of hours, you 1159 00:43:10,090 --> 00:43:12,099 should go off and read the it's a couple 1160 00:43:12,100 --> 00:43:13,209 of weeks old, right? 1161 00:43:13,210 --> 00:43:16,029 Like three weeks maybe in 1162 00:43:16,030 --> 00:43:18,729 the paper in a in and of itself is 1163 00:43:18,730 --> 00:43:20,889 well, as you said, criticizing 1164 00:43:20,890 --> 00:43:23,619 the crypto people for being, 1165 00:43:23,620 --> 00:43:25,869 well, way off the real world, 1166 00:43:25,870 --> 00:43:27,249 essentially. Right. That's that's the 1167 00:43:27,250 --> 00:43:29,409 bottom line that I took away and 1168 00:43:29,410 --> 00:43:31,989 rightfully the full of my morphic. 1169 00:43:31,990 --> 00:43:35,229 Well, for a string of research. 1170 00:43:35,230 --> 00:43:37,869 Well, it's complicated in the sense that 1171 00:43:37,870 --> 00:43:39,309 it's very, very demanding. 1172 00:43:39,310 --> 00:43:41,469 And so and I think 1173 00:43:41,470 --> 00:43:43,089 it's perfectly right. 1174 00:43:43,090 --> 00:43:44,649 And I think we should focus we as the 1175 00:43:44,650 --> 00:43:46,269 community should focus on making real 1176 00:43:46,270 --> 00:43:49,359 world things happen and enabling 1177 00:43:49,360 --> 00:43:51,459 these techniques that we have, like 1178 00:43:51,460 --> 00:43:53,619 for real world usage, for usage 1179 00:43:53,620 --> 00:43:55,359 in such a context such that you could 1180 00:43:55,360 --> 00:43:57,459 actually upload your content and 1181 00:43:57,460 --> 00:43:59,589 still, well, perform your 1182 00:43:59,590 --> 00:44:00,939 queries over the encrypted data. 1183 00:44:00,940 --> 00:44:03,129 So I sympathize with that paper 1184 00:44:03,130 --> 00:44:05,949 like to to the full extent possible to be 1185 00:44:05,950 --> 00:44:08,019 I again, I encourage everyone to read 1186 00:44:08,020 --> 00:44:10,179 that paper and I hope that we don't fall 1187 00:44:10,180 --> 00:44:12,459 into that category of 1188 00:44:12,460 --> 00:44:14,529 good crypto if Dejavu 1189 00:44:14,530 --> 00:44:16,809 called it. I think I hope this is 1190 00:44:16,810 --> 00:44:19,149 becoming boring crypto in the sense that 1191 00:44:19,150 --> 00:44:20,769 this is a commodity, that you can use it 1192 00:44:20,770 --> 00:44:22,299 just like that. 1193 00:44:22,300 --> 00:44:23,979 And everybody read the paper. 1194 00:44:23,980 --> 00:44:25,659 Yeah. And oh actually on paper, something 1195 00:44:25,660 --> 00:44:27,669 before you forget for completeness sake. 1196 00:44:27,670 --> 00:44:29,739 These are the references in case you 1197 00:44:29,740 --> 00:44:31,779 want to go off and read these papers 1198 00:44:31,780 --> 00:44:33,730 yourself, you will find them anyway. 1199 00:44:35,350 --> 00:44:36,369 Uh, is there a question from the 1200 00:44:36,370 --> 00:44:38,499 Internet? Yes, I have two questions. 1201 00:44:38,500 --> 00:44:40,419 The first one is, can you compare this 1202 00:44:40,420 --> 00:44:42,609 scheme with IAMGOLD was private 1203 00:44:42,610 --> 00:44:43,610 information retrieval. 1204 00:44:45,490 --> 00:44:48,369 Uh, I'm 1205 00:44:48,370 --> 00:44:50,879 OK, I'm not a super expert at 1206 00:44:50,880 --> 00:44:52,969 this information, which is a 1207 00:44:52,970 --> 00:44:55,659 major topic, but 1208 00:44:55,660 --> 00:44:57,759 I think that it's just 1209 00:44:57,760 --> 00:44:59,829 I think it's just almost practical and I 1210 00:44:59,830 --> 00:45:02,049 think the stuff is indeed 1211 00:45:02,050 --> 00:45:03,239 practical. 1212 00:45:03,240 --> 00:45:05,209 If the index stuff. 1213 00:45:05,210 --> 00:45:07,589 Uh, usually 1214 00:45:07,590 --> 00:45:10,319 when you have, um, yeah, 1215 00:45:10,320 --> 00:45:12,569 when you use this, uh, these 1216 00:45:12,570 --> 00:45:14,729 other techniques, you have 1217 00:45:14,730 --> 00:45:16,169 problems with this. 1218 00:45:16,170 --> 00:45:18,509 If the ciphertext 1219 00:45:18,510 --> 00:45:20,639 with the length of a search query of the 1220 00:45:20,640 --> 00:45:23,129 indexes are you have problems 1221 00:45:23,130 --> 00:45:25,229 with the complexity or 1222 00:45:25,230 --> 00:45:27,479 its will, this information 1223 00:45:27,480 --> 00:45:29,879 retrieval might not work 1224 00:45:29,880 --> 00:45:32,119 as well. In practice, with 1225 00:45:32,120 --> 00:45:34,229 data at the fingers will be 1226 00:45:34,230 --> 00:45:36,539 there are scaling problems 1227 00:45:36,540 --> 00:45:39,869 so far. Smaller stuff might be fine, 1228 00:45:39,870 --> 00:45:42,119 but on a large scale, I 1229 00:45:42,120 --> 00:45:44,369 think it's it's 1230 00:45:44,370 --> 00:45:46,529 very challenging and also 1231 00:45:46,530 --> 00:45:48,209 private information retrieval of the 1232 00:45:48,210 --> 00:45:49,859 different parties, the private 1233 00:45:49,860 --> 00:45:51,089 information retrieval. 1234 00:45:51,090 --> 00:45:53,699 You will download something from 1235 00:45:53,700 --> 00:45:56,009 the server or the server network without 1236 00:45:56,010 --> 00:45:57,749 the server, the Seven Network learning 1237 00:45:57,750 --> 00:45:58,769 what you're downloading. This is 1238 00:45:58,770 --> 00:46:00,539 different from performing such operations 1239 00:46:00,540 --> 00:46:01,540 over encrypted data. 1240 00:46:03,100 --> 00:46:05,589 OK, yeah, here 1241 00:46:05,590 --> 00:46:08,009 the schemes to present the database 1242 00:46:08,010 --> 00:46:10,119 as learning of a service learning which 1243 00:46:10,120 --> 00:46:11,590 which rolls off the Y axis. 1244 00:46:13,050 --> 00:46:15,149 If you hide this, it will 1245 00:46:15,150 --> 00:46:17,069 come with a cost. 1246 00:46:17,070 --> 00:46:19,469 It will be costly, and 1247 00:46:20,520 --> 00:46:22,589 I don't think that you can implement 1248 00:46:22,590 --> 00:46:24,899 it now needs any database and it 1249 00:46:24,900 --> 00:46:25,949 will run smoothly. 1250 00:46:25,950 --> 00:46:28,019 I think there is a lot of work to 1251 00:46:28,020 --> 00:46:30,239 do, but just my 1252 00:46:30,240 --> 00:46:31,240 opinion. 1253 00:46:33,250 --> 00:46:35,289 Next question on the microphone, please. 1254 00:46:37,450 --> 00:46:40,129 Oh, I have a light on me, so 1255 00:46:40,130 --> 00:46:41,130 oh. 1256 00:46:42,450 --> 00:46:44,889 So one problem that I potentially 1257 00:46:44,890 --> 00:46:46,489 see in this and so you mentioned that you 1258 00:46:46,490 --> 00:46:48,729 pat each word, have the same length 1259 00:46:48,730 --> 00:46:50,599 so that you can analyze the word length. 1260 00:46:51,610 --> 00:46:52,719 That's good. 1261 00:46:52,720 --> 00:46:54,579 There's one other issue in that. 1262 00:46:54,580 --> 00:46:56,589 There's I don't know if you're heard of 1263 00:46:56,590 --> 00:46:58,029 the distributional hypothesis, but 1264 00:46:58,030 --> 00:47:00,129 basically the colocation 1265 00:47:00,130 --> 00:47:02,229 of words within large 1266 00:47:02,230 --> 00:47:04,119 sets of text kind of defines the 1267 00:47:04,120 --> 00:47:05,979 semantics of the word themselves. 1268 00:47:05,980 --> 00:47:08,169 So just by if I get 1269 00:47:08,170 --> 00:47:09,670 access to your encrypted data, 1270 00:47:11,020 --> 00:47:12,909 looking at how encrypted words occur with 1271 00:47:12,910 --> 00:47:15,039 each other and knowing that, OK, this 1272 00:47:15,040 --> 00:47:16,779 might be English or even without knowing 1273 00:47:16,780 --> 00:47:18,609 the language, potentially, I could kind 1274 00:47:18,610 --> 00:47:20,769 of reverse engineer what specific words 1275 00:47:20,770 --> 00:47:22,749 might be just by the fact that they occur 1276 00:47:22,750 --> 00:47:25,189 together or how often they 1277 00:47:25,190 --> 00:47:27,579 use those words again occur 1278 00:47:27,580 --> 00:47:28,699 compared to other ones. 1279 00:47:28,700 --> 00:47:30,759 So you can kind of cluster them. 1280 00:47:30,760 --> 00:47:32,049 And there's a lot of research that's 1281 00:47:32,050 --> 00:47:34,149 going on in deep learning on this 1282 00:47:34,150 --> 00:47:36,369 word vector modeling basically. 1283 00:47:36,370 --> 00:47:37,629 And the thing there is word vector 1284 00:47:37,630 --> 00:47:39,429 modeling works, even if you don't look at 1285 00:47:39,430 --> 00:47:40,989 the surface form of the word like the 1286 00:47:40,990 --> 00:47:43,089 actual letters. So I 1287 00:47:43,090 --> 00:47:45,219 don't know if you've gone into this 1288 00:47:45,220 --> 00:47:46,989 a lot in your research at all, but it's a 1289 00:47:46,990 --> 00:47:49,269 potential vulnerability that if 1290 00:47:49,270 --> 00:47:51,129 you encrypt your data word by word, just 1291 00:47:51,130 --> 00:47:53,499 the fact that each word is still the same 1292 00:47:53,500 --> 00:47:55,809 token makes them vulnerable 1293 00:47:55,810 --> 00:47:58,029 to being reverse engineered. 1294 00:47:58,030 --> 00:47:59,019 Sort of. 1295 00:47:59,020 --> 00:48:00,399 Yeah, you're totally right. 1296 00:48:00,400 --> 00:48:03,389 Yeah. This is this is a problem. 1297 00:48:03,390 --> 00:48:05,709 But what I want to emphasize once again, 1298 00:48:05,710 --> 00:48:07,899 it's much better 1299 00:48:07,900 --> 00:48:09,649 than upload to the Blendtec stuff. 1300 00:48:09,650 --> 00:48:11,819 So now we upload 1301 00:48:11,820 --> 00:48:13,899 plaintext and they come a plain 1302 00:48:13,900 --> 00:48:16,209 text from plain text is not challenging 1303 00:48:16,210 --> 00:48:17,210 at all. 1304 00:48:17,740 --> 00:48:20,019 So even 1305 00:48:20,020 --> 00:48:22,419 if you perform that stuff, if 1306 00:48:22,420 --> 00:48:24,699 it posts the code 1307 00:48:24,700 --> 00:48:26,320 closely and you have to do it, 1308 00:48:27,400 --> 00:48:29,469 you have to believe that 1309 00:48:29,470 --> 00:48:30,470 they really have to. 1310 00:48:31,510 --> 00:48:33,669 Now, you can just look at the data 1311 00:48:33,670 --> 00:48:35,949 now then if I ask you to 1312 00:48:35,950 --> 00:48:37,449 have do you have to mount a negative 1313 00:48:37,450 --> 00:48:40,159 attack? And I think that's. 1314 00:48:40,160 --> 00:48:42,319 Yeah, it's, uh, it's 1315 00:48:42,320 --> 00:48:44,759 much better than uploading our problems, 1316 00:48:44,760 --> 00:48:45,760 OK? 1317 00:48:46,910 --> 00:48:49,129 I told you, it's not 1318 00:48:49,130 --> 00:48:50,449 maybe not the final not the best 1319 00:48:50,450 --> 00:48:52,739 solution, but we 1320 00:48:52,740 --> 00:48:55,099 I think we should now shift to 1321 00:48:55,100 --> 00:48:57,499 upload and drop the data, even 1322 00:48:57,500 --> 00:49:00,079 if you can partially decrypt 1323 00:49:00,080 --> 00:49:02,319 it if, uh, if 1324 00:49:02,320 --> 00:49:04,519 you perform a lot of amount of stuff 1325 00:49:04,520 --> 00:49:05,479 of computation. 1326 00:49:05,480 --> 00:49:06,889 Yeah. It still makes it harder to 1327 00:49:06,890 --> 00:49:07,890 analyze. 1328 00:49:08,420 --> 00:49:10,699 Sure. I think it's super hard 1329 00:49:10,700 --> 00:49:13,549 to decrypt all the data 1330 00:49:13,550 --> 00:49:15,979 partially. OK, but 1331 00:49:15,980 --> 00:49:17,789 it's better than uploading plaintext 1332 00:49:17,790 --> 00:49:19,359 here. OK, thanks. 1333 00:49:19,360 --> 00:49:20,389 Please come to the front to pick up your 1334 00:49:20,390 --> 00:49:22,499 seats. We're having our next question 1335 00:49:22,500 --> 00:49:24,799 now in the middle, please. 1336 00:49:24,800 --> 00:49:27,109 Hello. Thanks for the talk. 1337 00:49:27,110 --> 00:49:30,489 I was quite pleasantly. 1338 00:49:30,490 --> 00:49:32,929 I'm interested to see, 1339 00:49:32,930 --> 00:49:35,269 uh, this, uh, talk 1340 00:49:35,270 --> 00:49:38,329 sponsored by a company was started 1341 00:49:38,330 --> 00:49:40,879 by a person from the Chinese National 1342 00:49:40,880 --> 00:49:43,189 Army or People's Army. 1343 00:49:43,190 --> 00:49:44,539 So kudos to that, that they are 1344 00:49:44,540 --> 00:49:47,299 sponsoring such kind of, uh, 1345 00:49:47,300 --> 00:49:49,399 research. The question I have is, 1346 00:49:49,400 --> 00:49:51,739 uh, the new anti-terror 1347 00:49:51,740 --> 00:49:54,139 laws in China, how do they affect 1348 00:49:55,820 --> 00:49:57,139 this kind of research? 1349 00:49:57,140 --> 00:49:59,329 Basically, you need to either 1350 00:49:59,330 --> 00:50:01,699 build in a backdoor or have the keys 1351 00:50:01,700 --> 00:50:02,959 sent to China. 1352 00:50:02,960 --> 00:50:05,149 So can you in your 1353 00:50:06,350 --> 00:50:08,539 working life use this 1354 00:50:08,540 --> 00:50:10,669 in a way, or is it 1355 00:50:10,670 --> 00:50:13,279 not possible because you need to have 1356 00:50:13,280 --> 00:50:14,989 the backdoor built in? 1357 00:50:14,990 --> 00:50:16,669 Yes, I don't know. 1358 00:50:16,670 --> 00:50:19,009 It's I, I, 1359 00:50:19,010 --> 00:50:20,659 I'm not concerned with any of that. 1360 00:50:20,660 --> 00:50:22,789 OK, maybe you should look into it. 1361 00:50:27,650 --> 00:50:29,389 Is there a question from the Internet? 1362 00:50:29,390 --> 00:50:30,889 Yes, I have one more question. 1363 00:50:31,940 --> 00:50:34,159 What about contacts like normal service 1364 00:50:34,160 --> 00:50:36,229 gives you context and the result or 1365 00:50:36,230 --> 00:50:38,479 can consider the context 1366 00:50:38,480 --> 00:50:39,709 when doing search. 1367 00:50:39,710 --> 00:50:41,360 What what's your approach to this? 1368 00:50:45,640 --> 00:50:47,859 Difficult, you mean contact 1369 00:50:47,860 --> 00:50:50,079 that in the morning, like the time 1370 00:50:50,080 --> 00:50:51,529 maybe, or the place where you are 1371 00:50:51,530 --> 00:50:52,959 accruing from that? 1372 00:50:52,960 --> 00:50:55,419 No, just the words that are around 1373 00:50:55,420 --> 00:50:57,669 the, um, 1374 00:50:57,670 --> 00:50:59,109 the thing you're looking for. 1375 00:50:59,110 --> 00:51:01,269 So some hits might be 1376 00:51:01,270 --> 00:51:02,889 more relevant to the search performed 1377 00:51:02,890 --> 00:51:05,019 than others, despite having 1378 00:51:05,020 --> 00:51:06,020 the same keyword. 1379 00:51:08,740 --> 00:51:11,529 Yeah, once again, yeah, we can 1380 00:51:11,530 --> 00:51:13,209 just might be your problem. 1381 00:51:13,210 --> 00:51:14,210 Yeah, right. 1382 00:51:15,190 --> 00:51:18,519 We have no practical solution 1383 00:51:18,520 --> 00:51:20,649 yet, but once again, 1384 00:51:20,650 --> 00:51:22,400 it's better like uploading a plain text 1385 00:51:24,090 --> 00:51:25,090 file. 1386 00:51:26,080 --> 00:51:28,689 I recommend to encrypted Justis 1387 00:51:28,690 --> 00:51:30,849 techniques and you will 1388 00:51:30,850 --> 00:51:33,429 have got Rothery 1389 00:51:33,430 --> 00:51:35,649 the agencies will have a much 1390 00:51:35,650 --> 00:51:36,969 harder time than now. 1391 00:51:36,970 --> 00:51:39,469 And just looking at the plain text. 1392 00:51:39,470 --> 00:51:40,659 Oh, yeah. 1393 00:51:41,880 --> 00:51:43,979 Or you can yeah, you can 1394 00:51:43,980 --> 00:51:45,929 say, yeah, there is this problem, this 1395 00:51:45,930 --> 00:51:48,059 problem, so don't deal with it but 1396 00:51:48,060 --> 00:51:50,189 don't deal with it means you upload stuff 1397 00:51:50,190 --> 00:51:52,859 in plain text and then you're 1398 00:51:52,860 --> 00:51:53,969 the agencies. 1399 00:51:53,970 --> 00:51:56,099 The military is superheavy if 1400 00:51:56,100 --> 00:51:57,360 you upload plaintext. 1401 00:51:58,570 --> 00:52:00,309 Though you make that the top more 1402 00:52:00,310 --> 00:52:02,549 difficult issue of to perform 1403 00:52:02,550 --> 00:52:03,550 encryption. 1404 00:52:05,820 --> 00:52:07,899 Next question at middle middle 1405 00:52:07,900 --> 00:52:09,299 aged, please. 1406 00:52:09,300 --> 00:52:11,399 Are you aware of any schemes 1407 00:52:11,400 --> 00:52:13,859 to also offload the index calculation 1408 00:52:13,860 --> 00:52:15,689 to the provider and work on encrypted 1409 00:52:15,690 --> 00:52:16,690 data? 1410 00:52:18,920 --> 00:52:20,599 Whether we are aware of index schemes 1411 00:52:20,600 --> 00:52:23,299 that offload the index calculation, 1412 00:52:23,300 --> 00:52:25,429 as I understood your scheme, 1413 00:52:25,430 --> 00:52:27,319 that client is doing the index 1414 00:52:27,320 --> 00:52:28,320 calculation. 1415 00:52:29,240 --> 00:52:31,399 Now, if the index is to be calculated 1416 00:52:31,400 --> 00:52:33,499 by the service, that's possible to 1417 00:52:33,500 --> 00:52:35,959 make it much easier to create the index 1418 00:52:35,960 --> 00:52:37,099 when you know the plaintext. 1419 00:52:39,410 --> 00:52:40,489 That's something you thought about 1420 00:52:40,490 --> 00:52:41,809 searching, too, in the beginning, 1421 00:52:41,810 --> 00:52:44,809 although I'm not aware of a practical 1422 00:52:44,810 --> 00:52:47,229 scheme, but 1423 00:52:47,230 --> 00:52:49,459 that may be a scheme, 1424 00:52:49,460 --> 00:52:50,659 but I'm not aware of any. 1425 00:52:50,660 --> 00:52:51,660 I don't think you. 1426 00:52:52,680 --> 00:52:54,229 Next question now on this side. 1427 00:52:54,230 --> 00:52:56,259 In the middle, please. 1428 00:52:56,260 --> 00:52:57,579 Thank you for your great talk. 1429 00:52:57,580 --> 00:52:59,499 I was wondering, are there any approaches 1430 00:52:59,500 --> 00:53:01,209 to do a little bit more complicated 1431 00:53:01,210 --> 00:53:03,699 queries like get all the people 1432 00:53:03,700 --> 00:53:05,799 in your database that are older than 30, 1433 00:53:05,800 --> 00:53:06,800 for example? 1434 00:53:08,440 --> 00:53:10,179 OK. Yeah, it's 1435 00:53:11,230 --> 00:53:12,429 the yes, but no, but yes. 1436 00:53:16,660 --> 00:53:18,309 Well, I mean, there are schemes, right, 1437 00:53:18,310 --> 00:53:21,009 that there are schemes to 1438 00:53:21,010 --> 00:53:24,249 well, for example, order 1439 00:53:24,250 --> 00:53:26,679 your entry's you know, 1440 00:53:27,760 --> 00:53:29,709 you can encrypt in a clever way such that 1441 00:53:29,710 --> 00:53:31,179 you could steal all of them by ciphertext 1442 00:53:31,180 --> 00:53:32,559 and you would have them ordered. 1443 00:53:32,560 --> 00:53:34,089 If you decrypted them, then you would 1444 00:53:34,090 --> 00:53:36,309 still have the very same order in plain 1445 00:53:36,310 --> 00:53:37,310 text. But 1446 00:53:38,770 --> 00:53:41,019 it's difficult because some people 1447 00:53:41,020 --> 00:53:43,119 might not necessarily consider that to be 1448 00:53:43,120 --> 00:53:45,609 well, as secure as you would like 1449 00:53:45,610 --> 00:53:47,709 to be if you make 1450 00:53:47,710 --> 00:53:49,899 all the efforts of having really 1451 00:53:49,900 --> 00:53:52,179 secure. You have exponential size 1452 00:53:52,180 --> 00:53:54,759 in ciphertext. And yeah, 1453 00:53:54,760 --> 00:53:55,810 it's not practical. 1454 00:53:57,020 --> 00:53:59,119 Oh, yeah, you can 1455 00:53:59,120 --> 00:54:01,939 you can make it better 1456 00:54:01,940 --> 00:54:04,069 if you sacrifice some 1457 00:54:04,070 --> 00:54:06,139 security, but there are 1458 00:54:06,140 --> 00:54:08,389 some schemes, but 1459 00:54:08,390 --> 00:54:10,509 yeah, I've 1460 00:54:10,510 --> 00:54:12,739 yeah, right now, uh, we make 1461 00:54:12,740 --> 00:54:14,929 some we have made some 1462 00:54:14,930 --> 00:54:17,959 analysis and stuff and test, 1463 00:54:17,960 --> 00:54:20,029 but, uh, it's not as 1464 00:54:20,030 --> 00:54:22,309 easy as the schemes I 1465 00:54:22,310 --> 00:54:23,249 showed you. Yeah. 1466 00:54:23,250 --> 00:54:25,169 You can do it, but it's much more tricky 1467 00:54:25,170 --> 00:54:26,559 in hurry. 1468 00:54:26,560 --> 00:54:27,560 Yeah. 1469 00:54:28,420 --> 00:54:31,659 You have to consult a lot of people. 1470 00:54:31,660 --> 00:54:32,660 To the teeth, this. 1471 00:54:34,010 --> 00:54:35,749 Next question, please, on that side, you 1472 00:54:35,750 --> 00:54:38,269 have presented two ways for implementing 1473 00:54:38,270 --> 00:54:40,399 index based social 1474 00:54:40,400 --> 00:54:43,159 interaction. One with just encrypted 1475 00:54:43,160 --> 00:54:45,109 line was encrypted in the text, the one 1476 00:54:45,110 --> 00:54:46,879 where you additional hide the length of 1477 00:54:46,880 --> 00:54:47,809 the index. 1478 00:54:47,810 --> 00:54:50,779 Do you think it's a worthwhile 1479 00:54:50,780 --> 00:54:53,029 step to put the additional effort 1480 00:54:53,030 --> 00:54:55,369 into the second solution since 1481 00:54:55,370 --> 00:54:57,349 each time you're searching, you're 1482 00:54:57,350 --> 00:54:59,509 basically disclosing the relationship of 1483 00:54:59,510 --> 00:55:01,699 those rows, since you will be 1484 00:55:01,700 --> 00:55:04,069 sending like a bunch of search requests. 1485 00:55:04,070 --> 00:55:06,289 And if you just search 1486 00:55:06,290 --> 00:55:07,729 them one by one, you will also get the 1487 00:55:07,730 --> 00:55:09,949 low latency and you get a lot 1488 00:55:09,950 --> 00:55:11,149 of additional network overhead. 1489 00:55:11,150 --> 00:55:12,260 So is it worth it? 1490 00:55:13,730 --> 00:55:14,959 Depends on your needs, I guess. 1491 00:55:14,960 --> 00:55:16,879 I mean, the first scenario where you 1492 00:55:16,880 --> 00:55:19,069 don't hide the length, you're so 1493 00:55:19,070 --> 00:55:21,109 vulnerable against the network stealing 1494 00:55:21,110 --> 00:55:23,089 your hard disk or database, because then 1495 00:55:23,090 --> 00:55:25,489 you can run the analysis on the size 1496 00:55:25,490 --> 00:55:27,289 of the index values. 1497 00:55:27,290 --> 00:55:28,609 You don't have that only second scheme 1498 00:55:28,610 --> 00:55:29,959 where you hide the length. 1499 00:55:29,960 --> 00:55:32,059 So if that's you know of your 1500 00:55:32,060 --> 00:55:34,129 concern, then, well, you better 1501 00:55:34,130 --> 00:55:36,199 go for the hiding one instead 1502 00:55:36,200 --> 00:55:38,689 of the cheaper one thing. 1503 00:55:38,690 --> 00:55:40,219 Again, if you if you know your 1504 00:55:40,220 --> 00:55:42,289 requirements, you can engineer 1505 00:55:42,290 --> 00:55:44,329 your scheme such that it performs very 1506 00:55:44,330 --> 00:55:45,330 well. 1507 00:55:45,780 --> 00:55:47,609 Do you know how much information this 1508 00:55:47,610 --> 00:55:48,610 leaks? 1509 00:55:49,340 --> 00:55:51,979 It just totally depends on your takes 1510 00:55:51,980 --> 00:55:53,119 on the structure of latex 1511 00:55:54,350 --> 00:55:56,719 and also it's a difficult subject 1512 00:55:56,720 --> 00:55:59,179 matter to define the leakage, 1513 00:56:00,800 --> 00:56:02,929 that state of current discussions 1514 00:56:02,930 --> 00:56:04,129 among cryptographers. 1515 00:56:04,130 --> 00:56:05,689 What does leakage actually mean? 1516 00:56:08,080 --> 00:56:09,080 Thank you. 1517 00:56:10,180 --> 00:56:12,669 There is still someone standing 1518 00:56:12,670 --> 00:56:14,079 there, please. 1519 00:56:14,080 --> 00:56:16,239 Um, yeah, so for the 1520 00:56:16,240 --> 00:56:18,339 for the scenario you briefly mentioned, 1521 00:56:18,340 --> 00:56:20,439 like I have a three three terabyte 1522 00:56:20,440 --> 00:56:22,659 of loaded data and I want to search 1523 00:56:22,660 --> 00:56:25,149 it. Um, I 1524 00:56:25,150 --> 00:56:26,469 see that it makes a lot of sense to 1525 00:56:26,470 --> 00:56:28,089 cooperate with the cloud provider and 1526 00:56:28,090 --> 00:56:29,649 make them search. 1527 00:56:29,650 --> 00:56:32,409 But I sort of wondering 1528 00:56:32,410 --> 00:56:34,599 what the break even is, because if I have 1529 00:56:34,600 --> 00:56:36,669 way less data, let's say, I don't 1530 00:56:36,670 --> 00:56:38,829 know, just a few gigabytes like many 1531 00:56:38,830 --> 00:56:39,880 of us probably have, 1532 00:56:40,940 --> 00:56:43,059 it might also make sense to just 1533 00:56:43,060 --> 00:56:45,009 compute the index on my site, upload the 1534 00:56:45,010 --> 00:56:46,359 index somewhere. I know. 1535 00:56:46,360 --> 00:56:48,369 And then downloading just the index, 1536 00:56:48,370 --> 00:56:50,439 which is way smaller than my actual 1537 00:56:50,440 --> 00:56:51,909 data and then use that. 1538 00:56:51,910 --> 00:56:54,579 So have you looked into 1539 00:56:54,580 --> 00:56:56,919 at which point of of 1540 00:56:56,920 --> 00:56:59,349 the volume of data, um, 1541 00:56:59,350 --> 00:57:01,089 does it start to make sense to employ a 1542 00:57:01,090 --> 00:57:03,129 solution like that? Because I think that 1543 00:57:03,130 --> 00:57:05,679 most people just have a few gigabytes 1544 00:57:05,680 --> 00:57:06,939 and for a few gigabytes. 1545 00:57:06,940 --> 00:57:09,159 I don't I think it's 1546 00:57:09,160 --> 00:57:11,289 faster to just download the index 1547 00:57:11,290 --> 00:57:13,389 search and then download the specific 1548 00:57:13,390 --> 00:57:15,439 part of the encrypted data I want. 1549 00:57:15,440 --> 00:57:17,349 So where to when does it start to pay 1550 00:57:17,350 --> 00:57:18,350 off? 1551 00:57:19,660 --> 00:57:21,819 Oh, no, we didn't look into 1552 00:57:21,820 --> 00:57:23,919 that because we want to 1553 00:57:25,210 --> 00:57:27,369 build a cloud service, 1554 00:57:27,370 --> 00:57:29,439 that's such a lot of stuff 1555 00:57:29,440 --> 00:57:32,049 we're interested in, and 1556 00:57:32,050 --> 00:57:34,509 implementing a cloud service that 1557 00:57:34,510 --> 00:57:36,669 can perform such Ivankov good data. 1558 00:57:36,670 --> 00:57:39,459 Therefore, we have not looked at 1559 00:57:39,460 --> 00:57:40,460 the solution. 1560 00:57:43,110 --> 00:57:44,649 A little bit out of scope of our 1561 00:57:44,650 --> 00:57:45,650 research. 1562 00:57:46,710 --> 00:57:48,509 We still have time for one last question, 1563 00:57:48,510 --> 00:57:49,510 please go ahead. 1564 00:57:50,340 --> 00:57:52,619 Have you ever thought about 1565 00:57:52,620 --> 00:57:54,839 organizing all your text before 1566 00:57:54,840 --> 00:57:56,399 you encrypt them? 1567 00:57:56,400 --> 00:57:58,739 Because, uh, tokenized list doesn't 1568 00:57:58,740 --> 00:58:00,749 change so much you can keep it on your 1569 00:58:00,750 --> 00:58:02,879 devices would also 1570 00:58:02,880 --> 00:58:06,299 defeat, uh, frequency analysis 1571 00:58:06,300 --> 00:58:08,489 and encrypted data would be a lot 1572 00:58:08,490 --> 00:58:09,490 smaller as well. 1573 00:58:11,670 --> 00:58:14,679 Yes, you're right. 1574 00:58:14,680 --> 00:58:16,369 We can you can do it this way. 1575 00:58:16,370 --> 00:58:17,469 Yeah, well, OK. 1576 00:58:19,120 --> 00:58:20,419 Uh. 1577 00:58:20,420 --> 00:58:22,759 So as it seems, there are no questions 1578 00:58:22,760 --> 00:58:24,759 left. What about the Internet? 1579 00:58:24,760 --> 00:58:27,109 No, the Internet is quite so 1580 00:58:27,110 --> 00:58:28,159 small, of course, and follow. 1581 00:58:28,160 --> 00:58:29,989 Thank you very much for your talk. 1582 00:58:29,990 --> 00:58:31,619 Thank you, House.