1 00:00:00,000 --> 00:00:13,047 *rC3 preroll music* 2 00:00:13,047 --> 00:00:17,730 Herald: Our next speaker, Alisa Esage, is an independent vulnerability researcher 3 00:00:17,730 --> 00:00:22,640 and has a notable record of security research achievements such as this year, 4 00:00:22,640 --> 00:00:29,770 the initiative Silver Bounty Hunter Awards 2018. Alisa is going to present her latest 5 00:00:29,770 --> 00:00:36,007 research on the Qualcomm DIAG protocol, which is found abundantly in Qualcomm 6 00:00:36,007 --> 00:00:46,500 Hexagon based cellular modems. Alisa, we're looking forward to your talk now. 7 00:00:46,500 --> 00:00:49,701 Alisa Esage: This is Alisa Esage, you're attending my presentation about Advanced 8 00:00:49,701 --> 00:01:01,010 Hexagon DIAG at Chaos Communication Congress 2020 remote experience. My main 9 00:01:01,010 --> 00:01:06,250 interest as advanced vulnerability researcher is complex systems and hardened 10 00:01:06,250 --> 00:01:11,920 systems. For the last 10 years I have been researching various classes of software 11 00:01:11,920 --> 00:01:16,280 such as Windows kernel, browsers, JavaScript engines. And for the last three 12 00:01:16,280 --> 00:01:21,880 years I was focusing mostly on Hypervisors. The project that I'm 13 00:01:21,880 --> 00:01:27,970 presenting today was a little side project that I made for distraction a couple years 14 00:01:27,970 --> 00:01:37,560 ago. The name of this talk Advanced Hexagon DIAG is a bit of an understatement 15 00:01:37,560 --> 00:01:45,290 in the attempt to keep this talk a little bit low key in the general internet, 16 00:01:45,290 --> 00:01:50,840 because a big part of the talk will actually be devoted to a general 17 00:01:50,840 --> 00:01:56,710 vulnerability research in basebands. But the primary focus of this talk is on the 18 00:01:56,710 --> 00:02:02,899 Hexagon DIAG, also known as QCDM Qualcomm diagnostic manager. This is a proprietary 19 00:02:02,899 --> 00:02:09,229 protocol developed by Qualcomm for use in their basebands, and it is included on all 20 00:02:09,229 --> 00:02:18,400 Snapdragon SoCs and modem chips produced by Qualcomm. More than Qualcomm chips run 21 00:02:18,400 --> 00:02:24,299 on custom silicone with a custom instruction set architecture and named 22 00:02:24,299 --> 00:02:30,930 QDSP6 Hexagon. This is important because all the DIAG handlers that we will be 23 00:02:30,930 --> 00:02:41,699 dealing with are written in this instruction set architecture. As usually 24 00:02:41,699 --> 00:02:47,769 with my talks, I have adjusted the materials of this presentation for various 25 00:02:47,769 --> 00:02:52,659 audiences, for the full spectrum of audiences, specifically the first part of 26 00:02:52,659 --> 00:03:00,699 the presentation is mostly specialized for research directors and high level 27 00:03:00,699 --> 00:03:06,719 technical staff. And the last part is more deep technical. And it would be mostly 28 00:03:06,719 --> 00:03:14,510 interesting to specialized vulnerability researchers and low level programmers that 29 00:03:14,510 --> 00:03:25,400 somehow are related to this particular area. Let's start from the top level 30 00:03:25,400 --> 00:03:31,540 overview of cellular technology. This mind map presents a simplified view of various 31 00:03:31,540 --> 00:03:36,739 types of entities that we'd have to deal with with respect to basebands. It's not a 32 00:03:36,739 --> 00:03:44,659 complete diagram, of course, but it only presents the classes of entities that 33 00:03:44,659 --> 00:03:51,540 exist in this space. Also, this mind map is specific to the clean site equipment, 34 00:03:51,540 --> 00:03:57,109 the user equipment and it completely omits any server side considerations which are a 35 00:03:57,109 --> 00:04:02,290 world in their own. There exists quite a large number of cellular protocols on the 36 00:04:02,290 --> 00:04:08,199 planet. From the user perspective, this is simple. This is usually the shared name 37 00:04:08,199 --> 00:04:15,469 3G, 4G that you see on the mobile screen. But in reality, this simple name, that 38 00:04:15,469 --> 00:04:27,409 generation name encodes - may encode several different distinct technologies. 39 00:04:27,409 --> 00:04:32,620 There are a few key points about cellular protocols that are crucial to understand 40 00:04:32,620 --> 00:04:38,860 before starting to approach this area. The first one is the concept of a generation. 41 00:04:38,860 --> 00:04:45,379 This is simple. This is simply 1G, 2G and so on. The generic name of the family of 42 00:04:45,379 --> 00:04:49,910 protocols that are supported in a particular generation. Generation is 43 00:04:49,910 --> 00:04:55,539 simply a marketing name, for users. It doesn't really have any strict technical 44 00:04:55,539 --> 00:05:02,199 meaning. And generations represent the evolution of cellular protocols in time. 45 00:05:02,199 --> 00:05:06,840 The second most important thing about cellular protocols is the air interface. 46 00:05:06,840 --> 00:05:13,629 This is.. or the protocol, which actually.. this is the lowest level protocol which 47 00:05:13,629 --> 00:05:20,270 defines how exactly the cellular signal is digitized and read from the 48 00:05:20,270 --> 00:05:26,700 electromagnetic wave and how exactly the different players in this field divide the 49 00:05:26,700 --> 00:05:32,990 space. Historically, there existed two main implementations of this low level 50 00:05:32,990 --> 00:05:39,330 code called TDMA and CDMA. TDMA means time division multiple access, which basically 51 00:05:39,330 --> 00:05:43,670 divides the entire electromagnetic spectrum within the radio band into time 52 00:05:43,670 --> 00:05:51,490 slots that are rotated in a round robin manner by various mobile phones so that 53 00:05:51,490 --> 00:06:04,319 they speak in turns. TDMA was the base for the GSM technology. And GSM was the main 54 00:06:04,319 --> 00:06:09,919 protocol used on this planet for a long time. Another low level implementation is 55 00:06:09,919 --> 00:06:16,689 CDMA. It was a little bit more complex from the beginning. It's decoded as coded 56 00:06:16,689 --> 00:06:24,300 division multiple access. And instead of dividing the spectrum in time slots and 57 00:06:24,300 --> 00:06:32,580 dividing the protocol in bursts, CDMA uses random codes that are assigned to mobile 58 00:06:32,580 --> 00:06:43,060 phones so that this code can be used as an additional randomizing mask against the 59 00:06:43,060 --> 00:06:48,400 modulation protocol. And multiple user equipments can talk on the same frequency 60 00:06:48,400 --> 00:06:57,110 without interrupting each other. Note here that CDMA was developed by Qualcomm and it 61 00:06:57,110 --> 00:07:03,159 was mostly used in the United States. So at the level of 2G, there were two main 62 00:07:03,159 --> 00:07:11,581 protocols, GSM based on the TDMA and the cdmaOne based on the CDMA. On the third 63 00:07:11,581 --> 00:07:17,919 generation of mobile protocols these two branches of development were continued. So 64 00:07:17,919 --> 00:07:24,160 GSM evolved into UMTS, while cdmaOne evolved into CDMA2000. The important point 65 00:07:24,160 --> 00:07:31,029 here is that UMTS has at this point already adopted the low level air 66 00:07:31,029 --> 00:07:37,340 interface protocol from the CDMA and eventually at the fourth generation of 67 00:07:37,340 --> 00:07:41,240 protocols these two branches of development come together to create the 68 00:07:41,240 --> 00:07:52,680 LTE technology and the same for the 5G. This is a bit important for us as from the 69 00:07:52,680 --> 00:07:57,909 offensive perspective, because first of all, all of this technologies including 70 00:07:57,909 --> 00:08:04,999 the air interfaces represents separate bits of code with separate parsing 71 00:08:04,999 --> 00:08:09,900 algorithms within the baseband firmware. And all of them are usually presented in 72 00:08:09,900 --> 00:08:15,099 each baseband, regardless of which one you actually use. Does your mobile provider 73 00:08:15,099 --> 00:08:20,919 actually support. Another important and not obvious thing from the offensive 74 00:08:20,919 --> 00:08:29,940 security perspective here is that because of this, evolutionary development of the.. 75 00:08:29,940 --> 00:08:34,669 protocols are not actually completely distinct. So if you think about LTE, it is 76 00:08:34,669 --> 00:08:39,289 not a completely different protocol from GSM, but instead it is based largely on 77 00:08:39,289 --> 00:08:47,600 the same internal structures. And in fact, if you look at the specifications, some of 78 00:08:47,600 --> 00:08:53,560 them are almost directly relevant. The specifications of the GSM 2G, some of them 79 00:08:53,560 --> 00:08:59,810 are still directly relevant to some extent to LTE. This is also important when you 80 00:08:59,810 --> 00:09:06,350 start analyzing protocols from the offensive perspective. The cellular 81 00:09:06,350 --> 00:09:17,460 protocols are structured in a nested way, in layers. Layers is the official 82 00:09:17,460 --> 00:09:25,120 terminology adopted by the specifications with the exception of level zero. Here I 83 00:09:25,120 --> 00:09:29,980 just edited it for convenience, but it's in the specifications layer start from one 84 00:09:29,980 --> 00:09:34,649 and proceed to three. From the offensive perspective, the most interesting is level 85 00:09:34,649 --> 00:09:39,050 three, as you can see from the screenshot of the specifications, because it encodes 86 00:09:39,050 --> 00:09:45,260 most of the high level protocol data, such as handling SMS and GSM. This is the part 87 00:09:45,260 --> 00:09:49,830 of the protocol which actually contains interesting data structures with TLV 88 00:09:49,830 --> 00:09:58,550 values and so on. When people talk about attack in basebands, they usually mean 89 00:09:58,550 --> 00:10:06,010 attack in baseband over the air. Their OTA attack vector, which is definitely one of 90 00:10:06,010 --> 00:10:11,930 the most interesting. But let's take a step back and consider the entire big 91 00:10:11,930 --> 00:10:21,070 picture of the baseband ecosystem. This diagram presents a unified view of 92 00:10:21,070 --> 00:10:28,009 generalized architecture of a modern baseband with attack surfaces. First of 93 00:10:28,009 --> 00:10:34,680 all, there are two separate distinct processors: the AP, application processor, 94 00:10:34,680 --> 00:10:40,140 and the MP, which is mobile processor. It may be either a DSP or another CPU. 95 00:10:40,140 --> 00:10:45,290 Usually there are two separate processors and each one of them runs a separate 96 00:10:45,290 --> 00:10:51,311 operating system. In case of the AP, it may be Android or iOS and the baseband 97 00:10:51,311 --> 00:10:55,940 processor will draw on some sort of real- time operating system provided by the 98 00:10:55,940 --> 00:11:03,400 mobile vendor. Important point here that on modern implementations, baseband 99 00:11:03,400 --> 00:11:08,649 actually protected by some sort of secure execution environment, maybe TrustZone on 100 00:11:08,649 --> 00:11:17,100 Androids or SEPOS on Apple devices. Which means that the privilege boundary which is 101 00:11:17,100 --> 00:11:22,820 depicted here on the left side is dual sided. So even if you have kernel access 102 00:11:22,820 --> 00:11:29,740 to the Android kernel, you still are not supposed to be able to read the memory of 103 00:11:29,740 --> 00:11:33,620 the baseband or somehow intersect with its operation, at least on the modern 104 00:11:33,620 --> 00:11:38,560 production smartphones. And the same goes around to the baseband, which is not 105 00:11:38,560 --> 00:11:45,540 supposed to be able to access to application processor directly. So these two are 106 00:11:45,540 --> 00:11:50,191 mutually distrusting entities that are separated from each other. And so there 107 00:11:50,191 --> 00:12:01,892 exists privilege boundary, which is - which represents attack surface. Within 108 00:12:01,892 --> 00:12:07,389 the real-time operating systems, there are three large attack surfaces. Starting from 109 00:12:07,389 --> 00:12:14,180 right to left: the rightmost gray box represents the attack surface of the 110 00:12:14,180 --> 00:12:20,639 cellular stacks. This is the code which actually parses the cellular protocols. 111 00:12:20,639 --> 00:12:31,699 It's usually runs in several distant real- time operating system tasks. And this part 112 00:12:31,699 --> 00:12:38,519 of the attack surface handles all the layers of the protocol. There is a huge 113 00:12:38,519 --> 00:12:44,070 amount of parsing that happens here. The second box represents the various 114 00:12:44,070 --> 00:12:50,980 management protocols. The simplest one to think about is the AT command protocol. It 115 00:12:50,980 --> 00:12:56,700 is still widely included in all basebands, and it's even usually exposed in some way 116 00:12:56,700 --> 00:13:01,279 to the application processor. So you can actually send some AT commands to the 117 00:13:01,279 --> 00:13:09,000 cellular modem. About a bit more interesting is the vendor specific management 118 00:13:09,000 --> 00:13:16,680 protocols, one of them is the DIAG protocol. Because the modern basebands are 119 00:13:16,680 --> 00:13:22,569 very complex. So vendors need some sort of specialized protocol to enable 120 00:13:22,569 --> 00:13:28,910 configuration and diagnostics for the OEM's. In case of Qualcomm, for example, 121 00:13:28,910 --> 00:13:37,170 DIAG is just one of the many diagnostic protocols involved. The third box is what 122 00:13:37,170 --> 00:13:45,350 I call the RTOS core, it is various core level functionality, such as the 123 00:13:45,350 --> 00:13:57,770 code, which implements that interface to the application processor. On the side of 124 00:13:57,770 --> 00:14:04,019 the application operating system such as Android, there are also 2 attack surfaces 125 00:14:04,019 --> 00:14:10,370 that are attackable from the baseband. The first one is the peripheral drivers, 126 00:14:10,370 --> 00:14:13,579 because the basement is a separate part of peripherals. So it requires some 127 00:14:13,579 --> 00:14:21,110 specialized drivers that handle I/O and such things. And the second one is the 128 00:14:21,110 --> 00:14:29,002 dark surface represented with various interface handlers because the baseband 129 00:14:29,002 --> 00:14:34,800 and the main operating system cannot communicate directly. They use some sort 130 00:14:34,800 --> 00:14:39,839 of a specialized interface to do that. In case of Qualcomm this is shared memory. 131 00:14:39,839 --> 00:14:44,670 And so this shared memory implementations are usually quite complex and they 132 00:14:44,670 --> 00:14:51,460 represent an attack surface on the both sides. And finally, the third piece of this 133 00:14:51,460 --> 00:14:57,319 diagram is in the lowest part. I have depicted two grey boxes which are related 134 00:14:57,319 --> 00:15:03,139 to the trusted execution environment. Because typically a modem runs as a 135 00:15:03,139 --> 00:15:11,379 Trustled in a secure environment. So technically, the attack surfaces that 136 00:15:11,379 --> 00:15:16,550 exists within TrustZone or related to it also can be useful for baseband offensive 137 00:15:16,550 --> 00:15:22,890 research. Here we can distinguish at least two large attack surfaces. The first one 138 00:15:22,890 --> 00:15:31,490 is the secure manager of call handlers, which is the core interface that 139 00:15:31,490 --> 00:15:36,960 handles calls from the application processor to the TrustZone. And the second 140 00:15:36,960 --> 00:15:44,810 one are the Trustlets. They are separate pieces of code which are executed and 141 00:15:44,810 --> 00:15:56,790 protected by the TrustZone. On this diagram, I have also added some 142 00:15:56,790 --> 00:16:02,839 information about data codex, I'm not sure if they are supposed to be in the RTOS 143 00:16:02,839 --> 00:16:06,319 core because these things are directly accessible from the cellular stacks 144 00:16:06,319 --> 00:16:14,959 usually, especially ASN. 1, which I have seen some bugs reachable from the over the 145 00:16:14,959 --> 00:16:23,009 air interface. On this diagram, I have shown some example of vulnerabilities. I 146 00:16:23,009 --> 00:16:26,769 will not discuss them in details here since it's not the point of the 147 00:16:26,769 --> 00:16:32,480 presentation, but at least the ones from Baodong, you can find the writeups on 148 00:16:32,480 --> 00:16:46,589 the Internet. To discuss baseband offensive tools and approaches, I have 149 00:16:46,589 --> 00:16:50,720 narrowed down the previous diagram to just one attack surface, the over the air 150 00:16:50,720 --> 00:16:55,620 attack surface. This is the attack surface, which is represented by parsing 151 00:16:55,620 --> 00:16:59,480 implementations of various cellular protocols inside the baseband operating 152 00:16:59,480 --> 00:17:06,610 system. And this is the attack surface that we can reach from the air interface. 153 00:17:06,610 --> 00:17:13,390 In order to accomplish that, we need a transceiver such as software defined radio 154 00:17:13,390 --> 00:17:21,170 or a mobile tester, which is able to talk the specific cellular protocol that we're 155 00:17:21,170 --> 00:17:28,780 planning to attack. The simplest way to accomplish this is use some sort of a 156 00:17:28,780 --> 00:17:34,730 software defined radio, such as Ettus research USRP or blade RF and install open 157 00:17:34,730 --> 00:17:41,240 source implementation of a base station such as OpenBTS or OpenBSC. The thing to 158 00:17:41,240 --> 00:17:50,050 note here is that the software based implementations actually lagged behind the 159 00:17:50,050 --> 00:17:54,970 development of technologies. Implementations of GSM base stations are 160 00:17:54,970 --> 00:18:03,630 very well established and popular, such as OpenBTS. And in fact, when I tried to 161 00:18:03,630 --> 00:18:15,140 establish BTS with my USRP, it was quite simple. For UMTS and LTE, there exists less 162 00:18:15,140 --> 00:18:19,950 number of software based implementations and also there are more constraints on the 163 00:18:19,950 --> 00:18:26,310 hardware. For example, my model of the USRP does not support UMTS due to resource 164 00:18:26,310 --> 00:18:31,690 constraints. And the most interesting thing here is that there does not exist 165 00:18:31,690 --> 00:18:36,580 any software based implementation on the CDMA that you can use to establish a base 166 00:18:36,580 --> 00:18:53,270 station. This is a pseudorandom diagram of one of the Snapdragon chips. There exists 167 00:18:53,270 --> 00:18:58,820 a huge amount of various models of Snapdragons. This one I have chosen 168 00:18:58,820 --> 00:19:05,680 pseudorandomly when I was searching for some sort of visual diagram. Qualcomm used 169 00:19:05,680 --> 00:19:12,030 to include some high level diagrams of the architecture in their marketing materials 170 00:19:12,030 --> 00:19:19,400 previously. But since they don't do this anymore. And this particular diagram is 171 00:19:19,400 --> 00:19:26,820 from a technical specification of a particular model 820. Also this particular 172 00:19:26,820 --> 00:19:34,420 model Snapdragon is... a bit interesting because it is the first one that included 173 00:19:34,420 --> 00:19:44,790 the artificial intelligence agent, which is also based on Hexagon. For all 174 00:19:44,790 --> 00:19:52,890 purposes, the main interest here are the processors. Majority of snapdragons 175 00:19:52,890 --> 00:19:59,630 include quite a long list of processors. There are at least 4 ARM-based Kryo-CPUs 176 00:19:59,630 --> 00:20:11,480 that actually run the Android operating system. Then there are the Adreno GPUs and 177 00:20:11,480 --> 00:20:16,380 then there are several Hexagons. On the most recent models there is not just one 178 00:20:16,380 --> 00:20:23,360 Hexagon processing unit, but several of them. And they are called respectively to 179 00:20:23,360 --> 00:20:28,030 their purposes. Each one of them, each one of these Hexagon cores is responsible for 180 00:20:28,030 --> 00:20:35,770 handling a specific functionality. For example, MDSB handles modem and runs the 181 00:20:35,770 --> 00:20:44,260 real-time operating system. The ADSP handles media and the CDSP handles 182 00:20:44,260 --> 00:20:52,540 compute. So the Hexagons actually represent around one half of the 183 00:20:52,540 --> 00:21:08,771 processing power, more than Snapdragons. There are two key points about the Hexagon 184 00:21:08,771 --> 00:21:17,501 architecture from the hardware perspective. First of all, it is- Hexagon 185 00:21:17,501 --> 00:21:25,410 is specialized to parallel processing. And so the first concept is variable size 186 00:21:25,410 --> 00:21:31,000 destruction packets. It means that several instructions can execute 187 00:21:31,000 --> 00:21:42,330 simultaneously in separate execution units. It also uses hardware 188 00:21:42,330 --> 00:21:48,990 multithreading for the same purposes. On the right side of the slide here is some 189 00:21:48,990 --> 00:22:00,630 example of the Hexagon assembly. It is quite funny at times. This curly brackets 190 00:22:00,630 --> 00:22:07,160 should present the instructions that are executed simultaneously. And these 191 00:22:07,160 --> 00:22:15,500 instructions must be compactable in order to be able to use that distant processing 192 00:22:15,500 --> 00:22:21,040 slots. And then there is the funny .new notation which actually enables the 193 00:22:21,040 --> 00:22:26,050 instructions to use both the old and the new value of a particular register within 194 00:22:26,050 --> 00:22:32,850 the same instruction cycle. This provides quite a bit of optimization on the lower 195 00:22:32,850 --> 00:22:41,200 level. For more information, I can direct you to the Hexagon Specification and 196 00:22:41,200 --> 00:22:53,830 programmers reference manual, which is available from the Qualcomm website. The 197 00:22:53,830 --> 00:22:59,270 concept of production fusing is quite common. As I said previously, it's a 198 00:22:59,270 --> 00:23:05,590 common practice from mobile device vendors to lock down the devices before they enter 199 00:23:05,590 --> 00:23:11,540 the market to prevent modifications and tinkering. And for the purposes of this 200 00:23:11,540 --> 00:23:17,300 locking down, they usually- there are several ways how this can be accomplished. 201 00:23:17,300 --> 00:23:24,356 Usually various advanced diagnostic and debugging functionalities are removed from 202 00:23:24,356 --> 00:23:30,820 either software or hardware or both. It is quite common that this functionalities are 203 00:23:30,820 --> 00:23:37,180 only removed from software while the hardware remains here. And in such case, 204 00:23:37,180 --> 00:23:43,869 we will- eventually the researchers will come up with their own software based 205 00:23:43,869 --> 00:23:50,050 implementation. All this functionality as in case with some custom iOS kernel 206 00:23:50,050 --> 00:23:55,910 debuggers, for example. In case of Qualcomm, there was at some point a leaked 207 00:23:55,910 --> 00:24:02,416 internal memo which discusses what exactly they are doing for production fusing the 208 00:24:02,416 --> 00:24:15,730 devices. In addition to our production fusing in case of modern Androids, the 209 00:24:15,730 --> 00:24:22,860 baseband runs within the trust zone. And on my implementation, it is already quite 210 00:24:22,860 --> 00:24:28,680 locked down. It uses a separate component. The baseband uses a separate component 211 00:24:28,680 --> 00:24:36,510 named the MBA this stands for the modem basic authenticator. And this entire thing 212 00:24:36,510 --> 00:24:42,210 is run by the subsystem of Android kernel named PILO, the peripheral image loader. 213 00:24:42,210 --> 00:24:50,820 You can open the source code and investigate how exactly it looks. And the 214 00:24:50,820 --> 00:24:57,430 purpose of the MBA is to authenticate the modem firmware so that you would not be 215 00:24:57,430 --> 00:25:04,000 able to inject some arbitrary commands into the modem firmware and flash it. This 216 00:25:04,000 --> 00:25:09,250 is another side of the hardening, which makes it very difficult to inject any 217 00:25:09,250 --> 00:25:13,260 arbitrary code into the baseband. Basically, the only way to do this is 218 00:25:13,260 --> 00:25:23,130 through a software vulnerability. During this project I have reverse engineered 219 00:25:23,130 --> 00:25:33,360 partially the Hexagon modem firmware from my implementation, from my Nexus 6b. The 220 00:25:33,360 --> 00:25:38,770 process of reverse engineering is not very difficult because all you need is to 221 00:25:38,770 --> 00:25:44,950 download the firmware from the website, Googles website in this case. Then you 222 00:25:44,950 --> 00:25:50,960 need to find the binary which corresponds to the modem firmware. This binary is 223 00:25:50,960 --> 00:25:57,680 actually a compound binary that must be divided into separate binaries that 224 00:25:57,680 --> 00:26:04,940 represent specific sections inside the firmware. And for that purpose we can use 225 00:26:04,940 --> 00:26:11,410 the unified Trustlet script. After you have split the baseband firmware into separate 226 00:26:11,410 --> 00:26:18,270 sections, you can load them into IDA Pro. There are several plugins available for 227 00:26:18,270 --> 00:26:26,110 IDA Pro that support Hexagon. I have tried one of them. I think it was GSMK and it 228 00:26:26,110 --> 00:26:35,650 works quite good for basic reverse engineering purposes. Notable here is that 229 00:26:35,650 --> 00:26:41,660 some sections of the modem firmware are compressed and relocated at runtime, so 230 00:26:41,660 --> 00:26:48,350 you would not be able to reverse engineer them. And unless you can decompress them, 231 00:26:48,350 --> 00:26:52,270 which is also a bit of a challenge because the Qualcomm uses some internal 232 00:26:52,270 --> 00:27:02,000 compression algorithm for that. For the reverse engineering the main approach here 233 00:27:02,000 --> 00:27:06,010 is to get started with some root points, for example, because this is a real time 234 00:27:06,010 --> 00:27:11,290 operating system, we know that it should have some task structures and task 235 00:27:11,290 --> 00:27:16,340 structures that we can locate. And from there we can locate some interesting code. 236 00:27:16,340 --> 00:27:20,160 In case of Hexagon this is a bit non- trivial because, as I said, it doesn't 237 00:27:20,160 --> 00:27:24,930 have any log strings. So even though you may locate something that looks like a 238 00:27:24,930 --> 00:27:30,530 task struct, but it's not clear which code does it actually represent. So the first 239 00:27:30,530 --> 00:27:43,360 step here is to apply the log strings that were removed from the binary by Qshrink. I 240 00:27:43,360 --> 00:27:51,920 think the only way to do it is by using that msg_hash.txt file from the leaked 241 00:27:51,920 --> 00:27:57,590 sources. This file is not supposed to be available neither on the mobile devices 242 00:27:57,590 --> 00:28:05,470 nor in some open ecosystem. And after you have applied these log strings, you will 243 00:28:05,470 --> 00:28:10,841 be able to rename some functions. And based on these log strings and because the 244 00:28:10,841 --> 00:28:17,420 log strings often contain the names of the source file, source module from which the 245 00:28:17,420 --> 00:28:27,090 code was built. So it creates opportunity to understand what exactly this code is 246 00:28:27,090 --> 00:28:34,920 doing. Debugging was completely unavailable in my case, and I realized 247 00:28:34,920 --> 00:28:44,820 that it would require some couple of months more work to make it work and the 248 00:28:44,820 --> 00:28:49,490 only way I think, and the best way is to create a software based debugger similar 249 00:28:49,490 --> 00:28:57,100 to modkit, the publication that I will be referencing in the references, based on 250 00:28:57,100 --> 00:29:05,520 software vulnerability in either the modem itself or in some authenticator or in the 251 00:29:05,520 --> 00:29:09,700 trust zone so that we can inject a software debugger callbacks into the 252 00:29:09,700 --> 00:29:20,180 baseband and connect it to the GDB stop. This is how the part of the firmware looks 253 00:29:20,180 --> 00:29:28,040 that has log strings stripped out. Here it already has some names applied using IDA 254 00:29:28,040 --> 00:29:32,940 script. So of course there was no such names initially, only the hashes. Each one 255 00:29:32,940 --> 00:29:38,450 of these hashes represent a log string that you can take in from the message hash 256 00:29:38,450 --> 00:29:48,720 file. And here is what you can get after you have applied the textual messages and 257 00:29:48,720 --> 00:29:54,120 renamed some functions. In this case, you would be able to find some hundreds of 258 00:29:54,120 --> 00:29:59,600 procedures that are directly related to the DIAG subsystem. And in a similar way 259 00:29:59,600 --> 00:30:07,460 you can locate various subsystems related to over the air vectors as well. But 260 00:30:07,460 --> 00:30:17,650 unfortunately, majority of the OTA vectors are located in the segments that are not 261 00:30:17,650 --> 00:30:23,190 immediately available in the firmware, the ones that are compressed and relocated. 262 00:30:23,190 --> 00:30:31,360 Meanwhile, I have tried many different things during this project. The things 263 00:30:31,360 --> 00:30:37,360 that definitely worked is building the MSM kernel. There is nothing special about 264 00:30:37,360 --> 00:30:44,980 this, just a regular cross-build. Another commonly well known offensive approach is 265 00:30:44,980 --> 00:30:50,280 firmware downgrades. When you take some old firmware that contains a well-known 266 00:30:50,280 --> 00:30:56,070 security vulnerability and flash it and use the bug to create and exploit to 267 00:30:56,070 --> 00:31:06,680 achieve some additional functionality or introspection into the system. This part 268 00:31:06,680 --> 00:31:13,390 definitely works, downgrades are trivial both on the entire firmware and a modem as 269 00:31:13,390 --> 00:31:18,870 well as the trust zone. I did try to build the Qualcomm firmware from the leaked 270 00:31:18,870 --> 00:31:23,420 source codes. I assigned just a few days to the task since it's not mission- 271 00:31:23,420 --> 00:31:29,700 critical and I have run out of time, probably was a different version of sorce 272 00:31:29,700 --> 00:31:37,820 codes. But actually, this is not a critical project because building leaked 273 00:31:37,820 --> 00:31:42,250 firmware is not directly relevant to finding new bugs in the production 274 00:31:42,250 --> 00:31:53,140 firmware. So I just said it aside for some later investigation. I have also 275 00:31:53,140 --> 00:31:58,380 investigated the ramdump's ecosystem a little bit on the software side at least. 276 00:31:58,380 --> 00:32:10,640 And it seems that it's also fused quite reliably. This is when I remembered about 277 00:32:10,640 --> 00:32:16,890 the Qualcomm DIAG. During the initial reconnaisance I stumbled on some 278 00:32:16,890 --> 00:32:23,720 whitepapers and slides that mentioned the Qualcomm diagnostic protocol. And it 279 00:32:23,720 --> 00:32:27,960 seemed like quite a powerful protocol, specifically with respect to reconfiguring 280 00:32:27,960 --> 00:32:33,910 the baseband. So I decided to, first of all, to test it in case that it would 281 00:32:33,910 --> 00:32:37,810 actually provide some advanced introspection functionality and then 282 00:32:37,810 --> 00:32:48,790 probably to use it.. to use the protocol for enabling log dumps. Qualcomm DIAG or QCDM 283 00:32:48,790 --> 00:32:53,290 is a proprietary protocol developed by Qualcomm with the purposes of advanced 284 00:32:53,290 --> 00:32:59,910 baseband software configuration and diagnostics. It is mostly aimed for OEM 285 00:32:59,910 --> 00:33:07,410 developers, not for users. The Qualcomm DIAG protocol consists of around 200 286 00:33:07,410 --> 00:33:14,660 commands at least in theory. Some of them are quite powerful on paper such as 287 00:33:14,660 --> 00:33:25,450 downloader mode and read/write memory. Initially the DIAG was partially reverse 288 00:33:25,450 --> 00:33:33,580 engineered around 2010 and included in the open source project named Modem Manager. 289 00:33:33,580 --> 00:33:39,680 And then it was also exposed in a presentation at the Chaos Communication 290 00:33:39,680 --> 00:33:49,840 Congress 2011 by Guillaume Delugré. I think this presentation popularized it and 291 00:33:49,840 --> 00:33:55,050 this is the one that introduced me to this protocol. Unfortunately, that presentation 292 00:33:55,050 --> 00:34:01,771 is not really relevant - majority of it - to modern production phones, but it does 293 00:34:01,771 --> 00:34:08,200 provide a high level overview and a general expectation of what you will have 294 00:34:08,200 --> 00:34:15,149 to deal with. From the offensive perspective, the DIAG protocol represents 295 00:34:15,149 --> 00:34:21,240 a local attack vector from the application processor to the baseband. A common 296 00:34:21,240 --> 00:34:27,319 scenario of how it can be useful is unlocking mobile phones which are locked 297 00:34:27,319 --> 00:34:33,269 to a particular mobile carrier. If we find a memory corruption vulnerability in DIAG 298 00:34:33,269 --> 00:34:40,829 protocol, it may be possible to execute a call directly on the baseband and change 299 00:34:40,829 --> 00:34:45,089 some internal settings. This is usually accomplished historically through the IT 300 00:34:45,089 --> 00:34:51,429 common handlers, but internal proprietary protocols are also very convenient for 301 00:34:51,429 --> 00:34:59,740 that. The second scenario how that diag offensive can be useful is using it for 302 00:34:59,740 --> 00:35:08,750 injecting a software based debugger. If you can find a bug in DIAG that enables 303 00:35:08,750 --> 00:35:14,440 read/write capability on the baseband, you can inject some debugging hooks and 304 00:35:14,440 --> 00:35:22,509 eventually connect it to a GDB stop. So it enables to create a software based 305 00:35:22,509 --> 00:35:32,450 debugger even when GTAG is not available. What has changed in DIAG in 10 years based 306 00:35:32,450 --> 00:35:37,750 on some cursory investigation that I did. First of all, the original publication 307 00:35:37,750 --> 00:35:46,390 mentioned Qualcomm baseband based on ARM and with a Rex operating system. All modern 308 00:35:46,390 --> 00:35:50,770 Qualcomm basements are based on Hexagon as opposed to ARM. And the Rex 309 00:35:50,770 --> 00:35:57,470 operating system was replaced with Kirt, which I think is still has some bits of 310 00:35:57,470 --> 00:36:05,359 Rex, but in general it's a different operating system. Majority of super 311 00:36:05,359 --> 00:36:09,921 powerful commands of DIAG such as downloader mode and memory read/write were 312 00:36:09,921 --> 00:36:17,369 removed, at least on my device. And also it does not expose any immediately 313 00:36:17,369 --> 00:36:25,579 available interfaces such as USB channel. I hear that it's possible to enable the 314 00:36:25,579 --> 00:36:37,040 USB DIAG channel by adding some special boot properties, but usually it's not, it 315 00:36:37,040 --> 00:36:42,650 wouldn't be available. It shouldn't be expected to be available on all devices. 316 00:36:42,650 --> 00:36:48,599 So this observations are based on my test device, Nexus 6b. And this this should be 317 00:36:48,599 --> 00:36:57,150 around medium level of hardening. More modern devices such as Google pixels, the 318 00:36:57,150 --> 00:37:02,799 modern ones should be expected to be even more hardened than that. Especially on the 319 00:37:02,799 --> 00:37:07,720 Google side, because they take hardening very seriously. As opposed to it on the 320 00:37:07,720 --> 00:37:14,631 other side of the spectrum if you think about some no name modem sticks, these 321 00:37:14,631 --> 00:37:24,329 things can be more open and more easy to investigate. The DIAG implementation 322 00:37:24,329 --> 00:37:29,119 architecture is relatively simple. This diagram is based roughly on the same 323 00:37:29,119 --> 00:37:34,319 diagram that I presented in the beginning of talk. On the left side there is the 324 00:37:34,319 --> 00:37:42,099 Android kernel and on the right side there is the baseband operating system. DIAG 325 00:37:42,099 --> 00:37:47,160 protocol actually it works in both sides. It's not only commands that can be sent by 326 00:37:47,160 --> 00:37:51,000 the application processor to the baseband, but it's also the messages that can be 327 00:37:51,000 --> 00:37:55,730 sent by the baseband to the application processor. So DIAG comments are not really 328 00:37:55,730 --> 00:38:02,150 comments - they're more like tokens that also can be used to encode messages. The 329 00:38:02,150 --> 00:38:10,269 green arrows on this slide represents an example of call flow, of the data flow 330 00:38:10,269 --> 00:38:14,609 originating from the baseband and going to the application processor. So obviously, 331 00:38:14,609 --> 00:38:25,820 in case of commands there would be a reverse call flow or data flow. The main 332 00:38:25,820 --> 00:38:29,810 entity inside the operating system, baseband operating system responsible for 333 00:38:29,810 --> 00:38:37,230 DIAG is the DIAG task. It has a separate task which handles specifically various 334 00:38:37,230 --> 00:38:47,210 operations related to the DIAG protocol. The exchange of data between the DIAG task 335 00:38:47,210 --> 00:38:55,390 and other tasks are done through the ring buffer. So, for example, if some tasks 336 00:38:55,390 --> 00:39:05,730 needs to log something through the DIAG, it will use specialized logging APIs that 337 00:39:05,730 --> 00:39:10,930 will in turn put logging data into the ring buffer. The ring buffer will be 338 00:39:10,930 --> 00:39:20,330 drained either on timer or on a software based interrupt from the caller. And at 339 00:39:20,330 --> 00:39:28,480 this point the data will be wrapped into DIAG protocol and from there it will go to 340 00:39:28,480 --> 00:39:37,119 sI/O task, this Serial I/O which is responsible to send in the output to a 341 00:39:37,119 --> 00:39:49,529 specific interface. This is based on the modem, on the baseband configuration. The 342 00:39:49,529 --> 00:39:56,549 main interface that I was dealing with is the shared memory, which ends up in the 343 00:39:56,549 --> 00:40:06,130 DIAG shared driver inside the Android kernel. So in case of sending the commands 344 00:40:06,130 --> 00:40:11,809 from the Android kernel to the baseband, it will be the reverse flow. First, you 345 00:40:11,809 --> 00:40:17,420 will need to send some- to craft the DIAG protocol data, send it through the DIAG 346 00:40:17,420 --> 00:40:21,920 shared driver that will write to the shared memory interface. From there, it 347 00:40:21,920 --> 00:40:28,109 will go to the specialized task in the basement and eventually end up in the DIAG 348 00:40:28,109 --> 00:40:42,400 task and potentially other responsible task. On the Android side, DIAG is 349 00:40:42,400 --> 00:40:47,970 represented with the /dev/diag device, which is implemented with the diagchar, 350 00:40:47,970 --> 00:40:54,980 and diagfwd kernel drivers in the MSM kernel. The purpose of the DIAG shared 351 00:40:54,980 --> 00:41:02,910 driver is to support the DIAG interface. It is quite complex in code, but 352 00:41:02,910 --> 00:41:09,569 functionally it's quite simple. It contains some basic minimum of DIAG 353 00:41:09,569 --> 00:41:15,310 commands that enable configuration of the interface on the baseband side. And then 354 00:41:15,310 --> 00:41:20,609 it would be able to multiplex the DIAG channel to either USB or a memory device. 355 00:41:20,609 --> 00:41:29,680 It also contains some IOCTLs for configuration that can be accessed from 356 00:41:29,680 --> 00:41:36,029 the Android user land. And finally, the IOCTL filters various DIAG commands that 357 00:41:36,029 --> 00:41:43,890 it considers unnecessary. This is a bit important because when you will start, 358 00:41:43,890 --> 00:41:47,970 when you'll try to do some tests and send some arbitrary DIAG comments with the DIAG 359 00:41:47,970 --> 00:41:54,980 interface, you would be required to rebuild the actual driver to remove this 360 00:41:54,980 --> 00:42:03,249 masking, otherwise your commands will not make it to the baseband side. At the core, 361 00:42:03,249 --> 00:42:09,299 the DIAG shared driver is based on the SMD shared memory device interface, which is a 362 00:42:09,299 --> 00:42:21,470 core interface specific to Qualcomm modem. So this is where DIAG is, diagchar 363 00:42:21,470 --> 00:42:29,059 is on the diagram. The diagchar driver itself is located in the 364 00:42:29,059 --> 00:42:39,039 application OS's vendor specific drivers. And then there is some shared memory 365 00:42:39,039 --> 00:42:43,759 implementation in the baseband that handles this and the DIAG implementation 366 00:42:43,759 --> 00:42:56,589 itself. diagchar driver is quite complex in code, but the functionality is quite 367 00:42:56,589 --> 00:43:06,869 simple. It does implement a handful of CTLs that enables some configuration. I 368 00:43:06,869 --> 00:43:14,529 didn't check what exactly this IOCTLs are responsible for. It exposes the /dev/diag 369 00:43:14,529 --> 00:43:19,430 device which is available for it in the writing. However, by default, you are not 370 00:43:19,430 --> 00:43:25,380 able to access the DIAG channel based on- for this device, because in order to 371 00:43:25,380 --> 00:43:33,220 access it, there is diag_switch_logging function, which switches the channel that 372 00:43:33,220 --> 00:43:41,230 is used for DIAG communications. On the screen there are several modes listed, but 373 00:43:41,230 --> 00:43:45,009 in practice only two of them are supported. The USB mode and the memory 374 00:43:45,009 --> 00:43:53,000 device mode. USB mode is the default, so which is why if you just open, the 375 00:43:53,000 --> 00:43:58,269 /dev/diag driver, dev/diag device and try to read something from it, it won't work, 376 00:43:58,269 --> 00:44:07,559 is tied to USB. And in order to reconfigure it to use the memory device, 377 00:44:07,559 --> 00:44:17,280 you need to send a special IOCTL code. Notice the procedure named 378 00:44:17,280 --> 00:44:24,950 mask_request_validate, which employs a quite strict filtering on the DIAG commands 379 00:44:24,950 --> 00:44:31,619 that you try to send through this interface. So it filters out basically 380 00:44:31,619 --> 00:44:40,072 everything with the exception of some basic configuration requests. At the core, 381 00:44:40,072 --> 00:44:46,990 DIAG shared driver use the shared memory device to communicate with the baseband. 382 00:44:46,990 --> 00:44:55,079 The SMD implementation is quite complex. It exposes SMD Read API, which is used by 383 00:44:55,079 --> 00:45:02,679 DIAG share for reading the data from the shared memory, one of the APIs. Shared 384 00:45:02,679 --> 00:45:14,309 memory also operates on the abstraction of channels which are accessed through the 385 00:45:14,309 --> 00:45:19,619 API named smd_named_open_on_edge. So you can notice here that there are some DIAG 386 00:45:19,619 --> 00:45:25,120 specific channels that can be opened. Now, let's take a look at the SMD 387 00:45:25,120 --> 00:45:29,730 implementation. This is a bit important because a shared memory device represents 388 00:45:29,730 --> 00:45:33,420 a part of the attack surface for escalation from the modem to the 389 00:45:33,420 --> 00:45:37,880 application processor. This is a very important attack surface because if you 390 00:45:37,880 --> 00:45:42,509 just achieve code execution on the baseband, it's mostly useless because it 391 00:45:42,509 --> 00:45:49,480 cannot access the main operating system. And in order to make it useful, you'll 392 00:45:49,480 --> 00:45:59,119 need to create and exploit chain and add one more exploit based on that bug with 393 00:45:59,119 --> 00:46:04,210 privilege escalation from the modem to the application processor. So shared memory 394 00:46:04,210 --> 00:46:10,559 device is one of the attack surfaces for this. The shared memory device is 395 00:46:10,559 --> 00:46:22,160 implemented as exposed memory region exposed by the Qualcomm peripheral. The 396 00:46:22,160 --> 00:46:28,619 specialized MSM driver will map it and here it's the name is smem_ram_phys, the 397 00:46:28,619 --> 00:46:40,099 base of the shared memory region. The shared memory region operates on the 398 00:46:40,099 --> 00:46:50,519 concept of entries and channels, so it's partitioned in distant parts that can be 399 00:46:50,519 --> 00:47:00,470 accessed through the procedure, smem_get_entry and one of these entries is 400 00:47:00,470 --> 00:47:08,070 SMEM_CHANNEL_ALLOC_TBL, which contains the list of available channels that can be 401 00:47:08,070 --> 00:47:13,740 opened. From there, we can actually open the channels and use the shared memory 402 00:47:13,740 --> 00:47:25,700 interface. During this initial research project, it wasn't my goal to research the 403 00:47:25,700 --> 00:47:32,460 entire Qualcomm ecosystem, so while I was preparing for this talk, I have noticed 404 00:47:32,460 --> 00:47:37,569 some more interesting things in the source codes, such as, for example, the 405 00:47:37,569 --> 00:47:45,859 specialized driver that handles GTAG memory region, which is presumably exposed 406 00:47:45,859 --> 00:47:53,140 by some Qualcomm system of chips. In the drivers this is mostly used read only, and 407 00:47:53,140 --> 00:47:58,609 I suppose that will not really work for writing, but it's worth checking probably. 408 00:47:58,609 --> 00:48:07,849 And now, finally, let's take a look at the DIAG protocol itself. One of the first 409 00:48:07,849 --> 00:48:13,119 things that I noticed when researching the DIAG protocol is that it's actually used 410 00:48:13,119 --> 00:48:21,460 in a few places, not only in libqcdm. A popular tool named SnoopSnitch can enable 411 00:48:21,460 --> 00:48:27,460 protocol dumps, so there are protocol dumps on rooted devices. And in order to 412 00:48:27,460 --> 00:48:33,349 accomplish this, it's SnoopSnitch sends an opaque blob of the commands to the mobile 413 00:48:33,349 --> 00:48:40,349 device through the DIAG interface. This is blob is not documented. So it got me 414 00:48:40,349 --> 00:48:46,740 curious what exactly these commands are doing. But before we can look at the dump, 415 00:48:46,740 --> 00:48:53,780 let's understand the protocol. The DIAG protocol consists of around 200 of commands 416 00:48:53,780 --> 00:49:02,365 or tokens. Some of them are documented in the open source, but not all of them. So 417 00:49:02,365 --> 00:49:07,630 you can notice on the screenshots, some of the commands are missing. And one of the 418 00:49:07,630 --> 00:49:21,680 missing commands is actually the token 0x92 hexadecimal, which represents an encoded hash log 419 00:49:21,680 --> 00:49:34,069 message. The common format is quite simple. The best pritimitive here is the 420 00:49:34,069 --> 00:49:42,819 DIAG token number 0x7E, it's not really a delimiter, it's a separate DIAG command 421 00:49:42,819 --> 00:49:49,519 126. It's missing in the open source, as you can see here. So the DIAG command is 422 00:49:49,519 --> 00:49:57,870 nested. The outer layer consists of this wrapper of 0x7e hexadecimal bytes. Then 423 00:49:57,870 --> 00:50:02,329 there is the main command and then there is some variable length data that can 424 00:50:02,329 --> 00:50:10,839 contain even more subcommands. This entire thing is verified using the CRC and some 425 00:50:10,839 --> 00:50:16,860 bytes are escaped. Specifically, as you can see on the snippet. One interesting 426 00:50:16,860 --> 00:50:24,539 thing about the DIAG protocol is that it supports subsystem extensions. Basically, 427 00:50:24,539 --> 00:50:29,820 different subsystems in the baseband can register their own DIAG system handlers, 428 00:50:29,820 --> 00:50:38,119 arbitrary ones. And there is a special DIAG command number 75, which simply forwards.. 429 00:50:38,119 --> 00:50:43,419 instructs the DIAG system to forward this command to the respective subsystem. And 430 00:50:43,419 --> 00:50:56,849 then it will be parsed there. There exists quite a large number of subsystems. Not 431 00:50:56,849 --> 00:51:01,480 all of them are documented, and when I started investigating this, I noticed that 432 00:51:01,480 --> 00:51:08,360 there actually exists a DIAG subsystem- subsystem and debugging subsystem. The 433 00:51:08,360 --> 00:51:15,089 later one immediately interested me because I was hoping that it would enable 434 00:51:15,089 --> 00:51:19,700 some more advanced introspection through this debugging subsystem. But it turned 435 00:51:19,700 --> 00:51:25,910 out that the debugging subsystem is quite simple. It only supported one command: 436 00:51:25,910 --> 00:51:35,470 inject crash. So you can send a special DIAG comment that will inject the crash 437 00:51:35,470 --> 00:51:43,970 into the baseband. I will talk later about this. Now, let's take a look at specific 438 00:51:43,970 --> 00:51:52,410 examples of the DIAG protocol. This is the annotated snippet of the blob of commands 439 00:51:52,410 --> 00:52:00,720 from SnoopSnitch. This blob actually consists of three large logical parts. The 440 00:52:00,720 --> 00:52:04,470 first part is largely irrelevant. It's a bunch of commands that request various 441 00:52:04,470 --> 00:52:10,249 informations from the baseband, such as timestamp, version info, build id and so 442 00:52:10,249 --> 00:52:16,839 on. The second batch of commands starts with a command Number 0x73 hexadecimal. 443 00:52:16,839 --> 00:52:26,529 This is DIAG common log config. This is the command which enables protocol dumps and 444 00:52:26,529 --> 00:52:34,390 configures them. And third part of this blob starts with the command number 0x7D 445 00:52:34,390 --> 00:52:38,459 hexadecimal. This is the CMD_EXT_MESSAGE_CONFIG. This is actually 446 00:52:38,459 --> 00:52:43,410 the command that is supposed to enable textual message logging, except that in 447 00:52:43,410 --> 00:52:51,680 case of SnoopSnitch it disables all of the logging altogether. So how do you actually 448 00:52:51,680 --> 00:52:57,390 cellular protocol dumps work? In order to enable the cellular product dumps, we need 449 00:52:57,390 --> 00:53:04,210 DIAG_CMD_LOG_CONFIG, number 0x73 hexadecimal. It is partially documented in 450 00:53:04,210 --> 00:53:12,640 the libqcdm. The structure of the packet would contain the code and the subcommand, 451 00:53:12,640 --> 00:53:18,079 that would be set mask in this case. It also needs an equipment ID, which 452 00:53:18,079 --> 00:53:25,230 corresponds to the specific protocol that we want to dump. And finally, the masks 453 00:53:25,230 --> 00:53:33,369 that are applied to filter some parts of the dump. This is relatively 454 00:53:33,369 --> 00:53:41,020 straightforward. And now the second command, DIAG_CMD_EXT_MESSAGE_CONFIG. This 455 00:53:41,020 --> 00:53:48,359 is the one which is supposed to enable textual message logs. The command format 456 00:53:48,359 --> 00:54:00,130 is undocumented. So let's take a closer look at it. The command consists of a 457 00:54:00,130 --> 00:54:06,720 subcommand. In this case, it's subcommand number 4, the set mask. And then there are 458 00:54:06,720 --> 00:54:15,819 two 16 bit integers. SSID start and end. SSID is subsystem ID, which is not the 459 00:54:15,819 --> 00:54:26,099 same as DIAG subsystems. And the last one is the mask, so subsystem IDs are used to 460 00:54:26,099 --> 00:54:31,859 filter the messages based on a specific subsystem, because there is a huge amount 461 00:54:31,859 --> 00:54:35,970 of subsystems in the baseband. And if all of them start logging, this is a huge 462 00:54:35,970 --> 00:54:41,720 amount of data. So DIAG provides this capability to filter a little bit, to a 463 00:54:41,720 --> 00:54:49,569 specific subsystem that you're interested in. The snippet of Python code here is an 464 00:54:49,569 --> 00:54:58,440 example how to enable textual message logging for all subsystems. You need to set the 465 00:54:58,440 --> 00:55:12,680 mask to all 1s. And this is quite a lot of logging in my experience. Now for parsing 466 00:55:12,680 --> 00:55:18,039 the incoming log messages, there are two types of DIAG tokens, both of them are 467 00:55:18,039 --> 00:55:26,399 undocumented. The first one is a legacy message number 0x79 hexadecimal. This is a 468 00:55:26,399 --> 00:55:32,420 simple ASCII based message that arrives through the DIAG interface so you can 469 00:55:32,420 --> 00:55:38,509 parse it quite straightforwardly. The second one is I called it 470 00:55:38,509 --> 00:55:43,640 DIAG_CMD_LOG_HASH, it's number 0x92 hexadecimal. This is the token which 471 00:55:43,640 --> 00:55:50,650 encodes the log messages that contain only the hashes. This is the one that if you 472 00:55:50,650 --> 00:55:57,579 have the msg_hash.txt file, you can correspond the hash that was arrived to 473 00:55:57,579 --> 00:56:02,170 this command to the messages provided in the text file. And you can get the textual 474 00:56:02,170 --> 00:56:08,900 logs. On the lower part of the slide there are two examples of hexdumps from both 475 00:56:08,900 --> 00:56:16,019 commands. Both of them have a similar structure. First, there are 4 bytes 476 00:56:16,019 --> 00:56:23,569 that are essential. The first one is the command itself. And the third byte is 477 00:56:23,569 --> 00:56:30,950 quite interesting is the number of arguments included. Next there is 64 bit 478 00:56:30,950 --> 00:56:40,470 value of timestamp. Next there is the SSID value, 16 bit. Some line number, and I'm 479 00:56:40,470 --> 00:56:48,509 not sure what is the next argument. And finally, after that, there is either ASCII 480 00:56:48,509 --> 00:56:59,380 encoded log string in plain text or hash of the log string. And optionally there 481 00:56:59,380 --> 00:57:06,060 may be included some arguments, though, in case of the first legacy command. The 482 00:57:06,060 --> 00:57:10,400 arguments are included before the log message and in case of the second command 483 00:57:10,400 --> 00:57:16,670 they are included after the MD5 hash in the log message, at least in my version of 484 00:57:16,670 --> 00:57:29,109 this implementation. And this is the DIAG packet that enables you to inject a crash 485 00:57:29,109 --> 00:57:36,970 into the baseband, at least in theory. Because in my case it did not work. And by 486 00:57:36,970 --> 00:57:41,410 not working, I mean that it did simply not enter the baseband. Normally, I would 487 00:57:41,410 --> 00:57:46,470 expect that on production device it should just reset the baseband. You will not get 488 00:57:46,470 --> 00:57:53,029 a crash dump or anything like that, just a reset. So I suppose that it still should 489 00:57:53,029 --> 00:57:58,150 be working on some other devices. So it's worth of checking. There are a few types of 490 00:57:58,150 --> 00:58:09,789 crashes that you can request in this way. In order to accomplish this, I needed a 491 00:58:09,789 --> 00:58:17,119 very simple tool with basically two functions. first, direct easy access to 492 00:58:17,119 --> 00:58:22,839 the DIAG interface, ideally through some sort of python shell. And second is the 493 00:58:22,839 --> 00:58:29,779 ability to read and parse data with advanced log strings. For that purpose. I 494 00:58:29,779 --> 00:58:37,999 wrote a simple framework that I named diagtalk, which is based directly on the 495 00:58:37,999 --> 00:58:49,349 diag interface in the Android kernel and or with a Python harness. So on the left 496 00:58:49,349 --> 00:58:56,970 side, here is the example of some advanced parsing with some leaked values. And on 497 00:58:56,970 --> 00:59:02,014 the right side, here is the example of the advanced message log, which includes the 498 00:59:02,014 --> 00:59:10,589 log strings that were extracted.. that were stripped out from the firmware. The log is 499 00:59:10,589 --> 00:59:16,791 quite fun, as I expected it to be, it has a lot of detailed data, such as, for 500 00:59:16,791 --> 00:59:22,800 example, GPS coordinates and various attempts of the basement to connect to 501 00:59:22,800 --> 00:59:34,539 different channels. And I think it's quite useful for offensive research purposes, 502 00:59:34,539 --> 00:59:42,960 it's even contained sometimes raw pointers as you can notice on the screenshot. So in 503 00:59:42,960 --> 00:59:50,069 this project, my conclusion was that indeed I was reassured that it was the 504 00:59:50,069 --> 00:59:56,660 right choice and Hexagon seems to be a quite a challenging target, and it would 505 00:59:56,660 --> 01:00:00,940 probably need several more months of work to even begin to do some serious offensive 506 01:00:00,940 --> 01:00:08,500 work. I also started to think about writing a software debugger because it 507 01:00:08,500 --> 01:00:15,640 seems to be the most.. probably the most reliable way to achieve debugging 508 01:00:15,640 --> 01:00:22,140 introspection. And also, I noticed some blank spaces in the field that may require 509 01:00:22,140 --> 01:00:27,839 future work. For Qualcomm Hexagon specifically, there is a lot of things 510 01:00:27,839 --> 01:00:35,539 that can be done. For example, you can take a look at other Qualcomm proprietary 511 01:00:35,539 --> 01:00:40,609 diagnostic protocols of which there are a few, such as QMI for example, I think they 512 01:00:40,609 --> 01:00:49,400 are lesser known than DIAG protocol. And then there is a requirement to create a 513 01:00:49,400 --> 01:00:58,569 full system emulation based on QEMU at least for some chips. And a big problem 514 01:00:58,569 --> 01:01:04,140 about the decompiler, which is a major obstacle to any serious static analysis in 515 01:01:04,140 --> 01:01:14,979 the code and for the offensive research, there are 3 large directions. First one is 516 01:01:14,979 --> 01:01:18,920 enabling debugging. There are different ways for that. For example, software based 517 01:01:18,920 --> 01:01:25,940 debugging or bypassing JTAG fusing, on the other hand. Next, there are explorations 518 01:01:25,940 --> 01:01:33,000 of the over the air attack vectors. And the 3rd one is escalation from the baseband 519 01:01:33,000 --> 01:01:39,369 to the application processor. These are the 3 large offensive research vectors. 520 01:01:39,369 --> 01:01:44,670 And for the basebands in general, there also exists some interesting directions of 521 01:01:44,670 --> 01:01:54,140 future work. First of all, the OsmocommBB. It definitely deserves some update a 522 01:01:54,140 --> 01:01:59,989 little bit. It is the only one open source implementation of a baseband. And it is so 523 01:01:59,989 --> 01:02:09,040 outdated. And there is, and it is based on some real obscure hardwares. Another 524 01:02:09,040 --> 01:02:17,677 problem here is that there doesn't exist any software based CDMA implementation. 525 01:02:17,677 --> 01:02:28,660 *No sound* 526 01:02:28,660 --> 01:02:34,067 Herald: Alisa, thank you very much for this nice talk. Um, there are some 527 01:02:34,067 --> 01:02:39,030 questions from the audience. So basically the first one is a little bit of an 528 01:02:39,030 --> 01:02:46,358 icebreaker: Do you use a mobile phone? And do you trust it? 529 01:02:46,358 --> 01:02:51,769 Alisa: No, I don't try to use a mobile phone only for Twitter. Does anyone still 530 01:02:51,769 --> 01:03:00,065 use mobile phones nowadays? H: *laughs* Well, no idea. Another 531 01:03:00,065 --> 01:03:07,979 question concerns the other Qualcomm chips. Did you have a look at the Qualcom 532 01:03:07,979 --> 01:03:15,960 Wi-Fi chips sets? A: As I mentioned during the talk, I had 533 01:03:15,960 --> 01:03:20,509 only one month. It was like a short reconnaissance project, so I didn't really 534 01:03:20,509 --> 01:03:27,020 have time to investigate everything. I did notice that Qualcomm socks have a Wi-Fi 535 01:03:27,020 --> 01:03:32,369 chip, which is also based on Hexagon. And more than that, it also shares some of the 536 01:03:32,369 --> 01:03:38,540 same low level technical primitives. So it's definitely worth looking, but I didn't 537 01:03:38,540 --> 01:03:45,019 investigate it in details. H: OK, OK, thanks. There is also a pretty 538 01:03:45,019 --> 01:03:50,820 technical question here, so instead of having to go through the rigorous command 539 01:03:50,820 --> 01:03:57,600 checking for the DIAG card driver, wouldn't it be possible to nmap /dev/mem 540 01:03:57,600 --> 01:04:04,604 into userspace process and send over commands directly so. Depends a little bit 541 01:04:04,604 --> 01:04:11,799 on what the goal is. A: OK, so it really depends on your 542 01:04:11,799 --> 01:04:16,869 previous background and your goals. The point here is that by default, the DIAG 543 01:04:16,869 --> 01:04:23,420 shared ecosystem does not allow to send arbitrary DIAG commands. So either way, 544 01:04:23,420 --> 01:04:28,749 you will have to hack something. One way to hack this is to rebuild the actual 545 01:04:28,749 --> 01:04:33,529 driver. So you would be able to send the commands directly through that DIAG 546 01:04:33,529 --> 01:04:37,859 interface. Another way would be to access the shared memory directly, for example. 547 01:04:37,859 --> 01:04:42,079 But I think it would be more complex because the Qualcomm shared memory 548 01:04:42,079 --> 01:04:47,440 implementation is quite complex. So I think that the easiest way would be 549 01:04:47,440 --> 01:04:52,789 actually to hack the DIAG shared driver and use the deb. DIAG interface for this. 550 01:04:52,789 --> 01:05:00,270 H: OK, thanks. Thanks. There is one question which I'm going to read out, 551 01:05:00,270 --> 01:05:14,870 maybe you can make sense of it: is this typically *[unclear]* security fall mobile phones? 552 01:05:14,870 --> 01:05:19,289 A: This level of hardening that I presented, I think is around medium level. 553 01:05:19,289 --> 01:05:24,270 So usually production falls are even more hardened. If you take a look at things 554 01:05:24,270 --> 01:05:31,249 like Google Pixel5 or the latest iPhones, they will be even better, hardened than 555 01:05:31,249 --> 01:05:38,640 the one that I discussed. H: Oh, OK. Yeah, thanks. Thanks then. So it 556 01:05:38,640 --> 01:05:42,900 doesn't look like we have any more questions left. Anyway, so if you want to 557 01:05:42,900 --> 01:05:49,122 get in contact with Alisa, no problem. There is the feedback tab below your 558 01:05:49,122 --> 01:05:56,888 video now at the moment, just drop your questions over there. And that's a way to 559 01:05:56,888 --> 01:06:02,736 get in touch with Alisa. Other than that I would say we're done for today for this 560 01:06:02,736 --> 01:06:07,410 session. Thank you very, very much Alisa for this really nice presentation once 561 01:06:07,410 --> 01:06:14,160 again. Applause And I'll transfer now over to the Herald News Show. 562 01:06:14,160 --> 01:06:33,639 *postroll music* 563 01:06:33,639 --> 01:06:54,000 Subtitles created by c3subtitles.de in the year 2021. Join, and help us!