Martina Lindorfer
Associate Prof. Dipl.-Ing.in Dr.in techn. / BSc
I joined the Security and Privacy Research group in 2018 as an Assistant Professor. My research interests are applied security, with a special focus on mobile security and privacy.
While this page is work in progress, please refer to my personal website for up to date news, publications, and research artifacts: https://martina.lindorfer.in/
I am always looking for motivated students, if you are interested in working with me have a look at our open positions and thesis opportunities!
Roles
- Associate Professor
Courses
- Project in Computer Science 1 / PR / 192.021
- Project in Computer Science 2 / PR / 192.022
2025S
- Project in Computer Science 1 / PR / 192.021
- Project in Computer Science 2 / PR / 192.022
- Foundations of System and Application Security / VU / 192.044
- Seminar for PhD Students / SE / 192.060
- Bachelor Thesis / PR / 192.061
2024W
Projects (at TU Wien)
-
W4MP : Fixing the Broken Bridge Between Mobile Apps and the Web
2023 - 2027 / Vienna Science and Technology Fund (WWTF) -
IoTIO : IoTIO: Analyzing and Understanding the Internet of Insecure Things
2020 - 2025 / Vienna Science and Technology Fund (WWTF) -
SysSec : A European Network of Excellence in Managing Threats and
Vulnerabilities in the Future Internet: Europe for the World
2010 - 2014 / European Commission -
TRUDIE : TRUDIE - Trust Relationships in Underground IT Economies
2009 - 2012 / Austrian Research Promotion Agency (FFG) -
icode : i-Code: Real-time Malicious Code Identification
2010 - 2012 / European Commission
Publications (created while at TU Wien)
-
2024
-
Are You Sure You Want To Do Coordinated Vulnerability Disclosure?
Chen, T.-H., Tagliaro, C., Lindorfer, M., Borgolte, K., & van der Ham-de Vos, J. (2024). Are You Sure You Want To Do Coordinated Vulnerability Disclosure? In 2024 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW) (pp. 307–314).
DOI: 10.1109/EuroSPW61312.2024.00039 MetadataAbstract
The rising numbers of vulnerabilities and security issues stemming from the rapid iteration and development of the Internet of Things (IoT) have introduced new challenges for the involved stakeholders to mitigate them in time. To effectively bring researchers, vendors, and end-users together to address such problems, Coordinated Vulnerability Disclosure (CVD) has become standard practice. Although general CVD procedures for practitioners to follow exist, adapting them to the specific circumstances has proven to be complicated in practice. In this paper, we document our experience of reporting various security vulnerabilities for 15,820 IoT backends. The discovery and scanning have been part of a separate research project, in this contribution we focus on the disclosure to the backends' operators in a large-scale coordinated vulnerability disclosure effort, following the latest disclosure guidelines. We discuss what we have learned to inform others who want to engage in large-scale CVD, we compare the steps and tradeoffs of our effort with current CVD suggestions, based on our measurement before and after the disclosure, and we describe how adapting our approach can improve CVD best practices. -
Exploring the Malicious Document Threat Landscape: Towards a Systematic Approach to Detection and Analysis
Saha, A., Blasco Alís, J., & Lindorfer, M. (2024). Exploring the Malicious Document Threat Landscape: Towards a Systematic Approach to Detection and Analysis. In 2024 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW) (pp. 533–544).
DOI: 10.1109/EuroSPW61312.2024.00065 MetadataAbstract
Despite being the most common initial attack vector, document-based malware delivery remains understudied compared to research on malicious executables. This limits our understanding of how attackers leverage document file formats and exploit their functionalities for malicious purposes. In this paper, we perform a measurement study that leverages existing tools and techniques to detect, extract, and analyze malicious Office documents. We collect a substantial dataset of 9,086 malicious samples and reveal a critical gap in the understanding of how attackers utilize these documents. Our in-depth analysis highlights emerging tactics used in both targeted and large-scale cyberattacks while identifying weaknesses in common document analysis methods. Through a combination of analysis techniques, we gain crucial in-sights valuable for forensic analysts to assess suspicious files, pinpoint infection origins, and ultimately contribute to the development of more robust detection models. We make our dataset and source code available to the academic community to foster further research in this area. -
Tabbed Out: Subverting the Android Custom Tab Security Model
Beer, P., Squarcina, M., Veronese, L., & Lindorfer, M. (2024). Tabbed Out: Subverting the Android Custom Tab Security Model. In 2024 IEEE Symposium on Security and Privacy (SP) (pp. 4591–4609).
DOI: 10.1109/SP54263.2024.00105 MetadataAbstract
Mobile operating systems provide developers with various mobile-to-Web bridges to display Web pages inside native applications. A recently introduced component called Custom Tab (CT) provides an outstanding feature to overcome the usability limitations of traditional WebViews: it shares the state with the underlying browser. Similar to traditional WebViews, it can also keep the host application informed about ongoing Web navigations. In this paper, we perform the first systematic security evaluation of the CT component and show how the design of its security model did not consider cross- context state inference attacks when the feature was introduced. Additionally, we show how CTs can be exploited for fine-grained exfiltration of sensitive user browsing data, violation of Web session integrity by circumventing SameSite cookies, and how UI customization of the CT component can lead to phishing and information leakage. To assess the prevalence of CTs in the wild and the practicality of the mitigation strategies we propose, we carry out the first large-scale analysis of CT usage on over 50K Android applications. Our analysis reveals that their usage is widespread, with 83% of applications embedding CTs either directly or as part of a library. We have responsibly disclosed all our findings to Google, which has already taken steps to apply targeted mitigations, assigned three CVEs for the discovered vulnerabilities, and awarded us $10,000 in bounties. Our interaction with Google led to clarifications of the CT security model in the new Chrome Custom Tabs Security FAQ document. -
C2Miner: Tricking IoT Malware into Revealing Live Command & Control Servers
Davanian, A., Faloutsos, M., & Lindorfer, M. (2024). C2Miner: Tricking IoT Malware into Revealing Live Command & Control Servers. In ASIA CCS ’24: Proceedings of the 19th ACM Asia Conference on Computer and Communications Security (pp. 112–127).
DOI: 10.1145/3634737.3644992 MetadataAbstract
How can we identify live Command & Control (C2) servers for a given IoT malware binary? An effective solution to this problem constitutes a significant capability towards detecting and containing botnets. This task is not trivial because C2 servers are short-lived, and they use sophisticated and proprietary communication protocols. We propose C2Miner, a novel approach to trick IoT malware binaries into revealing their currently live C2 servers. Our approach weaponizes old disposable IoT malware binaries and uses them to probe active servers. We provide novel solutions to overcome the following challenges: (a) disambiguating the C2-bound traffic generated by the malware and (b) determining if a target IP:port is indeed a C2 server as opposed to a benign server. In our evaluation, based on 3M distinct exploration attempts over 150K distinct IP addresses, we show that we can identify C2 servers within a given IP:port space with an F1 score of 86%. In addition, we show how our approach can be used in practice and at scale. Conducting a large-scale probing campaign has scalability issues given that the number of probes is proportional to the IP addresses, the number of ports, and the number of binaries from distinct families which we want to explore. To address this challenge, we propose a grammar-based method to fingerprint and cluster C2 communications which, among other applications, allows us to select malware binaries for weaponization efficiently. Additionally, we use spatio-temporal features of C2 servers to narrow down our search in the entire IP space. An optimistic observation from our study is that using only 2 (more than 6 months) old IoT malware binaries, we scan 18K IP:port pairs daily for 6 days and find 6 new live C2 servers. -
ADAPT it! Automating APT Campaign and Group Attribution by Leveraging and Linking Heterogeneous Files
Saha, A., Blasco, J., Cavallaro, L., & Lindorfer, M. (2024). ADAPT it! Automating APT Campaign and Group Attribution by Leveraging and Linking Heterogeneous Files. In RAID ’24: Proceedings of the 27th International Symposium on Research in Attacks, Intrusions and Defenses (pp. 114–129). Association for Computing Machinery.
DOI: 10.1145/3678890.3678909 MetadataAbstract
Recent years have witnessed a surge in the growth of Advanced Persistent Threats (APTs), with significant challenges to the security landscape, affecting industry, governance, and democracy. The ever- growing number of actors and the complexity of their campaigns have made it difficult for defenders to track and attribute these malicious activities effectively. Traditionally, researchers relied on threat intelligence to track APTs. However, this often led to fragmented information, delays in connecting campaigns with specific threat groups, and misattribution. In response to these challenges, we introduce ADAPT, a ma- chine learning-based approach for automatically attributing APTs at two levels: (1) the threat campaign level, to identify samples with similar objectives and (2) the threat group level, to identify samples operated by the same entity. ADAPT supports a variety of heterogeneous file types targeting different platforms, includ- ing executables and documents, and uses linking features to find connections between them. We evaluate ADAPT on a reference dataset from MITRE as well as a comprehensive, label-standardized dataset of 6,134 APT samples belonging to 92 threat groups. Using real-world case studies, we demonstrate that ADAPT effectively identifies clusters representing threat campaigns and associates them with their respective groups. -
Comparing Apples to Androids: Discovery, Retrieval, and Matching of iOS and Android Apps for Cross-Platform Analyses
Steinböck, M., Bleier, J., Rainer, M., Urban, T., Utz, C., & Lindorfer, M. (2024). Comparing Apples to Androids: Discovery, Retrieval, and Matching of iOS and Android Apps for Cross-Platform Analyses. In MSR ’24: Proceedings of the 21st International Conference on Mining Software Repositories (pp. 348–360).
DOI: 10.1145/3643991.3644896 MetadataAbstract
For years, researchers have been analyzing mobile Android apps to investigate diverse properties such as software engineering practices, business models, security, privacy, or usability, as well as differences between marketplaces. While similar studies on iOS have been limited, recent work has started to analyze and compare Android apps with those for iOS. To obtain the most representative analysis results across platforms, the ideal approach is to compare their characteristics and behavior for the same set of apps, e. g., to study a set of apps for iOS and their respective counterparts for Android. Previous work has only attempted to identify and evaluate such cross-platform apps to a limited degree, mostly comparing sets of apps independently drawn from app stores, manually matching small sets of apps, or relying on brittle matches based on app and developer names. This results in (1) comparing apps whose behavior and properties significantly differ, (2) limited scalability, and (3) the risk of matching only a small fraction of apps. In this work, we propose a novel approach to create an extensive dataset of cross-platform apps for the iOS and Android ecosystems. We describe an analysis pipeline for discovering, retrieving, and matching apps from the Apple App Store and Google Play Store that we used to create a set of 3,322 cross-platform apps out of 10,000 popular apps for iOS and Android, respectively. We evaluate existing and new approaches for cross-platform app matching against a set of reference pairs that we obtained from Google's data migration service. We identify a combination of seven features from app store metadata and the apps themselves to match iOS and Android apps with high confidence (95.82 %). Compared to previous attempts that identified 14 % of apps as cross-platform, we are able to match 34 % of apps in our dataset. To foster future research in the cross-platform analysis of mobile apps, we make our pipeline available to the community. -
Large-Scale Security Analysis of Real-World Backend Deployments Speaking IoT-Focused Protocols
Tagliaro, C., Komsic, M., Continella, A., Borgolte, K., & Lindorfer, M. (2024). Large-Scale Security Analysis of Real-World Backend Deployments Speaking IoT-Focused Protocols. In RAID ’24: Proceedings of the 27th International Symposium on Research in Attacks, Intrusions and Defenses (pp. 561–578).
DOI: 10.1145/3678890.3678899 MetadataAbstract
Internet-of-Things (IoT) devices, ranging from smart home assistants to health devices, are pervasive: Forecasts estimate their number to reach 29 billion by 2030. Understanding the security of their machine-to-machine communication is crucial. Prior work focused on identifying devices’ vulnerabilities or proposed protocol-specific solutions. Instead, we investigate the security of backends speaking IoT protocols, that is, the backbone of the IoT ecosystem. We focus on three real-world protocols for our large-scale analysis: MQTT, CoAP, and XMPP. We gather a dataset of over 337,000 backends, augment it with geographical and provider data, and perform non-invasive active measurements to investigate three major security threats: information leakage, weak authentication, and denial of service. Our results provide quantitative evidence of a problematic immaturity in the IoT ecosystem. Among other issues, we find that 9.44% backends expose information, 30.38% CoAP-speaking backends are vulnerable to denial of service attacks, and 99.84% of MQTT- and XMPP-speaking backends use insecure transport protocols (only 0.16% adopt TLS, of which 70.93% adopt a vulnerable version). -
Back-to-the-Future Whois: An IP Address Attribution Service for Working with Historic Datasets
Streibelt, F., Lindorfer, M., Gürses, S., Hernández Gañán, C., & Fiebig, T. (2023). Back-to-the-Future Whois: An IP Address Attribution Service for Working with Historic Datasets. In Passive and Active Measurement : 24th International Conference, PAM 2023, Virtual Event, March 21–23, 2023, Proceedings (pp. 209–226). Springer.
DOI: 10.1007/978-3-031-28486-1_10 MetadataAbstract
Researchers and practitioners often face the issue of having to attribute an IP address to an organization. For current data this is comparably easy, using services like whois or other databases. Similarly, for historic data, several entities like the RIPE NCC provide websites that provide access to historic records. For large-scale network measurement work, though, researchers often have to attribute millions of addresses. For current data, Team Cymru provides a bulk whois service which allows bulk address attribution. However, at the time of writing, there is no service available that allows historic bulk attribution of IP addresses. Hence, in this paper, we introduce and evaluate our ‘Back-to-the-Future whois’ service, allowing historic bulk attribution of IP addresses on a daily granularity based on CAIDA Routeviews aggregates. We provide this service to the community for free, and also share our implementation so researchers can run instances themselves. -
Not Your Average App: A Large-scale Privacy Analysis of Android Browsers
Pradeep, A., Feal, Á., Gamba, J., Rao, A., Lindorfer, M., Vallina-Rodriguez, N., & Choffnes, D. (2023). Not Your Average App: A Large-scale Privacy Analysis of Android Browsers. In M. L. Mazurek & M. Sherr (Eds.), Proceedings on Privacy Enhancing Technologies Symposium 2023 (pp. 29–46).
DOI: 10.56553/popets-2023-0003 MetadataAbstract
The privacy-related behavior of mobile browsers has remained widely unexplored by the research community. In fact, as opposed to regular Android apps, mobile browsers may present contradicting privacy behaviors. On the one hand, they can have access to (and can expose) a unique combination of sensitive user data, from users’ browsing history to permission-protected personally identifiable information (PII) such as unique identifiers and geolocation. On the other hand, they are in a unique position to protect users’ privacy by limiting data sharing with other parties by implementing ad- blocking features. In this paper, we perform a comparative and empirical analysis on how hundreds of Android web browsers protect or expose user data during browsing sessions. To this end, we collect the largest dataset of Android browsers to date, from the Google Play Store and four Chinese app stores. Then, we develop a novel analysis pipeline that combines static and dynamic analysis methods to find a wide range of privacy-enhancing (e.g., ad-blocking) and privacy-harming behaviors (e.g., sending browsing histories to third parties, not validating TLS certificates, and exposing PII—including non-resettable identifiers—to third parties) across browsers. We find that various popular apps on both Google Play and Chinese stores have these privacy-harming behaviors, including apps that claim to be privacy-enhancing in their descriptions. Overall, our study not only provides new insights into important yet overlooked considerations for browsers’ adoption and transparency, but also that automatic app analysis systems (e.g., sandboxes) need context-specific analysis to reveal such privacy behaviors. -
Of Ahead Time: Evaluating Disassembly of Android Apps Compiled to Binary OATs Through the ART
Bleier, J., & Lindorfer, M. (2023). Of Ahead Time: Evaluating Disassembly of Android Apps Compiled to Binary OATs Through the ART. In J. Polakis & E. van der Kouwe (Eds.), EUROSEC ’23: Proceedings of the 16th European Workshop on System Security (pp. 21–29).
DOI: 10.1145/3578357.3591219 MetadataAbstract
The Android operating system has evolved significantly since its initial release in 2008. Most importantly, in a continuing effort to increase the run-time performance of mobile applications (apps) and to reduce resource requirements, the way code is executed has transformed from being bytecode-based to a binary-based approach: Apps are still mainly distributed as Dalvik bytecode, but the Android Runtime (ART) uses an optimizing compiler to create binary code ahead-of-time (AOT), just-in-time (JIT), or as a combination of both. These changes in the build pipeline, including increasing obfuscation and optimization of the Dalvik bytecode, invalidate assumptions of bytecode-based static code analysis approaches through identifier renaming and code shrinking. Furthermore, customized apps can be distributed pre-compiled with devices’ firmware, sidestepping the bytecode altogether. Finally, Android apps have always relied on native binary code libraries for performance-critical tasks. We propose to narrow the gap between bytecode and binary code by leveraging the ART compiler’s capability to create well-formed ELF binaries, called OATs, as the basis for further static code analysis. To this end, we created a pipeline to automatically and efficiently compile APKs to OATs into a benchmark dataset of 1,339 apps. We then evaluate five popular disassemblers based on how well they can analyze these OATs based on how well they can detect function boundaries. Our results, in particular, compared to the success rate of two bytecode-based analyzers, demonstrate that our OAT-based approach can help to bring a wider set of code analysis tools and techniques to the area of Android app analysis. -
Mixed Signals: Analyzing Software Attribution Challenges in the Android Ecosystem
Hageman, K., Feal, A., Gamba, J., Girish, A., Bleier, J., Lindorfer, M., Tapiador, J., & Vallina-Rodriguez, N. (2023). Mixed Signals: Analyzing Software Attribution Challenges in the Android Ecosystem. IEEE Transactions on Software Engineering, 49(4), 2964–2979.
DOI: 10.34726/5296 MetadataAbstract
The ability to identify the author responsible for a given software object is critical for many research studies and for enhancing software transparency and accountability. However, as opposed to other application markets like Apple's iOS App Store, attribution in the Android ecosystem is known to be hard. Prior research has leveraged market metadata and signing certificates to identify software authors without questioning the validity and accuracy of these attribution signals. However, Android application (app) authors can, either intentionally or by mistake, hide their true identity due to: (1) the lack of policy enforcement by markets to ensure the accuracy and correctness of the information disclosed by developers in their market profiles during the app release process, and (2) the use of self-signed certificates for signing apps instead of certificates issued by trusted CAs. In this paper, we perform the first empirical analysis of the availability, volatility and overall aptness of publicly available market and app metadata for author attribution in Android markets. To that end, we analyze a dataset of over 2.5 million market entries and apps extracted from five Android markets for over two years. Our results show that widely used attribution signals are often missing from market profiles and that they change over time. We also invalidate the general belief about the validity of signing certificates for author attribution. For instance, we find that apps from different authors share signing certificates due to the proliferation of app building frameworks and software factories. Finally, we introduce the concept of an attribution graph and we apply it to evaluate the validity of existing attribution signals on the Google Play Store. Our results confirm that the lack of control over publicly available signals can confuse automatic attribution processes. -
IoTFlow: Inferring IoT Device Behavior at Scale through Static Mobile Companion App Analysis
Schmidt, D., Tagliaro, C., Borgolte, K., & Lindorfer, M. (2023). IoTFlow: Inferring IoT Device Behavior at Scale through Static Mobile Companion App Analysis. In CCS ’23: Proceedings of the ACM SIGSAC Conference on Computer and Communications Security (pp. 681–695). Association for Computing Machinery.
DOI: 10.1145/3576915.3623211 MetadataAbstract
The number of “smart” devices, that is, devices making up the Internet of Things (IoT), is steadily growing. They suffer from vulnerabilities just as other software and hardware. Automated analysis techniques can detect and address weaknesses before attackers can misuse them. Applying existing techniques or developing new approaches that are sufficiently general is challenging though. Contrary to other platforms, the IoT ecosystem features various software and hardware architectures. We introduce IoTFlow, a new static analysis approach for IoT devices that leverages their mobile companion apps to address the diversity and scalability challenges. IoTFlow combines Value Set Analysis (VSA) with more general data-flow analysis to automatically reconstruct and derive how companion apps communicate with IoT devices and remote cloud-based backends, what data they receive or send, and with whom they share it. To foster future work and reproducibility, our IoTFlow implementation is open source. We analyze 9,889 manually verified companion apps with IoTFlow to understand and characterize the current state of security and privacy in the IoT ecosystem, which also demonstrates the utility of IoTFlow. We compare how these IoT apps differ from 947 popular general-purpose apps in their local network commu- nication, the protocols they use, and who they communicate with. Moreover, we investigate how the results of IoTFlow compare to dynamic analysis, with manual and automated interaction, of 13 IoT devices when paired and used with their companion apps. Overall, utilizing IoTFlow, we discover various IoT security and privacy issues, such as abandoned domains, hard-coded credentials, expired certificates, and sensitive personal information being shared. -
Heads in the Clouds? Measuring Universities’ Migration to Public Clouds: Implications for Privacy & Academic Freedom
Fiebig, T., Gürses, S., Hernández Gañán, C., Kotkamp, E., Kuipers, F., Lindorfer, M., Prisse, M., & Sari, T. (2023). Heads in the Clouds? Measuring Universities’ Migration to Public Clouds: Implications for Privacy & Academic Freedom. In M. L. Mazurek & M. Sherr (Eds.), Proceedings on Privacy Enhancing Technologies (pp. 117–150). De Gruyter Open / Sciendo.
DOI: 10.56553/popets-2023-0044 MetadataAbstract
With the emergence of remote education and work in universities due to COVID-19, the 'zoomification' of higher education, i.e., the migration of universities to the clouds, reached the public discourse. Ongoing discussions reason about how this shift will take control over students' data away from universities, and may ultimately harm the privacy of researchers and students alike. However, there has been no comprehensive measurement of universities' use of public clouds and reliance on Software-as-a-Service offerings to assess how far this migration has already progressed. We perform a longitudinal study of the migration to public clouds among universities in the U.S. and Europe, as well as institutions listed in the Times Higher Education (THE) Top100 between January 2015 and October 2022. We find that cloud adoption differs between countries, with one cluster (Germany, France, Austria, Switzerland) showing a limited move to clouds, while the other (U.S., U.K., the Netherlands, THE Top100) frequently outsources universities' core functions and services---starting long before the COVID-19 pandemic. We attribute this clustering to several socio-economic factors in the respective countries, including the general culture of higher education and the administrative paradigm taken towards running universities. We then analyze and interpret our results, finding that the implications reach beyond individuals' privacy towards questions of academic independence and integrity. -
Connecting the .dotfiles: Checked-In Secret Exposure with Extra (Lateral Movement) Steps
Jungwirth, G., Saha, A., Schröder, M., Fiebig, T., Lindorfer, M., & Cito, J. (2023). Connecting the .dotfiles: Checked-In Secret Exposure with Extra (Lateral Movement) Steps. In IEEE/ACM 20th International Conference on Mining Software Repositories (MSR) (pp. 322–333).
DOI: 10.1109/MSR59073.2023.00051 MetadataAbstract
Personal software configurations, known as dotfiles, are increasingly being shared in public repositories. To understand the security and privacy implications of this phenomenon, we conducted a large-scale analysis of dotfiles repositories on GitHub. Furthermore, we surveyed repository owners to understand their motivations for sharing dotfiles, and their awareness of the security implications. Our mixed-method approach consisted of two parts: (1) We mined 124,230 public dotfiles repositories and inductively searched them for security and privacy flaws. (2) We then conducted a survey of repository owners (n=1,650) to disclose our findings and learn more about the problems and implications. We found that 73.6 % of repositories leak potentially sensitive information, most commonly email addresses (of which we found 1.2 million), but also RSA private keys, API keys, installed software versions, browsing history, and even mail client inboxes. In addition, we found that sharing is mainly ideological (an end in itself) and to show off ("ricing"), in addition to easing machine setup. Most users are confident about the contents of their files and claim to understand the security implications. In response to our disclosures, a small minority (2.2%) will make their repositories private or delete them, but the majority of respondents will continue sharing their dotfiles after taking appropriate actions. Dotfiles repositories are a great tool for developers to share knowledge and communicate - if done correctly. We provide recommendations for users and platforms to make them more secure. Specifically, tools should be used to manage dotfiles. In addition, platforms should work on more sophisticated tests, to find weaknesses automatically and inform the users or control the damage. -
The Threat of Surveillance and the Need for Privacy Protections
Lindorfer, M. (2023). The Threat of Surveillance and the Need for Privacy Protections. In H. Werthner, C. Ghezzi, J. Kramer, J. Nida-Rümelin, B. Nuseibeh, E. Prem, & A. Stanger (Eds.), Introduction to Digital Humanism : A Textbook (pp. 593–609). Springer.
DOI: 10.1007/978-3-031-45304-5_37 MetadataAbstract
In recent years, and since the introduction of the General Data Protection Regulation (GDPR) in particular, we have seen an increased interest (and concern) about the amount of private information that is collected by the applications and services we use in our daily lives. The widespread collection and commodification of personal data has been mainly driven by companies collecting, mining, and selling user profiles for targeted advertisement, a practice also referred to as “surveillance capitalism.” However, as we detail in this chapter, this is not the only form of surveillance and can be necessary and even beneficial by increasing the safety of citizens—if it is aligned with the principles of digital humanisms in providing transparency, oversight, and accountability. We also detail mechanisms users can deploy to protect their own privacy, as well as mechanisms that help to develop more privacy-friendly technologies. -
I Still Know What You Watched Last Sunday: Privacy of the HbbTV Protocol in the European Smart TV Landscape
Tagliaro, C., Hahn, F., Sepe, R., Aceti, A., & Lindorfer, M. (2023). I Still Know What You Watched Last Sunday: Privacy of the HbbTV Protocol in the European Smart TV Landscape. In Proceedings Network and Distributed System Security (NDSS) Symposium 2023. 30th Annual Network and Distributed System Security Symposium (NDSS) 2023, San Diego, United States of America (the).
DOI: 10.14722/ndss.2023.24102 MetadataAbstract
The ever-increasing popularity of Smart TVs and support for the Hybrid Broadcast Broadband TV (HbbTV) standard allow broadcasters to enrich content offered to users via the standard broadcast signal with Internet-delivered apps, e.g., ranging from quizzes during a TV show to targeted advertisement. HbbTV works using standard web technologies as transparent overlays over a TV channel. Despite the number of HbbTV-enabled devices rapidly growing, studies on the protocol’s security and privacy aspects are scarce, and no standard protective measure is in place. We fill this gap by investigating the current state of HbbTV in the European landscape and assessing its implications for users’ privacy. We shift the focus from the Smart TV’s firmware and app security, already studied in-depth in related work, to the content transmission protocol itself. Contrary to traditional “linear TV” signals, HbbTV allows for bi-directional communication: in addition to receiving TV content, it also allows for transmitting data back to the broadcaster. We describe techniques broadcasters use to measure users’ (viewing) preferences and show how the protocol’s implementation can cause severe privacy risks by studying its deployment by 36 TV channels in five European countries (Italy, Germany, France, Austria, and Finland). We also survey users’ awareness of Smart TV and HbbTV-related risks. Our results show little understanding of the possible threats users are exposed to. Finally, we present a denylist-based mechanism to ensure a safe experience for users when watching TV and to reduce the privacy issues that HbbTV may pose. -
Investigating HbbTV Privacy Invasiveness Across European Countries
Tagliaro, C., Hahn, F., Sepe, R., Aceti, A., & Lindorfer, M. (2023). Investigating HbbTV Privacy Invasiveness Across European Countries. In Learning from Authoritative Security Experiment Results (LASER) 2023. Workshop on Learning from Authoritative Security Experiment Results (LASER 2023), San Diego, United States of America (the).
DOI: 10.14722/laser-ndss.2023.24102 MetadataAbstract
Smart TVs enable the integration of the traditional broadcast signal with services offered by the Internet. Specifically, the Hybrid Broadcast Broadband TV (HbbTV) protocol allows broadcasters to offer consumers additional features via the Internet (e.g., quizzes and the ability to restart programs), enriching their viewing experience. For broadcasters its bi-directional nature also enables them to measure viewing preferences and provide targeted advertisements (marketed as “Addressable TV”). HbbTV works using standard web technologies as transparent overlays over a TV channel, thus, porting web security and privacy concerns to the Smart TV. However, despite the increasing adoption of HbbTV worldwide, studies on security and privacy issues in its deployments are scarce. In this paper, we discuss how we tested a range of 36 channels across five European countries and which challenges we faced; Specifically, every country adopts different ways of delivering the broadcast signal to the TVs. Thus, we provide a common experiment setup and detailed instructions on how we assess the TV channels’ privacy level in each country. We also show how the URLs pointing to the HbbTV applications we extracted can foster further replicability and studies. Finally, to complement our technical experiments we also measured Italian users’ awareness (N=174) of the security and privacy risks HbbTV introduces and we discuss our methodology to do so. -
Position Paper: Escaping Academic Cloudification to Preserve Academic Freedom
Fiebig, T., Gürses, S., & Lindorfer, M. (2022). Position Paper: Escaping Academic Cloudification to Preserve Academic Freedom. Privacy Studies Journal, 51–68.
DOI: 10.7146/psj.vi.132713 MetadataAbstract
Especially since the onset of the COVID-19 pandemic, the use of cloud-based tools and solutions - lead by the ‘Zoomification’ of education, has picked up attention in the EdTech and privacy communities. In this paper, we take a look at the progressing use of cloud-based educational tools, often controlled by only a handful of major corporations. We analyse how this ‘cloudification’ impacts academics’ and students’ privacy and how it influences the handling of privacy by universities and higher education institutions. Furthermore, we take a critical perspective on how this cloudification may not only threaten users’ privacy, but ultimately may also compromise core values like academic freedom: the dependency relationships between universities and corporations could impact curricula, while also threatening what research can be conducted. Finally, we take a perspective on universities’ cloudification in different western regions to identify policy mechanisms and recommendations that can enable universities to preserve their academic independence, without compromising on digitalization and functionality. -
A Comparative Analysis of Certificate Pinning in Android & iOS
Pradeep, A., Paracha, M. T., Bhowmick, P., Davanian, A., Razaghpanah, A., Chung, T., Lindorfer, M., Vallina-Rodriguez, N., Levin, D., & Choffnes, D. (2022). A Comparative Analysis of Certificate Pinning in Android & iOS. In Proceedings of the 22nd ACM Internet Measurement Conference (pp. 605–618). ACM.
DOI: 10.34726/3505 MetadataAbstract
TLS certificate pinning is a security mechanism used by applications (apps) to protect their network traffic against malicious certificate authorities (CAs), in-path monitoring, and other methods of TLS tampering. Pinning can provide enhanced security to defend against malicious third-party access to sensitive data in transit (e.g., to protect sensitive banking and health care information), but can also hide an app’s personal data collection from users and auditors. Prior studies found pinning was rarely used in the Android ecosystem, except in high-profile, security-sensitive apps; and, little is known about its usage on iOS and across mobile platforms. In this paper, we thoroughly investigate the use of certificate pinning on Android and iOS. We collect 5,079 unique apps from the two official app stores: 575 common apps, 1,000 popular apps each, and 1,000 randomly selected apps each. We develop novel, cross-platform, static and dynamic analysis techniques to detect the usage of certificate pinning. Thus, our study offers a more comprehensive understanding of certificate pinning than previous studies. We find certificate pinning as much as 4 times more widely adopted than reported in recent studies. More specifically, we find that 0.9% to 8% of Android apps and 2.5% to 11% of iOS apps use certificate pinning at run time (depending on the aforementioned sets of apps). We then investigate which categories of apps most frequently use pinning (e.g., apps in the “finance” category), which destinations are typically pinned (e.g., first-party destinations vs those used by third-party libraries), which certificates are pinned and how these are pinned (e.g., CA vs leaf certificates), and the connection security for pinned connections vs unpinned ones (e.g., the use of weak ciphers or improper certificate validation). Lastly, we investigate how many pinned connections are amenable to binary instrumentation to reveal the contents of their connections; for those that are, we analyze the data sent over pinned connections to understand what is protected by pinning. -
Comparing User Perceptions of Anti-Stalkerware Apps with the Technical Reality
Fassl, M., Anell, S., Houy, S., Lindorfer, M., & Krombholz, K. (2022). Comparing User Perceptions of Anti-Stalkerware Apps with the Technical Reality. In Proceedings of the Eighteenth Symposium on Usable Privacy and Security (SOUPS 2022) (pp. 135–154). USENIX Association.
DOI: 10.34726/3902 MetadataAbstract
Every year an increasing number of users face stalkerware on their phones. Many of them are victims of intimate partner surveillance (IPS) who are unsure how to identify or remove stalkerware from their phones. An intuitive approach would be to choose anti-stalkerware from the app store. However, a mismatch between user expectations and the technical capabilities can produce an illusion of security and risk compensation behavior (i.e., the Peltzmann effect). We compare users’ perceptions of anti-stalkerware with the technical reality. First, we applied thematic analysis to app reviews to analyze user perceptions. Then, we performed a cognitive walkthrough of two prominent anti-stalkerware apps available on the Google Play Store and reverse-engineered them to understand their detection features. Our results suggest that users base their trust on the look and feel of the app, the number and type of alerts, and the apps’ affordances. We also found that app capabilities do not correspond to the users’ perceptions and expectations, impacting their practical effectiveness. We discuss different stakeholders’ options to remedy these challenges and better align user perceptions with the technical reality. -
Not that Simple: Email Delivery in the 21st Century
Holzbauer, F., Ullrich, J., Lindorfer, M., & Fiebig, T. (2022). Not that Simple: Email Delivery in the 21st Century. In Proceedings of the 2022 USENIX Annual Technical Conference (pp. 295–308). USENIX Association.
DOI: 10.34726/4024 MetadataAbstract
Over the past two decades, the number of RFCs related to email and its security has exploded from below 100 to nearly 500. This embedded the Simple Mail Transfer Protocol (SMTP) into a tree of interdependent and delivery-relevant standards. In this paper, we investigate how far real-world deployments keep up with this increasing complexity of delivery- and security options. To gain an in-depth picture of email delivery apart from the giants in the ecosystem (Gmail, Outlook, etc.), we engage people to send emails to eleven differently configured target domains. Our measurements allow us to evaluate core aspects of email delivery, including security features, DNS configuration, and IP version support on the sending side across different types of providers. We find that novel technologies are often insufficiently supported, even by large providers. For example, while 65.4% of email providers can resolve hosts via IPv6, only 44.3% can also deliver emails via IPv6. Concerning security features, we observe that less than half (41.5%) of all providers rely on DNSSEC validating resolvers, and encryption is mostly opportunistic, with 89.7% of providers accepting invalid certificates. TLSA, as a DNS-based certificate verification method, is only used by 31.7% of the providers in our study. Finally, we turned our eye to the impact modern standards have on unsolicited bulk email (SPAM). We found that greylisting is effective, reducing the SPAM volume by roughly half while not impacting regular delivery. However, and interestingly, SPAM delivery currently seems to focus on plaintext IPv4 connections, making IPv6-only, TLS-enforcing inbound email servers a more effective anti-SPAM measure—even though it also means rejecting a major portion of legitimate emails. -
No Spring Chicken: Quantifying the Lifespan of Exploits in IoT Malware Using Static and Dynamic Analysis
Al Alsadi, A. A., Sameshima, K., Bleier, J., Yoshioka, K., Lindorfer, M., van Eeten, M., & Hernández Gañán, C. (2022). No Spring Chicken: Quantifying the Lifespan of Exploits in IoT Malware Using Static and Dynamic Analysis. In Yuji Suga, Kouichi Sakurai, Xuhua Ding, & Kazue Sako (Eds.), ASIA CCS ’22: Proceedings of the 2022 ACM on Asia Conference on Computer and Communications Security (pp. 309–321). Association for Computing Machinery.
DOI: 10.1145/3488932.3517408 MetadataAbstract
The Internet of things (IoT) is composed by a wide variety of software and hardware components that inherently contain vulnerabilities. Previous research has shown that it takes only a few minutes from the moment an IoT device is connected to the Internet to the first infection attempts. Still, we know little about the evolution of exploit vectors: Which vulnerabilities are being targeted in the wild, how has the functionality changed over time, and for how long are vulnerabilities being targeted? Understanding these questions can help in the secure development, and deployment of IoT networks. We present the first longitudinal study of IoT malware exploits by analyzing 17,720 samples collected from three different sources from 2015 to 2020. Leveraging static and dynamic analysis, we extract exploits from these binaries to then analyze them along the following four dimensions: (1) evolution of infection vectors over the years, (2) exploit lifespan, vulnerability age, and the time-to-exploit of vulnerabilities, (3) functionality of exploits, and (4) targeted IoT devices and manufacturers. Our descriptive analysis uncovers several patterns: IoT malware keeps evolving, shifting from simply leveraging brute force attacks to including dozens of device-specific exploits. Once exploits are developed, they are rarely abandoned. The most recent binaries still target (very) old vulnerabilities. In some cases, new exploits are developed for a vulnerability that has been known for years. We find that the mean time-to-exploit after vulnerability disclosure is around 29 months, much longer than for malware targeting other environments. -
Tarnhelm: Isolated, Transparent & Confidential Execution of Arbitrary Code in ARM's TrustZone
Quarta, D., Ianni, M., Machiry, A., Fratantonio, Y., Gustafson, E., Balzarotti, D., Lindorfer, M., Vigna, G., & Kruegel, C. (2021). Tarnhelm: Isolated, Transparent & Confidential Execution of Arbitrary Code in ARM’s TrustZone. In Proceedings of the 2021 Research on offensive and defensive techniques in the Context of Man At The End (MATE) Attacks. ACM, Austria. ACM.
DOI: 10.1145/3465413.3488571 Metadata ⯈Fulltext (preprint)Abstract
Protecting the confidentiality of applications on commodity operating systems, both on desktop and mobile devices, is challenging: attackers have unrestricted control over an application´s processes and thus direct access to any of the application´s assets. However, the application´s code itself can be of great commercial value, for example in the case of proprietary code or additional functionality obtained as downloadable content and via in-app purchases, which are widely used to monetize free applications through premium content. Developers still rely heavily on obfuscation to protect their own code from unauthorized tampering or copying, providing an obstacle for an attacker, but not preventing compromise. In this paper, we present Tarnhelm, an approach to offer a practical and transparent primitive to implement code confidentiality by extending ARM´s TrustZone, a TEE that so far provides limited functionality to application developers. Tarnhelm allows develop- ers to easily designate part of their code as confidential through source code annotations. At compile time, Tarnhelm automatically partitions the application into regular application code, executed in the "normal world," and the invisible code, transparently executed in the "secure world." Tarnhelm tightly couples and secures the execution in both worlds without exposing any additional attack surface by combining a number of different techniques, such as secure code loading, system call forwarding, transparent world switching, and the enforcement of inter-world control-flow integrity. We implemented a proof of concept of Tarnhelm and demonstrate its feasibility in a mobile computing setting. -
When Malware is Packin' Heat; Limits of Machine Learning Classifiers Based on Static Analysis Features
Aghakhani, H., Gritti, F., Mecca, F., Lindorfer, M., Ortolani, S., Balzarotti, D., Vigna, G., & Krügel, C. (2020). When Malware is Packin’ Heat; Limits of Machine Learning Classifiers Based on Static Analysis Features. In Network and Distributed System Security Symposium (NDSS). Internet Society.
Metadata ⯈Fulltext (preprint)Abstract
Machine learning techniques are widely used in addition to signatures and heuristics to increase the detection rate of anti-malware software, as they automate the creation of detection models, making it possible to handle an ever-increasing number of new malware samples. In order to foil the analysis of anti-malware systems and evade detection, malware uses packing and other forms of obfuscation. However, few realize that benign applications use packing and obfuscation as well, to protect intellectual property and prevent license abuse. In this paper, we study how machine learning based on static analysis features operates on packed samples. Malware researchers have often assumed that packing would prevent machine learning techniques from building effective classifiers. However, both industry and academia have published results that show that machine-learning-based classifiers can achieve good detection rates, leading many experts to think that classifiers are simply detecting the fact that a sample is packed, as packing is more prevalent in malicious samples. We show that, different from what is commonly assumed, packers do preserve some information when packing programs that is "useful" for malware classification. However, this information does not necessarily capture the sample´s behavior. We demonstrate that the signals extracted from packed executables are not rich enough for machine-learning-based models to (1) generalize their knowl- edge to operate on unseen packers, and (2) be robust against adversarial examples. We also show that a na ̈ıve application of machine learning techniques results in a substantial number of false positives, which, in turn, might have resulted in incorrect labeling of ground-truth data used in past work. -
FlowPrint: Semi-Supervised Mobile-App Fingerprinting on Encrypted Network Traffic
van Ede, T., Bortolameotti, R., Continella, A., Ren, J., Dubois, D., Lindorfer, M., Choffnes, D., van Steen, M., & Peter, A. (2020). FlowPrint: Semi-Supervised Mobile-App Fingerprinting on Encrypted Network Traffic. In Network and Distributed System Security Symposium (NDSS). Internet Society.
Metadata ⯈Fulltext (preprint)Abstract
Mobile-application fingerprinting of network traffic is valuable for many security solutions as it provides insights into the apps active on a network. Unfortunately, existing techniques require prior knowledge of apps to be able to recognize them. However, mobile environments are constantly evolving, i.e., apps are regularly installed, updated, and uninstalled. Therefore, it is infeasible for existing fingerprinting approaches to cover all apps that may appear on a network. Moreover, most mobile traffic is encrypted, shows similarities with other apps, e.g., due to common libraries or the use of content delivery networks, and depends on user input, further complicating the fingerprinting process. As a solution, we propose FLOWPRINT, a semi-supervised approach for fingerprinting mobile apps from (encrypted) net- work traffic. We automatically find temporal correlations among destination-related features of network traffic and use these correlations to generate app fingerprints. Our approach is able to fingerprint previously unseen apps, something that existing techniques fail to achieve. We evaluate our approach for both Android and iOS in the setting of app recognition, where we achieve an accuracy of 89.2%, significantly outperforming state- of-the-art solutions. In addition, we show that our approach can detect previously unseen apps with a precision of 93.5%, detecting 72.3% of apps within the first five minutes of communication. -
TXTing 101: Finding Security Issues in the Long Tail of DNS TXT Records
der Toorn, O. van, van Rijswijk-Deij, R., Fiebig, T., Lindorfer, M., & Sperotto, A. (2020). TXTing 101: Finding Security Issues in the Long Tail of DNS TXT Records. In 2020 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW). IEEE.
DOI: 10.1109/eurospw51379.2020.00080 Metadata ⯈Fulltext (preprint)Abstract
The DNS TXT resource record is the one with the most flexibility for its contents, as it is a largely unstructured. Although it might be the ideal basis for storing any form of text-based information, it also poses a security threat, as TXT records can also be used for malicious and unintended practices. Yet, TXT records are often overlooked in security research. In this paper, we present the first structured study of the uses of TXT records, with a specific focus on security implications. We are able to classify over 99.54% of all TXT records in our dataset, finding security issues including accidentally published private keys and exploit delivery attempts. We also report on our lessons learned during our large-scale, systematic analysis of TXT records. -
MineSweeper: An In-depth Look into Drive-by Cryptocurrency Mining and Its Defense
Konoth, R. K., Vineti, E., Moonsamy, V., Lindorfer, M., Kruegel, C., Bos, H., & Vigna, G. (2018). MineSweeper: An In-depth Look into Drive-by Cryptocurrency Mining and Its Defense. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. ACM.
DOI: 10.1145/3243734.3243858 Metadata ⯈Fulltext (preprint)Abstract
A wave of alternative coins that can be effectively mined without specialized hardware, and a surge in cryptocurrencies´ market value has led to the development of cryptocurrency mining (cryptomining) services, such as Coinhive, which can be easily integrated into websites to monetize the computational power of their visitors. While legitimate website operators are exploring these services as an alternative to advertisements, they have also drawn the attention of cybercriminals: drive-by mining (also known as cryptojacking) is a new web-based attack, in which an infected website secretly executes JavaScript code and/or a WebAssembly module in the user´s browser to mine cryptocurrencies without her consent. In this paper, we perform a comprehensive analysis on Alexa´s Top 1 Million websites to shed light on the prevalence and profitabil- ity of this attack. We study the websites affected by drive-by mining to understand the techniques being used to evade detection, and the latest web technologies being exploited to efficiently mine cryptocurrency. As a result of our study, which covers 28 Coinhive-like services that are widely being used by drive-by mining websites, we identified 20 active cryptomining campaigns. Motivated by our findings, we investigate possible countermeasures against this type of attack. We discuss how current blacklisting approaches and heuristics based on CPU usage are insufficient, and present MineSweeper, a novel detection technique that is based on the intrinsic characteristics of cryptomining code, and, thus, is resilient to obfuscation. Our approach could be integrated into browsers to warn users about silent cryptomining when visiting websites that do not ask for their consent. -
Panoptispy: Characterizing Audio and Video Exfiltration from Android Applications
Pan, E., Ren, J., Lindorfer, M., Wilson, C., & Choffnes, D. (2018). Panoptispy: Characterizing Audio and Video Exfiltration from Android Applications. In Proceedings on Privacy Enhancing Technologies (pp. 33–50). DeGruyter.
DOI: 10.1515/popets-2018-0030 Metadata ⯈Fulltext (preprint)Abstract
The high-fidelity sensors and ubiquitous internet connectivity offered by mobile devices have facilitated an explosion in mobile apps that rely on multi-media features. However, these sensors can also be used in ways that may violate user´s expectations and personal privacy. For example, apps have been caught taking pictures without the user´s knowledge and passively listened for inaudible, ultrasonic audio beacons. The developers of mobile device operating systems recognize that sensor data is sensitive, but unfortunately existing permission models only mitigate some of the privacy concerns surrounding multimedia data. In this work, we present the first large-scale empirical study of media permissions and leaks from Android apps, covering 17,260 apps from Google Play, AppChina, Mi.com, and Anzhi. We study the behavior of these apps using a combination of static and dynamic analysis techniques. Our study reveals several alarming privacy risks in the Android app ecosystem, including apps that over-provision their media permissions and apps that share image and video data with other parties in unexpected ways, without user knowledge or consent. We also identify a previously unreported privacy risk that arises from third-party libraries that record and upload screenshots and videos of the screen without informing the user and without requiring any permissions. -
GuardION: Practical Mitigation of DMA-Based Rowhammer Attacks on ARM
van der Veen, V., Lindorfer, M., Fratantonio, Y., Padmanabha Pillai, H., Vigna, G., Kruegel, C., Bos, H., & Razavi, K. (2018). GuardION: Practical Mitigation of DMA-Based Rowhammer Attacks on ARM. In Detection of Intrusions and Malware, and Vulnerability Assessment (pp. 92–113). Springer.
DOI: 10.1007/978-3-319-93411-2_5 Metadata ⯈Fulltext (preprint)Abstract
Over the last two years, the Rowhammer bug transformed from a hard-to-exploit DRAM disturbance error into a fully weaponized attack vector. Researchers demonstrated exploits not only against desktop computers, but also used single bit flips to compromise the cloud and mobile devices, all without relying on any software vulnerability. Since hardware-level mitigations cannot be backported, a search for software defenses is pressing. Proposals made by both academia and industry, however, are either impractical to deploy, or insufficient in stopping all attacks: we present rampage, a set of DMA-based Rowhammer attacks against the latest Android OS, consisting of (1) a root exploit, and (2) a series of app-to-app exploit scenarios that bypass all defenses. To mitigate Rowhammer exploitation on ARM, we propose guardion, a lightweight defense that prevents DMA-based attacks-the main attack vector on mobile devices-by isolating DMA buffers with guard rows. We evaluate guardion on 22 benchmark apps and show that it has a negligible memory overhead (2.2 MB on average). We further show that we can improve system performance by re-enabling higher order allocations after Google disabled these as a reaction to previous attacks. -
Bug Fixes, Improvements, ... and Privacy Leaks - A Longitudinal Study of PII Leaks Across Android App Versions
Ren, J., Lindorfer, M., Dubois, D. J., Rao, A., Choffnes, D., & Vallina-Rodriguez, N. (2018). Bug Fixes, Improvements, ... and Privacy Leaks - A Longitudinal Study of PII Leaks Across Android App Versions. In Proceedings 2018 Network and Distributed System Security Symposium. Internet Society.
DOI: 10.14722/ndss.2018.23143 Metadata ⯈Fulltext (preprint)Abstract
Is mobile privacy getting better or worse over time? In this paper, we address this question by studying privacy leaks from historical and current versions of 512 popular Android apps, covering 7,665 app releases over 8 years of app version history. Through automated and scripted interaction with apps and analysis of the network traffic they generate on real mobile devices, we identify how privacy changes over time for individual apps and in aggregate. We find several trends that include increased collection of personally identifiable information (PII) across app versions, slow adoption of HTTPS to secure the information sent to other parties, and a large number of third parties being able to link user activity and locations across apps. Interestingly, while privacy is getting worse in aggregate, we find that the privacy risk of individual apps varies greatly over time, and a substantial fraction of apps see little change or even improvement in privacy. Given these trends, we propose metrics for quantifying privacy risk and for providing this risk assessment proactively to help users balance the risks and benefits of installing new versions of apps. -
Obfuscation-Resilient Privacy Leak Detection for Mobile Apps Through Differential Analysis
Continella, A., Fratantonio, Y., Lindorfer, M., Puccetti, A., Zand, A., Kruegel, C., & Vigna, G. (2017). Obfuscation-Resilient Privacy Leak Detection for Mobile Apps Through Differential Analysis. In Proceedings 2017 Network and Distributed System Security Symposium. Internet Society.
DOI: 10.14722/ndss.2017.23465 Metadata ⯈Fulltext (preprint)Abstract
Mobile apps are notorious for collecting a wealth of private information from users. Despite significant effort from the research community in developing privacy leak detection tools based on data flow tracking inside the app or through network traffic analysis, it is still unclear whether apps and ad libraries can hide the fact that they are leaking private information. In fact, all existing analysis tools have limitations: data flow tracking suffers from imprecisions that cause false positives, as well as false negatives when the data flow from a source of private information to a network sink is interrupted; on the other hand, network traffic analysis cannot handle encryption or custom encoding. We propose a new approach to privacy leak detection that is not affected by such limitations, and it is also resilient to obfuscation techniques, such as encoding, formatting, encryption, or any other kind of transformation performed on private information before it is leaked. Our work is based on black- box differential analysis, and it works in two steps: first, it establishes a baseline of the network behavior of an app; then, it modifies sources of private information, such as the device ID and location, and detects leaks by observing deviations in the resulting network traffic. The basic concept of black-box differential analysis is not novel, but, unfortunately, it is not practical enough to precisely analyze modern mobile apps. In fact, their network traffic contains many sources of non-determinism, such as random identifiers, timestamps, and server-assigned session identifiers, which, when not handled properly, cause too much noise to correlate output changes with input changes. The main contribution of this work is to make black-box dif- ferential analysis practical when applied to modern Android apps. In particular, we show that the network-based non-determinism can often be explained and eliminated, and it is thus possible to reliably use variations in the network traffic as a strong signal to detect privacy leaks. We implemented this approach in a tool, called AGRIGENTO, and we evaluated it on more than one thousand Android apps. Our evaluation shows that our approach works well in practice and outperforms current state-of-the-art techniques. We conclude our study by discussing several case studies that show how popular apps and ad libraries currently exfiltrate data by using complex combinations of encoding and encryption mechanisms that other approaches fail to detect. Our results show that these apps and libraries seem to deliberately hide their data leaks from current approaches and clearly demonstrate the need for an obfuscation-resilient approach such as ours.
Presentations (created while at TU Wien)
-
2022
-
ART-assisted App Diffing: Defeating Dalvik Bytecode Shrinking, Obfuscation, and Optimization with Android's OAT Compiler
Bleier, J., & Lindorfer, M. (2022, May 23). ART-assisted App Diffing: Defeating Dalvik Bytecode Shrinking, Obfuscation, and Optimization with Android’s OAT Compiler [Poster Presentation]. 43rd IEEE Symposium on Security and Privacy, San Francisco, United States of America (the).
MetadataAbstract
Android aims to provide a secure and feature-rich, yet resource-saving platform for its applications (apps). To achieve these goals, the compilation to distributable packages shrinks, obfuscates, and optimizes the code by default. As an additional optimization, the Android Runtime (ART) nowadays compiles the app’s bytecode to native code on the device instead of executing it in the Dalvik VM. We study the effects of these changes in the Android build and runtime environment on the problem of calculating app similarity. We compare existing bytecode-based tools to our novel approach of using the recompiled (and optimized) binary form. We propose OATMEAL, an extensible framework to generate reliable ground truth for evaluating app similarity approaches and provide a benchmark dataset to the community. We built this dataset from open-source apps available on F-Droid in various configurations that optimize and obfuscate the bytecode. Using this dataset, we show the limitations of existing Android-specific bytecode analysis approaches when faced with the new optimizing R8 bytecode compiler. We further demonstrate how well BinDiff, a state-of-the-art binary-based alternative, works in scoring the similarity of apps. With OATMEAL, we provide the foundation for integrating and benchmarking further approaches, both for calculating the similarity between apps (based on bytecode or binary code), and for evaluating their robustness to evolving optimization and obfuscation techniques.