On the opportunities and risks of machine learning to online and societal safety

Abdelnabi, Sahar

Please use this identifier to cite or link to this item: doi:10.22028/D291-44404

Title:	On the opportunities and risks of machine learning to online and societal safety
Author(s):	Abdelnabi, Sahar
Language:	English
Year of Publication:	2024
DDC notations:	004 Computer science, internet 600 Technology
Publikation type:	Dissertation
Abstract:	Machine Learning (ML), with its continuous and ever-growing significant advances, has great potential to accelerate decision-making, alleviate some of our societal problems, and reshape and facilitate our daily lives. However, ML has inherent security vulnerabilities and limitations and can itself be exploited and misused to exacerbate such societal problems, which requires a thorough evaluation of capabilities, attacks, and countermeasures. In this thesis, we evaluate the interplay between ML, security, and online and societal safety aspects, such as misinformation and risks imposed by the use of Large Language Models (LLMs). To counter risks imposed by LLMs and generative models and help identify the context and provenance of information, we propose watermarking as an active defense against deepfakes and model abuse. To exemplify ML opportunities to promote online safety, we leverage ML to automate multi-modal fact-checking and identify the underlying context of images that might be used out of context. On the other hand, to evaluate the risk of how ML can exacerbate misinformation and cause information contamination and poisoning, we comprehensively study attacks against fact-checking models and possible ones against real-world deployed LLM-integrated search engines. Besides that, we broadly discuss LLM-integrated applications and their potential security risks induced by the indirect prompt injection vulnerability that we uncover. Finally, to proactively evaluate LLMs in interactive setups that better match real-world use cases, such as customer service chatbots, we propose a new benchmark of complex text-based negotiation games to examine LLMs' performance and reasoning in multi-agent setups, including adversarial ones that assume attacks between agents. Das maschinelle Lernen (ML) mit seinen kontinuierlichen und ständig wachsenden Fortschritten hat großes Potenzial, die Entscheidungsfindung zu beschleunigen, einige unserer gesellschaftlichen Probleme zu lindern und unser tägliches Leben neu zu gestalten und zu erleichtern. ML hat jedoch inhärente Sicherheitslücken und -beschränkungen und kann selbst ausgenutzt und missbraucht werden, um solche gesellschaftlichen Probleme zu verschärfen, was eine gründliche Bewertung der Fähigkeiten, Angriffe und Gegenmaßnahmen erfordert. In dieser Arbeit untersuchen wir das Zusammenspiel zwischen ML, Sicherheit und Aspekten der Online- und gesellschaftlichen Sicherheit, wie z. B. Fehlinformationen und Risiken, die durch die Verwendung von Large Language Models (LLMs) entstehen. Um den von LLMs und generativen Modellen ausgehenden Risiken zu begegnen und den Kontext und die Herkunft von Informationen zu identifizieren, schlagen wir Wasserzeichen als aktiven Schutz gegen Deepfakes und Modellmissbrauch vor. Um die Möglichkeiten von ML zur Förderung der Online-Sicherheit zu veranschaulichen, setzen wir ML ein, um die multimodale Faktenüberprüfung zu automatisieren und den zugrundeliegenden Kontext von Bildern zu identifizieren, die möglicherweise ohne Kontext verwendet werden. Um andererseits das Risiko zu bewerten, wie ML Fehlinformationen verschlimmern und Informationsverunreinigung und -vergiftung verursachen kann, untersuchen wir umfassend Angriffe auf Faktenprüfungsmodelle und mögliche Angriffe auf real eingesetzte LLM-integrierte Suchmaschinen. Darüber hinaus erörtern wir umfassend LLM-integrierte Anwendungen und ihre potenziellen Sicherheitsrisiken, die durch die von uns aufgedeckte indirekte Prompt-Injection-Schwachstelle entstehen. Um LLMs proaktiv in interaktiven Systemen zu evaluieren, die besser zu realen Anwendungsfällen passen, wie z.B. Chatbots im Kundenservice, schlagen wir einen neuen Benchmark komplexer textbasierter Verhandlungsspiele vor, um die Leistung und Argumentation von LLMs in Multi-Agenten-Systemen zu untersuchen, einschließlich gegnerischer Systeme, die Angriffe zwischen Agenten annehmen.
Link to this record:	urn:nbn:de:bsz:291--ds-444049 hdl:20.500.11880/40056 http://dx.doi.org/10.22028/D291-44404
Advisor:	Fritz, Mario
Date of oral examination:	19-Dec-2024
Date of registration:	28-May-2025
Faculty:	MI - Fakultät für Mathematik und Informatik
Department:	MI - Informatik
Professorship:	MI - Prof. Dr. Mario Fritz
Collections:	SciDok - Der Wissenschaftsserver der Universität des Saarlandes

Files for this record:

File	Description	Size	Format
PhD_dissertation_Sahar_Abdelnabi.pdf		21,88 MB	Adobe PDF	View/Open

Export: BibTex

This item is licensed under a Creative Commons License