Bitte benutzen Sie diese Referenz, um auf diese Ressource zu verweisen: doi:10.22028/D291-42655
Titel: Efficient and differentiable combinatorial optimization for visual computing
VerfasserIn: Abbas, Ahmed
Sprache: Englisch
Erscheinungsjahr: 2024
DDC-Sachgruppe: 004 Informatik
500 Naturwissenschaften
600 Technik
Dokumenttyp: Dissertation
Abstract: Many visual computing tasks involve reasoning over structured domains and discrete objects which can be modeled as combinatorial optimization (CO) problems. Tremendous speed-up in general-purpose CO solvers has allowed to solve these problems in many cases despite being NP-hard. Approaching large-scale structured prediction problems from a combinatorial optimization standpoint however, has been a challenge due to a variety of reasons. These include near real-time performance requirements and lack of differentiability of CO approaches. The latter causes difficulties in harnessing machine learning to specify problem specifications and in developing learnable CO algorithms. We aim to address these shortcomings on multiple avenues. First, we focus on a specific CO problem for clustering known as the multicut problem with many applications in a variety of visual computing tasks. The multicut problem is defined on a weighted graph where edge weights encode preferences of the endpoints belonging to the same or different clusters. We devise methods for reducing human effort in multicut model specification, (i.e., edge weights and graph structure) and for reducing high compute times associated with large problem sizes. For inferring the edge weights, we utilize neural networks and propose strategies for improving gradient estimation through the multicut problem. This allows to train the neural network by imposing a loss on the clustering produced by the multicut problem. Our approach improves performance on large-scale panoptic segmentation tasks. As a second step, we focus on the challenge of graph structure design. We propose a compact formulation of the multicut problem on the complete graph along with efficient algorithms, sidestepping the problem of graph selection. Our approach yields better panoptic segmentation performance as compared to the one obtained by a hand-designed graph structure. Lastly, we address scalability issues of the multicut problem. We devise a massively parallelizable GPU-friendly algorithm resulting in more than an order of magnitude speed-up over existing sequential methods. As a second avenue, we shift our focus to general-purpose CO solvers and tackle challenges related to their scalability. As a first step, we devise parallel algorithms that can harness GPUs gaining up to an order of magnitude speed-up over sequential CPU counterparts. For the second step, we exploit machine learning in solving relaxations of CO problems. We employ differentiable solver primitives and use neural networks inside the solver to guide the solving process. Given a few problem instances from a task of interest, our general-purpose solver can be adapted by learning instead of task-specific solver development. Empirically our learned approach achieves significantly better performance than its non-learned version, better solutions than task-specific solvers, and exhibits better anytime performance compared to a commercial solver (Gurobi). Our method is also applicable to some problems unrelated to structured prediction and offers competitive performance to the commercial solver. In summary, we study methods to make CO for visual computing more practical by devising differentiable, massively parallel, and data-driven methods.
Link zu diesem Datensatz: urn:nbn:de:bsz:291--ds-426550
hdl:20.500.11880/38409
http://dx.doi.org/10.22028/D291-42655
Erstgutachter: Swoboda, Paul
Tag der mündlichen Prüfung: 2-Aug-2024
Datum des Eintrags: 11-Sep-2024
Fakultät: MI - Fakultät für Mathematik und Informatik
Fachrichtung: MI - Informatik
Professur: MI - Keiner Professur zugeordnet
Sammlung:SciDok - Der Wissenschaftsserver der Universität des Saarlandes

Dateien zu diesem Datensatz:
Datei Beschreibung GrößeFormat 
thesis_ahmed_comb_opt_visual_computing.pdfDissertation3,15 MBAdobe PDFÖffnen/Anzeigen


Alle Ressourcen in diesem Repository sind urheberrechtlich geschützt.