Learning by Sampling: A Deep Learning Approach To The Planted Clique Problem With Unlimited Sampling

Najeeb Nabwani, M.Sc. Thesis Seminar

Thursday, 15.10.2020, 16:30

Zoom Lecture: https://technion.zoom.us/j/93620048284

Advisor: Prof. Assaf Schuster

The planted clique problem plays an important role in the field of average-case complexity. The problem consists of detecting graphs that contain a hidden clique of size $k$ that was planted in graphs from the Erdos-Renyi model $G_{n,0.5}$. The planted clique conjecture suggests that there is no polynomial-time algorithm that can solve the planted clique problem when $2\log_2(n) \ll k \ll \sqrt{n}$.
This conjecture has served as the underlying hardness assumption in several different works in fields such as cryptography and machine learning.
We propose a deep learning framework for approximating the planted clique problem. Due to the well-defined random distribution of the planted clique problem, when certain conditions are met, we are able to generate unlimited amounts of training data with constant-time labelling. Our framework leverages that ability and uses polynomial-sized feed-forward neural networks. We evaluate this framework empirically on multiple different planted clique distributions and manage to achieve high accuracy scores that surpass 90\% in some cases.
Our results show that more research is needed in applying deep learning techniques to problems from average-case complexity theory.
We then address the possibility of expanding this framework to the maximum clique problem. The set of graphs with $n$ nodes is broken down to smaller, more focused subsets, and for each subset, we train a network that approximates the maximum clique problem. We assume that for each subset, an oracle machine that labels most of the subset in polynomial time exists. We prove this concept by conducting empirical evaluations on small graphs by using an exponential-time algorithm in order to simulate the oracle machine. The results show that if we can approximate such oracle machine in polynomial time, then the maximum clique problem can be effectively learned from the smaller, more focused subsets.