To join the email distribution list of the cs colloquia, please visit the list subscription page.
Computer Science events calendar in HTTP ICS format for of Google calendars, and for Outlook.
Academic Calendar at Technion site.
Taub 601
DAG-based Byzantine Fault Tolerant (BFT) protocols provide high-throughput consensus under partial synchrony, but existing DAG protocols still require at least a three-message delay to commit decisions.
In contrast, existing Fast-Path BFT protocols can achieve optimal termination with a two-message delay under favorable conditions, though they do not naturally extend to DAGs.
In this seminar we present FinWhale, the first DAG-based BFT protocol with a two-message delay fast path.
FinWhale extends the Mysticeti protocol with a novel fast-path commit mechanism that safely coexists with the protocol’s original slow-path rules.
To preserve safety across different local DAG views, we introduce new commit structures based on fast-path evidence blocks that enable validators to combine fast-path and slow-path reasoning consistently.
FinWhale operates in the partially synchronous model with $n=3f+2p-1$ validators; this matches the known lower-bound for fast Byzantine consensus.
The protocol tolerates up to $f$ Byzantine faults and achieves fast termination whenever at most $p$ validators fail during the fast path ($1 \leq p\leq f$).
Our results show that optimal-latency fast paths can be integrated into uncertified DAG consensus protocols.
The diversity of workloads, performance metrics, and potential system optimizations makes designing an optimal caching middleware a daunting task. For example, cache replacement policies can exhibit widely varying hit ratios depending on workload characteristics and cache sizes. Moreover, the choice of cache replacement policy may impact the computational complexity and meta-data size required for cache management, which may also be impacted by the internal cache organization (e.g., fully associative vs. k-way) and supporting data structures. Consequently, state-of-the-art caching libraries are still hand-crafted by experts and carefully tuned to narrow workload classes and design goals. Adapting a cache to a new setting, or improving it as a robust general-purpose solution, still requires a labor-intensive redesign process that does not scale to the diversity of modern workloads and deployment constraints. We present Evolve, an agentic framework for automatically improving caching middleware implementations.
Starting from an existing design, Evolve searches for superior variants under two regimes: refinement over a fixed workload suite, and workload specialization against a deployment-specific trace distribution. Each cycle proposes a cohort of LLM-implemented variants, validates them through deterministic correctness gates, scores survivors on a multi-dimensional plugin-based fitness vector, retains a Pareto front under a held-out trace suite to control for overfitting.
The framework also stores LLM-generated insights in a knowledge ledger that supports cross-iteration learning and guides future exploration. As a case study, we apply Evolve on a Dash-inspired associative cache, run it across diverse traces in both regimes, and characterize the variants it produces along each chosen fitness dimension.
Taub 601
Constrained Horn Clauses (CHCs) have emerged as a powerful framework for formal verification, enabling reasoning about program safety, invariant synthesis, model checking, and related verification tasks. Their combination of logical expressiveness and algorithmic solvability has made CHCs a central foundation for modern verification techniques. However, despite significant advances in CHC solving, reasoning about low-level bit-precise program semantics remains a major challenge for existing solvers.
Traditionally, a set of CHCs is encoded with respect to a single background theory. In the context of bit-precise verification, program semantics can naturally be encoded as CHCs modulo the theory of fixed-size bit-vectors. Alternatively, bit-precise semantics can be encoded using integer arithmetic by modeling modular and bit-wise behavior through arithmetic constraints. However, neither approach consistently yields an efficient verification procedure: reasoning directly in the bit-vector theory often limits the scalability and generalization capabilities of CHC solvers, whereas arithmetic encodings produce complex constraints that are expensive to process, especially in the presence of bit-wise operations.
To address this limitation, I will present a novel theory-modular framework for CHC satisfiability that enables coordinated reasoning across multiple background theories. The framework decomposes a CHC system into complementary fragments interpreted over different background theories and coordinates reasoning between them through sound cross-theory transformations. This modular approach combines efficient arithmetic reasoning with precise handling of bit-level operations, leveraging the strengths of both theories within a unified CHC-solving framework.
I will describe the theoretical foundations of the approach, including theory transformations, modular CHC encodings, and a satisfiability procedure for coordinated reasoning across theories. I will also present experimental results on bit-manipulating verification benchmarks, demonstrating substantial improvements over monolithic single-theory approaches. Finally, I will discuss how theory-modular reasoning can contribute to future directions in scalable and expressive CHC solving.
High-performance computing software must be ported among serial C++, OpenMP, and CUDA as platforms and deployment constraints change. Such translation requires preserving program behavior while satisfying target-specific constraints on parallel execution, synchronization, memory movement, and API use. Although large language models are flexible code generators, plausible parallel-code translations can fail at compilation, execution, or validation.
We study guidance and adaptation methods that make existing LLMs more reliable without training a generator from scratch. The central method is latent PRM guidance, which guides a frozen latent-reasoning model before code is emitted. A small process reward model scores alternative continuous hidden-state branches during latent reasoning and selects branches expected to lead to executable, behaviorally correct programs. The evaluation also includes lighter-weight assistance methods such as prompting, fine-tuning, and iterative repair with compiler and execution feedback.
On the ParaTrans dataset, LLMs struggle as standalone executable translators, but their reliability improves significantly through task-aware assistance. Latent PRM guidance improves validation over unguided latent reasoning while remaining compatible with iterative repair. The best results are achieved through the cumulative and compounding effects of combining guidance, prompting, and iterative repair to improve executable correctness and validation performance.
Prediction algorithms are increasingly used to inform decisions about humans, but maximizing accuracy - the standard learning objective - does not necessarily maximize user benefits. Instead, we propose optimizing social welfare, defined as the average gain users receive from correct predictions. Welfare enables to express, and therefore account for, heterogeneity in how much users benefit from accuracy. But since these valuations are private and users can gain from overreporting them, learning must simultaneously elicit truthful values and optimize welfare with respect to them. To this end, we propose a novel learning algorithm that incorporates a truthful auction. We show how to compute allocations and prices efficiently, and bound the number of paying users - which surprisingly is independent of the sample size. We conclude with experiments on real and synthetic data that demonstrate our algorithm and explore the connections between welfare and accuracy.
Taub 601
We initiate the study of DNA-based distributed storage systems, where information is encoded across multiple DNA data storage containers to achieve robustness against container failures. In this setting, data are distributed over M containers, and the objective is to guarantee that the contents of any failed container can be reliably reconstructed from the surviving ones. Unlike classical distributed storage systems, DNA data storage containers are fundamentally constrained by sequencing technology, since each read operation yields the content of a uniformly random sampled strand from the container. Within this framework, we consider several erasure-correcting codes and analyze the expected recovery time of the data stored in a failed container. Our results are obtained by analyzing generalized versions of the classical Coupon Collector's Problem, which may be of independent interest.