To join the email distribution list of the cs colloquia, please visit the list subscription page.
Computer Science events calendar in HTTP ICS format for of Google calendars, and for Outlook.
Academic Calendar at Technion site.
506, Zisapel Building
Diffusion Transformer models generate images with remarkable fidelity, yet training them at ultra-high resolutions is often cost-prohibitive due to the quadratic scaling of self-attention. In this talk, I will present Dynamic Position Extrapolation (DyPE), a training-free method that enables pre-trained diffusion transformers to synthesize images at resolutions far beyond their training data with no additional sampling cost.
The core of our approach leverages the spectral progression of the diffusion process, where low-frequency structures converge early and high-frequency details emerge in later stages. We introduce a mechanism to dynamically adjust positional encodings at each step, matching the frequency spectrum to the current stage of the generative process. I will demonstrate how DyPE enables models like FLUX to generate images at extreme scales, up to 16 million pixels, while consistently achieving state-of-the-art fidelity on high-resolution benchmarks. https://noamissachar.github.io/DyPE/
Noam is a Phd student in the Hebrew University of Jerusalem under the supervision of Prof. Dani Lischinski and Prof. Raanan Fattal. His reserach interest is visual generative models and their applications.
This research introduces a novel per-timestep input-space adaptation framework designed for multivariate time-series models with fixed weights, addressing the need for secure deployments where regulatory or technical constraints prevent model fine-tuning. By back-propagating task loss through a frozen backbone using target-domain labels, the method enables effective adaptation in both source-free and source-assisted environments without requiring access to original training data. Evaluated against a convolutional backbone across clinical and sensor-based benchmarks, the approach yields substantial performance gains, significantly outperforming existing end-to-end and test-time adaptation baselines. Notably, the adapter matches or exceeds the performance of models trained natively on target data and exhibits unique architectural portability, allowing a single module to be deployed across different frozen predictors in a zero-shot capacity.
Cardiac MRI is clinically valuable but inherently slow, requiring many sequential measurements per frame to build a complete image. In dynamic MRI, where each time-frame must be acquired separately, this becomes especially limiting. This work presents a pipeline built on top of TEAM-PILOT model that learns to interpolate videos directly in the frequency domain, generating phase-consistent intermediate frames and effectively enlarging the training dataset without any new acquisitions. The approach addresses two challenges simultaneously: accelerating scan time and alleviating data scarcity, which is a general bottleneck in medical imaging deep learning. We demonstrate that combining 2-shot acquisition (proportional to the amount of signals) with 4 times temporal densification matches standard 8-shot reconstruction quality across multiple state-of-the-art architectures, achieving a 4 times reduction in scan time with no significant loss in image quality.
Real-world deployment of deep learning violates the assumptions of i.i.d. training data and aggregate-metric evaluation: observations are dependent in time, space, and across users; decision latency is bounded by milliseconds; and the cost of a wrong prediction is rarely symmetric. My doctoral research argues that the structural priors hardest for a deep network to recover from data are precisely the ones that should be encoded into its architecture. The position can be stated in one line: what I cannot afford to learn, I encode.
The eleven papers in the dissertation span communication systems, physiological monitoring, physics-informed learning, clinical decision support, and large language models. Across them I make three claims. First, architectural priors generalize when they match the structure of the data: the dissertation traces a continuum from priors in the problem formulation, to priors in features and representations, to priors in the architecture, to priors that are the governing equation. Second, this claim is Pareto rather than monotone: aligned priors dominate where parameters are scarce and are dominated where they are not, a scope condition anchored at a clean empirical crossover in the language-model chapters. Third, aggregate metrics are the wrong audit: calibrated diagnostics (matched effective sample size, effective residual-stream depth, the shuffle gap, cost-conditional thresholds) repeatedly reorder conclusions drawn from AUC, perplexity, and fixed-length comparisons.
The seminar develops the first claim through three case studies, one at each depth of the continuum: AARL, a DRL scheduler for 5G millimeter-wave networks (prior in the problem formulation); Lab to Wrist, a neural-ODE and neural-Kalman framework that embeds cardiovascular physics architecturally for heart-rate and oxygen-consumption prediction on wearables (prior in the architecture); and a differentiable Randers-Finsler eikonal solver applied to cross-scene wildfire propagation (prior is the governing PDE itself)