Session Chairs: Sebastian Angel, University of Pennsylvania, and Malte Schwarzkopf, Brown University, Ishtiyaque Ahmad, Yuntian Yang, Divyakant Agrawal, Amr El Abbadi, and Trinabh Gupta, University of California Santa Barbara. Writing a correct operating system kernel is notoriously hard. Pollux simultaneously considers both aspects. Currently, for large graphs, CPU servers offer the best performance-per-dollar over GPU servers. The OSDI '21 program co-chairs have agreed not to submit their work to OSDI '21. Marius is open-sourced at www.marius-project.org. The overhead of GPT is 5% for memory-intensive workloads (e.g., Redis) and negligible for CPU-intensive workloads (e.g., RV8 and Coremarks). His work has included the Barrelfish multikernel research OS, as well as work on distributed stream processors, and using formal specifications to describe the hardware/software interfaces of modern computer systems. Authors are also encouraged to contact the program co-chairs, osdi21chairs@usenix.org, if needed to relate their OSDI submissions to relevant submissions of their own that are simultaneously under review or awaiting publication at other venues. This is unfortunate because good OS design has always been driven by the underlying hardware, and right now that hardware is almost unrecognizable from ten years ago, let alone from the 1960s when Unix was written. This paper demonstrates that it is possible to achieve s-scale latency using Linux kernel storage stack, even when tens of latency-sensitive applications compete for host resources with throughput-bound applications that perform read/write operations at throughput close to hardware capacity. Papers not meeting these criteria will be rejected without review, and no deadline extensions will be granted for reformatting. In addition, CLP outperforms Elasticsearch and Splunk Enterprise's log ingestion performance by over 13x, and we show CLP scales to petabytes of logs. We also show that Marius can scale training to datasets an order of magnitude beyond a single machine's GPU and CPU memory capacity, enabling training of configurations with more than a billion edges and 550 GB of total parameters on a single machine with 16 GB of GPU memory and 64 GB of CPU memory. (Jan 2019) Our REPT paper won a best paper at OSDI'18 (Oct 2018) I will serve in the SOSP'19 PC. They collectively make the backup fresh, columnar, and fault-tolerant, even facing millions of concurrent transactions per second. People often assume that blockchain has Byzantine robustness, so adding it to any system will make that system super robust against any calamity. DeSearch then introduces a witness mechanism to make sure the completed tasks can be reused across different pipelines, and to make the final search results verifiable by end users. Academic and industrial participants present research and experience papers that cover the full range of theory . 23 artifacts received the Artifacts Functional badge (88%). For example, talks may be shorter than in prior years, or some parts of the conference may be multi-tracked. Submitted November 12, 2021 Accepted January 20, 2022. Hence, kernel developers are constantly refining synchronization within OS kernels to improve scalability at the risk of introducing subtle bugs. Lukas Burkhalter, Nicolas Kchler, Alexander Viand, Hossein Shafagh, and Anwar Hithnawi, ETH Zrich. We introduce a hybrid cryptographic protocol for privacy-adhering transformations of encrypted data. It then feeds those invariants and the desired safety properties to an SMT solver to check if the conjunction of the invariants and the safety properties is inductive. Papers must be in PDF format and must be submitted via the submission form. We have made Fluffy publicly available at https://github.com/snuspl/fluffy to contribute to the security of Ethereum. Pollux is implemented and publicly available as part of an open-source project at https://github.com/petuum/adaptdl. The key insight guiding our design is computation separation. Submissions violating the detailed formatting and anonymization rules will not be considered for review. The key insight in blk-switch is that Linux's multi-queue storage design, along with multi-queue network and storage hardware, makes the storage stack conceptually similar to a network switch. Commonly used log archival and compression tools like Gzip provide high compression ratio, yet searching archived logs is a slow and painful process as it first requires decompressing the logs. Sanitizers detect unsafe actions such as invalid memory accesses by inserting checks that are validated during a programs execution. Log search and log archiving, despite being critical problems, are mutually exclusive. JEL codes: Q18, Q28, Q57 . OSDI '21 Technical Sessions All the times listed below are in Pacific Daylight Time (PDT). This paper presents Dorylus: a distributed system for training GNNs. Even the little publishable OS work that is not based on Linux still assumes the same simplistic hardware model (essentially a multiprocessor VAX) that bears little resemblance to modern reality. A.H. Hunter, Jane Street Capital; Chris Kennelly, Paul Turner, Darryl Gove, Tipp Moseley, and Parthasarathy Ranganathan, Google. When registering your abstract, you must provide information about conflicts with PC members. USENIX discourages program co-chairs from submitting papers to the conferences they organize, although they are allowed to do so. This paper presents the design and implementation of CLP, a tool capable of losslessly compressing unstructured text logs while enabling fast searches directly on the compressed data. HotNets provides a venue for discussing innovative ideas and for debating future research agendas in networking. This paper presents Zeph, a system that enables users to set privacy preferences on how their data can be shared and processed. Although SSDs can be simplified under the current ZNS interface, its counterpart LFS must bear segment compaction overhead. With her students, she had led research in AI, with a focus on robotics and machine learning, having concretely researched and developed a variety of autonomous robots, including teams of soccer robots, and mobile service robots. The 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI '21) will take place as a virtual event on July 14-16, 2021. We present DistAI, a data-driven automated system for learning inductive invariants for distributed protocols. The hybrid segment recycling chooses a proper block reclaiming policy between segment compaction and threaded logging based on their costs. We propose a new framework for computing the embeddings of large-scale graphs on a single machine. We develop rigorous theoretical foundations to simplify equivalence examination and correction for partially equivalent transformations, and design an efficient search algorithm to quickly discover highly optimized programs by combining fully and partially equivalent optimizations at the tensor, operator, and graph levels. After request completion, an I/O device must decide either to minimize latency by immediately firing an interrupt or to optimize for throughput by delaying the interrupt, anticipating that more requests will complete soon and help amortize the interrupt cost. Typically, monolithic kernels share state across cores and rely on one-off synchronization patterns that are specialized for each kernel structure or subsystem. We develop MAGE, an execution engine for SC that efficiently runs SC computations that do not fit in memory. Consensus bugs are bugs that make Ethereum clients transition to incorrect blockchain states and fail to reach consensus with other clients. Dorylus is up to 3.8 faster and 10.7 cheaper compared to existing sampling-based systems. USENIX Security '21 has three submission deadlines. And yet, they continue to rely on centralized search engines and indexers to help users access the content they seek and navigate the apps. Sep 2021 - Present 1 year 7 months. Tao Luo, Mingen Pan, Pierre Tholoniat, Asaf Cidon, and Roxana Geambasu, Columbia University; Mathias Lcuyer, Microsoft Research. GoJournals goal is to bring the advantages of journaling for code to specs and proofs. A graph embedding is a fixed length vector representation for each node (and/or edge-type) in a graph and has emerged as the de-facto approach to apply modern machine learning on graphs. Professor Veloso earned a Bachelor and Master of Science degrees in Electrical and Computer Engineering from Instituto Superior Tecnico in Lisbon, Portugal, a Master of Arts in Computer Science from Boston University, and Master of Science and PhD in Computer Science from Carnegie Mellon University. Evaluations show that Vegito can perform 1.9 million TPC-C NewOrder transactions and 24 TPC-H-equivalent queries per second simultaneously, which retain the excellent performance of specialized OLTP and OLAP counterparts (e.g., DrTM+H and MonetDB). Instead of choosing among a small number of known algorithms, our approach searches in a "policy space" of fine-grained actions, resulting in novel algorithms that can outperform existing algorithms by specializing to a given workload. Erhu Feng, Xu Lu, Dong Du, Bicheng Yang, and Xueqiang Jiang, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China; Yubin Xia, Binyu Zang, and Haibo Chen, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai AI Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China. First, GNNAdvisor explores and identifies several performance-relevant features from both the GNN model and the input graph, and use them as a new driving force for GNN acceleration. We present DPF (Dominant Private Block Fairness) a variant of the popular Dominant Resource Fairness (DRF) algorithmthat is geared toward the non-replenishable privacy resource but enjoys similar theoretical properties as DRF. Accepted papers will be allowed 14 pages in the proceedings, plus references. Devices employ adaptive interrupt coalescing heuristics that try to balance between these opposing goals. This budget is a scarce resource that must be carefully managed to maximize the number of successfully trained models. Distributed systems are notoriously hard to implement correctly due to non-determinism. This kernel is scaled across NUMA nodes using node replication, a scheme inspired by state machine replication in distributed systems. OSDI brings together professionals from academic and industrial backgrounds in what has become a premier forum for discussing the design, implementation, and implications of systems software. Proceedings Front Matter Submitted papers must be no longer than 12 single-spaced 8.5 x 11 pages, including figures and tables, plus as many pages as needed for references, using 10-point type on 12-point (single-spaced) leading, two-column format, Times Roman or a similar font, within a text block 7 wide x 9 deep. We will look at various problems and approaches, and for each, see if blockchain would help. This talk will discuss several examples with very different solutions. Message from the Program Co-Chairs. She is the recipient of several best paper awards, the Einstein Chair of the Chinese Academy of Science, the ACM/SIGART Autonomous Agents Research Award, an NSF Career Award, and the Allen Newell Medal for Excellence in Research. When further combined with a simple caching strategy, our evaluation shows that P3 is able to outperform existing state-of-the-art distributed GNN frameworks by up to 7. Some recent schedulers choose job resources for users, but do so without awareness of how DL training can be re-optimized to better utilize the provided resources. Pollux promotes fairness among DL jobs competing for resources based on a more meaningful measure of useful job progress, and reveals a new opportunity for reducing DL cost in cloud environments. A PC member is a conflict if any of the following three circumstances applies: Institution: You are currently employed at the same institution, have been previously employed at the same institution within the past two years (not counting concluded internships), or are going to begin employment at the same institution during the review period. Title Page, Copyright Page, and List of Organizers | Registering abstracts a week before paper submission is an essential part of the paper-reviewing process, as PC members use this time to identify which papers they are qualified to review. Extensive experiments show that GNNAdvisor outperforms the state-of-the-art GNN computing frameworks, such as Deep Graph Library (3.02 faster on average) and NeuGraph (up to 4.10 faster), on mainstream GNN architectures across various datasets. High-performance tensor programs are critical for efficiently deploying deep neural network (DNN) models in real-world tasks. In this paper, we present P3, a system that focuses on scaling GNN model training to large real-world graphs in a distributed setting. In particular, I'll argue for re-engaging with what computer hardware really is today and give two suggestions (among many) about how the OS research community can usefully do this, and exploit what is actually a tremendous opportunity. Sam Kumar, David E. Culler, and Raluca Ada Popa, University of California, Berkeley. Zeph executes privacy-adhering data transformations in real-time and scales to thousands of data sources, allowing it to support large-scale low-latency data stream analytics. Moreover, to handle dynamic workloads, Nap adopts a fast NAL switch mechanism. She also has made contributions in network security, including scalable data expiration, distributed algorithms despite malicious participants, and DDOS prevention techniques. To enable FL developers to interpret their results in model testing, Oort enforces their requirements on the distribution of participant data while improving the duration of federated testing by cherry-picking clients. Machine learning (ML) models trained on personal data have been shown to leak information about users. Our approach outperforms existing file systems on a block SSD by a wide margin 6.2 on average for metadata-intensive benchmarks. Kyuhwa Han, Sungkyunkwan University and Samsung Electronics; Hyunho Gwak and Dongkun Shin, Sungkyunkwan University; Jooyoung Hwang, Samsung Electronics. We present Storm, a web framework that allows developers to build MVC applications with compile-time enforcement of centrally specified data-dependent security policies. The program co-chairs will use this information at their discretion to preserve the anonymity of the review process without jeopardizing the outcome of the current OSDI submission. Professor Veloso is on leave from Carnegie Mellon University as the Herbert A. Simon University Professor in the School of Computer Science, and the past Head of the Machine Learning Department. Please identify yourself as a presenter and include your mailing address in your email. . For conference information, see: . For general conference information, see https://www.usenix.org/conference/osdi22. Based on the observation that real-world workloads always feature skewed access patterns, Nap introduces a NUMA-aware layer (NAL) on the top of existing concurrent PM indexes, and steers accesses to hot items to this layer. We built an FPGA prototype of the nanoPU fast path by modifying an open-source RISC-V CPU, and evaluated its performance using cycle-accurate simulations on AWS FPGAs. Compared to existing baselines, DPF allows training more models under the same global privacy guarantee. Shaghayegh Mardani, UCLA; Ayush Goel, University of Michigan; Ronny Ko, Harvard University; Harsha V. Madhyastha, University of Michigan; Ravi Netravali, Princeton University. We also verified a simple NFS server using GoJournals specs, which confirms that they are helpful for application verification: a significant part of the proof doesnt have to consider concurrency and crashes. Nico Lehmann and Rose Kunkel, UC San Diego; Jordan Brown, Independent; Jean Yang, Akita Software; Niki Vazou, IMDEA Software Institute; Nadia Polikarpova, Deian Stefan, and Ranjit Jhala, UC San Diego. Therefore, developers typically find data locality issues via dynamic profiling and repair them manually. To adapt to different workloads, prior works mix or switch between a few known algorithms using manual insights or simple heuristics. Overall, the OSDI PC accepted 31 out of 165 submissions. Kirk Rodrigues, Yu Luo, and Ding Yuan, University of Toronto and YScope Inc. This yielded 6% fewer TLB miss stalls, and 26% reduction in memory wasted due to fragmentation. The full program will be available in May 2021. To remedy this, we introduce DeSearch, the first decentralized search engine that guarantees the integrity and privacy of search results for decentralized services and blockchain apps. 1 Acknowledgements: Paper prepared for the post-conference workshop on Food for Thought: Economic Analysis in Anticipation of the Next Farm Bill at the Agricultural and Applied Economics Association annual meeting, Austin, TX . We first introduce two new hardware primitives: 1) Guarded Page Table (GPT), which protects page table pages to support page-level secure memory isolation; 2) Mountable Merkle Tree (MMT), which supports scalable integrity protection for secure memory. Welcome to the 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI '22) submissions site. Finding the inductive invariant of the distributed protocol is a critical step in verifying the correctness of distributed systems, but takes a long time to do even for simple protocols. Youngseok Yang, Seoul National University; Taesoo Kim, Georgia Institute of Technology; Byung-Gon Chun, Seoul National University and FriendliAI. Papers accompanied by nondisclosure agreement forms will not be considered. Author Response Period Her specialties include network routing protocols and network security. Forgot your password? Using this property, MAGE calculates the memory access pattern ahead of time and uses it to produce a memory management plan. Advisor: You have a past or present association as thesis advisor or advisee. Camera-ready submission (all accepted papers): 15 Mars 2022. We build Polyjuice based on our learning framework and evaluate it against several existing algorithms. Thanks to selective profiling, DMons profiling overhead is 1.36% on average, making it feasible for production use. We present the nanoPU, a new NIC-CPU co-design to accelerate an increasingly pervasive class of datacenter applications: those that utilize many small Remote Procedure Calls (RPCs) with very short (s-scale) processing times. SC is being increasingly adopted by industry for a variety of applications. Yuke Wang, Boyuan Feng, Gushu Li, Shuangchen Li, Lei Deng, Yuan Xie, and Yufei Ding, University of California, Santa Barbara. We argue that a key-value interface between a file system and an SSD is superior to the legacy block interface by presenting KEVIN. blk-switch evaluation over a variety of scenarios shows that it consistently achieves s-scale average and tail latency (at both 99th and 99.9th percentiles), while allowing applications to near-perfectly utilize the hardware capacity. These results outperform state-of-the-art HTAP systems by several orders of magnitude on transactional performance, while just incurring little performance slowdown (5% over pure OLTP workloads) and still enjoying data freshness for analytical queries (less than 20 ms of maximum delay) in the failure-free case. As a member of ACCT, I have served two years on the bylaws and governance committee and two years on the finance and audit committee. She has a PhD in computer science from MIT. Ankit Bhardwaj and Chinmay Kulkarni, University of Utah; Reto Achermann, University of British Columbia; Irina Calciu, VMware Research; Sanidhya Kashyap, EPFL; Ryan Stutsman, University of Utah; Amy Tai and Gerd Zellweger, VMware Research. For conference information, . Starting with small invariant formulas and strongest possible invariants avoids large SMT queries, improving SMT solver performance. Swapnil Gandhi and Anand Padmanabha Iyer, Microsoft Research. Such centralized engines are in a perfect position to censor content and violate users privacy, undermining some of the key tenets behind decentralization. Only two types of supplementary material are permitted: source code described in the paper and formal proofs sketched in the paper. By submitting a paper, you agree that at least one of the authors will attend the conference to present it. USENIX NSDI, 2021 Acceptance Rate: 15.99% Fluid: Resource-Aware Hyperparameter Tuning Engine P. Yu*, J. Liu*, M. Chowdhury (*Equal contribution) MLSys, 2021 Acceptance Rate: 23.53% NetLock: Fast, Centralized Lock Management Using Programmable Switches Z. Yu, Y. Zhang, V. Braverman, M. Chowdhury, X. Jin ACM SIGCOMM, 2020 Acceptance Rate: 21.6% We observe that scalability challenges in training GNNs are fundamentally different from that in training classical deep neural networks and distributed graph processing; and that commonly used techniques, such as intelligent partitioning of the graph do not yield desired results. Proceedings Cover | We demonstrate the above using design, implementation and evaluation of blk-switch, a new Linux kernel storage stack architecture. Novel system designs, thorough empirical work, well-motivated theoretical results, and new application areas are all . Haojie Wang, Jidong Zhai, Mingyu Gao, Zixuan Ma, Shizhi Tang, and Liyan Zheng, Tsinghua University; Yuanzhi Li, Carnegie Mellon University; Kaiyuan Rong and Yuanyong Chen, Tsinghua University; Zhihao Jia, Carnegie Mellon University and Facebook. MAGE outperforms the OS virtual memory system by up to an order of magnitude, and in many cases, runs SC computations that do not fit in memory at nearly the same speed as if the underlying machines had unbounded physical memory to fit the entire computation. HotCRP.com signin Sign in using your HotCRP.com account. Furthermore, by combining SanRazor with an existing sanitizer reduction tool ASAP, we show synergistic effect by reducing the runtime cost to only 7.0% with a reasonable tradeoff of security. She is the author of the textbook Interconnections (about network layers 2 and 3) and coauthor of Network Security. For any further information, please contact the PC chairs: pc-chairs-2022@eurosys.org. Mingyu Li, Jinhao Zhu, and Tianxu Zhang, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai AI Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China; Cheng Tan, Northeastern University; Yubin Xia, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai AI Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China; Sebastian Angel, University of Pennsylvania; Haibo Chen, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai AI Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China. Prior or concurrent publication in non-peer-reviewed contexts, like arXiv.org, technical reports, talks, and social media posts, is permitted. Paper Submission Information All submissions must be received by 11:59 PM AoE (UTC-12) on the day of the corresponding deadline. The 20th ACM Workshop on Hot Topics in Networks (HotNets 2021) will bring together researchers in computer networks and systems to engage in a lively debate on the theory and practice of computer networking. Submission of a response is optional. Prepublication versions of the accepted papers from the summer submission deadline are available below. Second, Fluffy uses multiple existing Ethereum clients that independently implement the specification as cross-referencing oracles. CLP's gains come from using a tuned, domain-specific compression and search algorithm that exploits the significant amount of repetition in text logs. Indeed, it is a prime target for powerful adversaries such as nation states. While compiler-based techniques have been proposed to improve data locality, they depend on heuristics, which can sometimes hurt performance. For instance, FAST 21 and NSDI 21 have author-notification dates after the OSDI 21 abstract-registration deadline. By monitoring the status of each job during training, Pollux models how their goodput (a novel metric we introduce that combines system throughput with statistical efficiency) would change by adding or removing resources. Existing algorithms are designed to work well for certain workloads. Our evaluation shows that DistAI successfully verifies 13 common distributed protocols automatically and outperforms alternative methods both in the number of protocols it verifies and the speed at which it does so, in some cases by more than two orders of magnitude. She also invented the spanning tree algorithm, which transformed Ethernet from a technology that supported a few hundred nodes, to something that can support large networks. OSDI 2021 papers summary. Yet, existing efforts randomly select FL participants, which leads to poor model and system efficiency. OSDI takes a broad view of the systems area and solicits contributions from many fields of systems practice, including, but not limited to, operating systems, file and storage systems, distributed systems, cloud computing, mobile systems, secure and reliable systems, systems aspects of big data, embedded systems, virtualization, networking as it relates to operating systems, and management and troubleshooting of complex systems. Jiachen Wang, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai AI Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China; Ding Ding, Department of Computer Science, New York University; Huan Wang, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai AI Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China; Conrad Christensen, Department of Computer Science, New York University; Zhaoguo Wang and Haibo Chen, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai AI Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China; Jinyang Li, Department of Computer Science, New York University. For more details on the submission process, and for templates to use with LaTeX, Word, etc., authors should consult the detailed submission requirements. Although the number of submissions is lower than the past, it's likely only due to the late announcement; being in my first OSDI PC, I think the quality of the submitted and accepted papers remains as high as ever. Paper abstracts and proceedings front matter are available to everyone now. Manuela M. Veloso is the Head of J.P. Morgan AI Research, which pursues fundamental research in areas of core relevance to financial services, including data mining and cryptography, machine learning, explainability, and human-AI interaction. Radia Perlman is a Fellow at Dell Technologies. Contact your program co-chairs, osdi21chairs@usenix.org, or the USENIX office, submissionspolicy@usenix.org. Professor Veloso is the Past President of AAAI (the Association for the Advancement of Artificial Intelligence), and the co-founder, Trustee, and Past President of RoboCup. PET then automatically corrects results to restore full equivalence. Authors may use this for content that may be of interest to some readers but is peripheral to the main technical contributions of the paper. Horcrux-compliant web servers perform offline analysis of all the JavaScript code on any frame they serve to conservatively identify, for every JavaScript function, the union of the page state that the function could access across all loads of that page. Unfortunately, because devices lack the semantic information about which I/O requests are latency-sensitive, these heuristics can sometimes lead to disastrous results. Abstract registrations that do not provide sufficient information to understand the topic and contribution (e.g., empty abstracts, placeholder abstracts, or trivial abstracts) will be rejected, thereby precluding paper submission. Researchers from the Software Systems Laboratory bagged Best Paper Awards at the 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2021) and the 2021 USENIX Annual Technical Conference (USENIX ATC 2021).. Jay Lepreau Best Paper Award, OSDI'21. For instance, the following are not sufficient grounds to specify a conflict with a PC member: they have reviewed the work before, they are employed by your competitor, they are your personal friend, they were your post-doc advisor or advisee, or they had the same advisor as you. Here, we focus on hugepage coverage. The biennial ACM Symposium on Operating Systems Principles is the world's premier forum for researchers, developers, programmers, vendors and teachers of operating system technology.