• <xmp id="om0om">
  • <table id="om0om"><noscript id="om0om"></noscript></table>
  • Legate

    Legate is a framework and runtime with an ecosystem of libraries that democratize distributed accelerated computing for everyone. It’s designed to enable all users, including researchers, scientists, data engineers, and AI innovators, to scale their applications or scientific projects effortlessly without having to deal with the complexity of Parallel and Distributed computing. With Legate and Legate libraries, users can write sequential programs on their laptop and execute that very same code that runs on a CPU to run on a workstations with GPUs, GPU instances in public clouds, or multi-GPU/multi-node supercomputers. Using this technology, AI/ML innovators, scientific computing, and data scientists can develop and test programs on moderately sized datasets on local machines. They can then immediately scale up to larger datasets deployed on many nodes in the cloud, or on a supercomputer, without any code modifications.


    Benefits

    Productivity

    Heighten research and developer efficiency by utilizing implicit parallelism effortlessly, eliminating the learning curve of GPU programming, parallel processing, and distributed computing.

    Scalability

    Scale from a single laptop to a distributed parallel architecture at data center scale with thousands of nodes and tens of thousands of CPUs and GPUs.

    Flexibility

    Transparently extend popular tools like NumPy, JAX, and RAPIDS? on a common foundation that remains compatible and composable. Enables third parties to build libraries on the common core.


    Key Features

    Implicit Parallelism

    Convert simple sequential order programs into parallel tasks implicitly.

    cuPyNumeric provides three levels of implicit parallelism:

    1. Asynchronous execution of independent computations (tasks).

    2. Tasks that can be broken into sub-tasks and executed by different processes.

    3. Kernel-level parallelism using CUDA kernels.

    Unified Data Abstraction

    Legate framework offers a unified data abstraction for Legate libraries.

    Legate runtime manages the data partitioning, distribution, and synchronization; providing a strong guarantee of data coherency.

    Different libraries are designed to share in-memory buffers of data without unnecessary copying.

    Composable Libraries

    Use cuPyNumeric and Legate Sparse for scientific computing.

    Legate IO for distributed high performance throughput I/O.

    Legate Rapids for data scientists and data engineers.

    Legate backend for JAX for AI/ML and large language models (LLMs).

    All libraries share a unified and coherent data representation.

    Friendly User Interface 

    A single command to start a distributed launch in supercomputers. Run within Jupyter Notebooks and K8s for cloud-native infrastructure. A DASK bootstrap is available for running on Dask clusters.

    Flexible Infrastructure

    CPU and GPU, x86, and ARM.
    Workstation, supercomputer, and public clouds.

    Efficient Communication

    GPU Direct for communication.
    NVLink, InfiniBand, Slingshot, EFA. Automatically choose the fastest datapath.

    Profiling & Debugging

    Dataflow and event graphs during each compile run.

    Timelines of program’s execution.
    Nsight System support.

    SDK to Extend

    Legate provides C++ and Python APIs that developers can use to build composable libraries on Legate.


    Product Overview

    As datasets and applications continue to increase in size and complexity, there’s a growing need to harness computational resources, far beyond what a single CPU-only system can provide. Legate provides an abstraction layer and a runtime system that together enable scalable implementations of popular domain-specific APIs. It offers an API similar to Apache Arrow but guarantees stronger data coherence and synchronization to aid library developers.

    Legate enables major use cases, including AI/ML, scientific computing, and advanced data analytics to run efficiently in a large-scale, multi-GPU/multi-node infrastructure.

    Legate distributed framework and runtime for accelerated computing

    Scientific Computing

    Legate provides augmented alternatives to popular scientific computing libraries like NumPy, SciPy, HDF5, and others, with the addition of multi-GPU multi-node support seamlessly. You can use familiar language tools to write research programs and let Legate scale it from CPU to GPU, to multi-GPU multi-Node, without changing any code.

    Data Science & Analytics

    Legate Dataframe helps data scientists scale their applications to work with larger and text-heavy datasets in demanding workloads, enhancing pre-existing CPU and GPU-accelerated data science and AI libraries.

    AI/ML (coming soon…)

    NVIDIA’s Legate backend for JAX supports popular and future models. Whether you’re building on LLMs like Llama 3, MoE, or new models, Legate backend for JAX enables flexible parallelisms for AI/DL training to achieve state-of-the-art performance with thousands of GPUs.


    GTC Videos on Demand

    Watch video about Legate: A Productive Programming Framework for Composable, Scalable, Accelerated Libraries

    Legate: A Productive Programming Framework for Composable, Scalable, Accelerated Libraries

    GTC24 Talk Session

    Watch video about how to create a distributed GPU-accelerated Python library

    How to Create a Distributed GPU-Accelerated Python Library

    GTC23 Talk Session

    Watch video about Legate: Scaling the Python Ecosystem

    Legate: Scaling the Python Ecosystem

    GTC21 Talk Session

    Get Started in Legate

    Legate Core Documentation

    Full technical documentation with details about the Legate architecture, API calls, and developer tools can be found here.

    Legate on GitHub

    Legate software and documentation can be found in this GitHub repository. Check periodically for the latest updates and news.

    cuPyNumeric Library

    Learn more about the cuPyNumeric library for Python developers, providing scalable performance on multi-GPU/multi-node configurations.

    Resources

    人人超碰97caoporen国产