• <xmp id="om0om">
  • <table id="om0om"><noscript id="om0om"></noscript></table>
  • Simulation / Modeling / Design

    NVIDIA cuPyNumeric 25.03 Now Fully Open Source with PIP and HDF5 Support

    NVIDIA cuPyNumeric is a library that aims to provide a distributed and accelerated drop-in replacement for NumPy built on top of the Legate framework. It brings zero-code-change scaling to multi-GPU and multinode (MGMN) accelerated computing.?

    cuPyNumeric 25.03 is a milestone update that introduces powerful new capabilities and enhanced accessibility for users and developers alike, as detailed in this post.

    Full stack now open source

    With cuPyNumeric 25.03, NVIDIA open-sourced the Legate framework and runtime layer that powers cuPyNumeric, under the Apache 2 license. Now, the entire stack of cuPyNumeric is available under the Apache 2 license. This move aligns with NVIDIA’s commitment to transparency, reproducibility, and collaboration. Contributors can now explore, audit, contribute and extend any component of the system without barriers.

    PIP install support

    cuPyNumeric has supported installation through conda from the start. Now users can also install it through pip with the following simple command:

    pip install nvidia-cupynumeric

    This simplifies setup significantly, making it easy to integrate cuPyNumeric into your workflows, virtual environments, and CI pipelines. All major dependencies except MPI are bundled or easily resolvable through PyPI.

    Together with OpenMPI and UCX, the cuPyNumeric package on PyPI is multinode and multirank capable. It enables developers to use cuPyNumeric not only in a single node with multiple GPUs but also in multi-GPU multinode clusters.

    Example installation

    An example of installing and running cuPyNumeric on SLURM Clusters using the PyPI wheel packages is outlined in the following sections.

    Step 1: Environment setup

    After logging into the cluster, load essential environment modules including CUDA and MPI. These are dependent packages needed for executing cuPyNumeric on a multinode or multirank environment. If they are not available on your cluster, install them manually or contact your system administrator to request installation.

    module purge # clear existing modules
    module load cuda # CUDA toolkit
    module load openmpi # Open MPI

    Next, create and activate a virtual environment (recommended). This is unnecessary if you want to install the packages into your current Python environment.

    python -m venv legate
    source legate/bin/activate

    Step 2: Package installation

    Install cuPyNumeric and Legate using pip:

    pip install legate nvidia-cupynumeric

    Step 3: Run applications

    Allocate interactive compute nodes using srun:

    srun -p partition-name \    # Request a partition
         -N 2 \                 # 2 compute nodes
         --gres=gpu:8 \         # 8 GPU per node
         --time=00:30:00 \      # 30-minute time limit
         --pty bash             # Start interactive shell

    Then run a cuPyNumeric program:

    legate --gpus 8 \              # GPUs per process
           --ranks-per-node 1 \    # Processes per node
           --nodes 2 \             # Total nodes (matches -N)
           --launcher mpirun \     # launch with MPI
           ./prog.py

    Running with SLURM batch job submission is also supported:

    #!/bin/bash
    #SBATCH --job-name=cupynumeric
    #SBATCH --nodes=2
    #SBATCH --gres=gpu:8
    #SBATCH --time=00:30:00
     
    module load cuda openmpi
    source legate/bin/activate
     
    legate --gpus 8 \
           --ranks-per-node 1 \
           --nodes ${SLURM_NNODES} \
           --launcher mpirun \
           ./prog.py

    For more information, refer to the cuPyNumeric 25.03 installation guide

    Native HDF5 IO support

    cuPyNumeric 25.03 provides native support for HDF5 over GPU Direct Storage, enabling efficient handling of large datasets and seamless interoperability with scientific computing environments. With HDF5, you can now consume and persist complex data structures to disk in a compact, portable, and performant format with great performance.

    from legate.core.io.hdf5 import from_file
    import cupynumeric as np
     
    x = from_file("data.h5", dataset_name="x")
    y = from_file("data.h5", dataset_name="y")
    xx = np.asarray(x)
    yy = np.asarray(y)
    a = 8675.309
     
    yy[:] = a * xx + yy

    This feature is especially beneficial for high-performance computing and data-intensive applications where IO efficiency is critical.

    Get started

    NVIDIA cuPyNumeric 25.03 strengthens the cuPyNumeric foundation for both research and production environments. To learn about more new features and capabilities in the 25.03 release, see the release notes. The team is grateful for the growing community and welcomes feedback, contributions, and ideas for future releases. Join the conversation by submitting issues directly to the nv-legate/cupynumeric GitHub repo.

    Discuss (0)
    +2

    Tags

    人人超碰97caoporen国产