Skip to content
Change the repository type filter

All

    Repositories list

    • benchmark of n dimensional sparse tensor contractions
      C++
      0000Updated Jul 15, 2025Jul 15, 2025
    • fastcc

      Public
      C++
      0000Updated Jul 14, 2025Jul 14, 2025
    • CSS
      0100Updated Jun 27, 2025Jun 27, 2025
    • CoNST

      Public
      C++
      0500Updated May 6, 2025May 6, 2025
    • Python
      0000Updated Apr 10, 2025Apr 10, 2025
    • triton

      Public
      Development repository for the Triton language and compiler
      C++
      2.1k000Updated Jan 21, 2025Jan 21, 2025
    • For benchmarking the Roller
      C++
      0000Updated Dec 22, 2024Dec 22, 2024
    • 0000Updated Dec 19, 2024Dec 19, 2024
    • Python
      0000Updated Dec 15, 2024Dec 15, 2024
    • Open deep learning compiler stack for cpu, gpu and specialized accelerators
      Python
      3.6k000Updated Dec 11, 2024Dec 11, 2024
    • This repository contains the figures, tables and source code in the ICS'24 paper: "Accelerated Auto-Tuning of GPU Kernels for Tensor Computations".
      Python
      0820Updated Dec 5, 2024Dec 5, 2024
    • ics24tvm

      Public
      This repository contains the source code in the ICS'24 paper: "Accelerated Auto-Tuning of GPU Kernels for Tensor Computations".
      Python
      0010Updated Dec 5, 2024Dec 5, 2024
    • STeF

      Public
      C++
      0000Updated Nov 14, 2024Nov 14, 2024
    • Repo forked from the official PyTorch implementation of SegFormer, a Vision Transformer based semantic segmentation model. Optimized further with Triton for high resolution remote sensing Image dataset(Agrivision).
      Python
      389000Updated Aug 2, 2024Aug 2, 2024
    • tvm

      Public
      Open deep learning compiler stack for cpu, gpu and specialized accelerators
      Python
      3.6k200Updated Jul 6, 2024Jul 6, 2024
    • tvm-auto

      Public
      Focus on autoTVM
      Python
      3.6k000Updated Mar 3, 2024Mar 3, 2024
    • perf-char

      Public
      Performance characterization of auto-tuning data.
      Python
      0000Updated Feb 26, 2024Feb 26, 2024
    • ytopt

      Public
      ytopt: machine-learning-based search methods for autotuning
      Python
      18000Updated Aug 28, 2023Aug 28, 2023
    • All Benchmarks in single place which are ran using HPTD.
      Cuda
      0100Updated Jul 21, 2023Jul 21, 2023
    • GNN-RDM

      Public
      Python
      17000Updated Jul 14, 2023Jul 14, 2023
    • A retargetable MLIR-based machine learning compiler and runtime toolkit.
      C++
      726000Updated Jun 29, 2023Jun 29, 2023
    • A fork of LLVM to carry temporary patches for the IREE project
      12000Updated Jun 29, 2023Jun 29, 2023
    • C++
      0300Updated May 14, 2023May 14, 2023
    • iree

      Public
      👻
      C++
      726000Updated Mar 1, 2023Mar 1, 2023
    • C++
      1001Updated Nov 20, 2022Nov 20, 2022
    • AE-PACT

      Public
      Cuda
      0000Updated Aug 15, 2022Aug 15, 2022
    • Useful tutorials and recipes for developers doing low-level work with the Graphcore IPU
      C++
      10000Updated Jul 7, 2022Jul 7, 2022
    • pytorch

      Public
      Tensors and Dynamic neural networks in Python with strong GPU acceleration
      C++
      25k100Updated Mar 22, 2022Mar 22, 2022
    • Slides & posters of published papers
      1000Updated Mar 16, 2022Mar 16, 2022
    • TLCBench

      Public
      Benchmark scripts for TVM
      Python
      28000Updated Mar 15, 2022Mar 15, 2022