Custom GPU kernel implementations in Triton for DL workflows

Project is structured as follows:

.
├── src
│   └── kernels
│       ├── __init__.py
│       ├── dropout.py
│       └── softmax.py
│       └── ...
├── tests
│   ├── __init__.py
│   └── dropout_test.py
│   └── softmax_test.py
│   └── ...

Currently supported kernels:

Softmax
Dropout
Vector addition

More to be implemented. Coming up:

[] Matrix addition
[] Block-based fused softmax
[] Layer norm
[] Matrix multiplication
[] Fused attention

Testing kernel correctness

To run tests, simply run

make test

Prepare your environment

python3 -m venv venv
source venv/bin/activate
make install_dev

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
src		src
tests		tests
.pre-commit-config.yaml		.pre-commit-config.yaml
Makefile		Makefile
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Custom GPU kernel implementations in Triton for DL workflows

Testing kernel correctness

Prepare your environment

About

Releases

Packages

Contributors 2

Languages

simondanielsson/custom-triton-kernels

Folders and files

Latest commit

History

Repository files navigation

Custom GPU kernel implementations in Triton for DL workflows

Testing kernel correctness

Prepare your environment

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages