A Shared Encoder Approach to Multimodal Representation Learning

This repository contains code to reproduce the paper A Shared Encoder Approach to Multimodal Representation Learning.

Setup

To reproduce experiments, first set up a local copy of the repository by cloning it:

git clone https://github.com/VectorInstitute/shared_encoder.git

Then install uv and run the following command to install the dependencies for the project:

uv sync -n --dev

Running Experiments

Pretraining

Prior to running a pretraining job, please set the following environment variables:

export PMCOA_ROOT_DIR=/path/to/PMC-OA

Then run the following command to pretrain the model:

mmlearn_run --multirun \
    hydra.launcher.mem_per_cpu=5G \
    hydra.launcher.cpus_per_task=8 \
    hydra.launcher.partition=a40 \
    hydra.launcher.qos=normal \
    hydra.launcher.gres=gpu:4 \
    hydra.launcher.tasks_per_node=4 \
    hydra.launcher.nodes=2 \
    hydra.launcher.stderr_to_stdout=true \
    hydra.launcher.timeout_min=960 \
    '+hydra.launcher.additional_parameters={export: ALL}' \
    'hydra.searchpath=[pkg://shared_encoder.configs]' \
    +experiment=PMC-OA-SHARE-MT-20 \
    trainer.num_nodes=2 \
    experiment_name=PMC-OA-SHARE-MT-20-12Layers

Note: This command is will schedule a job on a SLURM cluster. If you are running the job locally, please remove all arguments that start with hydra.launcher as well as the --multirun flag.

Evaluation

Run the following command to evaluate a pretrained model on the test set of the PMC-OA dataset:

mmlearn_run --multirun hydra.launcher.mem_per_cpu=5G \
    hydra.launcher.cpus_per_task=10 \
    hydra.launcher.partition=rtx6000 \
    hydra.launcher.qos=normal \
    hydra.launcher.gres=gpu:1 \
    hydra.launcher.tasks_per_node=1 \
    hydra.launcher.nodes=1 \
    hydra.launcher.stderr_to_stdout=true \
    hydra.launcher.timeout_min=60 \
    '+hydra.launcher.additional_parameters={export: ALL}' \
    hydra.searchpath=[pkg://shared_encoder.configs] \
    +experiment=PMC-OA-SHARE-MT-20 \
    job_type=eval \
    [email protected]=PMCOA \
    datasets.test.split=test \
    +datasets/[email protected]=med_clip_vision_transform \
    +datasets/[email protected]_fn.batch_processors.text=HFCLIPTokenizer \
    experiment_name=ZSR-PMC-OA-SHARE-MT-20-12Layers \
    resume_from_checkpoint=<path_to_checkpoint>

Note: To evaluate the model on MIMIC-CXR, DeepEyeNet and/or Quilt, please set one or more of the following environment variables:

export DEY_ROOT_DIR=/path/to/DeepEyeNet/dataset
export MIMICIVCXR_ROOT_DIR=/path/to/MIMIC-CXR/dataset
export QUILT_ROOT_DIR=/path/to/Quilt

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.github		.github
src/shared_encoder		src/shared_encoder
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Shared Encoder Approach to Multimodal Representation Learning

Setup

Running Experiments

Pretraining

Evaluation

About

Releases

Packages

Contributors 3

Languages

License

VectorInstitute/shared-encoder

Folders and files

Latest commit

History

Repository files navigation

A Shared Encoder Approach to Multimodal Representation Learning

Setup

Running Experiments

Pretraining

Evaluation

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages