python - hydra submitit launcher plugin fails to import modules that can be imported normally when omitting the plugin - Stack O

admin2025-04-30  0

I am using hydra for config management for my experiments. I am trying to use the hydra submitit launcher plugin to submit jobs automatically to a slurm cluster.

the main config file is like this:

defaults:
    - override hydra/launcher: slurm
foo: 1

and the slurm config file is like this:

defaults:
  - submitit_slurm

_target_: hydra_plugins.hydra_submitit_launcher.submitit_launcher.SlurmLauncher

submitit_folder: ${hydra.sweep.dir}/.submitit/%j
name: ${hydra.job.name}

The project structure is like this:

|--project
|  |--src
|  |  |--main.py
|  |  |--models
|  |  |  |--__init__.py
|  |  |  |--file1.py
|  |  |  |--file2.py
|  |--scripts
|  |  |--run.sh

in the init.py I define what to import from models and in main I use the import as usual from models import func

THE PROBLEM: when I comment out the submitit launcher in the main config file everything works smoothly BUT when I uncomment it I get this error message

submitit ERROR (2025-01-04 21:31:32,789) - Submitted job triggered an exception
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/home/a/.conda/envs/hinet/lib/python3.12/site-packages/submitit/core/_submit.py", line 11, in <module>
    submitit_main()
  File "/home/a/.conda/envs/hinet/lib/python3.12/site-packages/submitit/core/submission.py", line 76, in submitit_main
    process_job(args.folder)
  File "/home/a/.conda/envs/hinet/lib/python3.12/site-packages/submitit/core/submission.py", line 69, in process_job
    raise error
  File "/home/a/.conda/envs/hinet/lib/python3.12/site-packages/submitit/core/submission.py", line 52, in process_job
    delayed = utils.DelayedSubmission.load(paths.submitted_pickle)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/a/.conda/envs/hinet/lib/python3.12/site-packages/submitit/core/utils.py", line 153, in load
    obj = pickle_load(filepath)
          ^^^^^^^^^^^^^^^^^^^^^
  File "/home/a/.conda/envs/hinet/lib/python3.12/site-packages/submitit/core/utils.py", line 232, in pickle_load
    return pickle.load(ifile)
           ^^^^^^^^^^^^^^^^^^
ModuleNotFoundError: No module named 'models'

Do you know what a possible reason for this error is and how to fix it?

Thanks!

转载请注明原文地址:http://anycun.com/QandA/1746027536a91540.html