Skip to content

One error should be solved and one improvement for reducing the CUDA memory #22

Closed
@hongfangyu

Description

@hongfangyu

1.One error should be solved
when install apex, there will be 4 erors about "convert unsigned long to long", you need to edit:
(1) line 65 in apex_22.01_pp/csrc/mlp.cpp
auto reserved_space = at::empty({reserved_size}, inputs[0].type());
change to:
auto reserved_space = at::empty({static_cast<long>(reserved_size)}, inputs[0].type());

(2) line 138 in apex_22.01_pp/csrc/mlp.cpp
auto work_space = at::empty({work_size / sizeof(scalar_t)}, inputs[0].type());
change to:
auto work_space = at::empty({static_cast<long>(work_size / sizeof(scalar_t))}, inputs[0].type());

or you need to change the compile option

2.one improvement for reducing the CUDA memory
when launch the owl_demo.py using a GPU with 16G, I ran into a CUDA memory overflow error. Then I edit here:
line 33 and 34 in interface.py:

    model = model.to(device)
    model = model.to(dtype)

change to:

    model = model.to(dtype)
    model = model.to(device)

Then, After the demo is started, the memory usage is about 14 GB. It can run very well on a 16GB GPU.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions