Multiple GPUs for inference error
same error
same error
I'm very sorry, but this problem has troubled me for a long time.
Different devices or different numbers of GPUs always trigger this issue in various ways.
I have a silly workaround, which involves modifying the source code of utils.py in the transformers library, manually moving the tensors to the same device.
If there is a better method, please let me know!
Could you share this please:
"I have a silly workaround, which involves modifying the source code of utils.py in the transformers library, manually moving the tensors to the same device."
@czczup
As a silly person myself, I would also be interested to know about your changes in utils.py. Can you post a diff maybe? Thanks.
Please refer to the new readme code. By placing the input and output layers of the LLM on a single device, it should now work without needing to modify utils.py, and this issue should no longer occur.
Hey bud! Just saw that you updated the readme! Wow it works! Thanks a ton man!! You rock!
Please refer to the new readme code. By placing the input and output layers of the LLM on a single device, it should now work without needing to modify
utils.py, and this issue should no longer occur.
It works,thakns


