Frequent Error

#1
by CoolT - opened

I am getting this error :

TypeError: empty() received an invalid combination of arguments - got (tuple, dtype=str, device=NoneType), but expected one of:

  • (tuple of ints size, *, tuple of names names, torch.memory_format memory_format = None, torch.dtype dtype = None, torch.layout layout = None, torch.device device = None, bool pin_memory = False, bool requires_grad = False)
  • (tuple of ints size, *, torch.memory_format memory_format = None, Tensor out = None, torch.dtype dtype = None, torch.layout layout = None, torch.device device = None, bool pin_memory = False, bool requires_grad = False)

Tried on both collab instance : T4 GPU and v5e-1 TPU

Thanks for your interest in the model!

This error is caused by a known regression in recent versions of the transformers library (specifically versions 4.49.0+ and some late 4.48.x builds).

It's because model configuration file stores the data type as a string but the updated library code is passing this string directly to torch.empty().

The easiest would be do downgrade your transformers package until the fix (which already exists) is released or you can build the latest changes by building directly from github.

Let me know if that helps :)

Sign up or log in to comment