* Updated docstrings of bert_padding.py
Added docstrings for missing arguments in the unpad and pad methods.
* Update bert_padding.py
Fixed spelling mistakes
According to the `setup.py` file, only dependencies are torch and einops. But the `bert_padding.py` file requires `numpy` only to multiply the elements of a `torch.Size` object. This change aims at allowing the use of FlashAttention without numpy.