feat: simplify spi_transmit, send data without delay
The transmit entity did contain a lot of states that
were dependent on each other, it was simply too chaotic.
The entity is rewritten to have only one important state variable
with few different states.
There were also changes to change the bit output on falling edge
rather than rising edge to comply with t_su and t_hold of the
device on the other end.
The entity now also sends data right away on the first clock cycle,
that did not work before. It may also align to send only every WIDTH
bits, to be in sync with the spi_recv if used together.