There are two flavors of text-to-video. The first generates entire scenes from scratch (people, rooms, products) — these are still slow and expensive. The second, much more practical, generates a talking-avatar video: a fixed face speaks your script in a chosen voice and language.
Avatar text-to-video is fast (minutes, not hours), cheap, and perfect for tutorials, product walkthroughs, course content, and internal training where you need a presenter but not a Hollywood scene.