Stats Series - Progress
in Training / PH_525_series on Statistics, Harvard, R, Python, Spyder, Jupyter
Yay!
Halfway through the first course!
It IS taking a bit more time to push through this mirroring things in Python. But this is proving to be invaluable. For many things, doing it twice permits me to notice, catch and fix errors. I’m also using the IDEs to prototype things and Jupyter notebooks more for the ultimate presentation. This is building for me a library of notebooks for quick reference of how to do particular tasks as well as parallels between R and Python.
I have been using Spyder rather than Rodeo. I plan to switch for the second half of this first course.
Although I have had real problem so far replicating everything in Python, including all the related plots, there is no easy way to make the respective random number generators behave the same way. I can set seeds for both to keep answers consistent in both. But they rarely match.
Another thing this helps is forcing me to reflect more on understanding what’s really going on rather than just relying on functions and formulae. There’s an enormous danger here with power of these stats functions. Statistics builds upon assumptions and methods in a quick chain. If you just jump to a particular formula without understanding all of what led you to use that formula, you may not appreciate the assumptions nor be as able to change approaches when appropriate.