Generating Realistic Random Datasets with PythonWed 14 February 2018
As a data engineer, after you have written your new awesome data processing application, you think it is time to start testing end-to-end and you therefore need some input data.
As a data scientist, you can benefit from data generation since it allows you to experiment with various ways of exploring datasets, algorithms, data visualization techniques or to validate assumptions about the behaviour of some method against many different dataset of your choosing.
In both cases, a tempting option is just to use real data. One small problem though is that production data is typically hard to obtain, even partially, and it is not getting easier with new European laws about privacy and security.
A detailed tutorial has been published on DataCamp here: ttps://www.datacamp.com/community/tutorials/generate-data-trumania