Python: Create Synthetic Data Using Faker
Faker is a Python library that can generate fake data in a very easy way! In this article we will create a script that will the help of the…
Faker is a Python library that can generate fake data in a very easy way! In this article we will create a script that will the help of the Faker and csv libraries we will generate a csv. This csv will contain
name, last_name, date_of_birth, gender, city, country
Lets see how we can do this!.
Install the Faker library
We can install Faker using the pip3 command!
pip3 install FakerCreating the code
Save the following as fake_data.py
#!/usr/bin/env python3
from faker import Faker
import random
import csv
if __name__ == '__main__':
fake = Faker()
num_records = 1000
data = []
for _ in range(num_records):
name = fake.first_name()
last_name = fake.last_name()
date_of_birth = fake.date_of_birth(minimum_age = 18,
maximum_age = 80)
gender = random.choice(['Male','Female'])
city = fake.city()
country = fake.country()
data.append([name,last_name,date_of_birth,gender,city,country])
csv_filename = "fake_data.csv"
with open(csv_filename,'w',newline='') as csvfile:
csv_writer = csv.writer(csvfile)
csv_writer.writerow(["Name","Last Name","Date Of Birth","Gender","City","Country"])
csv_writer.writerows(data)Explaining the code
To run the script enter in the terminal
python ./fake_data.pyThis will generate a csv file with 1000 records, lets examine the most important parts of the code
This line imports faker
from faker import FakerThe lines inside the for statement are those that generate synthetic data based on simple rules!
name = fake.first_name()
last_name = fake.last_name()
date_of_birth = fake.date_of_birth(minimum_age = 18,
maximum_age = 80)
gender = random.choice(['Male','Female'])
city = fake.city()
country = fake.country()Faker has a lot of built in and community providers for many common cases like first name, last name etc, for discrete random options we can use the random.choice() function and a list of options
Next with the help of csv library we create the csv file
with open(csv_filename,'w',newline='') as csvfile:
csv_writer = csv.writer(csvfile)
csv_writer.writerow(["Name","Last Name","Date Of Birth","Gender","City","Country"])
csv_writer.writerows(data)First, we create the csv header using the .writerow() function and then we pass as the data rows with the .writerows() function
Conclusion
Faker is a simple but yet powerful library that allows us to create synthetic data with ease! I hope you enjoyed this article as much I enjoyed writing the article!
In Plain English
Thank you for being a part of our community! Before you go:
- Be sure to clap and follow the writer! 👏
- You can find even more content at PlainEnglish.io 🚀
- Sign up for our free weekly newsletter. 🗞️
- Follow us on Twitter(X), LinkedIn, YouTube, and Discord.