Archive

Archive for December 9, 2019

python: Fake Data-set

December 9, 2019 1 comment


Learning : Python to generate fake data-set
Subject: About Fake data library

Most of the time when we working on a project, we need to test our procedures and functions with some data. In most cases we need just dummy data such as dates, names, address .. and so-on.

Last week, I was reading on the net and i fond an article about generating fake data using a library in PHP (PHP is a Computer Programming Language) so I start to find if we have one in Python! and the answer is YES there is a library that we can import called ‘Fake’. I start to work on it and discover it. This post is about the Fake Data-set Library.

The library called ‘Faker’ and we need to install it in our python environment, i use : pip install Faker to install it. In it’s documentation we can use some properties like : name, city, date, job .. and others. So if we want to generate a fake name we write this:

# Using lib:fake to generate fake name

print(fake.name()) 
[Output]: Victoria Campbell

Here is a screen-shot from Jupyter notbook screen.


To generate more than one name we can use for loop as:

# Using lib:fake to generate (X) fake name

for x in range (10) :
    print(fake.name())
[Output]: Jared Hawkins
Michael Reid
Ricky Brown
Mary Tyler
Kristy Dudley
Karen Cain
Jennifer Underwood
Desiree Jensen
Carla Rivera
Brandon Cooper


Other properties that we can use are :address, company, job, country, date_time and many other, and with all this we can create a data-set full of fake data.

So if we want to create a fake data-set contain:
Name, Date-of-birth, Company, Job, Country as one person data we will use it like this:

# Using lib:fake to generate (X) person fake data
# Data-set contain: Name, Date-of-birth, Company, Job, Country
p_count = 1
for x in range (p_count):
    print('Name:',fake.name())
    print('DOB:',fake.date())
    print('Company:',fake.company())
    print('Job:',fake.job())
    print('country:',fake.country())


[Output]: 
Name: Crystal Mcconnell
DOB: 2002-09-30
Company: Bailey LLC
Job: Insurance underwriter
country: Pakistan


Now if we want to store the person data in a dictionary type variable and use it later, we can do this as following:

# Using lib:fake to generate (X) person fake data and store it in a dictionary 
people_d ={}
p_count = 5
for x in range (p_count):
    ID = x
    people_d[ID]={'name':fake.name(),'date':fake.date(),'company':fake.company(),'job':fake.job(),'country':fake.country()}

# To print-out the people_d data-set.
for x in people_d :
    print(people_d[x])


Just in case we want a complicated ID we can use a random function (8-dig) integer, or combining two fake numbers such as (fake.zipcode() and fake.postcode()) just to make sure that we will not have a duplicate ID.

Using fake library will help a lot, and it has many attributes and properties that can be inserted in a data-set. For more information on this document you may read it here: Fake Library



To Download my Python code (.py) files Click-Here





Follow me on Twitter..




By: Ali Radwani