8. Synthetical Test Data With Python

By Bernd Klein. Last modified: 24 Mar 2022.

On this page ➤

Definition of Synthetical Data

Chernoff Faces

There is hardly any engineer or scientist who doesn't understand the need for synthetical data, also called synthetic data. But some may have asked themselves what do we understand by synthetical test data? There are lots of situtations, where a scientist or an engineer needs learn or test data, but it is hard or impossible to get real data, i.e. a sample from a population obtained by measurement. The task or challenge of creating synthetical data consists in producing data which resembles or comes quite close to the intended "real life" data. Python is an ideal language for easily producing such data, because it has powerful numerical and linguistic functionalities.

Synthetic data are also necessary to satisfy specific needs or certain conditions that may not be found in the "real life" data. Another use case of synthetical data is to protect privacy of the data needed.

In our previous chapter "Python, Numpy and Probability", we have written some functions, which we will need in the following:

find_interval
weighted_choice
cartesian_choice
weighted_cartesian_choice
weighted_sample

You should be familiar with the way of working of these functions.

We saved the functions in a module with the name bk_random.

Live Python training

Enjoying this page? We offer live Python training courses covering the content of this site.

See our Python training courses

See our Machine Learning with Python training courses

Definition of the Scope of Synthetic Data Creation

We want to provide solutions to the following task:

We have n finite sets containing data of various types:

D₁, D₂, ... D_n

The sets D_i are the data sets from which we want to deduce our synthetical data.

In the actual implementation, the sets will be tuples or lists for practical reasons.

The process of creating synthetic data can be defined by two functions "synthesizer" and "synthesize". Usually, the word synthesizer is used for a computerized electronic device which produces sound. Our synthesizer produces strings or alternatively tuples with data, as we will see later.

The function synthesizer creates the function synthesize:

synthesize = synthesizer( (D₁, D₂, ... D_n) )

The function synthesize, - which may also be a generator like in our implementation, - takes no arguments and the result of a function call sythesize() will be

a list or a tuple t = (d₁, d₂, ... d_n) where d_i is drawn at random from D_i
or a string which contains the elements str(d₁), str(d₂), ... str(d_n) where d_i is also drawn at random from D_i

Let us start with a simple example. We have a list of firstnames and a list of surnames. We want to hire employees for an institute or company. Of course, it will be a lot easier in our synthetical Python environment to find and hire specialsts than in real life. The function "cartesian_choice" from the bk_random module and the concatenation of the randomly drawn firstnames and surnames is all it takes.

import bk_random 

firstnames = ["John", "Eve", "Jane", "Paul", 
              "Frank", "Laura", "Robert", 
              "Kathrin", "Roger", "Simone",
              "Bernard", "Sarah", "Yvonne"]
surnames = ["Singer", "Miles", "Moore", 
            "Looper", "Rampman", "Chopman", 
            "Smiley", "Bychan", "Smith",
            "Baker", "Miller", "Cook"]
   
number_of_specialists = 15
    
employees = set()
while len(employees) < number_of_specialists:
    employee = bk_random.cartesian_choice(firstnames, surnames)
    employees.add(" ".join(employee))

print(employees)

OUTPUT:

{'Yvonne Rampman', 'Sarah Miles', 'Paul Cook', 'Frank Baker', 'Jane Looper', 'Frank Cook', 'Simone Chopman', 'Kathrin Singer', 'Sarah Cook', 'Laura Smiley', 'Sarah Smith', 'John Bychan', 'Yvonne Bychan', 'Simone Smith', 'Eve Rampman'}

This was easy enough, but we want to do it now in a more structured way, using the synthesizer approach we mentioned before. The code for the case in which the parameter "weights" is not None is still missing in the following implementation:

import bk_random 

firstnames = ["John", "Eve", "Jane", "Paul", 
              "Frank", "Laura", "Robert", 
              "Kathrin", "Roger", "Simone",
              "Bernard", "Sarah", "Yvonne"]
surnames = ["Singer", "Miles", "Moore", 
            "Looper", "Rampman", "Chopman", 
            "Smiley", "Bychan", "Smith",
            "Baker", "Miller", "Cook"]

def synthesizer( data, weights=None, format_func=None, repeats=True):
    """
    data is a tuple or list of lists or tuples containing the 
    data
    weights is a list or tuple of lists or tuples with the 
    corresponding weights of the data lists or tuples
    format_func is a reference to a function which defines
    how a random result of the creator function will be formated. 
    If None, "creator" will return the list "res".
    If repeats is set to True, the results of helper will not be unique
    """

    def synthesize():
        if not repeats:
            memory = set()
        while True:
            res = bk_random.cartesian_choice(*data)
            if not repeats:
                sres = str(res)
                while sres in memory:
                    res = bk_random.cartesian_choice(*data)
                    sres = str(res)
                memory.add(sres)
            if format_func:
                yield format_func(res)
            else:
                yield res
    return synthesize
        
recruit_employee = synthesizer( (firstnames, surnames), 
                                 format_func=lambda x: " ".join(x),
                                 repeats=False)

employee = recruit_employee()
for _ in range(15):
    print(next(employee))

OUTPUT:

Jane Singer
Frank Looper
Eve Rampman
Simone Smith
Kathrin Baker
Paul Smith
Roger Cook
John Smiley
Jane Cook
Frank Moore
Simone Chopman
Frank Miller
Jane Baker
Frank Chopman
Laura Looper

Every name, i.e first name and last name, had the same likehood to be drawn in the previous example. This is not very realistic, because we will expect in countries like the US or England names like Smith and Miller to occur more often than names like Rampman or Bychan. We will extend our synthesizer function with additional code for the "weighted" case, i.e. weights is not None. If weights are given, we will have to use the function weighted_cartesian_choice from the bk_random module. If "weights" is set to None, we will have to call the function cartesian_choice. We put this decision into a different subfunction of synthesizer to keep the function synthesize clearer.

We do not want to fiddle around with probabilites between 0 and 1 in defining the weights, so we take the detour with integer, which we normalize afterwards.

from bk_random import cartesian_choice, weighted_cartesian_choice

weighted_firstnames = [ ("John", 80), ("Eve", 70), ("Jane", 2), 
                        ("Paul", 8), ("Frank", 20), ("Laura", 6), 
                        ("Robert", 17), ("Zoe", 3), ("Roger", 8), 
                        ("Edgar", 4), ("Susanne", 11), ("Dorothee", 22),
                        ("Tim", 17), ("Donald", 12), ("Igor", 15),
                        ("Simone", 9), ("Bernard", 8), ("Sarah", 7),
                        ("Yvonne", 11), ("Bill", 12), ("Bernd", 10)]

weighted_surnames = [('Singer', 2), ('Miles', 2), ('Moore', 5),
                     ('Strongman', 5), ('Romero', 3), ("Yiang", 4),
                     ('Looper', 1), ('Rampman', 1), ('Chopman', 1), 
                     ('Smiley', 1), ('Bychan', 1), ('Smith', 150), 
                     ('Baker', 144), ('Miller', 87), ('Cook', 5),
                     ('Joyce', 1), ('Bush', 5), ('Shorter', 6), 
                     ('Wagner', 10), ('Sundigos', 10), ('Firenze', 8),
                     ('Puttner', 20), ('Faulkner', 10), ('Bowman', 11),
                     ('Klein', 1), ('Jungster', 14), ("Warner", 14),
                     ('Tiller', 9), ('Wogner', 10), ('Blumenthal', 16)]


firstnames, weights = zip(*weighted_firstnames)
wsum = sum(weights)
weights_firstnames = [ x / wsum for x in weights]

surnames, weights = zip(*weighted_surnames)
wsum = sum(weights)
weights_surnames = [ x / wsum for x in weights]

weights = (weights_firstnames, weights_surnames)


def synthesizer( data, weights=None, format_func=None, repeats=True):
    """
    "data" is a tuple or list of lists or tuples containing the 
    data.
    
    "weights" is a list or tuple of lists or tuples with the 
    corresponding weights of the data lists or tuples.
    
    "format_func" is a reference to a function which defines
    how a random result of the creator function will be formated. 
    If None,the generator "synthesize" will yield the list "res".
    
    If "repeats" is set to True, the output values yielded by 
    "synthesize" will not be unique.
    """
        
    def choice(data, weights):
        if weights:
            return weighted_cartesian_choice(*zip(data, weights))
        else:
            return cartesian_choice(*data)
        
    def synthesize():
        if not repeats:
            memory = set()
        while True:
            res = choice(data, weights)
            if not repeats:
                sres = str(res)
                while sres in memory:
                    res = choice(data, weights)
                    sres = str(res)
                memory.add(sres)
            if format_func:
                yield format_func(res)
            else:
                yield res
    return synthesize
        


recruit_employee = synthesizer( (firstnames, surnames), 
                                weights = weights,
                                format_func=lambda x: " ".join(x),
                                repeats=False)

employee = recruit_employee()
for _ in range(12):
    print(next(employee))

OUTPUT:

Frank Smith
Eve Baker
Dorothee Miller
Robert Smith
Dorothee Smith
Eve Tiller
John Miller
John Baker
Donald Baker
Roger Blumenthal
John Puttner
Yvonne Baker

Wine Example

grapes

Let's imagine that you have to describe a dozen wines. Most probably a nice imagination for many, but I have to admit that it is not for me. The main reason is that I am not a wine drinker!

We can write a little Python program, which will use our synthesize function to create automatically "sophisticated criticisms" like this one:

This wine is light-bodied with a conveniently juicy bouquet leading to a lingering flamboyant finish!

Try to find some adverbs, like "seamlessly", "assertively", and some adjectives, like "fruity" and "refined", to describe the aroma.

If you have defined your lists, you can use the synthesize function.

Here is our solution, in case you don't want to do it on your own:

import bk_random

body = ['light-bodied', 'medium-bodied', 'full-bodied']
    
adverbs = ['appropriately', 'assertively', 'authoritatively', 
           'compellingly', 'completely', 'continually', 
           'conveniently', 'credibly', 'distinctively', 
           'dramatically', 'dynamically', 'efficiently', 
           'energistically', 'enthusiastically', 'fungibly', 
           'globally', 'holisticly', 'interactively', 
           'intrinsically', 'monotonectally', 'objectively', 
           'phosfluorescently', 'proactively', 'professionally', 
           'progressively', 'quickly', 'rapidiously', 
           'seamlessly', 'synergistically', 'uniquely']

noun = ['aroma', 'bouquet', 'flavour']

aromas = ['angular', 'bright', 'lingering', 'butterscotch', 
          'buttery', 'chocolate', 'complex', 'earth', 'flabby', 
          'flamboyant', 'fleshy', 'flowers', 'food friendly', 
          'fruits', 'grass', 'herbs', 'jammy', 'juicy', 'mocha', 
          'oaked', 'refined', 'structured', 'tight', 'toast',
          'toasty', 'tobacco', 'unctuous', 'unoaked', 'vanilla', 
          'velvetly']
          
example = """This wine is light-bodied with a completely buttery 
bouquet leading to a lingering fruity  finish!"""

def describe(data):
    body, adv, adj, noun, adj2 = data
    format_str = "This wine is %s with a %s %s %s\nleading to"
    format_str += " a lingering %s finish!"
    return format_str % (body, adv, adj, noun, adj2)  
    
t = bk_random.cartesian_choice(body, adverbs, aromas, noun, aromas)

data = (body, adverbs, aromas, noun, aromas)
synthesize = synthesizer( data, weights=None, format_func=describe, repeats=True)
criticism = synthesize()

for i in range(1, 13):
    print("{0:d}. wine:".format(i))
    print(next(criticism))
    print()

OUTPUT:

1. wine:
This wine is medium-bodied with a completely flowers aroma
leading to a lingering food friendly finish!

2. wine:
This wine is medium-bodied with a professionally herbs flavour
leading to a lingering angular finish!

3. wine:
This wine is full-bodied with a completely food friendly flavour
leading to a lingering tight finish!

4. wine:
This wine is light-bodied with a intrinsically angular flavour
leading to a lingering food friendly finish!

5. wine:
This wine is full-bodied with a fungibly bright flavour
leading to a lingering flowers finish!

6. wine:
This wine is medium-bodied with a professionally vanilla aroma
leading to a lingering tobacco finish!

7. wine:
This wine is full-bodied with a rapidiously refined flavour
leading to a lingering angular finish!

8. wine:
This wine is medium-bodied with a globally vanilla bouquet
leading to a lingering structured finish!

9. wine:
This wine is light-bodied with a synergistically food friendly aroma
leading to a lingering earth finish!

10. wine:
This wine is medium-bodied with a energistically jammy flavour
leading to a lingering food friendly finish!

11. wine:
This wine is full-bodied with a dynamically complex bouquet
leading to a lingering jammy finish!

12. wine:
This wine is light-bodied with a completely flabby flavour
leading to a lingering oaked finish!

Live Python training

Enjoying this page? We offer live Python training courses covering the content of this site.

Upcoming online Courses

Python Intensive Course

23 Jun to 27 Jun 2025
28 Jul to 01 Aug 2025
08 Sep to 12 Sep 2025
20 Oct to 24 Oct 2025

Data Analysis with Python

04 Jun to 06 Jun 2025
30 Jul to 01 Aug 2025
10 Sep to 12 Sep 2025
22 Oct to 24 Oct 2025

Efficient Data Analysis with Pandas

02 Jun to 03 Jun 2025
23 Jun to 24 Jun 2025
28 Jul to 29 Jul 2025
08 Sep to 09 Sep 2025
20 Oct to 21 Oct 2025

Python Text Processing Course

04 Jun to 06 Jun 2025
10 Sep to 12 Sep 2025
22 Oct to 24 Oct 2025

See our Python training courses

See our Machine Learning with Python training courses

Exercise: International Disaster Operation

World of Flags

It would be gorgeous, if the problem described in this exercise, would be purely synthetic, i.e. there would be no further catastophes in the world. Completely unrealistic, but a nice daydream. So, the task of this exercise is to provide synthetical test data for an international disaster operation. The countries taking part in this mission might be e.g. France, Switzerland, Germany, Canada, The Netherlands, The United States, Austria, Belgium and Luxembourg.

We want to create a file with random entries of aides. Each line should consist of:

UniqueIdentifier, FirstName, LastName, Country, Field

For example:

001, Jean-Paul,  Rennier, France, Medical Aid
002, Nathan, Bloomfield, Canada, Security Aid
003, Michael, Mayer, Germany, Social Worker

For practical reasons, we will reduce the countries to France, Italy, Switzerland and Germany in the following example implementation:

from bk_random import cartesian_choice, weighted_cartesian_choice

countries = ["France", "Switzerland", "Germany"]

w_firstnames = { "France" : [ ("Marie", 10), ("Thomas", 10), 
                            ("Camille", 10), ("Nicolas", 9),
                            ("Léa", 10), ("Julien", 9), 
                            ("Manon", 9), ("Quentin", 9), 
                            ("Chloé", 8), ("Maxime", 9), 
                            ("Laura", 7), ("Alexandre", 6),
                            ("Clementine", 2), ("Grégory", 2), 
                            ("Sandra", 1), ("Philippe", 1)],
               "Switzerland": [ ("Sarah", 10), ("Hans", 10), 
                            ("Laura", 9), ("Peter", 8),
                            ("Mélissa", 9), ("Walter", 7), 
                            ("Océane", 7), ("Daniel", 7), 
                            ("Noémie", 6), ("Reto", 7), 
                            ("Laura", 7), ("Bruno", 6),
                            ("Eva", 2), ("Urli", 4), 
                            ("Sandra", 1), ("Marcel", 1)],
               "Germany": [ ("Ursula", 10), ("Peter", 10), 
                            ("Monika", 9), ("Michael", 8),
                            ("Brigitte", 9), ("Thomas", 7), 
                            ("Stefanie", 7), ("Andreas", 7), 
                            ("Maria", 6), ("Wolfgang", 7), 
                            ("Gabriele", 7), ("Manfred", 6),
                            ("Nicole", 2), ("Matthias", 4), 
                            ("Christine", 1), ("Dirk", 1)],
               "Italy" : [ ("Francesco", 20), ("Alessandro", 19), 
                            ("Mattia", 19), ("Lorenzo", 18),
                            ("Leonardo", 16), ("Andrea", 15), 
                            ("Gabriele", 14), ("Matteo", 14), 
                            ("Tommaso", 12), ("Riccardo", 11), 
                            ("Sofia", 20), ("Aurora", 18),
                            ("Giulia", 16), ("Giorgia", 15), 
                            ("Alice", 14), ("Martina", 13)]}       
                        

w_surnames = { "France" : [ ("Matin", 10), ("Bernard", 10), 
                          ("Camille", 10), ("Nicolas", 9),
                          ("Dubois", 10), ("Petit", 9), 
                            ("Durand", 8), ("Leroy", 8), 
                            ("Fournier", 7), ("Lambert", 6), 
                            ("Mercier", 5), ("Rousseau", 4),
                            ("Mathieu", 2), ("Fontaine", 2), 
                            ("Muller", 1), ("Robin", 1)],
               "Switzerland": [ ("Müller", 10), ("Meier", 10), 
                            ("Schmid", 9), ("Keller", 8),
                            ("Weber", 9), ("Huber", 7), 
                            ("Schneider", 7), ("Meyer", 7), 
                            ("Steiner", 6), ("Fischer", 7), 
                            ("Gerber", 7), ("Brunner", 6),
                            ("Baumann", 2), ("Frei", 4), 
                            ("Zimmermann", 1), ("Moser", 1)],
               "Germany": [ ("Müller", 10), ("Schmidt", 10), 
                            ("Schneider", 9), ("Fischer", 8),
                            ("Weber", 9), ("Meyer", 7), 
                            ("Wagner", 7), ("Becker", 7), 
                            ("Schulz", 6), ("Hoffmann", 7), 
                            ("Schäfer", 7), ("Koch", 6),
                            ("Bauer", 2), ("Richter", 4), 
                            ("Klein", 2), ("Schröder", 1)],
               "Italy" : [ ("Rossi", 20), ("Russo", 19), 
                            ("Ferrari", 19), ("Esposito", 18),
                            ("Bianchi", 16), ("Romano", 15), 
                            ("Colombo", 14), ("Ricci", 14), 
                            ("Marino", 12), ("Grecco", 11), 
                            ("Bruno", 10), ("Gallo", 12),
                            ("Conti", 16), ("De Luca", 15), 
                            ("Costa", 14), ("Giordano", 13),
                            ("Mancini", 14), ("Rizzo", 13),
                            ("Lombardi", 11), ("Moretto", 9)]}

# separate names and weights
synthesize = {}
identifier = 1
for country in w_firstnames:
    firstnames, weights = zip(*w_firstnames[country])
    wsum = sum(weights)
    weights_firstnames = [ x / wsum for x in weights]
    w_firstnames[country] = [firstnames, weights_firstnames]

    surnames, weights = zip(*w_surnames[country])
    wsum = sum(weights)
    weights_surnames = [ x / wsum for x in weights]
    w_surnames[country] = [surnames, weights_firstnames]

    synthesize[country] = synthesizer( (firstnames, surnames), 
                                       (weights_firstnames, 
                                        weights_surnames),
                                 format_func=lambda x: " ".join(x),
                                 repeats=False)
nation_prob = [("Germany", 0.3), 
               ("France", 0.4), 
               ("Switzerland", 0.2),
               ("Italy", 0.1)]

profession_prob = [("Medical Aid", 0.3), 
                   ("Social Worker", 0.6), 
                   ("Security Aid", 0.1)]

helpers = []
for _ in range(200):
    country = weighted_cartesian_choice(zip(*nation_prob))
    profession = weighted_cartesian_choice(zip(*profession_prob))
    country, profession = country[0], profession[0]
    s = synthesize[country]()
    uid = "{id:05d}".format(id=identifier)
    helpers.append((uid, country, next(s), profession ))
    identifier += 1
    
print(helpers)

OUTPUT:

[('00001', 'Germany', 'Gabriele Müller', 'Medical Aid'), ('00002', 'France', 'Camille Lambert', 'Social Worker'), ('00003', 'Switzerland', 'Sarah Schmid', 'Medical Aid'), ('00004', 'France', 'Marie Mathieu', 'Social Worker'), ('00005', 'Italy', 'Leonardo Rizzo', 'Medical Aid'), ('00006', 'France', 'Julien Camille', 'Medical Aid'), ('00007', 'France', 'Manon Camille', 'Medical Aid'), ('00008', 'France', 'Manon Bernard', 'Social Worker'), ('00009', 'Germany', 'Peter Koch', 'Social Worker'), ('00010', 'Germany', 'Ursula Müller', 'Social Worker'), ('00011', 'Switzerland', 'Océane Schmid', 'Social Worker'), ('00012', 'Germany', 'Matthias Schmidt', 'Social Worker'), ('00013', 'France', 'Laura Petit', 'Social Worker'), ('00014', 'France', 'Marie Durand', 'Social Worker'), ('00015', 'France', 'Léa Petit', 'Social Worker'), ('00016', 'France', 'Laura Bernard', 'Social Worker'), ('00017', 'Germany', 'Manfred Hoffmann', 'Security Aid'), ('00018', 'Switzerland', 'Sarah Schneider', 'Social Worker'), ('00019', 'Switzerland', 'Sarah Weber', 'Social Worker'), ('00020', 'France', 'Camille Leroy', 'Social Worker'), ('00021', 'France', 'Marie Mathieu', 'Medical Aid'), ('00022', 'Italy', 'Lorenzo Gallo', 'Medical Aid'), ('00023', 'France', 'Laura Bernard', 'Social Worker'), ('00024', 'Italy', 'Sofia Marino', 'Social Worker'), ('00025', 'Switzerland', 'Sarah Schneider', 'Security Aid'), ('00026', 'Germany', 'Ursula Hoffmann', 'Medical Aid'), ('00027', 'Italy', 'Aurora Rossi', 'Social Worker'), ('00028', 'France', 'Nicolas Durand', 'Social Worker'), ('00029', 'France', 'Léa Lambert', 'Social Worker'), ('00030', 'France', 'Julien Matin', 'Medical Aid'), ('00031', 'France', 'Chloé Bernard', 'Security Aid'), ('00032', 'Switzerland', 'Reto Steiner', 'Medical Aid'), ('00033', 'France', 'Clementine Matin', 'Medical Aid'), ('00034', 'France', 'Alexandre Matin', 'Social Worker'), ('00035', 'France', 'Nicolas Dubois', 'Medical Aid'), ('00036', 'Switzerland', 'Hans Huber', 'Social Worker'), ('00037', 'Germany', 'Andreas Wagner', 'Social Worker'), ('00038', 'Italy', 'Tommaso Colombo', 'Social Worker'), ('00039', 'Italy', 'Sofia Conti', 'Social Worker'), ('00040', 'Switzerland', 'Noémie Fischer', 'Social Worker'), ('00041', 'France', 'Chloé Lambert', 'Social Worker'), ('00042', 'France', 'Quentin Dubois', 'Social Worker'), ('00043', 'Italy', 'Mattia Grecco', 'Security Aid'), ('00044', 'France', 'Maxime Dubois', 'Social Worker'), ('00045', 'Italy', 'Mattia Rossi', 'Medical Aid'), ('00046', 'Switzerland', 'Laura Huber', 'Security Aid'), ('00047', 'Germany', 'Wolfgang Schneider', 'Medical Aid'), ('00048', 'Germany', 'Stefanie Müller', 'Social Worker'), ('00049', 'France', 'Quentin Petit', 'Social Worker'), ('00050', 'France', 'Quentin Nicolas', 'Social Worker'), ('00051', 'Germany', 'Wolfgang Müller', 'Social Worker'), ('00052', 'Switzerland', 'Eva Brunner', 'Social Worker'), ('00053', 'Germany', 'Brigitte Meyer', 'Medical Aid'), ('00054', 'Switzerland', 'Hans Keller', 'Social Worker'), ('00055', 'Switzerland', 'Noémie Schmid', 'Medical Aid'), ('00056', 'Germany', 'Ursula Schulz', 'Social Worker'), ('00057', 'France', 'Maxime Durand', 'Social Worker'), ('00058', 'France', 'Maxime Nicolas', 'Social Worker'), ('00059', 'France', 'Alexandre Bernard', 'Security Aid'), ('00060', 'Germany', 'Manfred Müller', 'Social Worker'), ('00061', 'Switzerland', 'Sarah Meyer', 'Medical Aid'), ('00062', 'Germany', 'Peter Becker', 'Social Worker'), ('00063', 'Italy', 'Aurora Conti', 'Medical Aid'), ('00064', 'Italy', 'Sofia Esposito', 'Medical Aid'), ('00065', 'France', 'Camille Mercier', 'Social Worker'), ('00066', 'France', 'Nicolas Nicolas', 'Security Aid'), ('00067', 'Germany', 'Wolfgang Müller', 'Medical Aid'), ('00068', 'France', 'Sandra Camille', 'Social Worker'), ('00069', 'France', 'Chloé Dubois', 'Social Worker'), ('00070', 'Switzerland', 'Laura Gerber', 'Social Worker'), ('00071', 'France', 'Chloé Lambert', 'Social Worker'), ('00072', 'Switzerland', 'Peter Schneider', 'Social Worker'), ('00073', 'France', 'Maxime Petit', 'Social Worker'), ('00074', 'France', 'Grégory Nicolas', 'Medical Aid'), ('00075', 'Switzerland', 'Peter Schneider', 'Social Worker'), ('00076', 'Switzerland', 'Urli Schneider', 'Social Worker'), ('00077', 'France', 'Julien Matin', 'Social Worker'), ('00078', 'Germany', 'Monika Wagner', 'Medical Aid'), ('00079', 'France', 'Thomas Leroy', 'Social Worker'), ('00080', 'France', 'Marie Mercier', 'Medical Aid'), ('00081', 'France', 'Julien Bernard', 'Social Worker'), ('00082', 'Germany', 'Matthias Hoffmann', 'Social Worker'), ('00083', 'France', 'Marie Durand', 'Medical Aid'), ('00084', 'Switzerland', 'Peter Müller', 'Social Worker'), ('00085', 'Germany', 'Ursula Schneider', 'Social Worker'), ('00086', 'Switzerland', 'Mélissa Brunner', 'Medical Aid'), ('00087', 'France', 'Chloé Rousseau', 'Social Worker'), ('00088', 'Switzerland', 'Urli Keller', 'Social Worker'), ('00089', 'France', 'Clementine Leroy', 'Social Worker'), ('00090', 'France', 'Manon Durand', 'Social Worker'), ('00091', 'France', 'Thomas Durand', 'Medical Aid'), ('00092', 'France', 'Quentin Dubois', 'Medical Aid'), ('00093', 'France', 'Manon Rousseau', 'Medical Aid'), ('00094', 'Switzerland', 'Noémie Steiner', 'Social Worker'), ('00095', 'Switzerland', 'Mélissa Steiner', 'Social Worker'), ('00096', 'Germany', 'Manfred Becker', 'Social Worker'), ('00097', 'Italy', 'Alice Gallo', 'Medical Aid'), ('00098', 'France', 'Quentin Muller', 'Social Worker'), ('00099', 'France', 'Chloé Fontaine', 'Medical Aid'), ('00100', 'Germany', 'Gabriele Fischer', 'Medical Aid'), ('00101', 'Switzerland', 'Mélissa Huber', 'Social Worker'), ('00102', 'Switzerland', 'Noémie Steiner', 'Social Worker'), ('00103', 'Switzerland', 'Laura Huber', 'Social Worker'), ('00104', 'Switzerland', 'Peter Gerber', 'Social Worker'), ('00105', 'France', 'Sandra Mathieu', 'Medical Aid'), ('00106', 'France', 'Marie Lambert', 'Social Worker'), ('00107', 'Germany', 'Andreas Müller', 'Social Worker'), ('00108', 'France', 'Marie Bernard', 'Social Worker'), ('00109', 'Germany', 'Ursula Koch', 'Social Worker'), ('00110', 'France', 'Léa Lambert', 'Social Worker'), ('00111', 'France', 'Manon Mercier', 'Social Worker'), ('00112', 'France', 'Maxime Lambert', 'Social Worker'), ('00113', 'Germany', 'Peter Müller', 'Social Worker'), ('00114', 'Switzerland', 'Laura Keller', 'Social Worker'), ('00115', 'France', 'Laura Leroy', 'Security Aid'), ('00116', 'France', 'Léa Camille', 'Social Worker'), ('00117', 'Switzerland', 'Sarah Schneider', 'Medical Aid'), ('00118', 'France', 'Alexandre Nicolas', 'Security Aid'), ('00119', 'Italy', 'Martina Giordano', 'Social Worker'), ('00120', 'France', 'Camille Bernard', 'Social Worker'), ('00121', 'Germany', 'Stefanie Hoffmann', 'Social Worker'), ('00122', 'France', 'Marie Nicolas', 'Social Worker'), ('00123', 'Germany', 'Thomas Müller', 'Security Aid'), ('00124', 'Germany', 'Nicole Weber', 'Social Worker'), ('00125', 'Germany', 'Andreas Meyer', 'Social Worker'), ('00126', 'France', 'Alexandre Dubois', 'Social Worker'), ('00127', 'Germany', 'Brigitte Klein', 'Social Worker'), ('00128', 'Switzerland', 'Peter Schneider', 'Security Aid'), ('00129', 'Germany', 'Peter Becker', 'Medical Aid'), ('00130', 'Switzerland', 'Urli Schneider', 'Security Aid'), ('00131', 'Germany', 'Monika Richter', 'Social Worker'), ('00132', 'France', 'Chloé Camille', 'Social Worker'), ('00133', 'France', 'Thomas Fournier', 'Medical Aid'), ('00134', 'Germany', 'Peter Schulz', 'Social Worker'), ('00135', 'Switzerland', 'Océane Fischer', 'Social Worker'), ('00136', 'Germany', 'Michael Fischer', 'Medical Aid'), ('00137', 'France', 'Thomas Bernard', 'Medical Aid'), ('00138', 'France', 'Sandra Dubois', 'Social Worker'), ('00139', 'Germany', 'Andreas Hoffmann', 'Medical Aid'), ('00140', 'France', 'Maxime Mercier', 'Social Worker'), ('00141', 'France', 'Léa Leroy', 'Social Worker'), ('00142', 'Switzerland', 'Mélissa Meyer', 'Social Worker'), ('00143', 'Switzerland', 'Hans Frei', 'Social Worker'), ('00144', 'Switzerland', 'Laura Müller', 'Medical Aid'), ('00145', 'Germany', 'Wolfgang Schneider', 'Social Worker'), ('00146', 'Germany', 'Thomas Weber', 'Medical Aid'), ('00147', 'Switzerland', 'Noémie Meyer', 'Social Worker'), ('00148', 'France', 'Manon Dubois', 'Social Worker'), ('00149', 'France', 'Marie Bernard', 'Social Worker'), ('00150', 'France', 'Maxime Nicolas', 'Social Worker'), ('00151', 'Germany', 'Andreas Meyer', 'Medical Aid'), ('00152', 'France', 'Laura Mathieu', 'Social Worker'), ('00153', 'Switzerland', 'Sandra Keller', 'Medical Aid'), ('00154', 'France', 'Alexandre Mercier', 'Social Worker'), ('00155', 'Switzerland', 'Océane Müller', 'Social Worker'), ('00156', 'Switzerland', 'Sarah Gerber', 'Medical Aid'), ('00157', 'France', 'Maxime Lambert', 'Medical Aid'), ('00158', 'Germany', 'Wolfgang Wagner', 'Medical Aid'), ('00159', 'France', 'Laura Leroy', 'Social Worker'), ('00160', 'France', 'Laura Dubois', 'Social Worker'), ('00161', 'Switzerland', 'Peter Weber', 'Social Worker'), ('00162', 'Italy', 'Mattia De Luca', 'Social Worker'), ('00163', 'Germany', 'Gabriele Becker', 'Social Worker'), ('00164', 'France', 'Quentin Camille', 'Social Worker'), ('00165', 'Germany', 'Manfred Bauer', 'Medical Aid'), ('00166', 'Italy', 'Lorenzo Ricci', 'Social Worker'), ('00167', 'France', 'Quentin Petit', 'Social Worker'), ('00168', 'France', 'Thomas Lambert', 'Social Worker'), ('00169', 'Germany', 'Matthias Richter', 'Medical Aid'), ('00170', 'France', 'Thomas Nicolas', 'Social Worker'), ('00171', 'Switzerland', 'Océane Frei', 'Social Worker'), ('00172', 'France', 'Quentin Dubois', 'Social Worker'), ('00173', 'France', 'Léa Nicolas', 'Social Worker'), ('00174', 'Germany', 'Gabriele Schulz', 'Medical Aid'), ('00175', 'Germany', 'Monika Meyer', 'Medical Aid'), ('00176', 'Italy', 'Sofia Moretto', 'Social Worker'), ('00177', 'France', 'Marie Durand', 'Social Worker'), ('00178', 'Switzerland', 'Laura Meyer', 'Social Worker'), ('00179', 'Germany', 'Monika Schmidt', 'Social Worker'), ('00180', 'Germany', 'Stefanie Richter', 'Social Worker'), ('00181', 'France', 'Thomas Camille', 'Social Worker'), ('00182', 'France', 'Maxime Bernard', 'Social Worker'), ('00183', 'Germany', 'Gabriele Schneider', 'Security Aid'), ('00184', 'Germany', 'Peter Schmidt', 'Social Worker'), ('00185', 'France', 'Léa Matin', 'Medical Aid'), ('00186', 'Germany', 'Nicole Richter', 'Medical Aid'), ('00187', 'France', 'Philippe Bernard', 'Social Worker'), ('00188', 'Italy', 'Sofia Ricci', 'Medical Aid'), ('00189', 'Switzerland', 'Sandra Keller', 'Medical Aid'), ('00190', 'Germany', 'Gabriele Schulz', 'Social Worker'), ('00191', 'Italy', 'Alessandro Rossi', 'Social Worker'), ('00192', 'France', 'Quentin Mercier', 'Social Worker'), ('00193', 'Italy', 'Francesco Costa', 'Social Worker'), ('00194', 'Germany', 'Peter Müller', 'Social Worker'), ('00195', 'France', 'Nicolas Fontaine', 'Social Worker'), ('00196', 'Germany', 'Wolfgang Richter', 'Medical Aid'), ('00197', 'France', 'Thomas Dubois', 'Social Worker'), ('00198', 'Switzerland', 'Walter Baumann', 'Medical Aid'), ('00199', 'France', 'Alexandre Bernard', 'Medical Aid'), ('00200', 'Germany', 'Peter Wagner', 'Social Worker')]

with open("disaster_mission.txt", "w") as fh:
    fh.write("Reference number,Country,Name,Function\n")
    for el in helpers:
        fh.write(",".join(el) + "\n")

Live Python training

Enjoying this page? We offer live Python training courses covering the content of this site.

Upcoming online Courses

Python Intensive Course

23 Jun to 27 Jun 2025
28 Jul to 01 Aug 2025
08 Sep to 12 Sep 2025
20 Oct to 24 Oct 2025

Data Analysis with Python

04 Jun to 06 Jun 2025
30 Jul to 01 Aug 2025
10 Sep to 12 Sep 2025
22 Oct to 24 Oct 2025

Efficient Data Analysis with Pandas

02 Jun to 03 Jun 2025
23 Jun to 24 Jun 2025
28 Jul to 29 Jul 2025
08 Sep to 09 Sep 2025
20 Oct to 21 Oct 2025

Python Text Processing Course

04 Jun to 06 Jun 2025
10 Sep to 12 Sep 2025
22 Oct to 24 Oct 2025

See our Python training courses

See our Machine Learning with Python training courses