Headline

The process or the art of creating a data model

Illustration

We go through a hypothetical software evolution timeline for a human resources system like the System:Company while covering different data modeling options. Similar variations are exercised in Script:Data_modeling_in_Haskell. (In fact, this illustration has been designed to be handy in supporting the script on data modeling in Haskell.) In what follows, we focus here on the changes to the data-type definitions; see the repo for the complete co-changes, as they affect the actual sample data and the computations (such as totaling salaries).

V1: A company as a list of employees

Let's say that this is the data model of our initial system.

type Company = [Employee]
type Employee = (String, Float)

Here is some sample data and a function for totaling salaries:

sampleCompany :: Company
sampleCompany = [e1, e2, e3]
  where
    e1 = ("Max", 42)
    e2 = ("Nina", 77)
    e3 = ("Sean", 66)
 
total :: Company -> Float
total [] = 0
total ((_, s):es) = s + total es

V2: Flat companies versus companies with departments

Now let's enable departments; these are named units that group up employees with one employee serving as manager. In fact, the data model provides for two options: flat companies (as before) and companies with departments. To this end, we use the Either type constructor. Let's also make space for giving a name to companies.

type Company = Either [Employee] [Department]
type Department = (String, Employee, [Employee])
...

V3: Use of extra type synonyms for clarity

This is just a refactoring. For clarity, we introduce a few type synonyms. We dedicate type synonyms to names and salaries so that we don't confuse them all too easily with other uses of strings and floats. Also, we dedicate a type synonym to the manager role of employee. In all fairness, type synonyms don't really add any type checking benefits here, but the code is arguably better documented.

type Company = Either [Employee] [Department]
type Employee = (Name, Salary)
type Department = (Name, Manager, [Employee])
type Manager = Employee
type Name = String
type Salary = Float

V4: Making department manager optional

The manager position in a department may not always be filled. So we better make it optional. To this end, we use the Maybe type constructor.

...
type Manager = Maybe Employee
...

V5: Enabling nested department structure

Let's also allow for sub-departments and sub-sub-departments, etc. Here is our first attempt.

...
type Department = (Name, Manager, [Employee], [Department])
...

This declaration is invalid in Haskell because type synonyms must not be recursive because they carry some sort of macro expansion semantics. That is, any reference to a type synonym is internally replaced by the definition, which would lead to an infinite loop. We need to use a form type that is fit for recursion. In Haskell, we use algebraic data types.

...
data Department = MakeDepartment Name Manager [Employee] [Department]
...

The left-hand side occurrence 'Department' introduces a new type. The right-hand side occurrence of 'MakeDepartment' is a constructor which groups the various constructor components that were previously grouped as a product (a tuple type). For what it matters, in Haskell, it is is idiomatic to use the same name for data type and constructor when we only have one constructor. (We will see an example of a data type with more than constructor in a second.) Thus, we go with the following version:

...
data Department = Department Name Manager [Employee] [Department]
...

V6: Replacing Either by a problem-specific type

The Either type in the definition of 'Company' can also be modeled with a problem-specific algebraic data type whose constructor names may better convey the meaning of the two options.

data Company
  = FlatCompany [Employee]
  | HiearchicalCompany [Department]
...

V7: Replacing a tuple type by problem-specific type

We can also make a type distinction for employees so that we don't represent them as pairs. We may use an algebraic data type again:

...
data Employee = Employee Name Salary
...

When using plain constructors of algebraic data types (in Haskell), the different constructor components are only accessible by position. We can also switch to the use of record types, thereby allowing for the use of selectors for accessing constructor components. In this manner, the role of certain components becomes even more documented and we may even stop using some extra type synonyms, as the selectors may sufficiently document the purpose of some components without the use of type synonyms. In Haskell, record types are really sugared constructors of algebraic data types.

...
data Employee = Employee {
    getEmployeeName :: String,
    getSalary :: Float
}
...

We can also sugar the type for departments as a record type:

...
data Department = Department {
    getDepartmentName :: String,
    getManager :: Maybe Employee,
    getEmployees :: [Employee],
    getSubDepartments :: [Department]
}
...

V8: Adopting a flexible department structure

Suppose we could have multiple managers per department. In this case, we would go from 'Maybe' to 'List' for the manager position. Let's just go further than and make the department structure very flexible. That is, we just submit that a department is a collection of units where unit can be either a manager, a regular employee, or sub-department. When we say 'regular employee', then let's be more concrete and call it an 'engineer'.

...
data Department = Department {
    getDepartmentName :: String,
    getDepartmentUnits :: [DepartmentUnit]
}
data DepartmentUnit 
  = ManagerUnit Employee
  | EngineerUnit Employee
  | SubDepartmentUnit Department
...

V9: Making a type distinction for salaries

This is just a refactoring. We would like to make sure that we don't confuse salaries with other types of floats all too easily. Thus, we set up a newtype; this is a limited form of algebraic data type with exactly one constructor and one constructor component. Newtypes serve nothing else but making type distinctions; general algebraic data types additionally serve the provison of choices (subject to different constructors) and grouping of data (subject to several constructor components).

...
data Employee = Employee {
    getEmployeeName :: String,
    getSalary :: Salary
}
newtype Salary = Salary Float
...


Ralf Lämmel edited this article at Thu, 14 May 2020 22:45:14 +0200
Compare revisions Compare revisions

User contributions

    This user never has never made submissions.

    User edits

    Syntax for editing wiki

    For you are available next options:

    will make text bold.

    will make text italic.

    will make text underlined.

    will make text striked.

    will allow you to paste code headline into the page.

    will allow you to link into the page.

    will allow you to paste code with syntax highlight into the page. You will need to define used programming language.

    will allow you to paste image into the page.

    is list with bullets.

    is list with numbers.

    will allow your to insert slideshare presentation into the page. You need to copy link to presentation and insert it as parameter in this tag.

    will allow your to insert youtube video into the page. You need to copy link to youtube page with video and insert it as parameter in this tag.

    will allow your to insert code snippets from @worker.

    Syntax for editing wiki

    For you are available next options:

    will make text bold.

    will make text italic.

    will make text underlined.

    will make text striked.

    will allow you to paste code headline into the page.

    will allow you to link into the page.

    will allow you to paste code with syntax highlight into the page. You will need to define used programming language.

    will allow you to paste image into the page.

    is list with bullets.

    is list with numbers.

    will allow your to insert slideshare presentation into the page. You need to copy link to presentation and insert it as parameter in this tag.

    will allow your to insert youtube video into the page. You need to copy link to youtube page with video and insert it as parameter in this tag.

    will allow your to insert code snippets from @worker.