Contribution:
haskellAcceptor
Headline
Parsing (acceptance only) in Haskell with Parsec
Motivation
The implementation demonstrates parsing (acceptance) in Haskell with the Parsec library of parser combinators. A concrete textual syntax for companies is assumed. Acceptance is considered only. Thus, no abstract syntax is constructued. We set up basic parsers for quoted strings and floating-point numbers. Further, we compose parsers for companies, departments, and employees using appropriate parser combinators for sequences, alternatives, and optionality. By design, the acceptor is kept simple in terms of leveraged programming technique; in particular, monadic style and applicative functors are avoided to the extent possible.
Illustration
We would like to process a textual representation of companies; "..." indicates an elision:
company "Acme Corporation" { department "Research" { manager "Craig" { address "Redmond" salary 123456.0 } employee "Erik" { address "Utrecht" salary 12345.0 } employee "Ralf" { address "Koblenz" salary 1234.0 } } department "Development" { ... } }
Let's assume that the textual representation is defined by the following context-free grammar:
company = "company" literal "{" department* "}" department = "department" literal "{" manager subunit* "}" subunit = nonmanager | department manager = "manager" employee nonmanager = "employee" employee employee = literal "{" "address" literal "salary" float "}"
We can now apply a mapping from the grammar to a functional program in the following way:
- Each nonterminal becomes a function that is of Parsec's parser type.
- The function definition composes parsers following the production's structure.
- We may need to deal with lexical trivia such as spaces.
- We may want to check for the end-of-file to be sure to have looked at the complete input.
Here is the parser function for departments:
-- Accept a department
parseDepartment :: Acceptor
parseDepartment
= parseString "department"
>> parseLiteral
>> parseString "{"
>> parseManager
>> many parseSubUnit
>> parseString "}"
The composition uses ">>" for sequential composition in the same way as the original production for departments uses juxtaposition for the sequential composition of various terminals and nonterminals. The type Acceptor is defined as a parser type where the type of parse trees is "()":
-- The parser type for simple acceptors
type Acceptor = Parsec String () ()
We also need parsers for the basic units of input: literals (strings) and floats. Here is the parser for floats:
-- Accept a float
parseFloat :: Acceptor
parseFloat
= many1 digit
>> char '.'
>> many1 digit
>> spaces
>> return ()
That is, a float is defined to start with a non-empty sequence of digits, followed by ".", followed by another non-empty sequence of digits. In addition, any pending spaces are consumed as well. Finally, "()" is returned as the trivial parse tree of such an acceptor.
Relationships
- Contribution:haskellParsec advances this acceptor into a proper parser.
- Contribution:antlrAcceptor and others use the same textual representation.
Architecture
There are these modules:
- Main: acceptance test
- Company/Parser: the actual parser (acceptor)
Usage
See https://github.com/101companies/101haskell/blob/master/README.md.
There are no revisions for this page.
User contributions
User edits
Syntax for editing wiki
For you are available next options:will make text bold.
will make text italic.
will make text underlined.
will make text striked.
will allow you to paste code headline into the page.
will allow you to link into the page.
will allow you to paste code with syntax highlight into the page. You will need to define used programming language.
will allow you to paste image into the page.
is list with bullets.
is list with numbers.
will allow your to insert slideshare presentation into the page. You need to copy link to presentation and insert it as parameter in this tag.
will allow your to insert youtube video into the page. You need to copy link to youtube page with video and insert it as parameter in this tag.
will allow your to insert code snippets from @worker.
Syntax for editing wiki
For you are available next options:will make text bold.
will make text italic.
will make text underlined.
will make text striked.
will allow you to paste code headline into the page.
will allow you to link into the page.
will allow you to paste code with syntax highlight into the page. You will need to define used programming language.
will allow you to paste image into the page.
is list with bullets.
is list with numbers.
will allow your to insert slideshare presentation into the page. You need to copy link to presentation and insert it as parameter in this tag.
will allow your to insert youtube video into the page. You need to copy link to youtube page with video and insert it as parameter in this tag.
will allow your to insert code snippets from @worker.