1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
|
March 2019
# Applicative vs. Monadic Parsing
Some _real world parsers_ as _Parsec_, let us choose between applicative and
monadic parsing. The applicative style should be sufficient to parse
_context-free_ languages and is easier to reason about, but it is not capable
of parsing _context-sensitive_ grammars.
## Functor, Applicative, Alternative, Monad
Before we discuss Applicative and Monadic Parsing, it is important to
understand what Functors, Applicative Functors, Alternatives and Monads
have to offer. When in doubt simply look up the type-class definition.
Two well-known data types which are instances of all of the above are
**Maybe** and **List**. Try it yourself in _GHCi_.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ {.haskell .numberLines}
-- Maybe as Functor
fmap (+20) (Just 10)
fmap (+20) (Nothing)
-- List as Functor
fmap (+20) [1..10]
-- Maybe as Applicative Functor
Nothing <*> Just (22)
Just (+1) <*> Nothing
Just (+15) <*> Just (22)
-- List as Applicative Functor
[]<*>[1..5]
[(+1),(+15)]<*>[]
[(+1),(+15)]<*>[1..5]
-- Maybe as Alternative
Nothing <|> Just 5
Just 5 <|> Nothing
Just 22 <|> Just 10
-- List as Alternative
[1..5] <|> [5..10]
-- Maybe as Monad
Just 5 >>= Just . (+10)
Just 5 >>= return . (+10)
Just 10 >>= return . (+5) >>=return . (+10)
Nothing >>= return . (+5) >>=return . (+10)
Just 10 >>= \x-> return (x*2) >>=return . (+x`div`2)
-- List as Monad
[1..10] >>= (:[]) . (*5)
[1..5] >>= return . (^2)
[[1..5],[5..10],[],[12]]>>=id
[[1..5],[5..10],[],[12]]>>= map (*4)
[1..5] >>= \a -> [100..100+a] >>= \b -> return (a+b)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Simple Parser
Our simple toy Parser simply encapsulates a parsing function that returns
_Nothing_ if parsing fails. Otherwise a _pair_ consisting of the parsed and
unparsed input inside its respective elements.
Notice that we made our Parser instance of all the type-classes already.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ {.haskell .numberLines}
data Parser a b = Parser (a->Maybe (b,a))
instance Functor (Parser a) where
fmap f p = Parser g
where g inp = case parse p inp of Just (x,y) -> Just (f x,y)
_ -> Nothing
instance Applicative (Parser a) where
pure v = Parser (\inp -> Just (v,inp))
pg <*> px = Parser g
where g inp = case parse pg inp of Just (f,b) -> parse (fmap f px) b
_ -> Nothing
instance Alternative (Parser a) where
empty = Parser (\_ -> Nothing)
p1 <|> p2 = Parser g
where g inp = case parse p1 inp of Nothing -> parse p2 inp
x -> x
instance Monad (Parser a) where
return = pure
p >>= f = Parser g
where g inp = case parse p inp of Just (a,b) -> parse (f a) b
_ -> Nothing
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Primitives
We also need at least a minimal set of primitives to parse something.
Note that the primitives can give you the effects of Applicative or
Alternative even if the Parser itself is not an instance of these.
(Thanks to dmwit for pointing this out on #haskell).
The _num_ parser makes use of the _some_ combinator which requires our
Parser to be an instance of _Alternative_ as well. _Applicative_
alone is of limited use in practice since without it, we also lack
the <|> combinator.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ {.haskell .numberLines}
satisfy :: (Char -> Bool) -> Parser String Char
satisfy f = Parser g where g (x:xs) | f x = Just (x,xs)
g _ = Nothing
space :: Parser String Char
space = satisfy (==' ')
char :: Char -> Parser String Char
char c = satisfy (==c)
notChar :: Char -> Parser String Char
notChar c = satisfy (/=c)
num :: Parser String Int
num = read <$> some (satisfy isDigit)
word :: String -> Parser String String
word (c:cs) = (:) <$> char c <*> word cs
word _ = pure ""
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Applicative / Alternative parsing
Now we are able to parse any context free grammar, given our
_satisfy_ Parser along with Applicative, Alternative, and recursion.
Let's parse something!
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ {.haskell .numberLines}
-- we parse, by simply applying the function encapsulated in our Parser on the input.
parse (Parser f) inp = f inp
-- well-formed parentheses are the canonical example of a context-free grammar
parsePar = concat <$> some parseP where parseP = word "()" <|> (char '(' *> parsePar <* char ')')
parse parsePar "((()())()())"
-- a simple context-free language with matching pairs of a's and b's, which is not regular
parseAB = (\a x b -> a:x++"b") <$> char 'a' <*> parseAB <*> char 'b' <|> word "ab"
parse parseAB "aaaaabbbbb"
-- parse basic mathematical operations and calulate the results.
parseOp op sig = (pure (op) <* many space <*> num <* many space <* char sig <* many space <*> num )
parse (parseOp (+) '+' <|> parseOp (-) '-' <|> parseOp (*) '*' <|> parseOp (div) '/') " 111 * 747 "
parse (parseOp (+) '+' <|> parseOp (-) '-' <|> parseOp (*) '*' <|> parseOp (div) '/') "111+747"
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Monadic parsing of context-sensitive grammars
The canonical non-context-free language can not be caputred by applicative parsing anymore:
\$\$\\{a^n b^n c^n : n \\geqslant 1\\}$\$
The following monadic parser will work here:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ {.haskell .numberLines}
parseABC = length <$> (many $ char 'a') >>= \la ->
length <$> (many $ char 'b') >>= \lb ->
length <$> (many $ char 'c') >>= \lc ->
if la==lb&&lb==lc then pure ("ABCok") else empty
parse parseABC "aaaaabbbbbccccc"
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Git Repo
Here you can find the complete source code of this little toy parser
along with the examples presented in this article:
<https://gitweb.softwarefools.com/?p=miguel/haskell.git;a=blob;f=parser/main.hs>
## Summary
As usual, use the simplest tool that will suffice :smile:
## Ref / Further Reading
* <http://dev.stephendiehl.com/fun/002_parsers.html>
* <https://eli.thegreenplace.net/2017/deciphering-haskells-applicative-and-monadic-parsers/>
* <https://byorgey.wordpress.com/2012/01/05/parsing-context-sensitive-languages-with-applicative/>
* <https://arunraghavan.net/2018/02/applicative-functors-for-fun-and-parsing/>
|