summaryrefslogtreecommitdiff
path: root/00_blog/00040_Haskell/00150_Applicative-vs-Monadic-Parsing
diff options
context:
space:
mode:
authorMiguel <m.i@gmx.at>2019-03-17 18:14:32 +0100
committerMiguel <m.i@gmx.at>2019-03-17 18:14:32 +0100
commit0e4810dcfb132bf276a282e25b8523a4009ae08b (patch)
treedac6dce820f0a35d9ed7ea7676982a0f86fd0edb /00_blog/00040_Haskell/00150_Applicative-vs-Monadic-Parsing
parentad6411e9ec256b03f20b9195e25cb128fe02c628 (diff)
rename blog dir
Diffstat (limited to '00_blog/00040_Haskell/00150_Applicative-vs-Monadic-Parsing')
-rw-r--r--00_blog/00040_Haskell/00150_Applicative-vs-Monadic-Parsing/index.md175
1 files changed, 175 insertions, 0 deletions
diff --git a/00_blog/00040_Haskell/00150_Applicative-vs-Monadic-Parsing/index.md b/00_blog/00040_Haskell/00150_Applicative-vs-Monadic-Parsing/index.md
new file mode 100644
index 0000000..6e4418b
--- /dev/null
+++ b/00_blog/00040_Haskell/00150_Applicative-vs-Monadic-Parsing/index.md
@@ -0,0 +1,175 @@
+ March 2019
+# Applicative vs. Monadic Parsing
+
+Some _real world parsers_ as _Parsec_, let us choose between applicative and
+monadic parsing. The applicative style should be sufficient to parse
+_context-free_ languages and is easier to reason about, but it is not capable
+of parsing _context-sensitive_ grammars.
+
+## Functor, Applicative, Alternative, Monad
+
+Before we discuss Applicative and Monadic Parsing, it is important to
+understand what Functors, Applicative Functors, Alternatives and Monads
+have to offer. When in doubt simply look up the type-class definition.
+
+Two well-known data types which are instances of all of the above are
+**Maybe** and **List**. Try it yourself in _GHCi_.
+
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ {.haskell .numberLines}
+-- Maybe as Functor
+fmap (+20) (Just 10)
+fmap (+20) (Nothing)
+
+-- List as Functor
+fmap (+20) [1..10]
+
+-- Maybe as Applicative Functor
+Nothing <*> Just (22)
+Just (+1) <*> Nothing
+Just (+15) <*> Just (22)
+
+-- List as Applicative Functor
+[]<*>[1..5]
+[(+1),(+15)]<*>[]
+[(+1),(+15)]<*>[1..5]
+
+-- Maybe as Alternative
+Nothing <|> Just 5
+Just 5 <|> Nothing
+Just 22 <|> Just 10
+
+-- List as Alternative
+[1..5] <|> [5..10]
+
+-- Maybe as Monad
+Just 5 >>= Just . (+10)
+Just 5 >>= return . (+10)
+Just 10 >>= return . (+5) >>=return . (+10)
+Nothing >>= return . (+5) >>=return . (+10)
+Just 10 >>= \x-> return (x*2) >>=return . (+x`div`2)
+
+-- List as Monad
+[1..10] >>= (:[]) . (*5)
+[1..5] >>= return . (^2)
+[[1..5],[5..10],[],[12]]>>=id
+[[1..5],[5..10],[],[12]]>>= map (*4)
+[1..5] >>= \a -> [100..100+a] >>= \b -> return (a+b)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+## Simple Parser
+
+Our simple toy Parser simply encapsulates a parsing function that returns
+_Nothing_ if parsing fails. Otherwise a _pair_ consisting of the parsed and
+unparsed input inside its respective elements.
+
+Notice that we made our Parser instance of all the type-classes already.
+
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ {.haskell .numberLines}
+data Parser a b = Parser (a->Maybe (b,a))
+
+instance Functor (Parser a) where
+ fmap f p = Parser g
+ where g inp = case parse p inp of Just (x,y) -> Just (f x,y)
+ _ -> Nothing
+
+instance Applicative (Parser a) where
+ pure v = Parser (\inp -> Just (v,inp))
+ pg <*> px = Parser g
+ where g inp = case parse pg inp of Just (f,b) -> parse (fmap f px) b
+ _ -> Nothing
+
+instance Alternative (Parser a) where
+ empty = Parser (\_ -> Nothing)
+ p1 <|> p2 = Parser g
+ where g inp = case parse p1 inp of Nothing -> parse p2 inp
+ x -> x
+
+instance Monad (Parser a) where
+ return = pure
+ p >>= f = Parser g
+ where g inp = case parse p inp of Just (a,b) -> parse (f a) b
+ _ -> Nothing
+
+parse :: Parser a b -> a -> Maybe (b,a)
+parse (Parser f) inp = f inp
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+## Primitives
+
+We also need at least a minimal set of primitives to parse something.
+Note that the primitives can give you the effects of Applicative or
+Alternative even if there's no instance (as dmwit pointed out on #haskell).
+E.g. the _word_ primitive below, could not be composed, by combining the
+_char_ parsers in an applicative style only.
+
+The _num_ parser makes use of the _some_ combinator which requires our
+Parser to be an instance of _Alternative_ as well. _Applicative_
+alone is of limited use in practice since without it, we also lack
+the <|> combinator.
+
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ {.haskell .numberLines}
+word :: String -> Parser String String
+word w = Parser g where l = length w
+ g inp = if take l inp == w then Just (w,drop l inp)
+ else Nothing
+
+satisfy :: (Char -> Bool) -> Parser String Char
+satisfy f = Parser g where g (x:xs) | f x = Just (x,xs)
+ g _ = Nothing
+
+space :: Parser String Char
+space = satisfy (==' ')
+
+char :: Char -> Parser String Char
+char c = satisfy (==c)
+
+notChar :: Char -> Parser String Char
+notChar c = satisfy (/=c)
+
+num :: Parser String Int
+num = read <$> some (satisfy isDigit)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+## Applicative / Alternative parsing
+
+Let's parse something!
+
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ {.haskell .numberLines}
+-- well-formed parentheses are the canonical example of a context-free grammar
+parsePar = concat <$> some parseP where parseP = word "()" <|> (char '(' *> parsePar <* char ')')
+parse parsePar "((()())()())"
+
+-- a simple context-free language with matching pairs of a's and b's, which is not regular
+parseAB = (\a x b -> a:x++"b") <$> char 'a' <*> parseAB <*> char 'b' <|> word "ab"
+parse parseAB "aaaaabbbbb"
+
+-- parse basic mathematical operations and calulate the results.
+parseOp op sig = (pure (op) <* many space <*> num <* many space <* char sig <* many space <*> num )
+parse (parseOp (+) '+' <|> parseOp (-) '-' <|> parseOp (*) '*' <|> parseOp (div) '/') " 111 * 747 "
+parse (parseOp (+) '+' <|> parseOp (-) '-' <|> parseOp (*) '*' <|> parseOp (div) '/') "111+747"
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+## Monadic parsing of context-sensitive grammars
+
+The canonical non-context-free language can not be caputred by applicative parsing anymore: $$\{a^n b^n c^n : n \geqslant 1\}$$
+
+The following monadic parser will work here:
+
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ {.haskell .numberLines}
+parseABC = length <$> (many $ char 'a') >>= \la ->
+ length <$> (many $ char 'b') >>= \lb ->
+ length <$> (many $ char 'c') >>= \lc ->
+ if la==lb&&lb==lc then pure ("ABCok") else empty
+
+parse parseABC "aaaaabbbbbccccc"
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+## Summary
+
+As usual, use the simplest tool that will suffice :smile:
+
+## Ref / Further Reading
+
+* <http://dev.stephendiehl.com/fun/002_parsers.html>
+* <https://eli.thegreenplace.net/2017/deciphering-haskells-applicative-and-monadic-parsers/>
+* <https://byorgey.wordpress.com/2012/01/05/parsing-context-sensitive-languages-with-applicative/>