diff options
| author | Miguel <m.i@gmx.at> | 2019-03-17 18:14:32 +0100 |
|---|---|---|
| committer | Miguel <m.i@gmx.at> | 2019-03-17 18:14:32 +0100 |
| commit | 0e4810dcfb132bf276a282e25b8523a4009ae08b (patch) | |
| tree | dac6dce820f0a35d9ed7ea7676982a0f86fd0edb /00_blog/00040_Haskell/00150_Applicative-vs-Monadic-Parsing | |
| parent | ad6411e9ec256b03f20b9195e25cb128fe02c628 (diff) | |
rename blog dir
Diffstat (limited to '00_blog/00040_Haskell/00150_Applicative-vs-Monadic-Parsing')
| -rw-r--r-- | 00_blog/00040_Haskell/00150_Applicative-vs-Monadic-Parsing/index.md | 175 |
1 files changed, 175 insertions, 0 deletions
diff --git a/00_blog/00040_Haskell/00150_Applicative-vs-Monadic-Parsing/index.md b/00_blog/00040_Haskell/00150_Applicative-vs-Monadic-Parsing/index.md new file mode 100644 index 0000000..6e4418b --- /dev/null +++ b/00_blog/00040_Haskell/00150_Applicative-vs-Monadic-Parsing/index.md @@ -0,0 +1,175 @@ + March 2019 +# Applicative vs. Monadic Parsing + +Some _real world parsers_ as _Parsec_, let us choose between applicative and +monadic parsing. The applicative style should be sufficient to parse +_context-free_ languages and is easier to reason about, but it is not capable +of parsing _context-sensitive_ grammars. + +## Functor, Applicative, Alternative, Monad + +Before we discuss Applicative and Monadic Parsing, it is important to +understand what Functors, Applicative Functors, Alternatives and Monads +have to offer. When in doubt simply look up the type-class definition. + +Two well-known data types which are instances of all of the above are +**Maybe** and **List**. Try it yourself in _GHCi_. + +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ {.haskell .numberLines} +-- Maybe as Functor +fmap (+20) (Just 10) +fmap (+20) (Nothing) + +-- List as Functor +fmap (+20) [1..10] + +-- Maybe as Applicative Functor +Nothing <*> Just (22) +Just (+1) <*> Nothing +Just (+15) <*> Just (22) + +-- List as Applicative Functor +[]<*>[1..5] +[(+1),(+15)]<*>[] +[(+1),(+15)]<*>[1..5] + +-- Maybe as Alternative +Nothing <|> Just 5 +Just 5 <|> Nothing +Just 22 <|> Just 10 + +-- List as Alternative +[1..5] <|> [5..10] + +-- Maybe as Monad +Just 5 >>= Just . (+10) +Just 5 >>= return . (+10) +Just 10 >>= return . (+5) >>=return . (+10) +Nothing >>= return . (+5) >>=return . (+10) +Just 10 >>= \x-> return (x*2) >>=return . (+x`div`2) + +-- List as Monad +[1..10] >>= (:[]) . (*5) +[1..5] >>= return . (^2) +[[1..5],[5..10],[],[12]]>>=id +[[1..5],[5..10],[],[12]]>>= map (*4) +[1..5] >>= \a -> [100..100+a] >>= \b -> return (a+b) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +## Simple Parser + +Our simple toy Parser simply encapsulates a parsing function that returns +_Nothing_ if parsing fails. Otherwise a _pair_ consisting of the parsed and +unparsed input inside its respective elements. + +Notice that we made our Parser instance of all the type-classes already. + +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ {.haskell .numberLines} +data Parser a b = Parser (a->Maybe (b,a)) + +instance Functor (Parser a) where + fmap f p = Parser g + where g inp = case parse p inp of Just (x,y) -> Just (f x,y) + _ -> Nothing + +instance Applicative (Parser a) where + pure v = Parser (\inp -> Just (v,inp)) + pg <*> px = Parser g + where g inp = case parse pg inp of Just (f,b) -> parse (fmap f px) b + _ -> Nothing + +instance Alternative (Parser a) where + empty = Parser (\_ -> Nothing) + p1 <|> p2 = Parser g + where g inp = case parse p1 inp of Nothing -> parse p2 inp + x -> x + +instance Monad (Parser a) where + return = pure + p >>= f = Parser g + where g inp = case parse p inp of Just (a,b) -> parse (f a) b + _ -> Nothing + +parse :: Parser a b -> a -> Maybe (b,a) +parse (Parser f) inp = f inp +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +## Primitives + +We also need at least a minimal set of primitives to parse something. +Note that the primitives can give you the effects of Applicative or +Alternative even if there's no instance (as dmwit pointed out on #haskell). +E.g. the _word_ primitive below, could not be composed, by combining the +_char_ parsers in an applicative style only. + +The _num_ parser makes use of the _some_ combinator which requires our +Parser to be an instance of _Alternative_ as well. _Applicative_ +alone is of limited use in practice since without it, we also lack +the <|> combinator. + +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ {.haskell .numberLines} +word :: String -> Parser String String +word w = Parser g where l = length w + g inp = if take l inp == w then Just (w,drop l inp) + else Nothing + +satisfy :: (Char -> Bool) -> Parser String Char +satisfy f = Parser g where g (x:xs) | f x = Just (x,xs) + g _ = Nothing + +space :: Parser String Char +space = satisfy (==' ') + +char :: Char -> Parser String Char +char c = satisfy (==c) + +notChar :: Char -> Parser String Char +notChar c = satisfy (/=c) + +num :: Parser String Int +num = read <$> some (satisfy isDigit) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +## Applicative / Alternative parsing + +Let's parse something! + +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ {.haskell .numberLines} +-- well-formed parentheses are the canonical example of a context-free grammar +parsePar = concat <$> some parseP where parseP = word "()" <|> (char '(' *> parsePar <* char ')') +parse parsePar "((()())()())" + +-- a simple context-free language with matching pairs of a's and b's, which is not regular +parseAB = (\a x b -> a:x++"b") <$> char 'a' <*> parseAB <*> char 'b' <|> word "ab" +parse parseAB "aaaaabbbbb" + +-- parse basic mathematical operations and calulate the results. +parseOp op sig = (pure (op) <* many space <*> num <* many space <* char sig <* many space <*> num ) +parse (parseOp (+) '+' <|> parseOp (-) '-' <|> parseOp (*) '*' <|> parseOp (div) '/') " 111 * 747 " +parse (parseOp (+) '+' <|> parseOp (-) '-' <|> parseOp (*) '*' <|> parseOp (div) '/') "111+747" +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +## Monadic parsing of context-sensitive grammars + +The canonical non-context-free language can not be caputred by applicative parsing anymore: $$\{a^n b^n c^n : n \geqslant 1\}$$ + +The following monadic parser will work here: + +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ {.haskell .numberLines} +parseABC = length <$> (many $ char 'a') >>= \la -> + length <$> (many $ char 'b') >>= \lb -> + length <$> (many $ char 'c') >>= \lc -> + if la==lb&&lb==lc then pure ("ABCok") else empty + +parse parseABC "aaaaabbbbbccccc" +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +## Summary + +As usual, use the simplest tool that will suffice :smile: + +## Ref / Further Reading + +* <http://dev.stephendiehl.com/fun/002_parsers.html> +* <https://eli.thegreenplace.net/2017/deciphering-haskells-applicative-and-monadic-parsers/> +* <https://byorgey.wordpress.com/2012/01/05/parsing-context-sensitive-languages-with-applicative/> |
