Java Parser Combinator (JPC)

Thu, 24 January, 2002
atze@cs.uu.nl
What&why | Writing a Parser | Downloading

Home

Home CS
Courses & Education

Software

Research, Publications & Projects

Links & References

Personal
Educational portfolio
 
 

These pages written, maintained, and © by Atze Dijkstra.

 

What?

The package nl.uu.cs.jpc offers parser combinators written in Java. The current version is based on version 2.56 of the errorcorrecting parsercombinators written by Doaitse Swierstra. Currently, only a demo applet is available. A paper is under construction, a small tutorial can be found further.

Note: The previous version of the package is based on upon the combinators written for Hugs used in the course Grammatica's en Ontleden, including some of the examples used in the course. It is not supported but left here for the time being.

Why?

Ever used yacc + lex or equivalents? And thrilled with the speed or integration with the used language (like C, C++, Java) but frustrated by the difficulty of manipulating higher-order concepts like abstract syntax trees? Or you did use a functional programming language like Haskell in which parsercombinators can be used easily to create and manipulate grammars and the parsing of text described by a parser. But you could not find a way to incorporate a parser combinator in a production program written in an imperative language? This package hopes to combine best of both worlds: parser combinators written in Java.

Writing a Parser

The basic idea of writing a parser in Java using the parsercombinator package is to write a parser using the errorcorrecting parsercombinators written in Haskell, checking them for correctness and then handcompile the resulting parser to Java. The functional behavior is simulated using the JFC library, existing parsers from the errorcorrecting parsercombinators are found in the JPC library.

As an example the code for a (very) simple expression parser is used:

module Expr1Simple where import UU_Parsing instance Symbol Char pDigit = (\d -> ord d - ord '0') <$> pAnyOf ['0'..'9'] pNat = foldl (\a b -> a*10 + b) 0 <$> pSome pDigit pParens = pPacked (pSym '(') (pSym ')') pFact = pNat <|> pParens pExpr pTerm = pChainl ( (*) <$ pSym '*' <|> div <$ pSym '/' ) pFact pExpr = pChainl ( (+) <$ pSym '+' <|> (-) <$ pSym '-' ) pTerm t p inp = let (r, rest, errors) = parse p null inp in do putStr (show r) putStr (show errors) t1 = t pExpr "3*4-6" t2 = t pExpr "3**4-6" main :: IO () main = do putStr "Enter expression: " inp <- getLine let (r, rest, errors) = parse pExpr null inp putStr (show r) putStr (show errors) main

Code in a functional language consists of writing down function applications. The code for pDigit is handcompiled to something similar:

Eval pDigit =
	UUParsing.pApp.apply2
		( StdFuncs.charToDigit
		, UUParsing.pAnyOf.apply1( Str.valueOf( "0123456789" ) )
		) ;

Each application is written down as an invocation of applyX(). The applyX() returns an Apply object, subclass of Eval. The applyX() is invoked on a predefined Eval object, which may be a Function (subclass of Eval) or another Apply.

The parser pNat, for natural numbers is defined in a similar way, however, it is passed an explicitly defined lambda expression. A lambda expression is written down as a subclass of a Function:

Function pNat_foldl_func =
	new Function2()
	{
		public Object eval2( Object a, Object b )
		{
			return Int.valueOf( evalToI(a) * 10 + evalToI(b) ) ;
		}
	} ;

Eval pNat =
	UUParsing.pApp.apply2
		( StdFuncs.foldl.apply2
			( pNat_foldl_func
			, StdVals.zero
			)
		, UUParsing.pSome.apply1( pDigit )
		) ;

The newly defined function is passed as a parameter to the foldl.

The remaining definitions are written down in a similar fashion. The demo applet includes source code of a bit more complicated parser and its translation to Java.

The mechanics and internal workings are to be explained in a forthcoming paper.

Availability & Downloading

The final version is not yet available. For the time being one can download or look at the following:

  • Preliminary release (zip 400K) including parsercombinator + lazy/function library sources, apidoc and library jar file.
  • Online api documentation.
  • Example demo applet.