Parser

 

 

Parser 

  •  Accepts string of tokens from lexical analyzer (usually one token at a time)
     
  •  Verifies whether or not string can be generated by grammar
     
  • Reports syntax errors (recovers if possible)

 

Parser obtains a string of tokens from the lexical analyzer and verifies that it can be generated by the language for  the source program. The parser should report any syntax errors in an intelligible fashion.                            

 

     The two types of parsers  employed are:

  1.   Top down parser: which build parse trees from top(root) to bottom(leaves)
     
  2.   Bottom up parser: which build parse trees from leaves and work up the root.

Therefore there are two types of parsing methods– top-down parsing and bottom-up parsing
 

Parse Trees

  • Nodes are non-terminals.
     
  •  Leaves are terminals.
     
  •  Branching corresponds to rules of the grammar.
     
  •  The leaves give a sentence of the input language.
     
  •  For every sentence of the language there is at least one parse tree.
     
  •  Sometimes we have more then one parse tree for a sentence.
     
  •  Grammars which allow more than one parse tree for some sentences are called ambiguous and are usually not good for compilation.
     

 

Recursive descent parsing:  corresponds to finding a leftmost  derivation for an input string
 

Equivalent to constructing parse tree in pre-order
 

Example:
 

Grammar: S |cAd A | ab | a
 

Input: cad
 

Problems:
 

1. backtracking involved ()buffering of tokens required)
 

2. left recursion will lead to infinite looping
 

3. left factors may cause several backtracking steps
 

 

Top Down Parsing

  •   Can be viewed two ways:
     
  •  Attempt to find leftmost derivation for input string
     
  •  Attempt to create parse tree, starting from at root, creating nodes in preorder
     
  •   General form is recursive descent parsing


Predictive Parsing
 

  • May require backtracking
  • Backtracking parsers not used frequently because not needed

     A special case of recursive-descent parsing that does not require backtracking
     

      Must always know which production to use based on current input symbol
     

     Can often create appropriate grammar:
     

    removing left-recursion
     

    left factoring the resulting grammar