Skip to content

Start parser with our token inputs with Bison #3

Open
@leonardoaraujosantos

Description

@leonardoaraujosantos
Owner

Introduction

assignstatement

The parser’s job is to figure out the relationship(grammar) among the input tokens. Those tokens come from the scanner (flex).
A grammar is a series of rules that the parser uses to recognize syntactically valid input.

Bison only deal with the syntax no the semantic, for instance the code bellow has a valid syntax but is invalid, bison will think that this is fine

int var = "Leo"

Consider the following grammar on bison. NAME, NUMBER, '+', '-', are tokens that flex gives to us.

statement: NAME '=' expression
expression: NUMBER '+' NUMBER
 | NUMBER '−' NUMBER

Every line here is a rule, so the rule for a statement is the pattern "NAME = expression". And the rule for expression is the pattern "NUMBER+NUMBER" OR "NUMBER-NUMBER"

Every grammar includes a start symbol, the one that has to be at the root of the parse tree. In this grammar, statement is the start symbol.

So the following source code line:

fred = 12 + 13

Would produce from the scanner the following tokens:
NAME,'=',NUMBER,'+',NUMBER

This AST tree would be created:
simple_expression_ast

Where "fred =" is a statement. And "12+13" is an expression.

Now what would happen if we would want to parse a longer instruction

fred = 14 + 2311 + 7

To solve this can use a Recursive rule, where the rule call itself.

statement: NAME '=' expression
expression: NUMBER
 | expression '+' NUMBER
 | expression '−' NUMBER

Now the tree would grow to this:
recursiverule

To start use our token inputs, starting from a simple example.

function result = justSum(a,b)
  result = a + b; # Simple comment
end

Tokens generated with out scanner

FUNCTION: function
Identifier: result
ASSIGN: =
Identifier: justSum
Open Parenthesis: (
Identifier: a
Comma: ,
Identifier: b
Close Parenthesis: )

Identifier: result
ASSIGN: =
Identifier: a
Plus: +
Identifier: b
EndLine: ;

END: end

And build an AST tree

Class diagram for parser Nodes

Considering the class diagram bellow, we need to create a rule for every class that inherites from Node.
parserdiagram

References

Activity

changed the title [-]Start parser with our token inputs[/-] [+]Start parser with our token inputs with Bison[/+] on Sep 18, 2016
leonardoaraujosantos

leonardoaraujosantos commented on Sep 18, 2016

@leonardoaraujosantos
OwnerAuthor

BNF

Backus Normal Form is one notation techniques for context-free grammars, often used to describe the syntax of programming languages like C, or Python

Example:
bnf_0
This translates into English as:

  • A postal address consists of a name-part, followed by a street-address part, followed by a zip-code part.
  • A name-part consists of either: a personal-part followed by a last name followed by an optional suffix (Jr., Sr., or dynastic number) and end-of-line, or a personal part followed by a name part (this rule illustrates the use of recursion in BNFs, covering the case of people who use multiple first and middle names and/or initials).
  • A personal-part consists of either a first name or an initial followed by a dot.
  • A street address consists of a house number, followed by a street name, followed by an optional apartment specifier, followed by an end-of-line.
  • A zip-part consists of a town-name, followed by a comma, followed by a state code, followed by a ZIP-code followed by an end-of-line.
  • A opt-suffix-part consists of a suffix, such as "Sr.", "Jr." or a roman-numeral, or an empty string (i.e. nothing).

On Bison

stmt: var_decl | func_decl | expr { /* Any Action*/ }

Read as ”stmt is a var_decl or func_decl or expr” and if this grammar match execute something on the ”action block”. Inside the action block we mount our AST Tree

if_stmt : IF '(' condition ')' block { /* do stuff when this rule is encountered */ }
        | IF '(' condition ')'       { /*do some other stuff on this other rule*/ }
        ;

This would create the grammar for the if statement, like if (condition) block, or if (condition)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @leonardoaraujosantos

        Issue actions

          Start parser with our token inputs with Bison · Issue #3 · leonardoaraujosantos/Matlab_LLVM_Frontend