Add to Favorites    Make Home Page 1796 Online  
 Language Categories  
 Our Services  

Home » C Home » Beginners / Lab Assignments Home » Creating a Lexical Analyzer in c

A D V E R T I S E M E N T

Search Projects & Source Codes:

Title Creating a Lexical Analyzer in c
Author fahad bader al-buhairi ¦Õ¤ ?¤Ð Ãß??ÝÐÝ
Author Email xxxq8xxx [at] hotmail.com
Description The token classes correspond to the following regular definitions , except and digit>, plus the class . (The ellipse ôàö is used in the usual sense ôand so onö. The spaces in the definitions are used for better adability, they are not valid parts of the definitions.)

::= ;
::= :
::= ,
::= + | - | * | /
::= < | <= | = | <> | >= | >
::= ( | )
::= % | ! | @ | ~ | $
::= a | A | b | B | à | z | Z
::= 0 | 1 | à | 9
::= ( | )*
::= + (. digit+ )? (E(+|-)?

The following lexemes should be recognized as keywords, not as
identifiers:

procedure, is, begin, end, var, cin, cout, if, then, else, and, or,
not,
loop, exit, when, while, until

The whitespace characters ô ô (space symbol) and (end-of-line
symbol)
are to be skipped.
Comments (any text closed between braces ô{ö and ô}ö ) are to be
skipped
as well. The comments do not extend to several lines (they do not
contain
the end-of-line symbol).

The input for the lexical analyzer is a textfile SOURCE.TXT consisting
of
several lines of text (a ôprogramö) being a correctly formed sequence
of
lexemes corresponding to the above definitions, whitespaces and
comments.

The output of your lexical analyzer consists of 2 text files ST.TXT and
TOKENS.TXT.
1. ST.TXT is the symbol table created by the lexical analyzer. Each
line
consists of three parts:
- line number
- the lexeme (string)
- type (string) , being one of the following: keyword, identifier, num
2. TOKENS.TXT is the list of tokens produced by the lexical analyzer
with
the following structure:
- one line of input (in the order of appearance in SOURCE.TXT)
- corresponding pairs ôtoken, attributeö, each in a separate line in
the
order as they occur in the line
- blank line
The attribute of a keyword, identifier or a number is the line number
in
the symbol table. The attribute of any other token is the lexeme
itself.
The longest prefix of the input that can match any regular expression
pi
is taken as the next token.
Category C » Beginners / Lab Assignments
Hits 504340
Code Select and Copy the Code
/*************************************************************** Program describtion : ===================== This program is for creating a Lexical Analyzer in c Created by : ============= Name : Fahad Bader Al-buhairi Email : q8_government@hotmail.com phone : 009657991000 *****************************************************************/ /**************************************************************** Necessary Header files used in program. *****************************************************************/ #include<stdio.h> #include<string.h> #include<conio.h> #include<ctype.h> /**************************************************************** Functions prototype. *****************************************************************/ void Open_File(); void Demage_Lexeme(); int Search(char[256],int); void analyze(); void Skip_Comment(); void Read_String(); void Is_Keyword_Or_Not(); void Is_Identifier_Or_Not(); void Is_Operator_Or_Not(); void Read_Number(); void Is_Special_Or_Not(); void Is_Comparison_Or_Not(); void Add_To_Lexical (char[256],int,char[256]); void Print_ST(); void Print_TOKEN(); void Token_Attribute(); /**************************************************************** Data structure used in program. *****************************************************************/ struct lexical { char data[256]; //Value of token. int line[256]; //Line # which token appear in input file. int times; //# of times that token appear in input file. char type[256]; //Type of each token. struct lexical *next; }; typedef struct lexical Lex; typedef Lex *lex; /**************************************************************** File pointer for accessing the file. *****************************************************************/ FILE *fp; FILE *st; FILE *token; char lexeme[256],ch; int f,flag,line=1,i=1; lex head=NULL,tail=NULL; /**************************************************************** Array holding all keywords for checking. *****************************************************************/ char *keywords[]={"procedure","is","begin","end","var","cin","cout","if", "then","else","and","or","not","loop","exit","when", "while","until"}; /**************************************************************** Array holding all arithmetic operations for checking. *****************************************************************/ char arithmetic_operator[]={'+','-','*','/'}; /**************************************************************** Array holding all comparison operations for checking. *****************************************************************/ char *comparison_operator[]={"<",">","=","<=","<>",">="}; /**************************************************************** Array holding all special for checking. *****************************************************************/ char special[]={'%','!','@','~','$'}; /**************************************************************** ************** *MAIN PROGRAM* ************** *****************************************************************/ void main() { Open_File(); analyze(); fclose(fp); Print_ST(); Print_TOKEN(); } /**************************************************************** This function open input sourse file. *****************************************************************/ void Open_File() { fp=fopen("source.txt","r"); //provide path for source.txt here if(fp==NULL) { printf("!!!Can't open input file - source.txt!!!"); getch(); exit(0); } } /**************************************************************** Function to add item to structure of array to store data and information of lexical items. *****************************************************************/ void Add_To_Lexical (char value[256],int line,char type[256]) { lex new_lex; if (!Search(value,line)) //When return 1 the token not found. { new_lex=malloc(sizeof(Lex)); if (new_lex!=NULL) { strcpy(new_lex->data,value); new_lex->line[0]=line; new_lex->times=1; strcpy(new_lex->type,type); new_lex->next=NULL; if (head==NULL) head=new_lex; else tail->next=new_lex; tail=new_lex; } } } /**************************************************************** Function to search token. *****************************************************************/ int Search (char value[256],int line) { lex x=head; int flag=0; while (x->next!=NULL && !flag) { if (strcmp(x->data,value)==0) { x->line[x->times]=line; x->times++; flag=1; } x=x->next; } return flag; } /**************************************************************** Function to print the ST.TXT . *****************************************************************/ void Print_ST() { lex x=head; int j; if ((st=fopen("ST.TXT","w"))==NULL) printf("The file ST.TXT cat not open. "); else { fprintf(st," %s %s %s ","Line#","Lexeme","Type"); fprintf(st," ---- ------ ---- "); while (x!=NULL) { if ((strcmp(x->type,"num")==0) || (strcmp(x->type,"keyword")==0) || (strcmp(x->type,"identifier")==0)) { fprintf(st," "); for (j=0;j<x->times;j++) { fprintf(st,"%d",x->line[j]); if (j!=x->times-1) //This condition to prevent the comma fprintf(st,",",x->line[j]); //"," to not print after last line #. } fprintf(st," %-6s %-6s ",x->data,x->type); } x=x->next; } fclose(st); } } /**************************************************************** Function to print the TOKENS.TXT . *****************************************************************/ void Print_TOKEN() { int flag=0; fp=fopen("source.txt","r"); if(fp==NULL) { printf("!!!Can't open input file - source.txt!!!"); getch(); exit(0); } else { if ((token=fopen("TOKENS.TXT","w"))==NULL) printf("The file ST.TXT cat not open. "); else { ch=fgetc(fp); while (!(feof(fp))) { if (ch==' ' && !flag) { do ch=fgetc(fp); while (ch==' '); fseek(fp,-2,1); ch=fgetc(fp); flag=1; } if (ch!=' ' && ch!=' ') fprintf(token,"%c",ch); if (ch==' ') { fprintf(token," "); Token_Attribute(); i++; flag=0; } ch=fgetc(fp); } } } fclose(fp); fclose(token); } /**************************************************************** Function to put the token and atrribute in TOKENS.TXT . *****************************************************************/ void Token_Attribute() { lex x=head; int j; while (x!=NULL) { if (x->line[0]==i) { fprintf(token,"token : %-4s ",x->type); if ((strcmp(x->type,"num")==0) || (strcmp(x->type,"keyword")==0) || (strcmp(x->type,"identifier")==0)) { fprintf(token,"attribute : line#=%-4d ",i); } else { fprintf(token,"attribute : %-4s ",x->data); } } x=x->next; } fprintf(token," "); } /**************************************************************** Function to create lexical analysis. *****************************************************************/ void analyze() { ch=fgetc(fp); //Read character. while(!feof(fp)) //While the file is not end. { if(ch==' ') //Compute # of lines in source.txt . { line++; ch=fgetc(fp); } if(isspace(ch) && ch==' ' ) { line++; ch=fgetc(fp); } if(isspace(ch) && ch!=' ' ) //The character is space. ch=fgetc(fp); if(ch=='/' || ch=='"') //Function for skipping comments in the file Skip_Comment(); //and '"' with display statements. if(isalpha(ch)) //The character is leter. { Read_String(); Is_Keyword_Or_Not(); Is_Operator_Or_Not(); Is_Identifier_Or_Not(); } if(isdigit(ch)) //The character is digit. Read_Number(); if (ch==';') //The character is semicolon. Add_To_Lexical(";",line,"semicolon"); if (ch==':') //The character is colon. Add_To_Lexical(":",line,"colon"); if (ch==',') //The character is comma. Add_To_Lexical(",",line,"comma"); if (ch=='(') //The character is parenthesis. Add_To_Lexical("(",line,"parenthesis"); if (ch==')') //The character is parenthesis. Add_To_Lexical(")",line,"parenthesis"); //The character is comparison_operator if (ch=='<' || ch=='=' || ch=='>') Is_Comparison_Or_Not(); Is_Special_Or_Not(); //After failed scaning in before cases //check the character is special or not. Demage_Lexeme(); if(isspace(ch) && ch==' ' ) { line++; ch=fgetc(fp); } else ch=fgetc(fp); } } /**************************************************************** This function read all character of strings. *****************************************************************/ void Read_String() { int j=0; do { lexeme[j++]=ch; ch=fgetc(fp); } while(isalpha(ch)); fseek(fp,-1,1); lexeme[j]='

Related Source Codes

Script Name Author
The Game Opposite as seen on Nokia 2300 Mobile Manikanta
RECURSIVE BALANCED QUICK SORT ashish
Radix Sort ashish
Change your mouse pointer Ashim
The blinking star Shashank
Data Validation Crylittlebaby
To search a file by giving file type like mp3 or mpeg or doc Prashanth SR
Menus Demonstration B.Chidhambaram
Employee Database Project Using C. Reenku Raman Nayak
Calendar Program Omkar & Devendra
Stop double Process for start in C Cedrik Jurak
Stop double Process for start in C Cedrik Jurak
Time Scheduler Atiq Anwar
A timepass game between atmost two players Rahul Roy
Simple Tic Tac Toe Game Rahul Roy

A D V E R T I S E M E N T




Google Groups Subscribe to SourceCodesWorld - Techies Talk
Email:

Free eBook - Interview Questions: Get over 1,000 Interview Questions in an eBook for free when you join JobsAssist. Just click on the button below to join JobsAssist and you will immediately receive the Free eBook with thousands of Interview Questions in an ebook when you join.

New! Click here to Add your Code!


ASP Home | C Home | C++ Home | COBOL Home | Java Home | Pascal Home
Source Codes Home Page

 Advertisements  

Google Search

Google

Source Codes World.com is a part of Vyom Network.

Vyom Network : Web Hosting | Dedicated Server | Free SMS, GRE, GMAT, MBA | Online Exams | Freshers Jobs | Software Downloads | Interview Questions | Jobs, Discussions | Placement Papers | Free eBooks | Free eBooks | Free Business Info | Interview Questions | Free Tutorials | Arabic, French, German | IAS Preparation | Jokes, Songs, Fun | Free Classifieds | Free Recipes | Free Downloads | Bangalore Info | Tech Solutions | Project Outsourcing, Web Hosting | GATE Preparation | MBA Preparation | SAP Info | Software Testing | Google Logo Maker | Freshers Jobs

Sitemap | Privacy Policy | Terms and Conditions | Important Websites
Copyright ©2003-2024 SourceCodesWorld.com, All Rights Reserved.
Page URL: http://www.sourcecodesworld.com/source/show.asp?ScriptID=1244


Download Yahoo Messenger | Placement Papers | Free SMS | C Interview Questions | C++ Interview Questions | Quick2Host Review