Add to Favorites    Make Home Page 10635 Online  
 Language Categories  
 Our Services  

Home » C Home » Beginners / Lab Assignments Home » Creating a Lexical Analyzer in c


Search Projects & Source Codes:

Title Creating a Lexical Analyzer in c
Author fahad bader al-buhairi ŽŇĄ ?Ąđ ├▀??ŢđŢ
Author Email xxxq8xxx [at]
Description The token classes correspond to the following regular definitions , except and digit>, plus the class . (The ellipse ˘Ó÷ is used in the usual sense ˘and so on÷. The spaces in the definitions are used for better adability, they are not valid parts of the definitions.)

::= ;
::= :
::= ,
::= + | - | * | /
::= < | <= | = | <> | >= | >
::= ( | )
::= % | ! | @ | ~ | $
::= a | A | b | B | Ó | z | Z
::= 0 | 1 | Ó | 9
::= ( | )*
::= + (. digit+ )? (E(+|-)?

The following lexemes should be recognized as keywords, not as

procedure, is, begin, end, var, cin, cout, if, then, else, and, or,
loop, exit, when, while, until

The whitespace characters ˘ ˘ (space symbol) and (end-of-line
are to be skipped.
Comments (any text closed between braces ˘{÷ and ˘}÷ ) are to be
as well. The comments do not extend to several lines (they do not
the end-of-line symbol).

The input for the lexical analyzer is a textfile SOURCE.TXT consisting
several lines of text (a ˘program÷) being a correctly formed sequence
lexemes corresponding to the above definitions, whitespaces and

The output of your lexical analyzer consists of 2 text files ST.TXT and
1. ST.TXT is the symbol table created by the lexical analyzer. Each
consists of three parts:
- line number
- the lexeme (string)
- type (string) , being one of the following: keyword, identifier, num
2. TOKENS.TXT is the list of tokens produced by the lexical analyzer
the following structure:
- one line of input (in the order of appearance in SOURCE.TXT)
- corresponding pairs ˘token, attribute÷, each in a separate line in
order as they occur in the line
- blank line
The attribute of a keyword, identifier or a number is the line number
the symbol table. The attribute of any other token is the lexeme
The longest prefix of the input that can match any regular expression
is taken as the next token.
Category C » Beginners / Lab Assignments
Hits 497621
Code Select and Copy the Code
/*************************************************************** Program describtion : ===================== This program is for creating a Lexical Analyzer in c Created by : ============= Name : Fahad Bader Al-buhairi Email : phone : 009657991000 *****************************************************************/ /**************************************************************** Necessary Header files used in program. *****************************************************************/ #include<stdio.h> #include<string.h> #include<conio.h> #include<ctype.h> /**************************************************************** Functions prototype. *****************************************************************/ void Open_File(); void Demage_Lexeme(); int Search(char[256],int); void analyze(); void Skip_Comment(); void Read_String(); void Is_Keyword_Or_Not(); void Is_Identifier_Or_Not(); void Is_Operator_Or_Not(); void Read_Number(); void Is_Special_Or_Not(); void Is_Comparison_Or_Not(); void Add_To_Lexical (char[256],int,char[256]); void Print_ST(); void Print_TOKEN(); void Token_Attribute(); /**************************************************************** Data structure used in program. *****************************************************************/ struct lexical { char data[256]; //Value of token. int line[256]; //Line # which token appear in input file. int times; //# of times that token appear in input file. char type[256]; //Type of each token. struct lexical *next; }; typedef struct lexical Lex; typedef Lex *lex; /**************************************************************** File pointer for accessing the file. *****************************************************************/ FILE *fp; FILE *st; FILE *token; char lexeme[256],ch; int f,flag,line=1,i=1; lex head=NULL,tail=NULL; /**************************************************************** Array holding all keywords for checking. *****************************************************************/ char *keywords[]={"procedure","is","begin","end","var","cin","cout","if", "then","else","and","or","not","loop","exit","when", "while","until"}; /**************************************************************** Array holding all arithmetic operations for checking. *****************************************************************/ char arithmetic_operator[]={'+','-','*','/'}; /**************************************************************** Array holding all comparison operations for checking. *****************************************************************/ char *comparison_operator[]={"<",">","=","<=","<>",">="}; /**************************************************************** Array holding all special for checking. *****************************************************************/ char special[]={'%','!','@','~','$'}; /**************************************************************** ************** *MAIN PROGRAM* ************** *****************************************************************/ void main() { Open_File(); analyze(); fclose(fp); Print_ST(); Print_TOKEN(); } /**************************************************************** This function open input sourse file. *****************************************************************/ void Open_File() { fp=fopen("source.txt","r"); //provide path for source.txt here if(fp==NULL) { printf("!!!Can't open input file - source.txt!!!"); getch(); exit(0); } } /**************************************************************** Function to add item to structure of array to store data and information of lexical items. *****************************************************************/ void Add_To_Lexical (char value[256],int line,char type[256]) { lex new_lex; if (!Search(value,line)) //When return 1 the token not found. { new_lex=malloc(sizeof(Lex)); if (new_lex!=NULL) { strcpy(new_lex->data,value); new_lex->line[0]=line; new_lex->times=1; strcpy(new_lex->type,type); new_lex->next=NULL; if (head==NULL) head=new_lex; else tail->next=new_lex; tail=new_lex; } } } /**************************************************************** Function to search token. *****************************************************************/ int Search (char value[256],int line) { lex x=head; int flag=0; while (x->next!=NULL && !flag) { if (strcmp(x->data,value)==0) { x->line[x->times]=line; x->times++; flag=1; } x=x->next; } return flag; } /**************************************************************** Function to print the ST.TXT . *****************************************************************/ void Print_ST() { lex x=head; int j; if ((st=fopen("ST.TXT","w"))==NULL) printf("The file ST.TXT cat not open. "); else { fprintf(st," %s %s %s ","Line#","Lexeme","Type"); fprintf(st," ---- ------ ---- "); while (x!=NULL) { if ((strcmp(x->type,"num")==0) || (strcmp(x->type,"keyword")==0) || (strcmp(x->type,"identifier")==0)) { fprintf(st," "); for (j=0;j<x->times;j++) { fprintf(st,"%d",x->line[j]); if (j!=x->times-1) //This condition to prevent the comma fprintf(st,",",x->line[j]); //"," to not print after last line #. } fprintf(st," %-6s %-6s ",x->data,x->type); } x=x->next; } fclose(st); } } /**************************************************************** Function to print the TOKENS.TXT . *****************************************************************/ void Print_TOKEN() { int flag=0; fp=fopen("source.txt","r"); if(fp==NULL) { printf("!!!Can't open input file - source.txt!!!"); getch(); exit(0); } else { if ((token=fopen("TOKENS.TXT","w"))==NULL) printf("The file ST.TXT cat not open. "); else { ch=fgetc(fp); while (!(feof(fp))) { if (ch==' ' && !flag) { do ch=fgetc(fp); while (ch==' '); fseek(fp,-2,1); ch=fgetc(fp); flag=1; } if (ch!=' ' && ch!=' ') fprintf(token,"%c",ch); if (ch==' ') { fprintf(token," "); Token_Attribute(); i++; flag=0; } ch=fgetc(fp); } } } fclose(fp); fclose(token); } /**************************************************************** Function to put the token and atrribute in TOKENS.TXT . *****************************************************************/ void Token_Attribute() { lex x=head; int j; while (x!=NULL) { if (x->line[0]==i) { fprintf(token,"token : %-4s ",x->type); if ((strcmp(x->type,"num")==0) || (strcmp(x->type,"keyword")==0) || (strcmp(x->type,"identifier")==0)) { fprintf(token,"attribute : line#=%-4d ",i); } else { fprintf(token,"attribute : %-4s ",x->data); } } x=x->next; } fprintf(token," "); } /**************************************************************** Function to create lexical analysis. *****************************************************************/ void analyze() { ch=fgetc(fp); //Read character. while(!feof(fp)) //While the file is not end. { if(ch==' ') //Compute # of lines in source.txt . { line++; ch=fgetc(fp); } if(isspace(ch) && ch==' ' ) { line++; ch=fgetc(fp); } if(isspace(ch) && ch!=' ' ) //The character is space. ch=fgetc(fp); if(ch=='/' || ch=='"') //Function for skipping comments in the file Skip_Comment(); //and '"' with display statements. if(isalpha(ch)) //The character is leter. { Read_String(); Is_Keyword_Or_Not(); Is_Operator_Or_Not(); Is_Identifier_Or_Not(); } if(isdigit(ch)) //The character is digit. Read_Number(); if (ch==';') //The character is semicolon. Add_To_Lexical(";",line,"semicolon"); if (ch==':') //The character is colon. Add_To_Lexical(":",line,"colon"); if (ch==',') //The character is comma. Add_To_Lexical(",",line,"comma"); if (ch=='(') //The character is parenthesis. Add_To_Lexical("(",line,"parenthesis"); if (ch==')') //The character is parenthesis. Add_To_Lexical(")",line,"parenthesis"); //The character is comparison_operator if (ch=='<' || ch=='=' || ch=='>') Is_Comparison_Or_Not(); Is_Special_Or_Not(); //After failed scaning in before cases //check the character is special or not. Demage_Lexeme(); if(isspace(ch) && ch==' ' ) { line++; ch=fgetc(fp); } else ch=fgetc(fp); } } /**************************************************************** This function read all character of strings. *****************************************************************/ void Read_String() { int j=0; do { lexeme[j++]=ch; ch=fgetc(fp); } while(isalpha(ch)); fseek(fp,-1,1); lexeme[j]='

Related Source Codes

Script Name Author
The Game Opposite as seen on Nokia 2300 Mobile Manikanta
Radix Sort ashish
Change your mouse pointer Ashim
The blinking star Shashank
Data Validation Crylittlebaby
To search a file by giving file type like mp3 or mpeg or doc Prashanth SR
Menus Demonstration B.Chidhambaram
Employee Database Project Using C. Reenku Raman Nayak
Calendar Program Omkar & Devendra
Stop double Process for start in C Cedrik Jurak
Stop double Process for start in C Cedrik Jurak
Time Scheduler Atiq Anwar
A timepass game between atmost two players Rahul Roy
Simple Tic Tac Toe Game Rahul Roy


Google Groups Subscribe to SourceCodesWorld - Techies Talk

Free eBook - Interview Questions: Get over 1,000 Interview Questions in an eBook for free when you join JobsAssist. Just click on the button below to join JobsAssist and you will immediately receive the Free eBook with thousands of Interview Questions in an ebook when you join.

New! Click here to Add your Code!

ASP Home | C Home | C++ Home | COBOL Home | Java Home | Pascal Home
Source Codes Home Page


Google Search


Source Codes is a part of Vyom Network.

Vyom Network : Web Hosting | Dedicated Server | Free SMS, GRE, GMAT, MBA | Online Exams | Freshers Jobs | Software Downloads | Interview Questions | Jobs, Discussions | Placement Papers | Free eBooks | Free eBooks | Free Business Info | Interview Questions | Free Tutorials | Arabic, French, German | IAS Preparation | Jokes, Songs, Fun | Free Classifieds | Free Recipes | Free Downloads | Bangalore Info | Tech Solutions | Project Outsourcing, Web Hosting | GATE Preparation | MBA Preparation | SAP Info | Software Testing | Google Logo Maker | Freshers Jobs

Sitemap | Privacy Policy | Terms and Conditions | Important Websites
Copyright ©2003-2024, All Rights Reserved.
Page URL:

Download Yahoo Messenger | Placement Papers | Free SMS | C Interview Questions | C++ Interview Questions | Quick2Host Review