Add to Favorites    Make Home Page 169 Online  

 Language Categories  
 Our Services  

Perl & CGI » Scripts and Programs » Searching » Web Indexing » Harvest-NG

A D V E R T I S E M E N T

Search Projects & Source Codes:



Title Harvest-NG
Platform n/a
Price Free
Author Visit Author Website
Website Visit Website of - Harvest-NG
Category Perl & CGI » Scripts and Programs » Searching » Web Indexing
Hits 674
Description Harvest-NG is a collection of Perl modules and scripts which provide a powerful web crawling and summarizing agent. The code is aimed at providing an open source, standards compliant, tool for fetching content from a wide variety of information sources, summarising it into a set of resource descriptions, and storing these in an easily accessible database from which search services can be built and statistical information compiled.

A D V E R T I S E M E N T




Google Groups Subscribe to SourceCodesWorld - Techies Talk
Email:


Free eBook - Interview Questions: Get over 1,000 Interview Questions in an eBook for free when you join JobsAssist. Just click on the button below to join JobsAssist and you will immediately receive the Free eBook with thousands of Interview Questions in an ebook when you join.


Scripts Related to - Harvest-NG

Script Name

Internet Spy (i-spy)

I-Spy is a Perl script which identifies new files on various remote FTP and Web sites. It grabs and compares contents of FTP directories and web pages. It will then compile a report and either send it via e-mail or save it as a web page. You may also request both deliveries of the report. For e-mail reports, you may request plain text or HTML. I-Spy logs its activity as it chugs along. You may specify the log directory, or I-Spy will try to find one automatically. For web page reports, I-Spy will attempt to store the log in such a place where it may be referenced by the report and served by the web server.

WebAwk

This is a proof-of-concept of a tool to automate web browsing / data collection. It works like AWK except that instead of working on files and lines it works on HTML pages and hyperlinks. It is meant to be run as a command line script and includes base_url - the URL the script was initially invoked on, base_path - root of saved data tree, url - current URL being processed, linked_from - parent of current URL, and content - the actual data corresponding to the current URL.

Google Search

Google

Source Codes World.com is a part of Vyom Network.

Vyom Network : Web Hosting | Dedicated Server | Free SMS, GRE, GMAT, MBA | Online Exams | Freshers Jobs | Software Downloads | Interview Questions | Jobs, Discussions | Placement Papers | Free eBooks | Free eBooks | Free Business Info | Interview Questions | Free Tutorials | Arabic, French, German | IAS Preparation | Jokes, Songs, Fun | Free Classifieds | Free Recipes | Free Downloads | Bangalore Info | Tech Solutions | Project Outsourcing, Web Hosting | GATE Preparation | MBA Preparation | SAP Info | Software Testing | Google Logo Maker | Freshers Jobs

Sitemap | Privacy Policy | Terms and Conditions
Copyright ©2003-2014 SourceCodesWorld.com, All Rights Reserved.
Page URL: http://www.sourcecodesworld.com/scripts/Harvest-NG-10390/default.asp


Download Yahoo Messenger | Placement Papers | Free SMS | C Interview Questions | C++ Interview Questions | Quick2Host Review


 Advertisements