!Project Description
Distributed Indexing Framework is framework make it easier to write a distributed Web Indexing Engine from the scratch. DIF provide a Server or Index Manager that manage the work, and the distributed workers.
!Distributed Indexing Framework Simply it's content of :
- Index Manager, Manage the work and the distributed workers and in the current version the work is links 'URL'. the Index Manager have a WCF service to allow the Workers to communicate with it.
- Index Client, or the Worker that received the work and (for example download the page, Index it, and save the result).
- The Storage Service is a simple service that used if you want a centralize the Indexed data. this mean the Index Client can shipped the Indexed Data to this service,
!The Framework Feature:
- It's completely open source, and you can change and improve it to feet your needs.
- The communication in the Framework based on WCF.
- The Framework is High Reliable because it's based on Queued File to save the work 'Links' into files (See http:/www.Codeplex.com/QF) also the system have backup, and recovery system for allow recovery and resume the work if the system fail (Power, Windows Restart, etc), beside this the Index Client can enter a hibernate mode if the Index Manager or the Storage Service stopped until they back to work.
- The Framework support unlimited number if distributed workers to make the Indexing work fast.
- The Index Manager can manage Mange Index tasks, and Large number of worker for every task at same time.
- and more...
!The work scenario:
- The Index Manager start work and if there are work need to recovery it resume it.
- The Workers register and start request a new work.
- The Workers finish the work and then they request a new work.
- and so on.