|
Zaval File SearchVersion 1.3User's GuideZaval Creative Engineering Group ContentsIntroduction to Zaval File Search What Can You Do with Zaval File Search When To Use Zaval Zaval File Search Installing and Configuring the Zaval File Search Operating the Zaval File Search Zaval File Search: Search process notes Zaval File Search Command Syntax Zaval File Search Modes and Options Introduction to Zaval File SearchThe Zaval File Search tool is designed to provide easy and powerful indexing and search facilities in corporate networks with SMB/MS Network shares and FTP servers. Similar products are MS Indexing Service, and Napster-like tools. Zaval File Search software has multi-tier client-server architecture, where
In brief, architecture can be displayed as the following picture displays: What Can You Do with Zaval File SearchThe Zaval File Search provides all facilities to build an index based on SMB shares and FTP servers scan and than search through this index using user-friendly web based interface. It supports lots features like regular expression usage and search based on custom/predefined extensions. Starting from v1.3.1 the Zaval File search engine allows incremental index building scheme, so if some hosts were turned off during scanning process they can be added to index later via incremental update (hosts that are already indexed will not be re-scanned to avoid network overload). This is really useful in large networks when you can't turn on every computer at the same time. The only thing you need to choose is scanning period. In almost all cases scanning period of one hour is enough to make relevant database with minor time of file links update. When To Use Zaval File SearchThe Zaval File Search best used for causal, irregular search for various files in your local network based on FTP and SMB file sharing solution. The powerful and flexible search engine and indexing service allow you to retrieve list of unique files have placed on the network via both MS Network and FTP shares. Installing and Configuring the Zaval File SearchRequirementsIn order to run Zaval File Search solution you need a Unix/Linux machine with the following tools installed:
All these tools come within almost any Linux distribution, so probably you already have them installed. InstallationThe Zaval File Search currently distributed in three forms. There are Debian source packages, RPM-based source packages, and tarballs. All packages can be installed by standard and well-known package manager commands. The default settings in most cases can be leaved unchanged. The Zaval File Search contains two parts:
Note: both parts come as separate packages and can be downloaded from http://www.zaval.org/products/file-search/ The typical installation commands are apt-get install zfsearch-spider apt-get install zfsearch-client or by packages rpm -i zfsearch-spider-1.2.2.i386.rpm rpm -i zfsearch-client-1.2.2.i386.rpm or similar commands to the dpkg command. Sometimes you can get a message that several packages are missing - this can be when you have installed several packages from sources. If you sure that all packages are already installed you can use '--force' option for rpm. Spider installation notes: The package manager will register main script file in crontab to provide full-featured network scanning by default (usually default settings work fine). However, you can make necessary changes in configuration files. Client installation notes: The package manager will put 'search.pl' script and related files to the Apache's DocumentRoot in separate virtual host so all existing settings will be leaved unchanged. Make sure you have CGI.pm and mod_perl installed to spawn 'search.pl' script. This script does not write to any files. Alternatively, you can build proper packages from corresponding src.rpm files as the rpm/make commands describes. There are preferred ways to install Zaval File Search tools. But you can compile it manually from source tree by the following commands: make make install make install_docs This way is not recommended. Configuring spiderIf you decided to make changes make sure you set proper timeout value - an interval between two nearest network scans, because large network (more than 500 computers) scan requires several hours to complete. The exact value depends on the number of files on the shares and local network speed. Note: The script tries to scan several hosts at one time, make sure there is enough free space and RAM on the host to avoid possible problems. You can change all paths and options in spider.conf file. Make sure the appropriate client part have the same settings inside perl scripts too if you've changed the default settings. The example is listed below: WINS=moon DATADIR=/var/spool/zfsearch HOST_LIST= /etc/zfsearch/hosts EXCLUDE_SHARES=/etc/netscan/smb_ignore_shares FTP_HOST_LIST=/etc/zfsearch/ftp_hosts FTP_USER=ftp FTP_PASSWORD=ftp@aol.com LANG=ru_RU.UTF-8 TRANSLATE_COMMAND_SMB="iconv -f koi8-r -t UTF-8" TRANSLATE_COMMAND_FTP="iconv -f CP1251 -t UTF-8" TIME_RESCAN=12 TIME_ALLOW=72 where
There is also smbclient.conf file where you should specify domain options. See example below: username=search password=searchpwd domain=workgroup where
Configuring clientThere is only one thing you need to change: IP-address of virtual host in /etc/zfsearch/zfsearch-httpd.conf file (you need to change IP 127.0.0.1 to your real settings). After this operation include this file to your Apache configuration file (httpd.conf) with "Include" directive. Example: Include "/etc/zfsearch/zfsearch-httpd.conf" All other settings can be leaved unchanged. You can change all paths, options and SMB login in spider.conf file, however, make sure /etc/zfsearch/client.conf file have the same settings too. Operating the Zaval File SearchTo use the Zaval File Search you need a browser (it can be even lynx). All JavaScript code can be ignored in all cases, it was used to provide user-friendlier interface only (currently there is 'enable/disable' behavior only). Specify keywords to search for, choose appropriate settings and use 'Sniff!' button to do the search. Note: you have to turn on "Always send URLs as UTF-8" in your browser to work correctly with requests in language other than English. In MS Explorer this feature can be found in Internet options -> Advanced. The actual time you'll be waiting for the results depends on the number of files on the shares at you local network and the settings you have chosen. For a network about 200-300 computers this time can be about 2 seconds for reasonable requests on PII-450/128M. Better computer is able to operate much faster. Zaval File Search has its own command syntax, similar to well-known Internet search engines, including grouping and logical operations. The regular expressions in perl style (without modifiers) and easy mode similar to well-known DOS meta characters are also applicable. All wildcards, options and custom file extensions have accumulated to request if possible. The all-exclusive options provide enabling/disabling behavior in modern browsers such as IE/Mozilla to make option manipulations user-friendly. Use appropriate options and modes below to make request precise. Zaval File Search: Search process notesZaval File Search provides non-standard search capabilities. We've decided to divide search process to the two important stages. There are search of file names without share names, full paths, sizes and other attributes, to make possibility to enumerate only unique names (here we tell "names", because we do not display unique files here); the second stage is display all files relevant to appropriate name being selected. The design of the search process briefly described above allows providing fastest search for advanced users. Some names and names wildcards are very popular, so the any files corresponding to one name will get a huge list of locations. Authors think the list of dozens of thousands files with the same names but with different places is not an effective way to manage and search files in network; few list with same names is better. This feature allows users to use non-strict and informal requests for file names with wildcards to provide fastest navigation through the database. The logical operations bring additional flexibility in requests, so users can specify logical operations to wildcards in one request. In additional to two dimensional search Zaval File Search engine allow providing relevant but effective search process. Zaval File Search Command SyntaxThe following syntax constructions supported:
Zaval File Search Modes and OptionsThe global modes are:
In Smart search mode you can use several options to make your requests more accurate and precise:
All these options are limitations on the files' types to search through. All of them are translated to the following construction in regex: <your command>(.)*\.(<ext1>|<ext2>|...)$ Regular expressions usageThis topic describes the syntax of regular expressions in search engine. All perlre (1) documented features are applicable. Matching operations can have various modifiers. Modifiers that relate to the interpretation of the regular expression inside are listed below. The following options are used in all cases:
The patterns used in search engine are the same as Perl pattern matching derive from supplied in the Version 8 regex routines. See appropriate documentation for details. In particular the following metacharacters have their standard meanings:
By default, the "^" character is guaranteed to match only the beginning of the string (file name), the "$" character only the end (or before the newline at the end). The following standard quantifiers are recognized:
The "*" modifier is equivalent to `{0,}', the "+" modifier to `{1,}', and the "?" modifier to `{0,1}'. A quantified subpattern is "greedy", that is, it will match as many times as possible (given a particular starting location) while still allowing the rest of the pattern to match. If you want it to match the minimum number of times possible, follow the quantifier with a "?". Note that the meanings don't change, just the "greediness":
In addition, Perl defines the following:
A `\w' matches a single alphanumeric character, not a whole word. Use `\w+' to match a string of Perl-identifier characters (which isn't the same as matching an English word). If `use locale' is in effect, the list of alphabetic characters generated by `\w' is taken from the current locale. See the perllocale manpage. You may use `\w', `\W', `\s', `\S', `\d', and `\D' within character classes, but if you try to use them as endpoints of a range, that's not a range, the "-" is understood literally. The POSIX character class syntax is supported also. See perlre documentation for details. Product limitationsThis product was designed for irregular usage in small and medium Intranet space, and the engine was not optimized to obtain hundreds of million files in database. The engine was tested in Intranet with 300-500 computers with a lot of shares available (approximately 500 000 - 1 000 000 files in indexes only). The large database with thousands of available shares can cause significant slowdown when several users will be searching something like "all files containing 'a' symbol" at one time, especially if host computer is not very fast), so if you are seeking for a reliable Internet-related search engine (such as filez.com has) our engine is not for you. However, it goes ok for a small company with 200-300 computers. ;) Further product plansCurrent tool implementation follows the minimalist computing concept. The following features probably will be added:
Support availableAll support for software installation and problems should be sent directly to support@zaval.org with 'Re: Zaval File Search Support' in subject line and plain text in the message body, describing your request and/or your problem. Since this software is distributed under the General Public License and is maintained by its authors on non-commercial basis, your request will be answered as soon as possible, but no later than 5 business days. The Zaval Creative Engineering Group carries out its software customization/new software development on the regular basis. For more info contact us at info@zaval.org. Stay informed!Now you can receive information on latest products' updates and hotfixes via email.
|
Original idea: Victor Krapivin. Developed under Zaval Creative ProcessTM. |
Copyright © Zaval Creative Engineering Group, 2000-2005 |
Distributed under GPL v2. All Rights Reserved. |