Internet data search programs. Professional search for information on the Internet. Professional search for information on the Internet

For professional Internet searches you need specialized software, as well as specialized search engines and search services.

PROGRAMS

http://dr-watson.wix.com/home – the program is designed to study arrays of text information in order to identify entities and connections between them. The result of the work is a report on the object under study.

http://www.fmsasg.com/ - one of the best programs in the world for visualizing connections and relationships Sentinel Vizualizer. The company has completely Russified its products and connected hotline in Russian.

http://www.newprosoft.com/ – “Web Content Extractor” is the most powerful, easy-to-use software for extracting data from web sites. It also has an effective Visual Web spider.

SiteSputnik – a software package that has no analogues in the world, allowing you to search and process its results on the Visible and Invisible Internet, using all the search engines necessary for the user.

WebSite-Watcher – allows you to monitor web pages, including password-protected ones, monitoring forums, RSS feeds, news groups, local files. Possesses powerful system filters. Monitoring is carried out automatically and is delivered in a user-friendly form. A program with advanced functions costs 50 euros. Constantly updated.

http://www.scribd.com/ is the most popular platform in the world and increasingly used in Russia for posting various kinds of documents, books, etc. for free access with a very convenient search engine for titles, topics, etc.

http://www.atlasti.com/ is the most powerful and effective tool for qualitative information analysis available to individual users, small and even medium-sized businesses. The program is multifunctional and therefore useful. It combines the ability to create a unified information environment for working with various text, tabular, audio and video files as a single whole, as well as tools for qualitative analysis and visualization.

Ashampoo ClipFinder HD – an ever-increasing share of the information flow comes from video. Accordingly, competitive intelligence officers need tools that allow them to work with this format. One of such products is the presented free utility. It allows you to search for videos based on specified criteria on video file storage sites such as YouTube. The program is easy to use, displays all search results on one page with detailed information, titles, duration, time when the video was uploaded to the storage, etc. There is a Russian interface.

http://www.advego.ru/plagiatus/ – the program was made by SEO optimizers, but is quite suitable as an Internet intelligence tool. Plagiarism shows the degree of uniqueness of the text, the sources of the text, and the percentage of text match. The program also checks the uniqueness of the specified URL. The program is free.

http://neiron.ru/toolbar/ – includes an add-on for combining Google search and Yandex, and also allows for competitive analysis based on assessing the effectiveness of sites and contextual advertising. Implemented as a plugin for FF and GC.

http://web-data-extractor.net/ is a universal solution for obtaining any data available on the Internet. Setting up data cutting from any page is done in a few mouse clicks. You just need to select the data area that you want to save and Datacol will automatically select a formula for cutting out this block.

CaptureSaver is a professional Internet research tool. Simply irreplaceable working programm, allowing you to capture, store and export any Internet information, including not only web pages, blogs, but also RSS news, email, images and much more. It has the widest functionality, an intuitive interface and a ridiculous price.

http://www.orbiscope.net/en/software.html – web monitoring system at more than affordable prices.

http://www.kbcrawl.co.uk/ – software for working, including on the “Invisible Internet”.

http://www.copernic.com/en/products/agent/index.html – the program allows you to search using more than 90 search engines, using more than 10 parameters. Allows you to combine results, eliminate duplicates, block broken links, and show the most relevant results. Comes in free, personal and professional versions. Used by more than 20 million users.

Maltego is a fundamentally new software that allows you to establish the relationship of subjects, events and objects in real life and on the Internet.

SERVICES

new – web browser with dozens of pre-installed tools for OSINT.

– an effective search engine-aggregator for finding people on major Russian social networks.

https://hunter.io/ is an effective service for detecting and checking email.

https://www.whatruns.com/ – easy to use, but efficient scanner, allowing you to discover what works and doesn't work on a website and what its security holes are. Also implemented as a plugin for Chrom.

https://www.crayon.co/ is an American budget platform for market and competitive intelligence on the Internet.

http://www.cs.cornell.edu/~bwong/octant/ – host identifier.

https://iplogger.ru/ – a simple and convenient service for determining someone else’s IP.

http://linkurio.us/ is a powerful new product for economic security workers and corruption investigators. Processes and visualizes huge amounts of unstructured information from financial sources.

http://www.intelsuite.com/en – English-language online platform for competitive intelligence and monitoring.

http://yewno.com/about/ is the first operating system for translating information into knowledge and visualizing unstructured information. Currently supports English, French, German, Spanish and Portuguese.

https://start.avalancheonline.ru/landing/?next=%2F – forecasting and analytical services by Andrey Masalovich.

https://www.outwit.com/products/hub/ – full set stand-alone programs for professional work on the web 1.

https://github.com/search?q=user%3Acmlh+maltego – extensions for Maltego.

http://www.whoishostingthis.com/ – search engine for hosting, IP addresses, etc.

http://appfollow.ru/ – analysis of applications based on reviews, ASO optimization, positions in the tops and search results ah for the App Store, Google Play and Windows Phone Store.

http://spiraldb.com/ is a service implemented as a plugin for Chrom, which allows you to get a lot of valuable information about any electronic resource.

https://millie.northernlight.com/dashboard.php?id=93 - free service, collecting and structuring key information by industry and company. It is possible to use information panels based on text analysis.

http://byratino.info/ – collection of factual data from publicly available sources on the Internet.

http://www.datafox.co/ – CI platform collects and analyzes information on companies of interest to clients. There is a demo.

https://unwiredlabs.com/home - a specialized application with an API for searching by geolocation of any device connected to the Internet.

http://visualping.io/ – a service for monitoring sites and, first of all, the photographs and images available on them. Even if the photo appeared for a second, it will be e-mail subscriber Has a plugin for Google Chrome.

http://spyonweb.com/ is a research tool that allows for in-depth analysis of any Internet resource.

http://bigvisor.ru/ – the service allows you to track advertising campaigns for certain segments of goods and services, or specific organizations.

http://www.itsec.pro/2013/09/microsoft-word.html – instructions for use by Artem Ageev Windows programs for competitive intelligence needs.

http://granoproject.org/ is an open source tool source code for researchers who track networks of connections between individuals and organizations in politics, economics, crime, etc. Allows you to connect, analyze and visualize information obtained from various sources, as well as show significant connections.

http://imgops.com/ – a service for extracting metadata from graphic files and working with them.

http://sergeybelove.ru/tools/one-button-scan/ – a small online scanner for checking security holes in websites and other resources.

http://isce-library.net/epi.aspx – service for searching primary sources using a fragment of text on English language

https://www.rivaliq.com/ is an effective tool for conducting competitive intelligence in Western, primarily European and American markets for goods and services.

http://watchthatpage.com/ is a service that allows you to automatically collect new information from monitored Internet resources. The service is free.

http://falcon.io/ is a kind of Rapportive for the Web. It is not a replacement for Rapportive, but provides additional tools. Unlike Rapportive, it gives a general profile of a person, as if glued together from data from social networks and mentions in web.http://watchthatpage.com/ – a service that allows you to automatically collect new information from monitored Internet resources. The service is free.

https://addons.mozilla.org/ru/firefox/addon/update-scanner/ – add-on for Firefox. Monitors web page updates. Useful for websites that do not have news feeds (Atom or RSS).

http://agregator.pro/ – aggregator of news and media portals. Used by marketers, analysts, etc. to analyze news flows on certain topics.

http://price.apishops.com/ – automated web service for monitoring prices for selected product groups, specific online stores and other parameters.

http://www.la0.ru/ is a convenient and relevant service for analyzing links and backlinks to an Internet resource.

www.recordedfuture.com is a powerful tool for data analysis and visualization, implemented as an online service built on cloud computing.

http://advse.ru/ is a service with the slogan “Find out everything about your competitors.” Allows you to obtain competitors' websites in accordance with search queries and analyze competitors' advertising campaigns in Google and Yandex.

http://spyonweb.com/ – the service allows you to identify sites with the same characteristics, including those using the same statistics service identifiers Google Analytics, IP addresses, etc.

http://www.connotate.com/solutions – a line of products for competitive intelligence, managing information flows and converting information into information assets. It includes both complex platforms and simple, cheap services that allow for effective monitoring along with information compression and obtaining only the necessary results.

http://www.clearci.com/ - competitive intelligence platform for businesses of various sizes from start-ups and small companies to Fortune 500 companies. Solved as saas.

http://startingpage.com/ is a Google add-on that allows you to search on Google without recording your IP address. Fully supports all Google search capabilities, including in Russian.

http://newspapermap.com/ – unique service, very useful for a competitive scout. Connects geolocation with an online media search engine. Those. you select the region you are interested in, or even a city, or language, see the place on the map and a list of online versions of newspapers and magazines, click on the appropriate button and read. Supports Russian language, very user-friendly interface.

http://infostream.com.ua/ is a very convenient news monitoring system “Infostream”, distinguished by a first-class selection and quite accessible to any wallet, from one of the classics of Internet search, D.V. Lande.

http://www.instapaper.com/ is a very simple and effective tool for saving the necessary web pages. Can be used on computers, iPhones, iPads, etc.

http://screen-scraper.com/ – allows you to automatically extract all information from web pages, download the vast majority of file formats, and automatically enter data into various forms. Saves downloaded files and pages in databases, performs many other extremely useful functions. Works on all major platforms, has fully functional free and very powerful professional versions.

http://www.mozenda.com/ - having several tariff plans and a web service of multifunctional web monitoring and delivery of information necessary for the user from selected sites, available even to small businesses.

http://www.recipdonor.com/ - the service allows you to automatically monitor everything that happens on competitors' websites.

http://www.spyfu.com/ – and this is if your competitors are foreign.

www.webground.su is a service for monitoring the Runet created by Internet search professionals, which includes all the major providers of information, news, etc., and is capable of individual monitoring settings to suit the user’s needs.

SEARCH ENGINES

https://www.idmarch.org/ is the best search engine for the world archive of pdf documents in terms of quality. Currently, more than 18 million pdf documents have been indexed, ranging from books to secret reports.

http://www.marketvisual.com/ is a unique search engine that allows you to search for owners and top management by full name, company name, position, or a combination thereof. The search results contain not only the objects you are looking for, but also their connections. Designed primarily for English-speaking countries.

http://worldc.am/ is a search engine for freely accessible photographs linked to geolocation.

https://app.echosec.net/ is a public search engine that describes itself as the most advanced analytical tool for law enforcement and security and intelligence professionals. Allows you to search for photos posted on various sites, social platforms and social networks in relation to specific geolocation coordinates. There are currently seven data sources connected. By the end of the year their number will be more than 450. Thanks to Dementy for the tip.

http://www.quandl.com/ is a search engine for seven million financial, economic and social databases.

http://bitzakaz.ru/ – search engine for tenders and government orders with additional paid functions

Website-Finder - makes it possible to find sites that Google does not index well. The only limitation is that it only searches 30 websites for each keyword. The program is easy to use.

http://www.dtsearch.com/ is a powerful search engine that allows you to process terabytes of text. Works on desktop, web and intranet. Supports both static and dynamic data. Allows you to search in all MS Office programs. The search is carried out using phrases, words, tags, indexes and much more. The only federated search engine available. It has both paid and free versions.

http://www.strategator.com/ – searches, filters and aggregates information about the company from tens of thousands of web sources. Searches in the USA, Great Britain, major EEC countries. It is highly relevant, user-friendly, and has free and paid options ($14 per month).

http://www.shodanhq.com/ is an unusual search engine. Immediately after his appearance, he received the nickname “Google for hackers.” It does not search for pages, but determines IP addresses, types of routers, computers, servers and workstations located at a particular address, and traces chains DNS servers and allows you to implement many other interesting functions for competitive intelligence.

http://search.usa.gov/ – search engine for sites and open databases of all government agencies USA. The databases contain a lot of practical useful information, including for use in our country.

http://visual.ly/ – today visualization is increasingly used to present data. This is the first infographic search engine on the Web. Along with the search engine, the portal has powerful data visualization tools that do not require programming skills.

http://go.mail.ru/realtime – search for discussions of topics, events, objects, subjects in real or customizable time. The previously highly criticized search in Mail.ru works very effectively and provides interesting, relevant results.

Zanran is just launched, but already working great, the first and only search engine for data that extracts it from PDF files, EXCEL tables, data on HTML pages.

http://www.ciradar.com/Competitive-Analysis.aspx is one of the world's best information retrieval systems for competitive intelligence on the deep web. Retrieves almost all types of files in all formats on the topic of interest. Implemented as a web service. The prices are more than reasonable.

http://public.ru/ – Effective search and professional analysis of information, media archive since 1990. The online media library offers a wide range of information services: from access to electronic archives of Russian-language media publications and ready-made thematic press reviews to individual monitoring and exclusive analytical research based on press materials.

Cluuz is a young search engine with ample opportunities for competitive intelligence, especially on the English-language Internet. Allows you not only to find, but also to visualize and establish connections between people, companies, domains, e-mails, addresses, etc.

www.wolframalpha.com – the search engine of tomorrow. On search query displays statistical and factual information available on the request object, including visualized information.

www.ist-budget.ru – universal search in databases of government procurement, tenders, auctions, etc.

Talk about what in our time information technologies and the endless growth in the volume of data available to both an individual and society, there are many problems with processing information and searching for it - this is already blasphemy. Who doesn't raise this topic? And in order not to burden you with subjective and, in part, objective judgments drawn from various information sources regarding the problem, I will move directly to its solution. Today we'll talk about search. That is, about programs and serious information systems that search for the documents and data we need.

Upgrade "direct search"

Not so long ago, when trees were big, and information even in local network there were not so many enterprises, any search was carried out by a banal search of a handful of available files and a sequential check of their names and contents. Such a search is called direct, and programs (utilities) using direct search technology are traditionally present in all operating systems and tool packages. But even the power modern computers not enough for quick and adequate search in gigantic volumes of data during direct search. Searching through a couple of hundred documents on a disk and searching a huge library and several dozen mailboxes are two different things. Therefore, direct search programs today are clearly fading into the background - when it comes to universal tools.

Of course, this type of search has not been in demand for a long time in the corporate sector. The volumes are not the same. And therefore, for many years now, and in Lately definitely, technologies capable of carrying out fast and exact search documents of various formats and from various sources are more than relevant. Not so long ago, Microsoft’s “father” Bill Gates, apparently envious of the phenomenal success of the Internet search engine Google, at one of the press conferences announced the desire of the software industry (and not only) to contribute in every possible way, develop and deepen the creation of search engines and technologies. But it’s too early to create any phenomenally working program from Microsoft or a competitive server on the Internet (MSN still doesn’t reach Google). Therefore, let's turn to existing developments. Index, query, relevance

Modern technologies are based on two fundamental processes. Firstly, it is indexing the available information and processing the request with subsequent output of the results. As for the first, any program (be it a desktop search engine, corporate Information system or Internet search engine) creates its own search area. That is, it processes documents and generates an index of these documents (an organized structure that contains information about the processed data). In the future, it is the created index that is used for work - quickly obtaining a list of necessary documents according to the request. What follows, although by no means simple in terms of technology, is quite understandable to the average user. The program processes the request (using a keyword phrase) and displays a list of documents that contain this keyword phrase. Since the information is contained in a structured index, query processing is much faster (tens and hundreds of times!) than in the case of direct search (the selection of documents is carried out not by enumerating files, but by analyzing text information in the index).

The program displays the found documents in the resulting list according to relevance - the document's compliance with the query text. In different technologies, of course, there are different methods for searching and determining the relevance of a document (the number of “occurrences” of a word and its frequency of mention in the document, the ratio of these parameters to the total number of words in the document, the distance between the words of the query phrase in the searched files, and so on). Based on these parameters, the “weight” of the document is determined and, depending on it, a particular file appears in the list of results at a certain position. In the case of Internet search, the situation is even more complicated. Indeed, in this case, many other factors must be taken into account (Google’s Page Rank is an example of this). But this is a topic for a separate article, so we won’t touch the Internet. Review of search engines

This material discusses the possibilities of several popular programs search engines, which boast both decent speeds and good functionality. But showing off in brochures is one thing, but standing under the gaze of an expert is quite another. And there were no more experts, no less an office full of people who liked to tinker with the software for its usability. On an experimental computer (Athlon 2.2 MHz, with a capacity random access memory 1 GB, 160 GB Seagate 7200 rpm IDE hard drive and Windows system XP) a set of programs was installed: dtSearch Desktop, Ischeyka Prof Deluxe, Google Desktop Search, SearchInform, Copernic Desktop Search, ISYS Desktop. For the tests, a text database of documents was compiled in doc, txt and html formats with a total size of neither more nor less, but 20 gigabytes. A group of comrades under the leadership of your humble servant tested, compared and shared their subjective impressions of each software. Read a summary of the findings below. dtSearch Desktop

A program that, according to the developers, claims to be the fastest, most convenient and best search engine. Like, in general, everyone else from this review. The dtSearch interface is quite simple, but some windows or tabs are somewhat overloaded with elements, which makes it seem difficult to use. But in reality there are no particular difficulties. The only really unpleasant point is the software’s lack of support for the Russian language (despite the fact that the program can search for documents in several languages, its interface is exclusively English).

But dtSearch is one of the few programs that can index web pages to a user-specified “depth” (albeit, taking into account the “additional purchase” of the dtSearch Spider add-on kit). This is in addition to supporting files on disk of various text formats and emails from mailbox Outlook. At the same time, the program cannot work with databases, which are such a tasty morsel for search engines due to the large volumes of information contained in them and their wide distribution in companies, and therefore in corporate networks. The speed of indexing dtSearch documents turned out to be at the proper level. Looking ahead, I will say that this program coped with the indexing of a given amount of information on a level with another competitor - iSYS - and shared with it second place in the list of the most fast systems. dtSearch indexed a test 20 gigabytes of information in 6 hours and 13 minutes, creating an index of 7.9 GB for subsequent search needs.

As for the search capabilities, here they are at the proper level. Firstly, dtSearch has a morphological search (searching for a word in all its morphological forms). Using this opportunity, you free yourself from, say, such thoughts as “in what case was a certain word used in the document I needed?” The use of morphological search is almost always justified, so it should be present in any professional search engine.

Search by sound is a non-standard feature even for professional search engines. Its essence is that the program will search for words that sound the same as the word you entered. And the best part is, this function also works for the Russian language! For example, when you type the word "ear" in a search query, you will see not only the words "ear" but also "ear" as a result.

Search with error correction is a very important function. It is used to search for words containing syntactic errors - these can be either typos or errors in documents obtained using character recognition systems, for example. A simple example - you are looking for the word keyboard. Some document contains the word “keyboard”, it is obvious that in fact this is the word “keyboard”, the person just made a typo when typing. So, an error correction search will detect and include a document with the word "keyboard" in the result. There is also a setting in dtSearch that allows you to determine the degree of possible erroneous characters.

Search using synonyms. This feature uses a list of synonyms for various words. So, for example, by entering the word “fast”, the program will also find the words “high-speed” and others that are synonyms for the word “fast”, if, of course, they are present in the list of synonyms. A ready-made list of synonyms is not supplied with the dtSearch program, however, it is possible to use lists on the Internet (accordingly, a connection is required, which is not always convenient), or you can create your own list of synonyms.

In addition to the listed capabilities, dtSearch can search using phrases consisting of words connected by logical operations. Each word in a query can be assigned its own “weight,” that is, significance. A useful option is to use a dictionary consisting of meaningful words in order not to take them into account when searching, however, this dictionary is also empty and you will have to fill it out yourself.

Next, let's look at the program's capabilities when working on the network. In fact, dtSearch does not offer any specific capabilities for working with the network. However, it is quite possible to use it online. Alternatively, you can create some kind of index and put it in a public (shared) folder. The program itself can be installed on each user’s computer, or it can also be placed in a folder open to public access, and create special shortcuts for each user separately using the parameters command line, the purpose of which is described in the help file supplied with the program. Also, there is a possibility automatic installation programs to the network using MSI file. This will take into account the settings for each connected user.

In general, it is a good program from the category of professional search engines. It may qualify for a good rating, but gaining trust and respect from users may not be easy for dtSearch due to certain factors (not everything is smooth with the interface, Russian users are deprived, there are no bright features for working with the network). As for directly searching for documents, the program had no problems with Russian text. As there were none with the declared morphology, or with a fuzzy search. The system quite adequately found the necessary documents both by a simple one-word query and by using a couple of paragraphs or a document as a key phrase.

Official site:
Distribution size: 23 Mb Bloodhound Prof Deluxe

Based on the name, you can guess that there is support for the Russian language in this program. This is already nice. As for the interface, in general, it is somewhat unusual, but in appearance it is very attractive. Another thing is convenience. A very controversial criterion, but still, probably, a multi-window solution is not the most successful option (the request is entered in one window, the result is displayed in another, and the like).

Snoop uses the same indexes to perform a quick search, but indexing is much slower than other programs. This is very strange, especially considering that its capabilities for processing search queries are very weak, and therefore the index structure is not complex. Most likely, this is due to unoptimized algorithms. This program turned out to be a clear outsider in indexing and search speeds: the time spent on creating an index is six times longer than that of dtSearch and iSYS. Indexing 20 gigabytes of texts for the bloodhound resulted in 38 hours and 46 minutes of work. And the created “search area” took up the same size on the hard drive as the original data with a small minus - 19 gigabytes.

Bloodhound can be presented as an alternative to the standard search in Windows; it is unlikely to be capable of more. The fact that the Snooper's primary task is the simplest search for files is indicated not only by the small number of functions for analyzing the text of search queries and an advanced search by file attributes, but even by a results window that provides direct links to the files found, as well as to the folders containing these files. The results window is not very informative in the sense that you can read the entire found file only by running it, that is, it does not have a built-in file viewer. But an excerpt from the file where the searched word was found is displayed; in general, this display scheme is very reminiscent of Internet search engines.

Speaking about specific capabilities for processing search queries, it is worth noting that there is no such thing as “search text”; the maximum that can be searched is a phrase, if only because there is no multi-line text input field. However, you can also analyze the entered phrase, and Snoop offers us a standard search set here: logical operations, mask search and quote search... not a lot. The program contains some rudiments of morphological search, but probably so crude that it rather interferes correct operation(during testing, many overlays with incorrect use of morphology were noticed).

But the program allows you to specify file attributes when searching (document date, file name, folder name), and in these queries you can also use the same search set. You can also search for letters by specifying the parameters (From, Subject..., etc.).

So, we figured out the search itself, what else is interesting about the program, for which it received so many awards, according to information from the official website? It’s hard to say what’s so special about it; most likely, the Bloodhound interface is attractive (exactly in appearance, not to mention usability).

Operations with indexes are very standard; a nice feature is the ability to update indexes on a schedule. Additionally, indexes can also be used online. From now on we need more details.

Despite the primitiveness of search queries, the program can be used to search for files, so its use can be justified in networks. Although this is a stretch, since in a large network the priority is to quickly search for data using complex search queries due to the huge amount of information - and there are clearly problems with the speed of the search and the program. I must say that the work with the network at Izhishika is thought out as it should. A separate application is designed specifically for this - Bloodhound Server. It works the same way as simply Snooper (they have the same search engine), only for documents hosted on a central server or on shared resources in corporate network. Snooper Server creates new indexes on shared resources or uses previously created ones. Any user of the corporate network can connect to the Search Server and use it to access any document (located in the current index) using an Internet browser. Agree, this scheme is extremely convenient: it turns out that files on your own network can be searched in the same way as information on the Internet through, for example, Google.

Assessing all the advantages and disadvantages of this program, the conclusion suggests itself that its capabilities are most likely not enough for corporate networks (despite the good organization of working with the network), but for a home computer or even for a home network it is, in principle, , it might come up. Although neither the speed of work nor the search capabilities inspire optimism...

Official website in Russian:
Distribution size: 6 MbGoogle Desktop Search + GDS Enterprise

Of course, we couldn’t ignore such a famous developer. Name Google already says a lot. People who have been using the most powerful Internet search engine for years will certainly, without a single doubt, decide to install this particular search engine on their computer. Just think: Google on your home computer! However, without giving in to provocations with a widely promoted brand, let’s try soberly, and most importantly objectively, to consider the capabilities of the “desktop” search engine from Google.

The first thing that catches your eye is the lack of its own shell for the program. Google Desktop Search is still located in the browser window, respectively, the entire interface of the desktop version was inherited from the software from its older Internet brother. Whether this is good or bad is a moot point: some people like the minimalism in the design of this search engine, while others want to see a full-fledged application filled with all kinds of buttons and so on.

What catches your eye right after the design? And the fact that this same Google Desktop Search begins to index everything on the computer, without any demand! And what’s most interesting is to choose indexing paths when Google help Desktop Search is not possible. You will have to download a separate program (TweakGDS), which will allow you to slightly expand Google settings Desktop, including specifying the places required for indexing. Although, by the time you figure all this out, it will already index a standard hard drive, so this setting is more likely to be needed when working with large amounts of data, which is very important when used in corporate networks (Enterprise versions). However, it is not a fact that after downloading TweakGDS, your problems will be solved. After all, she needs Microsoft to work. NET Framework and Microsoft Scripting Runtime. Yeah... the installation, as well as access to the settings, could have been made simpler, although the developers can probably understand: why write something new when there is a ready-made search engine, ported it to the local computer and let the user “enjoy” , and a famous name will make another masterpiece out of “this.” Come on, let's end this lyrical digression and move on to the search.

As for analyzing search queries and delivering results, everything here is absolutely identical to Google on the Internet: the same system for displaying results, the same standard set of logical operations for search queries. In general, Google Desktop Search, like the previous program, is intended exclusively for searching for files - it, of course, does not have an internal viewer for these files. The number of file formats supported by Google Desktop Search is quite sufficient, and it is also nice that it searches visited Internet pages, taking data from the cache. Search and indexing speeds are quite acceptable. True, for home use. Google Desktop Search coped with an impressive 20 gigabytes of texts in 8 hours and 17 minutes. Spending several days processing information from the corporate network of a large enterprise is not something any system administrator would like to do. On the plus side: the size of the created index was on the same level (4.5 GB) as another search engine tested in this review - SearchInform.

The big advantage (or disadvantage - you decide) of Google Desktop Search is that it supports plugins, which can change a lot for the better. Another thing is that connecting plugins and setting them up complicates the task of installing a search engine so much that you begin to wonder whether all this is necessary when you can install a normal, full-fledged program in which everything will already be present. After all, to use each feature you will have to install new plugin. Even in order for the program to fully work with archives, a separate gadget is needed. It’s fascinating and seductive that all these additional modules are free. However, if you do not take into account the desktop version of the search engine, then competent configuration of GDS Enterprise may not be within your power - after all, it’s not for nothing that specialists from Google offer their services for setting up their own software for your network for only $10,000.

If you do go through the setup and installation procedure (or pay $10,000 to a quick response team from Google), you will understand that the complexity of the installation is more than compensated by the very flexible settings when used in corporate networks. An important aspect of using Google Desktop on a corporate network is the use group policies, which makes it possible to set settings for each user.

To summarize, the most reasonable use for this program is a home or work computer. After all, for regular computer You just need to install the program - it will do the rest itself (it won’t even ask you anything).

However, Google Desktop Search Enterprise will be acceptable in cases where there is an urgent need for flexible configuration of network policy to use the search engine, while the ability to process search queries will be in second place in importance, and the time (or money) spent on setting up the program will be in first place place.

Official site:
Distribution size including TweakGDS: 1.2 MbCopernic Desktop Search

Click on the picture to enlarge

The program interface evokes extremely positive emotions - everything is done in accordance with generally accepted standards, nothing superfluous, in a word, a pleasant design. For a beginner, understanding the Copernic Desktop Search interface will be very easy. Although, it is somewhat confusing that the designers clearly created the program interface taking into account the fact that the program will work in the standard Windows XP theme. When using the same classic theme, the program doesn’t look so pretty anymore. But this is more a matter of taste.

At the first launch, the program prompts you to create indexes for search. It seemed somewhat unusual that after selecting folders for indexing, the program did not offer to press any button, such as “Start indexing”, and indexing did not start automatically, only then it was noticed that Copernic was trying to start indexing while the computer was idle. You'll have to dig a little deeper into the program's options to configure everything properly. It should be noted that there are quite wide customization options here. automatic creation index: built-in scheduler, the ability to index while the computer is idle, in background, with low priority. Indexing was not too fast - 10 hours 51 minutes - this is slower than in other search engines (except for Isle of Bloodhound, but Copernic is still an order of magnitude faster than the development of iSleuthHound Technologies.

Now about the structure of the index. In general, there is nothing special about it. It is possible to select file types, both in general and detailed form. That is, initially you can choose what you want to index - Documents, Images, Videos, Music. On the other tab of the options window, you will be able to select specific file types by extension. Additionally, you can configure the index so that, for example, pictures smaller than 16x16 in size are not indexed or sound files less than 10 seconds in length are not indexed. In addition to indexing files from folders, Copernic can work with emails and contacts from address book Microsoft Outlook and Microsoft Outlook Express, indexing of Favorites and History from Internet Explorer is possible.

As for the search capabilities, they are very weak here. During tests, it was even revealed that the program does not search for documents in txt and html formats in Russian, allowing you to find them only by titles, and not by content. The only thing the program provides to improve search efficiency is the use of a standard set of logical operations, and even then, this feature was discovered experimentally, since it was not documented. By the way, not everything is in order with the program’s help either - it is only available via the Internet, which, you see, is very inconvenient, and even on the Internet reference information not too much. Apparently, the developers decided that the simple interface of the program does not imply the presence of normal help. Continuing the conversation about search capabilities, it should be noted that, despite the weak analysis of queries, the program provides an interesting search system - the user can select the type of files (images, videos, music, etc.), enter a search query and select attributes specific to selected file type. For example, for sound files, these can be values from mp3 tags (artist, album, date, etc.), for images, for example, you can select their size (by resolution), in general, each type has its own settings. After searching for a specific file type, the program will display a very informative list in the results window, and if your request includes files of other types, you can open them by clicking on a specific link.

Separately, it is worth mentioning the results display window. Below the list of found files, the contents of these files are displayed (a similar scheme is often used in mail clients). True, text viewing can only be done in the native format, and there is no plain text display mode, which is not always convenient, since opening a document in this case takes more time. But, given that Copernic can search for images and music, it is possible to view these multimedia files.

The basic principles of operation of this program are described, now let's see what Copernic Desktop Search can offer us for working with the network... In principle, you can watch for a very long time, but you will hardly be able to see anything. In other words, this program was not intended to be network-based. Copernic Desktop Search is a home search engine exclusively.

Obviously, the only (most logical) application of this program is a home computer. Here it will fully cope with all simple user search queries consisting of one or two words, find necessary information, and the division of search by file type and support for multimedia files along with background indexing in low priority mode, coupled with a pleasant interface, only give the program the strength to gain trust among inexperienced users.

Official site
Distribution size: 2.6 MbISYS Desktop

Click on the picture to enlarge

A very powerful program. In terms of its level of equipment with all sorts of functions, it is somewhere close to the next SearchInform search system on the list. At the same time the size installation file more than 40Mb! It’s hard to say what could be squeezed into such dimensions, because the same SearchInform, with similar functionality, takes up 15Mb.

The installation process here is also not very pleasant, or rather not even the installation process. Even before downloading the program, you will be asked to register, otherwise there is no way. Next, the interface. It is made very nicely, nothing unnecessary catches the eye, however, these are the impressions of a person who is already somewhat accustomed to it. It will not be easy for a beginner to figure out where and what is located, where to click and where to finally search. It is highly recommended to read the help before starting work - you will save a lot of nerves and time. Added to everything else is the complete lack of support for the Russian language in the program. Not good. In addition, the windows here are not overloaded with controls, but they had to pay for this with multi-modules and the use additional windows. For example, search queries are entered by launching one program, and index management is performed using another program. Search queries are also entered here in separate pop-up windows. It’s hard to say which is better - an overloaded interface or ubiquitous multi-windows; rather, it’s a matter of taste.

When it comes to creating indexes, the program provides features to simplify the process of setting options for a new index. These features include several ready-made templates to create indexes for the folder “My Documents”, “Mail”, “Mail and Documents”, “Specific Folder”, “Folder with a selection of file types”, etc. Such templates simplify the creation of indexes at the first stage. The utility for working with indexes does not have a very good interface, which is intimidating with some complexity (this is a very subjective assessment, to be honest), however, if you look at it, it provides many useful options and, in general, its use does not cause much difficulty. ISYS Desktop can index data from various data sources, and also provides many flexible settings for such indexing. Among additional features for indexing: support for SQL, FTP, TRIM Context, WORLDOX 2002, scripts. When creating an index, if you selected the "Folder with selection of file types" item, you have the opportunity to select file types for indexing manually (by extension). It must be said that there are simply a huge number of supported file types, but you will not be able to add your own type (extension) to the existing list. You can also note the presence of an indexing scheduler. Creating an index and processing 20 gigabytes of information took ISYS Desktop 6 hours and 13 minutes, ultimately showing a good time and the size of the created file - 7.9 GB.

The search capabilities of this program are quite good. What is used in ISYS is much more powerful than conventional support for logical operations. Among the advanced search capabilities, the program offers the use of synonyms and a sorting filter (by path, name and date of file creation). The set of logical operators is somewhat wider than the standard set. In addition to logical operations, the program allows you to work with many other operators, which, in principle, can replace some types of search; for example, search with parsing can be completely replaced by using special operators. I was very surprised that the program does not have a search using morphology. This is a serious omission, since search efficiency is greatly improved when using morphological analysis. In addition, there is no list of significant words, but there is an extensive list of insignificant words. Search functions such as “approximate search” and “heuristic analysis” are also announced.

ISYS provides a choice of several types of search queries, namely visual ones. This is done using different types of windows for entering search queries, however, in fact, not a single window allows the use of technologies other than those listed above.

The search results are very informative and are displayed as a list of documents sorted by relevance. A preview of the selected document is displayed below. Unlike Copernic Desktop Search, preview here is available only in the form of plain text; it was not possible to display documents in their native format, be it Word, Html or PDF, although this, in principle, is not too critical. The program allows you to divide found documents into groups according to certain criteria (by default they are divided by relevance). You can also view already found documents by selecting individual folders (this is convenient when the result produces a very large number of documents).

Using the program on a corporate network is also very justified, since it provides good opportunities for organizing network search. The search system is based on the creation of a public index that contains indexed data from publicly available online resources.

In fact, the program from ISYS is worthy of attention, at least getting acquainted with it. This program is a mature project with a huge number of functions (not always and not everyone, of course, needs them, but still). The chances that the program will see some improvements in terms of processing search queries are unknown, but this moment it can be recommended for almost universal use. And given that it is still too heavy for home systems, the main places for its installation are corporate networks.

Official site:
Distribution size: 40 MbSearchInform

Click on the picture to enlarge

It’s probably not worth starting right away with a description of the SearchInform interface. We should first describe the installation process, or rather one of its details: you cannot install the program without an Internet connection. The fact is that before the first launch, the program requires user registration (free) and sends all entered data to the server. Apparently, the developers had to take such measures in the fight against piracy, but this did not have a positive effect on the ease of installation.

The program interface is designed in compliance with all generally accepted rules, however, at first glance, it is somewhat cumbersome. Using the program for the first time, it seems that it is too complicated, sometimes it is not easy to remember in which menu or on which tab the desired option is located, however, with longer use, the interface no longer seems so terribly complex. The main thing is to read the certificate first.

Having understood the interface a little, you can start creating an index. The process itself is very simple and the indexing speed, even by eye, is significantly higher than all other search engines in the review. Clear test numbers show that SearchInform is twice as fast as dtSearch and iSYS in terms of indexing speed! The program indexed the provided data in the amount of 20 gigabytes in a record time of 3 hours 17 minutes. And the size of the created index turned out to be the smallest 4.4 GB - 100 megabytes less than Google Desktop Search.

The program supports, in addition to regular files and folders, also indexing emails, connecting and indexing databases (!) and other external sources (DMS, CRM), immediately during indexing you can specify a dictionary for conducting a morphological search, and all attributes can be indexed files. After creating the index, when trying to conduct the first test search for documents, you may become somewhat confused: “there are two types of search here, but which one do I need?” As mentioned earlier, the main thing is to read the certificate, then everything will become clear. The program can actually carry out two types of searches - phrase search and search for documents similar in content to the query text.

A description of all the main functions for analyzing a search query was given above, so now we will only list the search capabilities provided by this program. Let's start with phrase search: of course, morphological search, citation search, logical operations, search with word parsing (search at the beginning of the word, at the end, at the middle part, or a complete match), mixed citation search (when all words from the query must be present in the document, but not necessarily in the entered order), search with error correction, use of synonyms, “almost citation search” (search for the entered phrase as a citation, but other words may be present between the entered words), etc. Some of the options listed have their own specific settings. In addition, it is possible to use a dictionary of unimportant words, and the program already has a ready-made list of these words; you can also use a dictionary of priority words for searching (of course, you will have to fill it out yourself).

Here, in principle, we briefly reviewed all the main features of phrase search.

Let's move on to consider the features of this program - searching for similar documents. The developers claim that this is by no means a simple text search, it is precisely a “search for similar ones” - this is exactly how it is described everywhere, but oh well, you can call it whatever you want - the main point is. A quick search on the Internet can quickly reveal that so-called "similar search" is a new development in the field of text analysis. This system allows you to find texts that are similar in semantic content. The most pleasant thing was that after conducting test search queries, it turned out that the theory coincides quite well with practice! The program actually searches for documents with similar content and displays them in a list, sorting them by percentage of similarity.

Next, let's look at what SearchInform (in particular, its corporate version SearchInform Corporate) offers for working on a corporate network. There are two types of applications: server side and user side. The server part independently processes the specified indexes, and users can use them for search, depending on the access rights assigned to them. Users can be configured automatically using Accounts Windows (in professional terms, SearchInform uses NTFS Windows authentication), and manually (users will have to be added separately). Each user can be allowed or denied access to certain indexes, and users can also be combined into groups. In general, SearchInform’s settings for working on the network are ahead of Google in terms of flexibility, and Ishhound Server in terms of convenience and simplicity.

Official site:
Distribution size: 14.7 Mb Comparison of indexing speeds

Search system	Indexing time	Index size
Bloodhound Prof Deluxe 4.5	38 hours 46 minutes	19 GB
Isys Desktop 7.0	6 hours 13 minutes	7.9 GB
DtSearch 7.0	6 hours 3 minutes	8.6 GB
Google Desktop Search Enterprise	8 hours 17 minutes	4.5 GB
Copernic Desktop Search *	10 hours 51 minutes	7 GB
SearchInform 1.5.02	3 hours 17 minutes	4.4 GB

* Most of the documents.html and .txt containing Russian text, although they were indexed, were impossible to find except by their names. Summary

All programs are worthy of attention.

Based on tests and a careful examination of each program presented in the review, certain conclusions can be drawn. So, Google Desktop Search Copernic Desktop Search is quite suitable for the inexperienced user as a home information search system. They do a good job with simple queries, will not overload the user with settings and, moreover, are completely free. Google's attempt to enter the corporate search engine market is not yet very justified: for it to work properly, the program needs to be equipped with additional modules, and it is far from easy to set up. Therefore, the self-explanatory names Desktop Search, Copernic, and Google reserve behind them the niche of “desktop” search engines.

True, more powerful solutions - dtSearch, iSYS and SearchInform are also not foolproof and offer users their “desktop” versions. But at a reasonable price, unlike free software from Google and Copernic. Of course, you have to pay for power, speed and functionality. But the main focus of the developers of dtSearch, iSYS and SearchInform is, of course, on the corporate sector. Networking, functionality, indexing and search speed are what distinguish these products from their “competitors.” Based on the test results, the favorite was identified - SearchInform. The program provides the ability to search for similar documents, has the fastest indexing and search speeds, and has a good set of functions.

What is this

DuckDuckGo is a fairly well-known open source search engine. Servers are located in the USA. In addition to its own robot, the search engine uses results from other sources: Yahoo, Bing, Wikipedia.

The better

DuckDuckGo positions itself as a search engine that provides maximum privacy and confidentiality. The system does not collect any data about the user, does not store logs (no search history), use cookies as limited as possible.

DuckDuckGo does not collect or share personal information from users. This is our privacy policy.
Gabriel Weinberg, founder of DuckDuckGo

Why do you need this

All large search engines they try to personalize search results based on data about the person in front of the monitor. This phenomenon is called the “filter bubble”: the user sees only those results that are consistent with his preferences or that the system deems as such.

Forms an objective picture that does not depend on your past behavior on the Internet, and eliminates Google and Yandex thematic advertising based on your queries. With DuckDuckGo it's easy to search for information on foreign languages, while Google and Yandex by default give preference to Russian-language sites, even if the request is entered in another language.

What is this

not Evil is a system that searches the anonymous Tor network. To use it, you need to go to this network, for example by launching a specialized .

not Evil is not the only search engine of its kind. There is LOOK (the default search in the Tor browser, accessible from the regular Internet) or TORCH (one of the oldest search engines on the Tor network) and others. We settled on not Evil because of the clear hint from Google (just look at the start page).

The better

It searches where Google, Yandex and other search engines are generally closed.

Why do you need this

The Tor network contains many resources that cannot be found on the law-abiding Internet. And their number will grow as government control over the content of the Internet tightens. Tor is a kind of network within the Internet with its own social networks, torrent trackers, media, trading platforms, blogs, libraries and so on.

3. YaCy

What is this

YaCy is a decentralized search engine that works on the principle of P2P networks. Each computer on which the main software module is installed scans the Internet independently, that is, it is analogous to a search robot. The results obtained are collected into a common database that is used by all YaCy participants.

The better

It’s difficult to say whether this is better or worse, since YaCy is a completely different approach to organizing search. The absence of a single server and owner company makes the results completely independent of anyone's preferences. The autonomy of each node eliminates censorship. YaCy is capable of searching the deep web and non-indexed public networks.

Why do you need this

If you are a supporter of open source software and a free Internet, not subject to the influence of government agencies and large corporations, then YaCy is your choice. It can also be used to organize a search within a corporate or other autonomous network. And even though YaCy is not very useful in everyday life, it is a worthy alternative to Google in terms of the search process.

4. Pipl

What is this

Pipl is a system designed to search for information about a specific person.

The better

The authors of Pipl claim that their specialized algorithms search more efficiently than “regular” search engines. In particular, priority is given to social network profiles, comments, member lists, and various databases that publish information about people, such as databases of court decisions. Pipl's leadership in this area is confirmed by assessments from Lifehacker.com, TechCrunch and other publications.

Why do you need this

If you need to find information about a person living in the US, then Pipl will be much more effective than Google. The databases of Russian courts are apparently inaccessible to the search engine. Therefore, he does not cope so well with Russian citizens.

What is this

FindSounds is another specialized search engine. Searches various sounds in open sources: house, nature, cars, people, and so on. The service does not support queries in Russian, but there is an impressive list of Russian-language tags that you can use to search.

The better

The output contains only sounds and nothing extra. In the settings you can set the desired format and sound quality. All sounds found are available for download. There is a search by pattern.

Why do you need this

If you need to quickly find the sound of a musket shot, the blows of a suckling woodpecker, or the cry of Homer Simpson, then this service is for you. And we chose this only from the available Russian-language queries. In English the spectrum is even wider.

Seriously, a specialized service requires a specialized audience. But what if it comes in handy for you too?

What is this

Wolfram|Alpha is a computational search engine. Instead of links to articles containing keywords, it provides a ready-made answer to the user’s request. For example, if you enter “compare the populations of New York and San Francisco” into the search form in English, Wolfram|Alpha will immediately display tables and graphs with the comparison.

The better

This service is better than others for finding facts and calculating data. Wolfram|Alpha collects and organizes knowledge available on the Web from a variety of fields, including science, culture and entertainment. If this database contains a ready-made answer to a search query, the system displays it; if not, it calculates and displays the result. In this case, the user sees only nothing superfluous.

Why do you need this

If you are a student, analyst, journalist, or researcher, for example, you can use Wolfram|Alpha to find and calculate data related to your work. The service does not understand all requests, but it is constantly developing and becoming smarter.

What is this

The Dogpile metasearch engine displays a combined list of results from search results from Google, Yahoo and other popular systems.

The better

First, Dogpile displays fewer ads. Secondly, the service uses a special algorithm to find and show the best results from different search engines. According to the Dogpile developers, their systems generate the most complete search results on the entire Internet.

Why do you need this

If you can't find information on Google or another standard search engine, look for it in several search engines at once using Dogpile.

What is this

BoardReader is a system for text search in forums, question and answer services and other communities.

The better

The service allows you to narrow your search field to social platforms. Thanks to special filters, you can quickly find posts and comments that match your criteria: language, publication date and site name.

Why do you need this

BoardReader can be useful for PR specialists and other media specialists who are interested in the opinion of the masses on certain issues.

Finally

The life of alternative search engines is often fleeting. Lifehacker asked the former general director of the Ukrainian branch of Yandex, Sergei Petrenko, about the long-term prospects of such projects.

Sergey Petrenko

Former General Director of Yandex.Ukraine.

As for the fate of alternative search engines, it is simple: to be very niche projects with a small audience, therefore without clear commercial prospects or, conversely, with complete clarity of their absence.

If you look at the examples in the article, you can see that such search engines either specialize in a narrow but popular niche, which, perhaps, has not yet grown enough to be noticeable on the radars of Google or Yandex, or they are testing an original hypothesis in ranking, which is not yet applicable in regular search.

For example, if a search on Tor suddenly turns out to be in demand, that is, results from there are needed by at least a percentage of Google’s audience, then, of course, ordinary search engines will begin to solve the problem of how to find them and show them to the user. If the behavior of the audience shows that for a significant proportion of users in a significant number of queries, results given without taking into account factors depending on the user seem more relevant, then Yandex or Google will begin to produce such results.

“Be better” in the context of this article does not mean “be better at everything.” Yes, in many aspects our heroes are far from Yandex (even far from Bing). But each of these services gives the user something that the search industry giants cannot offer. Surely you also know similar projects. Share with us - let's discuss.

PROFESSIONAL INFORMATION SEARCH ON THE INTERNET

Internet search is an important element of working on the Internet. Exact number of web resources modern Internet Hardly anyone knows for sure. In any case, the count is in the billions. In order to be able to use the information needed at a given moment, no matter for work or entertainment purposes, you first need to find it in this constantly replenished ocean of resources.

In order for an Internet search to be successful, two conditions must be met: queries must be well formulated and they must be asked in appropriate places. In other words, the user is required, on the one hand, to be able to translate his search interests into the language of the search query, and on the other hand, a good knowledge of search engines, available search tools, their advantages and disadvantages, which will allow him to choose the most suitable search tools in each specific case .

Currently, there is no single resource that satisfies all Internet search requirements. Therefore, if you take your search seriously, you inevitably have to use different tools, using each in the most appropriate case.

Basic Internet search toolscan be divided into the following main groups:

Search engines;

Web directories;

Help Resources;

Local programs for searching the Internet.

The most popular search tools aresearch engines– the so-called Internet search engines (Search Engines). The top three leaders on a global scale are quite stable - Google, Yahoo! and Bing. In many countries, their own local search engines, optimized for working with local content, are added to this list. With their help, you can theoretically find any specific word on the pages of many millions of sites. From the user's point of view, the main disadvantage of search engines is the inevitable presenceinformation noisein the results. This is the customary name for results that are included in the search list for one reason or another and do not correspond to the request.

Despite many differences, all Internet search engines operate on similar principles and, from a technical point of view, consist of similar subsystems. The first structural part of the search engine is special programs, used for automatic search and subsequent indexing of web pages. Such programs are usually called spiders, or bots. They look at the code of web pages, find links located on them, and thereby discover new web pages. There are also alternative way inclusion of the site in the index. Many search engines offer resource owners the opportunity to independently add a site to their database. However, the web pages are then downloaded, analyzed and indexed. They highlight structural elements, find keywords, and determine their connections with other sites and web pages. Other operations are also performed, the result of which is the formation of a search engine index database. This database is the second main element of any search engine. Currently, there is no single absolutely complete index database that would contain information about all Internet content. Because the different search engines use different programs search for web pages and build their index using different algorithms, search engine index databases can vary significantly. Some sites are indexed by several search engines, but there is always a certain percentage of resources included in the database of only one search engine. The presence of such an original and non-overlapping part of the index in each search engine allows us to draw an important practical conclusion: if you use only one search engine, even the largest one, you will definitely lose a certain percentage of useful links.

The next part of the Internet search engine is the actual search and sorting programs. These programs solve two main tasks: first, they find pages and files in the database that match the incoming request, and then sort the resulting data array in accordance with various criteria. Success in achieving search goals largely depends on the effectiveness of their work.

The last element of an Internet search engine is the user interface. In addition to the usual requirements for aesthetics and convenience for any website, search engine interfaces have another important requirement: they must offer various tools for composing and clarifying queries, as well as sorting and filtering results. The advantages of search engines are excellent coverage of sources, relatively fast updating of database content and a good choice additional functions.

The main tool for working with search engines is a query.

For Internet searches, special applications are also used that are installed on the local computer. It could be like simple programs, and quite complex complexes of data search and analysis. The most common are search plugins for browsers, browser panels designed to work with a specific search service, and metasearch packages with capabilities for analyzing results.

Web directories – these are resources in which sites are divided into thematic categories. If the user works with search engines only through queries, then in the catalog it is possible to view thematic sections in their entirety. The second fundamental difference between directories and automatic search engines is that, as a rule, people are directly involved in their filling, viewing resources and classifying the site into one category or another. Web directories are usually divided into universal and thematic. Universal ones try to cover as many topics as possible. You can find anything in them: from websites about poetry to computer resources. In other words, their search breadth is maximum. Thematic directories specialize in a specific topic, providing maximum search depth by reducing the breadth of resource coverage.

The advantages of catalogs are comparatively high quality resources, since each site in it is viewed and selected by a person. Thematic grouping of sites allows you to conveniently arrange sites of similar topics. This mode of operation is good for discovering sites that are new to you on a topic of interest - it is more accurate than using a search engine. It is recommended to use web catalogs for the first acquaintance with any subject area, as well as searching for vague queries - you will have the opportunity to “wander” through the sections of the catalog and more accurately determine what exactly you need.

The disadvantages of web directories are known. First of all, this is a slow replenishment of the database, since the inclusion of a site in the catalog requires human participation. In terms of efficiency, a web directory is not a rival to search engines. In addition, web directories are significantly inferior to search engines in terms of database size.

When talking about Internet search, we cannot ignore a number of terms that are closely related to this area and are often used to describe and evaluate search engines. For example: breadth and depth Internet search. A broad search is one that captures as many sources of information as possible. In this case, at least a mention of one or another site suitable for the request is considered sufficient. Search depth refers to the detail of indexing and subsequent searching of each specific resource. For example, many search engines approach indexing different sites differently. Large and popular sites are indexed to the maximum extent; robots try not to miss a single page of such a resource. At the same time, on other sites, only the title page and a couple of content pages may be indexed. These circumstances naturally affect subsequent searches. Deep search works on the principle “it is better to include unnecessary information in the results than to miss any data relevant to the search topic.”

Quite often you can come across such concepts as global and local Internet search. Local Internet searches take into account the user's geographic location and give preference to results that are somehow related to a specific country or locality. During a global search, this information is not taken into account, and the search is carried out in all available resources.

When making a request on Internet search engines, the following applies: various modes search. Typical search modes found on most Internet machines include: simple and advanced search. A simple search allows you to specify only one search feature in one request. Advanced search makes it possible to create a query from several conditions, linking them with logical operators.

To refine search queries, various filters . Filters are those or other auxiliary means of composing a query that do not relate to the content side of the query conditions, but limit the search results by some formal feature. So, for example, when using a file type filter when searching, the user does not provide the system with information related to the topic of his request, but simply limits the results obtained to a certain file type specified in the condition of his request.

For most users, universal search engines are the main, and often the only means of Internet search. They offer good coverage of sources, as well as a set of tools sufficient to solve basic search problems.

The market for universal search engines is quite large. We tried to analyze the most famous search engines, and presented the results in Table 1.

When choosing a universal search engine, the quality of the resources found with its help plays an important role. You can determine the preferred search engine for specific tasks using the “marker method”. Its essence is that first a certain thematic search query is compiled, after which a group of people - experts in this field - is surveyed to identify the best, in their opinion, Internet resources on the chosen topic. Based on the survey data, a list of marker sites is generated that are guaranteed to be relevant to the request and contain high-quality information. The request is then sent to the tested search engines. The logic of the assessment is simple: the higher the marker sites are located in the search results, the better a particular resource is suitable for searching for information on a test topic.

Checking a nickname across dozens of services at a time, counting reposts on Facebook and visualizing Twitter account connections.

Social media content analysis is a hot topic among startups. More and more services for searching posts and people appear every year. But many of them either disappear quickly, are available in an unfinished state, or are expensive to use.

This material contains a few of them that allow you to quickly and freely get really useful or simply interesting information.

1. Search for profiles

Search system Snitch allows you to search for a person’s profile in four dozen services, including the websites of the world’s leading universities and the US criminal database:

Unfortunately, some of the sites for which you can check boxes no longer work. For example, Google Uncle Sam, closed 5 years ago. But despite this and other Snitch jambs - useful service, which allows you to significantly save time when searching for information about a person.

If for some service a blank screen is displayed instead of blocks with search results, then to view them you need to follow the link Open a new window:

2. Search for hashtags

It's very easy to use. You need to enter the desired hashtag into the search form and in a second a list of recent posts tagged with it in six social networks will appear:

3. Analysis of recent tweets

The service allows you to get a list of the last hundred tweets containing the search word, hashtag or account name. And also find out some analytical information about the people who made these tweets and the time they were created:

Let's say you want to identify which user caused an unusually high number of clicks to an article from Twitter. We look at the latest 100 tweets and see which of the people who mentioned the original concept have the most followers:

Owners paid subscription A large number of tweets are available for analysis:

4. Twitter account analysis

On Mentionapp you can enter the account name and get information about it (who retweets most often, what hashtags it uses, etc.) in the form of a connection diagram:

5. Search for tweets on the map

If you click on any place on the map, you can read the latest tweets made nearby:

6. Number of mentions on social networks

Sharedcount helps to evaluate the popularity of an article/site on social networks. You enter the URL and in a couple of seconds there are statistics of mentions on Facebook, Google+, Pinterest, LinkedIn and Stumble Upon:

7. Search the forums

Boardreader is a search engine for forums and message boards:

An assessment of the scale of the disaster showed that there are almost 4 responses on this portal per resident of Russia.

8. We break through the login via social networks

We go to knowem.com and enter the person’s nickname. In response, we receive information about which services it is registered on:

9. Determine a person’s name by email

If you are still looking for people by typing their email addresses into Google, then you should abandon this method. After all, there is pipl.com. You enter your email (nickname) and get a list of profiles on social networks:

The information is not always accurate or complete, but the service is extremely useful.

That's all. It was worth talking about Socialmention (unfinished analysis of reviews), Yomapic (search for photos from VK and Instagram on a map) and yandex.