Lists for mailings: a civilized collection and cleaning lists

the Post is more suitable for hubba "Spam and antispam" but globally post is still off topic, "I'm a PR".

Recently I had the task of collecting the email addresses of the schools to invite participation in inter-school competitions. On habré there were a few posts dedicated to collecting email with commercial and non-commercial sites. No really effective and civilized option for automatic or semi-automatic option, I have not seen, although occasionally the need arises. 99% tools generators email and "corrupt database" or desktop buggy softiny, which use no desire.

Lyrical digression. The theme of spam and anti-spam is a very fine line, so I give the definition: civilized (or sensitive) way – in terms of respect for those who will receive the newsletter. The manual option of making a list of the best, but the speed of modern life is forcing you to automate everything you can, because the task of any mailing is to inform a large number of people in minimal time.

A couple of weeks ago I was approached by the developer of the service spider-post.com, which solves this problem. He asked me to test the resource and to place the review on habré. I agreed because the topic is interesting to me, and similar tools I have found. I'd love to see in the comments links to other services.

All your questions will be given to the authors. The answers will appear in the comments.

A compromise solution of the problem of collecting email I saw this:
the
    the
  • according to certain criteria to select sites that are relevant to your business;
  • the
  • to take with them email;
  • the
  • to check for validity;
  • the
  • to purge manually from email, the form of which raises doubts;
  • the
  • to do a test mailing with offer to subscribe on an ongoing basis.


Spider Post uses a similar approach.

the
    the
  • you select a region and specify a list of key phrases that describe your business;
  • the
  • service selects the specified parameters of the sites in search engines and collect email lists. The finished list can be obtained within a few hours. According to the developer, the service analyzes what is written after "@" and checks if alive website and email, what is the age of the resource, whether it is commercial;
  • the
  • the lists can be downloaded and manually clean (the log shows the same: to purge effective).


I have tested several themes, in which something you know (secondary schools, phosphors, perimeter protection). The results and conclusions below.

A screenshot of the completed order


Detailed information on the results:


The impression is ambiguous.
    the
  1. Ask specialized queries in order to minimize the possibility of getting the final list of debris.
  2. the
  3. In all cases, the email worked with tens of thousands of addresses and a large number of "strange looking" email.


Some advice to developers on functionality. What I would like to add in the functionality:
    the
  1. the ability to use the query language of search engines, through which will narrow the number of sites for selection.
  2. the
  3. Collection of additional information. In addition to the address of the website – the heading and description of J. Directory or search engine.
  4. the
  5. option to specify which exactly sites you need to collect addresses (for example, my problem with schools is the Federal resources)

  6. These simple additions will decrease the percentage of garbage in the lists, and will facilitate further cleaning.
Article based on information from habrahabr.ru

Комментарии

Популярные сообщения из этого блога

March Habrameeting in Kiev

PostgreSQL load testing using JMeter, Yandex.Tank and Overload

Monitoring PostgreSQL with Zabbix