From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id 38C1D1581C1 for ; Tue, 9 Jul 2024 10:03:55 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 1696C2BC072; Tue, 9 Jul 2024 10:03:48 +0000 (UTC) Received: from forward500d.mail.yandex.net (forward500d.mail.yandex.net [178.154.239.208]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 5AA7F2BC04F for ; Tue, 9 Jul 2024 10:03:47 +0000 (UTC) Received: from mail-nwsmtp-smtp-production-main-18.iva.yp-c.yandex.net (mail-nwsmtp-smtp-production-main-18.iva.yp-c.yandex.net [IPv6:2a02:6b8:c0c:8c16:0:640:c88a:0]) by forward500d.mail.yandex.net (Yandex) with ESMTPS id 1CD0460A34; Tue, 9 Jul 2024 13:03:41 +0300 (MSK) Received: by mail-nwsmtp-smtp-production-main-18.iva.yp-c.yandex.net (smtp/Yandex) with ESMTPSA id c3aapd8mJ8c0-dJUKVmyW; Tue, 09 Jul 2024 13:03:40 +0300 X-Yandex-Fwd: 1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ya.ru; s=mail; t=1720519420; bh=iar7Ws7IknoJ8Fs/L256iJzsRTAIRY7QXmMNa+fIVKo=; h=In-Reply-To:References:To:Subject:Message-ID:Date:From; b=bz1+769SnPI7j642H/cDXtiC1nTJvVQnDhibnuRsA2wEjde301rs7WG0UGZ/9vlop 1E86z4QPjw0bDlNRGboVRgRloN3Jgsb4myUuMbBONWSRmPsDsaIC97TS7Bjm9WYaeH o7oKACw8wchUBElEYwfTpptzPkkJiebJtsfGKk1s= Authentication-Results: mail-nwsmtp-smtp-production-main-18.iva.yp-c.yandex.net; dkim=pass header.i=@ya.ru Content-Type: multipart/alternative; boundary="------------yB9fuFV5ujhHg6kzVHbhf05R" Message-ID: <72c521ff-f60a-4bef-971d-9bf70ea8df29@ya.ru> Date: Tue, 9 Jul 2024 14:03:38 +0400 Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-user@lists.gentoo.org Reply-to: gentoo-user@lists.gentoo.org X-Auto-Response-Suppress: DR, RN, NRN, OOF, AutoReply MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [gentoo-user] Emails are no indexable Content-Language: en-US To: gentoo-user@lists.gentoo.org, Michael References: <2945403.e9J7NaK4W3@lenovo> From: Vitaly Zdanevich In-Reply-To: <2945403.e9J7NaK4W3@lenovo> X-Archives-Salt: c9fbffe7-53cf-4ba1-9393-701fda5d9761 X-Archives-Hash: 8854447be05ff8b1bda899e28f5e4fef This is a multi-part message in MIME format. --------------yB9fuFV5ujhHg6kzVHbhf05R Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit In https://marc.info/robots.txt I see User-agent: * Disallow: / It looks bad. On 7/8/24 21:41, Michael wrote: > On Monday, 8 July 2024 16:07:59 BST Vitaly Zdanevich wrote: >> Hi, I tried to google in "exact match" a few sentences from this email >> list - and nothing found. For example this mirroring >> https://marc.info/?l=gentoo-user&m=171984189706185&w=2 - and nothing in >> Google. Is it excluded from search? This is bad, because people google >> problems that are already solved in these emails :( >> >> Is it a known issue? > It depends what Google or other web crawlers have decided to list in their > search results and what to exclude. You should be able to run a search within > the content of a single website, but only if it has been ranked/listed by > Google, e.g. say you want to find posts about blurred fonts, in a gentoo M/L, > but not Debian, contained in marc.info. You can search Google like so: > > "blurred fonts" +gentoo -debian site:marc.info > > This should include Gentoo M/L posts about blurred fonts found in the > marc.info domain, but exclude any Debian related search results. > Unfortunately, this relies on Google first ranking them in their results to > allow you to see them. --------------yB9fuFV5ujhHg6kzVHbhf05R Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 7bit

In https://marc.info/robots.txt I see

User-agent: *
Disallow: /

It looks bad.

On 7/8/24 21:41, Michael wrote:
On Monday, 8 July 2024 16:07:59 BST Vitaly Zdanevich wrote:
Hi, I tried to google in "exact match" a few sentences from this email
list - and nothing found. For example this mirroring
https://marc.info/?l=gentoo-user&m=171984189706185&w=2 - and nothing in
Google. Is it excluded from search? This is bad, because people google
problems that are already solved in these emails :(

Is it a known issue?
It depends what Google or other web crawlers have decided to list in their 
search results and what to exclude.  You should be able to run a search within 
the content of a single website, but only if it has been ranked/listed by 
Google, e.g. say you want to find posts about blurred fonts, in a gentoo M/L, 
but not Debian, contained in marc.info.  You can search Google like so:

"blurred fonts" +gentoo -debian site:marc.info

This should include Gentoo M/L posts about blurred fonts found in the 
marc.info domain, but exclude any Debian related search results.  
Unfortunately, this relies on Google first ranking them in their results to 
allow you to see them.
--------------yB9fuFV5ujhHg6kzVHbhf05R--