public inbox for gentoo-portage-dev@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-portage-dev] [PATCH] repoman: warn when herd's email appears in <maintainer><email> section
@ 2014-07-23 14:15 Sergei Trofimovich
  0 siblings, 0 replies; 3+ messages in thread
From: Sergei Trofimovich @ 2014-07-23 14:15 UTC (permalink / raw
  To: gentoo-portage-dev; +Cc: Sergei Trofimovich

Manuel Rüger noticed that most of haskell packages's 'metadata.xml'
contain duplicate information:
    <herd>haskell</herd>
    <maintainer><email>haskell@gentoo.org</email></maintainer>

I've added a check against 'herds.xml's email aliases.

Now repoman warns about such redundancy:
  metadata.warning              1
   dev-haskell/text/metadata.xml: use <herd>haskell</herd> instead of maintainer 'haskell@gentoo.org'

Quick scan [1] of tree revealed a lot of non-haskell packages
having redundancy: samba, xemacs, sci, gpe, etc, etc.

[1]: https://github.com/trofi/gentoo-qa/blob/master/check_herd.sh
Run in the root tree of gentoo-x86:
    gentoo-x86 $ ~/portage/gentoo-qa/check_herd.sh  | wc -l
    571
    # some examples:
    gentoo-x86 $ ~/portage/gentoo-qa/check_herd.sh
    ./app-admin/haskell-updater/metadata.xml:  <email>haskell@gentoo.org</email>
    ./app-editors/xemacs/metadata.xml:    <email>xemacs@gentoo.org</email>
    ./app-i18n/ibus-table-chinese/metadata.xml:             <email>cjk@gentoo.org</email>
    ./app-portage/fquery/metadata.xml:              <email>haskell@gentoo.org</email>
    ./app-text/glosung/metadata.xml:  <email>theology@gentoo.org</email>
    ... # a lot of haskell
    ./dev-lang/tcl/metadata.xml:    <email>tcltk@gentoo.org</email>
    ./dev-libs/Ice/metadata.xml:            <email>cpp@gentoo.org</email>
    ./dev-libs/cloog/metadata.xml:          <email>toolchain@gentoo.org</email>
    ./dev-libs/cvector/metadata.xml:    <email>sci@gentoo.org</email>
    ./dev-libs/iniparser/metadata.xml:      <email>samba@gentoo.org</email>
    ./dev-libs/isl/metadata.xml:    <email>toolchain@gentoo.org</email>
    ...

Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
---
 bin/repoman              | 13 +++++++++++++
 pym/repoman/herdbase.py  | 21 +++++++++++++++++----
 pym/repoman/utilities.py | 19 +++++++++++++++++++
 3 files changed, 49 insertions(+), 4 deletions(-)

diff --git a/bin/repoman b/bin/repoman
index c36ace1..74a4d44 100755
--- a/bin/repoman
+++ b/bin/repoman
@@ -1762,6 +1762,19 @@ for x in effective_scanlist:
 				fails["metadata.bad"].append("%s/metadata.xml: %s" % (x, e))
 				del e
 
+			# check if 'metadata.xml' contains redundant
+			#   '<maintainer><email>some-herd@gentoo.org</email></maintainer>'
+			# email address. Instead it should contain
+			#   '<herd>some-herd</herd>'
+			if herd_base is not None:
+				for m_email in utilities.get_maintainer_emails_from_metadata(_metadata_xml):
+					herd_name = herd_base.herd_by_herd_email(m_email)
+					if herd_name is not None:
+						stats["metadata.warning"] += 1
+						fails["metadata.warning"].append("%s/metadata.xml:"
+							" use <herd>%s</herd> instead of maintainer '%s'" % \
+							(x, herd_name, m_email))
+
 		# Only carry out if in package directory or check forced
 		if xmllint_capable and not metadata_bad:
 			# xmlint can produce garbage output even on success, so only dump
diff --git a/pym/repoman/herdbase.py b/pym/repoman/herdbase.py
index c5b88ff..ae7c4e4 100644
--- a/pym/repoman/herdbase.py
+++ b/pym/repoman/herdbase.py
@@ -34,9 +34,10 @@ def _make_email(nick_name):
 
 
 class HerdBase(object):
-	def __init__(self, herd_to_emails, all_emails):
+	def __init__(self, herd_to_emails, all_emails, herd_email_to_herd):
 		self.herd_to_emails = herd_to_emails
 		self.all_emails = all_emails
+		self.herd_email_to_herd = herd_email_to_herd
 
 	def known_herd(self, herd_name):
 		return herd_name in self.herd_to_emails
@@ -47,6 +48,9 @@ class HerdBase(object):
 	def maintainer_in_herd(self, nick_name, herd_name):
 		return _make_email(nick_name) in self.herd_to_emails[herd_name]
 
+	def herd_by_herd_email(self, email):
+		return self.herd_email_to_herd.get(email)
+
 class _HerdsTreeBuilder(xml.etree.ElementTree.TreeBuilder):
 	"""
 	Implements doctype() as required to avoid deprecation warnings with
@@ -57,6 +61,7 @@ class _HerdsTreeBuilder(xml.etree.ElementTree.TreeBuilder):
 
 def make_herd_base(filename):
 	herd_to_emails = dict()
+	herd_email_to_herd = dict()
 	all_emails = set()
 
 	try:
@@ -82,6 +87,11 @@ def make_herd_base(filename):
 		herd_name = _herd_name.text.strip()
 		del _herd_name
 
+		_herd_email = h.find('email')
+		herd_email = _herd_email.text.strip()
+		del _herd_email
+		herd_email_to_herd[herd_email] = herd_name
+
 		maintainers = h.findall('maintainer')
 		herd_emails = set()
 		for m in maintainers:
@@ -95,7 +105,7 @@ def make_herd_base(filename):
 
 		herd_to_emails[herd_name] = herd_emails
 
-	return HerdBase(herd_to_emails, all_emails)
+	return HerdBase(herd_to_emails, all_emails, herd_email_to_herd)
 
 
 if __name__ == '__main__':
@@ -104,12 +114,15 @@ if __name__ == '__main__':
 	assert(h.known_herd('sound'))
 	assert(not h.known_herd('media-sound'))
 
-	assert(h.known_maintainer('sping'))
-	assert(h.known_maintainer('sping@gentoo.org'))
+	assert(h.known_maintainer('slyfox'))
+	assert(h.known_maintainer('slyfox@gentoo.org'))
 	assert(not h.known_maintainer('portage'))
 
 	assert(h.maintainer_in_herd('zmedico@gentoo.org', 'tools-portage'))
 	assert(not h.maintainer_in_herd('pva@gentoo.org', 'tools-portage'))
 
+	assert(h.herd_by_herd_email('haskell@gentoo.org') == 'haskell')
+	assert(h.herd_by_herd_email('slyfox@gentoo.org') is None)
+
 	import pprint
 	pprint.pprint(h.herd_to_emails)
diff --git a/pym/repoman/utilities.py b/pym/repoman/utilities.py
index 415825e..2815529 100644
--- a/pym/repoman/utilities.py
+++ b/pym/repoman/utilities.py
@@ -17,6 +17,7 @@ __all__ = [
 	"get_commit_message_with_editor",
 	"get_commit_message_with_stdin",
 	"get_committer_name",
+	"get_maintainer_emails_from_metadata",
 	"have_ebuild_dir",
 	"have_profile_dir",
 	"parse_metadata_use",
@@ -215,6 +216,24 @@ def parse_metadata_use(xml_tree):
 
 	return uselist
 
+def get_maintainer_emails_from_metadata(xml_tree):
+	"""
+	Returns a list of explicit emails of maintainers
+	"""
+	m_emails = list()
+
+	maintainers = xml_tree.findall('maintainer')
+	if maintainers is None:
+		return m_emails
+
+	for m in maintainers:
+		_m_email = m.find('email')
+		m_email = _m_email.text.strip()
+		m_emails.append(m_email)
+
+	return m_emails
+
+
 class UnknownHerdsError(ValueError):
 	def __init__(self, herd_names):
 		_plural = len(herd_names) != 1
-- 
1.8.5.5



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [gentoo-portage-dev] [PATCH] repoman: warn when herd's email appears in <maintainer><email> section
@ 2014-07-24 22:11 Sergei Trofimovich
  2014-09-03 12:08 ` Alexander Berntsen
  0 siblings, 1 reply; 3+ messages in thread
From: Sergei Trofimovich @ 2014-07-24 22:11 UTC (permalink / raw
  To: gentoo-portage-dev; +Cc: Sergei Trofimovich

Manuel Rüger noticed that most of haskell packages's 'metadata.xml'
contain duplicate information:
    <herd>haskell</herd>
    <maintainer><email>haskell@gentoo.org</email></maintainer>

I've added a check against 'herds.xml's email aliases.

Now repoman warns about such redundancy:
  metadata.warning              1
   dev-haskell/text/metadata.xml: use <herd>haskell</herd> instead of maintainer 'haskell@gentoo.org'

Quick scan [1] of tree revealed a lot of non-haskell packages
having redundancy: samba, xemacs, sci, gpe, etc, etc.

[1]: https://github.com/trofi/gentoo-qa/blob/master/check_herd.sh
Run in the root tree of gentoo-x86:
    gentoo-x86 $ ~/portage/gentoo-qa/check_herd.sh  | wc -l
    571
    # some examples:
    gentoo-x86 $ ~/portage/gentoo-qa/check_herd.sh
    ./app-admin/haskell-updater/metadata.xml:  <email>haskell@gentoo.org</email>
    ./app-editors/xemacs/metadata.xml:    <email>xemacs@gentoo.org</email>
    ./app-i18n/ibus-table-chinese/metadata.xml:             <email>cjk@gentoo.org</email>
    ./app-portage/fquery/metadata.xml:              <email>haskell@gentoo.org</email>
    ./app-text/glosung/metadata.xml:  <email>theology@gentoo.org</email>
    ... # a lot of haskell
    ./dev-lang/tcl/metadata.xml:    <email>tcltk@gentoo.org</email>
    ./dev-libs/Ice/metadata.xml:            <email>cpp@gentoo.org</email>
    ./dev-libs/cloog/metadata.xml:          <email>toolchain@gentoo.org</email>
    ./dev-libs/cvector/metadata.xml:    <email>sci@gentoo.org</email>
    ./dev-libs/iniparser/metadata.xml:      <email>samba@gentoo.org</email>
    ./dev-libs/isl/metadata.xml:    <email>toolchain@gentoo.org</email>
    ...

Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
---
 bin/repoman              | 13 +++++++++++++
 pym/repoman/herdbase.py  | 21 +++++++++++++++++----
 pym/repoman/utilities.py | 19 +++++++++++++++++++
 3 files changed, 49 insertions(+), 4 deletions(-)

diff --git a/bin/repoman b/bin/repoman
index c36ace1..74a4d44 100755
--- a/bin/repoman
+++ b/bin/repoman
@@ -1762,6 +1762,19 @@ for x in effective_scanlist:
 				fails["metadata.bad"].append("%s/metadata.xml: %s" % (x, e))
 				del e
 
+			# check if 'metadata.xml' contains redundant
+			#   '<maintainer><email>some-herd@gentoo.org</email></maintainer>'
+			# email address. Instead it should contain
+			#   '<herd>some-herd</herd>'
+			if herd_base is not None:
+				for m_email in utilities.get_maintainer_emails_from_metadata(_metadata_xml):
+					herd_name = herd_base.herd_by_herd_email(m_email)
+					if herd_name is not None:
+						stats["metadata.warning"] += 1
+						fails["metadata.warning"].append("%s/metadata.xml:"
+							" use <herd>%s</herd> instead of maintainer '%s'" % \
+							(x, herd_name, m_email))
+
 		# Only carry out if in package directory or check forced
 		if xmllint_capable and not metadata_bad:
 			# xmlint can produce garbage output even on success, so only dump
diff --git a/pym/repoman/herdbase.py b/pym/repoman/herdbase.py
index c5b88ff..ae7c4e4 100644
--- a/pym/repoman/herdbase.py
+++ b/pym/repoman/herdbase.py
@@ -34,9 +34,10 @@ def _make_email(nick_name):
 
 
 class HerdBase(object):
-	def __init__(self, herd_to_emails, all_emails):
+	def __init__(self, herd_to_emails, all_emails, herd_email_to_herd):
 		self.herd_to_emails = herd_to_emails
 		self.all_emails = all_emails
+		self.herd_email_to_herd = herd_email_to_herd
 
 	def known_herd(self, herd_name):
 		return herd_name in self.herd_to_emails
@@ -47,6 +48,9 @@ class HerdBase(object):
 	def maintainer_in_herd(self, nick_name, herd_name):
 		return _make_email(nick_name) in self.herd_to_emails[herd_name]
 
+	def herd_by_herd_email(self, email):
+		return self.herd_email_to_herd.get(email)
+
 class _HerdsTreeBuilder(xml.etree.ElementTree.TreeBuilder):
 	"""
 	Implements doctype() as required to avoid deprecation warnings with
@@ -57,6 +61,7 @@ class _HerdsTreeBuilder(xml.etree.ElementTree.TreeBuilder):
 
 def make_herd_base(filename):
 	herd_to_emails = dict()
+	herd_email_to_herd = dict()
 	all_emails = set()
 
 	try:
@@ -82,6 +87,11 @@ def make_herd_base(filename):
 		herd_name = _herd_name.text.strip()
 		del _herd_name
 
+		_herd_email = h.find('email')
+		herd_email = _herd_email.text.strip()
+		del _herd_email
+		herd_email_to_herd[herd_email] = herd_name
+
 		maintainers = h.findall('maintainer')
 		herd_emails = set()
 		for m in maintainers:
@@ -95,7 +105,7 @@ def make_herd_base(filename):
 
 		herd_to_emails[herd_name] = herd_emails
 
-	return HerdBase(herd_to_emails, all_emails)
+	return HerdBase(herd_to_emails, all_emails, herd_email_to_herd)
 
 
 if __name__ == '__main__':
@@ -104,12 +114,15 @@ if __name__ == '__main__':
 	assert(h.known_herd('sound'))
 	assert(not h.known_herd('media-sound'))
 
-	assert(h.known_maintainer('sping'))
-	assert(h.known_maintainer('sping@gentoo.org'))
+	assert(h.known_maintainer('slyfox'))
+	assert(h.known_maintainer('slyfox@gentoo.org'))
 	assert(not h.known_maintainer('portage'))
 
 	assert(h.maintainer_in_herd('zmedico@gentoo.org', 'tools-portage'))
 	assert(not h.maintainer_in_herd('pva@gentoo.org', 'tools-portage'))
 
+	assert(h.herd_by_herd_email('haskell@gentoo.org') == 'haskell')
+	assert(h.herd_by_herd_email('slyfox@gentoo.org') is None)
+
 	import pprint
 	pprint.pprint(h.herd_to_emails)
diff --git a/pym/repoman/utilities.py b/pym/repoman/utilities.py
index 415825e..2815529 100644
--- a/pym/repoman/utilities.py
+++ b/pym/repoman/utilities.py
@@ -17,6 +17,7 @@ __all__ = [
 	"get_commit_message_with_editor",
 	"get_commit_message_with_stdin",
 	"get_committer_name",
+	"get_maintainer_emails_from_metadata",
 	"have_ebuild_dir",
 	"have_profile_dir",
 	"parse_metadata_use",
@@ -215,6 +216,24 @@ def parse_metadata_use(xml_tree):
 
 	return uselist
 
+def get_maintainer_emails_from_metadata(xml_tree):
+	"""
+	Returns a list of explicit emails of maintainers
+	"""
+	m_emails = list()
+
+	maintainers = xml_tree.findall('maintainer')
+	if maintainers is None:
+		return m_emails
+
+	for m in maintainers:
+		_m_email = m.find('email')
+		m_email = _m_email.text.strip()
+		m_emails.append(m_email)
+
+	return m_emails
+
+
 class UnknownHerdsError(ValueError):
 	def __init__(self, herd_names):
 		_plural = len(herd_names) != 1
-- 
1.8.5.5



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [gentoo-portage-dev] [PATCH] repoman: warn when herd's email appears in <maintainer><email> section
  2014-07-24 22:11 Sergei Trofimovich
@ 2014-09-03 12:08 ` Alexander Berntsen
  0 siblings, 0 replies; 3+ messages in thread
From: Alexander Berntsen @ 2014-09-03 12:08 UTC (permalink / raw
  To: gentoo-portage-dev

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On 25/07/14 00:11, Sergei Trofimovich wrote:
> -	assert(h.known_maintainer('sping'))
> -	assert(h.known_maintainer('sping@gentoo.org'))
> +	assert(h.known_maintainer('slyfox'))
> +	assert(h.known_maintainer('slyfox@gentoo.org'))
>  	assert(not h.known_maintainer('portage'))
>  
>  	assert(h.maintainer_in_herd('zmedico@gentoo.org', 'tools-portage'))
>  	assert(not h.maintainer_in_herd('pva@gentoo.org', 'tools-portage'))
>  
> +	assert(h.herd_by_herd_email('haskell@gentoo.org') == 'haskell')
> +	assert(h.herd_by_herd_email('slyfox@gentoo.org') is None)
> +
This really doesn't belong in the same patch.

If you split those, the patch looks good to me. Michał said it was fine too.

However, I can't merge this until Tom gives me an ACK with regards to the repoman refactor. You also need some QA member (like Tom) to ACK it.

I'll merge your reworked patches as soon as you have the appropriate ACKs.
- -- 
Alexander
bernalex@gentoo.org
https://secure.plaimi.net/~alexander
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iF4EAREIAAYFAlQHBMwACgkQRtClrXBQc7UJJwD/YUCoq5RuZrx31z+D/Q75P6uE
97fpP01Efj3tI7cz4hwA/R/19w9aQ5ke5pnatRbAWSeQfU5UxooUXwtr6sN713XW
=5KzR
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2014-09-03 12:08 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-07-23 14:15 [gentoo-portage-dev] [PATCH] repoman: warn when herd's email appears in <maintainer><email> section Sergei Trofimovich
  -- strict thread matches above, loose matches on Subject: below --
2014-07-24 22:11 Sergei Trofimovich
2014-09-03 12:08 ` Alexander Berntsen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox