public inbox for gentoo-commits@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2016-09-07 20:21 Mart Raudsepp
  0 siblings, 0 replies; 36+ messages in thread
From: Mart Raudsepp @ 2016-09-07 20:21 UTC (permalink / raw
  To: gentoo-commits

commit:     724bb757e8b08382dcbdd460cbef533b91e6338f
Author:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
AuthorDate: Wed Sep  7 20:17:51 2016 +0000
Commit:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
CommitDate: Wed Sep  7 20:17:51 2016 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=724bb757

Don't double-quote debug output for full atoms from %r usage

 backend/lib/models.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/backend/lib/models.py b/backend/lib/models.py
index 5088e3e..8e47d56 100644
--- a/backend/lib/models.py
+++ b/backend/lib/models.py
@@ -16,7 +16,7 @@ class Package(db.Model):
     category = db.relationship('Category', backref=db.backref('packages', lazy='dynamic'))
 
     def __repr__(self):
-        return "<Package %r/%r>" % (self.category.name, self.name)
+        return "<Package '%s/%s'>" % (self.category.name, self.name)
 
 class PackageVersion(db.Model):
     id = db.Column(db.Integer, primary_key=True)
@@ -25,4 +25,4 @@ class PackageVersion(db.Model):
     package = db.relationship('Package', backref=db.backref('versions', lazy='dynamic'))
 
     def __repr__(self):
-        return "<PackageVersion %s/%r-%r>" % (self.package.category.name, self.package.name, self.version)
+        return "<PackageVersion '%s/%s-%s'>" % (self.package.category.name, self.package.name, self.version)


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2016-09-24  7:02 Mart Raudsepp
  0 siblings, 0 replies; 36+ messages in thread
From: Mart Raudsepp @ 2016-09-24  7:02 UTC (permalink / raw
  To: gentoo-commits

commit:     6113941adc9693cac0a4aa12cdac82f75c7921bd
Author:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
AuthorDate: Sat Sep 24 07:01:30 2016 +0000
Commit:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
CommitDate: Sat Sep 24 07:01:30 2016 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=6113941a

Add a full_name property to package and remove some debug spam on sync

 backend/lib/models.py | 4 ++++
 backend/lib/sync.py   | 1 -
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/backend/lib/models.py b/backend/lib/models.py
index 8e47d56..8f7637d 100644
--- a/backend/lib/models.py
+++ b/backend/lib/models.py
@@ -15,6 +15,10 @@ class Package(db.Model):
     category_id = db.Column(db.Integer, db.ForeignKey('category.id'), nullable=False)
     category = db.relationship('Category', backref=db.backref('packages', lazy='dynamic'))
 
+    @property
+    def full_name(self):
+        return "%s/%s" % (self.category.name, self.name)
+
     def __repr__(self):
         return "<Package '%s/%s'>" % (self.category.name, self.name)
 

diff --git a/backend/lib/sync.py b/backend/lib/sync.py
index 6dcb6b9..a6aef23 100644
--- a/backend/lib/sync.py
+++ b/backend/lib/sync.py
@@ -26,7 +26,6 @@ def sync_categories():
 def sync_packages():
     for category in Category.query.all():
         existing_packages = category.packages.all()
-        print("Existing packages in DB for category %s: %s" % (category.name, existing_packages,))
         data = http_session.get(url_base + "categories/" + category.name + ".json")
         if not data:
             print("No JSON data for category %s" % category.name) # FIXME: Better handling; mark category as inactive/gone?


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2016-11-10 15:50 Mart Raudsepp
  0 siblings, 0 replies; 36+ messages in thread
From: Mart Raudsepp @ 2016-11-10 15:50 UTC (permalink / raw
  To: gentoo-commits

commit:     c11a833333cc5a9e9b0ce885caddef5a3b593fc4
Author:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
AuthorDate: Thu Nov 10 15:50:27 2016 +0000
Commit:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
CommitDate: Thu Nov 10 15:50:27 2016 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=c11a8333

Normalize subproject inherit-members to True or False during parsing

 backend/lib/sync.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/backend/lib/sync.py b/backend/lib/sync.py
index 7139119..291d701 100644
--- a/backend/lib/sync.py
+++ b/backend/lib/sync.py
@@ -43,8 +43,8 @@ def sync_projects():
                 if 'ref' in elem.attrib:
                     if 'subprojects' not in proj:
                         proj['subprojects'] = []
-                    # subprojects will be a list of (subproject_email, inherit-members) tuples where inherit-members is None, 0 or 1 (if dtd is followed). TODO: Might change if sync code will want it differently
-                    proj['subprojects'].append((elem.attrib['ref'], elem.attrib['inherit-members'] if 'inherit-members' in elem.attrib else None))
+                    # subprojects will be a list of (subproject_email, inherit-members) tuples where inherit-members is True or False. TODO: Might change if sync code will want it differently
+                    proj['subprojects'].append((elem.attrib['ref'], True if ('inherit-members' in elem.attrib and elem.attrib['inherit-members'] == '1') else False))
                 else:
                     print("Invalid <subproject> tag inside project %s - required 'ref' attribute missing" % proj['email'] if 'email' in proj else "<unknown>")
             else:


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2016-11-11  1:22 Mart Raudsepp
  0 siblings, 0 replies; 36+ messages in thread
From: Mart Raudsepp @ 2016-11-11  1:22 UTC (permalink / raw
  To: gentoo-commits

commit:     5972da09a9d9faaa7dbf45929a6c09a0d07d0691
Author:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
AuthorDate: Fri Nov 11 01:22:04 2016 +0000
Commit:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
CommitDate: Fri Nov 11 01:22:04 2016 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=5972da09

Add parsed project members to the result dict

 backend/lib/sync.py | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/backend/lib/sync.py b/backend/lib/sync.py
index 291d701..fbc653a 100644
--- a/backend/lib/sync.py
+++ b/backend/lib/sync.py
@@ -38,6 +38,9 @@ def sync_projects():
                         member[member_tag] = member_elem.text
                 if 'email' in member:
                     # TODO: Sync the members (it's valid as email is given) - maybe at the end, after we have synced the project data, so we can add him to the project directly
+                    if 'members' not in proj:
+                        proj['members'] = []
+                    proj['members'].append(member)
                     pass
             elif tag == 'subproject':
                 if 'ref' in elem.attrib:


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2016-12-04  4:56 Mart Raudsepp
  0 siblings, 0 replies; 36+ messages in thread
From: Mart Raudsepp @ 2016-12-04  4:56 UTC (permalink / raw
  To: gentoo-commits

commit:     20275e6f354929fe3d702fb9b296f828704eb5a1
Author:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
AuthorDate: Sun Dec  4 04:48:07 2016 +0000
Commit:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
CommitDate: Sun Dec  4 04:48:07 2016 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=20275e6f

models: Use sqlalchemy Unicode columns instead of String

 backend/lib/models.py | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/backend/lib/models.py b/backend/lib/models.py
index 8f7637d..57f3e64 100644
--- a/backend/lib/models.py
+++ b/backend/lib/models.py
@@ -3,15 +3,15 @@ from .. import db
 
 class Category(db.Model):
     id = db.Column(db.Integer, primary_key=True)
-    name = db.Column(db.String(30), unique=True, nullable=False)
-    description = db.Column(db.String(500))
+    name = db.Column(db.Unicode(30), unique=True, nullable=False)
+    description = db.Column(db.Unicode(500))
 
     def __repr__(self):
         return "<Category %r>" % self.name
 
 class Package(db.Model):
     id = db.Column(db.Integer, primary_key=True)
-    name = db.Column(db.String(128), nullable=False)
+    name = db.Column(db.Unicode(128), nullable=False)
     category_id = db.Column(db.Integer, db.ForeignKey('category.id'), nullable=False)
     category = db.relationship('Category', backref=db.backref('packages', lazy='dynamic'))
 
@@ -24,7 +24,7 @@ class Package(db.Model):
 
 class PackageVersion(db.Model):
     id = db.Column(db.Integer, primary_key=True)
-    version = db.Column(db.String(128), nullable=False)
+    version = db.Column(db.Unicode(128), nullable=False)
     package_id = db.Column(db.Integer, db.ForeignKey('package.id'), nullable=False)
     package = db.relationship('Package', backref=db.backref('versions', lazy='dynamic'))
 


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2016-12-04  5:26 Mart Raudsepp
  0 siblings, 0 replies; 36+ messages in thread
From: Mart Raudsepp @ 2016-12-04  5:26 UTC (permalink / raw
  To: gentoo-commits

commit:     df4ddb601efbef157147fcfd6057afd01636acab
Author:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
AuthorDate: Sun Dec  4 05:26:10 2016 +0000
Commit:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
CommitDate: Sun Dec  4 05:26:10 2016 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=df4ddb60

sync: Initial projects syncing to DB without members

 backend/lib/sync.py | 35 ++++++++++++++++++++++++++++-------
 1 file changed, 28 insertions(+), 7 deletions(-)

diff --git a/backend/lib/sync.py b/backend/lib/sync.py
index fbc653a..6ed8e01 100644
--- a/backend/lib/sync.py
+++ b/backend/lib/sync.py
@@ -2,19 +2,19 @@ import xml.etree.ElementTree as ET
 from flask import json
 import requests
 from .. import app, db
-from .models import Category, Package, PackageVersion
+from .models import Category, Maintainer, Package, PackageVersion
 
 proj_url = "https://api.gentoo.org/metastructure/projects.xml"
 pkg_url_base = "https://packages.gentoo.org/"
 http_session = requests.session()
 
-def sync_projects():
+def get_project_data():
     data = http_session.get(proj_url)
     if not data:
         print("Failed retrieving projects.xml")
         return
     root = ET.fromstring(data.text)
-    projects = []
+    projects = {}
     # Parsing is based on http://www.gentoo.org/dtd/projects.dtd as of 2016-11-10
     if root.tag.lower() != 'projects':
         print("Downloaded projects.xml root tag isn't 'projects'")
@@ -53,12 +53,33 @@ def sync_projects():
             else:
                 print("Skipping unknown <project> subtag <%s>" % tag)
         if 'email' in proj:
-            projects.append(proj)
+            projects[proj['email']] = proj
         else:
             print("Skipping incomplete project data due to lack of required email identifier: %s" % (proj,))
-    from pprint import pprint
-    print("Found the following projects and data:")
-    pprint(projects)
+    return projects
+
+def sync_projects():
+    projects = get_project_data()
+    existing_maintainers = {}
+    # TODO: Use UPSERT instead (on_conflict_do_update) if we can rely on postgresql:9.5
+    for maintainer in Maintainer.query.all():
+        existing_maintainers[maintainer.email] = maintainer
+    for email, data in projects.items():
+        if email in existing_maintainers:
+            print ("Updating project %s" % email)
+            existing_maintainers[email].is_project = True
+            if 'description' in data:
+                existing_maintainers[email].description = data['description']
+            if 'name' in data:
+                existing_maintainers[email].name = data['name']
+            if 'url' in data:
+                existing_maintainers[email].url = data['url']
+        else:
+            print ("Adding project %s" % email)
+            new_maintainer = Maintainer(email=data['email'], is_project=True, description=data['description'], name=data['name'], url=data['url'])
+            db.session.add(new_maintainer)
+    db.session.commit()
+
 
 def sync_categories():
     url = pkg_url_base + "categories.json"


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2016-12-04  5:26 Mart Raudsepp
  0 siblings, 0 replies; 36+ messages in thread
From: Mart Raudsepp @ 2016-12-04  5:26 UTC (permalink / raw
  To: gentoo-commits

commit:     a46c779bf33cf558d287f8bcf11a5e483046bb17
Author:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
AuthorDate: Sun Dec  4 05:24:45 2016 +0000
Commit:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
CommitDate: Sun Dec  4 05:25:29 2016 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=a46c779b

models: Add Maintainer model

As this is a new table, just re-doing "./manage.py init" should add it to db,
while keeping old data.

 backend/lib/models.py | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/backend/lib/models.py b/backend/lib/models.py
index 57f3e64..bc6cd20 100644
--- a/backend/lib/models.py
+++ b/backend/lib/models.py
@@ -30,3 +30,14 @@ class PackageVersion(db.Model):
 
     def __repr__(self):
         return "<PackageVersion '%s/%s-%s'>" % (self.package.category.name, self.package.name, self.version)
+
+class Maintainer(db.Model):
+    id = db.Column(db.Integer, primary_key=True)
+    email = db.Column(db.Unicode(50), nullable=False, unique=True)
+    is_project = db.Column(db.Boolean, nullable=False, server_default='f', default=False)
+    name = db.Column(db.Unicode(128))
+    url = db.Column(db.Unicode())
+    description = db.Column(db.Unicode(500))
+
+    def __repr__(self):
+        return "<Maintainer %s '%s'>" % ("project" if self.is_project else "individual", self.email)


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2016-12-04  6:27 Mart Raudsepp
  0 siblings, 0 replies; 36+ messages in thread
From: Mart Raudsepp @ 2016-12-04  6:27 UTC (permalink / raw
  To: gentoo-commits

commit:     d1965a898e3f92f94accb630d4daf68d156a0d0c
Author:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
AuthorDate: Sun Dec  4 06:26:47 2016 +0000
Commit:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
CommitDate: Sun Dec  4 06:26:47 2016 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=d1965a89

sync: Project members and subprojects syncing to DB

 backend/lib/sync.py | 26 +++++++++++++++++++++++++-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/backend/lib/sync.py b/backend/lib/sync.py
index 6ed8e01..57a7cb1 100644
--- a/backend/lib/sync.py
+++ b/backend/lib/sync.py
@@ -78,9 +78,33 @@ def sync_projects():
             print ("Adding project %s" % email)
             new_maintainer = Maintainer(email=data['email'], is_project=True, description=data['description'], name=data['name'], url=data['url'])
             db.session.add(new_maintainer)
+            existing_maintainers[email] = new_maintainer
+        members = []
+        if 'subprojects' in data:
+            for subproject_email, inherit_members in data['subprojects']:
+                # TODO: How should we handle inherit_members?
+                if subproject_email in existing_maintainers:
+                    members.append(existing_maintainers[subproject_email])
+                else:
+                    print("Creating new project entry for subproject: %s" % subproject_email)
+                    new_subproject = Maintainer(email=subproject_email, is_project=True)
+                    db.session.add(new_subproject)
+                    existing_maintainers[subproject_email] = new_subproject
+                    members.append(new_subproject)
+        if 'members' in data:
+            for member in data['members']:
+                if member['email'] in existing_maintainers:
+                    members.append(existing_maintainers[member['email']])
+                else:
+                    print("Adding individual    %s" % member['email'])
+                    new_maintainer = Maintainer(email=member['email'], is_project=False, name=member['name'] if 'name' in member else None)
+                    db.session.add(new_maintainer)
+                    existing_maintainers[member['email']] = new_maintainer
+                    members.append(new_maintainer)
+            # TODO: Include role information in the association?
+        existing_maintainers[email].members = members
     db.session.commit()
 
-
 def sync_categories():
     url = pkg_url_base + "categories.json"
     data = http_session.get(url)


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2016-12-04  6:27 Mart Raudsepp
  0 siblings, 0 replies; 36+ messages in thread
From: Mart Raudsepp @ 2016-12-04  6:27 UTC (permalink / raw
  To: gentoo-commits

commit:     a0e5f8b3559f243236d9dd1170a00d4405042631
Author:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
AuthorDate: Sun Dec  4 06:24:39 2016 +0000
Commit:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
CommitDate: Sun Dec  4 06:24:39 2016 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=a0e5f8b3

models: Add association table and ORM relationship between projects and members

 backend/lib/models.py | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/backend/lib/models.py b/backend/lib/models.py
index bc6cd20..f842a8a 100644
--- a/backend/lib/models.py
+++ b/backend/lib/models.py
@@ -31,6 +31,11 @@ class PackageVersion(db.Model):
     def __repr__(self):
         return "<PackageVersion '%s/%s-%s'>" % (self.package.category.name, self.package.name, self.version)
 
+maintainer_project_membership_rel_table = db.Table('maintainer_project_membership_rel',
+    db.Column('project_id', db.Integer, db.ForeignKey('maintainer.id')),
+    db.Column('maintainer_id', db.Integer, db.ForeignKey('maintainer.id')),
+)
+
 class Maintainer(db.Model):
     id = db.Column(db.Integer, primary_key=True)
     email = db.Column(db.Unicode(50), nullable=False, unique=True)
@@ -39,5 +44,12 @@ class Maintainer(db.Model):
     url = db.Column(db.Unicode())
     description = db.Column(db.Unicode(500))
 
+    members = db.relationship("Maintainer",
+        secondary=maintainer_project_membership_rel_table,
+        primaryjoin=id==maintainer_project_membership_rel_table.c.project_id,
+        secondaryjoin=id==maintainer_project_membership_rel_table.c.maintainer_id,
+        backref='projects')
+    # projects relationship backref ^^
+
     def __repr__(self):
         return "<Maintainer %s '%s'>" % ("project" if self.is_project else "individual", self.email)


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2016-12-04  7:44 Mart Raudsepp
  0 siblings, 0 replies; 36+ messages in thread
From: Mart Raudsepp @ 2016-12-04  7:44 UTC (permalink / raw
  To: gentoo-commits

commit:     080e857b7081db90f874c73fd271d8bd699195d6
Author:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
AuthorDate: Sun Dec  4 07:43:13 2016 +0000
Commit:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
CommitDate: Sun Dec  4 07:43:13 2016 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=080e857b

sync: Update individual maintainer names during projects sync for the time being

... until we don't have master data for this that we shouldn't overwrite.
Also remove a now done TODO item and tweak a debug string I messed up pre-commit.

 backend/lib/sync.py | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/backend/lib/sync.py b/backend/lib/sync.py
index 57a7cb1..4894315 100644
--- a/backend/lib/sync.py
+++ b/backend/lib/sync.py
@@ -37,7 +37,6 @@ def get_project_data():
                     if member_tag in ['email', 'name', 'role']:
                         member[member_tag] = member_elem.text
                 if 'email' in member:
-                    # TODO: Sync the members (it's valid as email is given) - maybe at the end, after we have synced the project data, so we can add him to the project directly
                     if 'members' not in proj:
                         proj['members'] = []
                     proj['members'].append(member)
@@ -94,9 +93,12 @@ def sync_projects():
         if 'members' in data:
             for member in data['members']:
                 if member['email'] in existing_maintainers:
+                    # TODO: Stop overwriting the name from master data, if/once we have a proper sync source for individual maintainers (Gentoo LDAP?)
+                    if 'name' in member:
+                        existing_maintainers[member['email']].name = member['name']
                     members.append(existing_maintainers[member['email']])
                 else:
-                    print("Adding individual    %s" % member['email'])
+                    print("Adding individual maintainer %s" % member['email'])
                     new_maintainer = Maintainer(email=member['email'], is_project=False, name=member['name'] if 'name' in member else None)
                     db.session.add(new_maintainer)
                     existing_maintainers[member['email']] = new_maintainer


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2016-12-04  8:04 Mart Raudsepp
  0 siblings, 0 replies; 36+ messages in thread
From: Mart Raudsepp @ 2016-12-04  8:04 UTC (permalink / raw
  To: gentoo-commits

commit:     dac532df96cb16626f4f1656b5aa2f82b8383c8d
Author:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
AuthorDate: Sun Dec  4 07:59:39 2016 +0000
Commit:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
CommitDate: Sun Dec  4 07:59:39 2016 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=dac532df

sync: Fix UTF-8 handling for projects.xml import

Need to feed response.content bytestring into ElementTree, not response.text.
With the latter ET seems to figure it's already decoded and goes all latin-1 on us.
From response.content bytestream it notices the UTF-8 encoding XML markup and does
things right.

Diagnosed-by: Doug Freed <dwfreed <AT> mtu.edu>

 backend/lib/sync.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/backend/lib/sync.py b/backend/lib/sync.py
index 4894315..22419bf 100644
--- a/backend/lib/sync.py
+++ b/backend/lib/sync.py
@@ -13,7 +13,7 @@ def get_project_data():
     if not data:
         print("Failed retrieving projects.xml")
         return
-    root = ET.fromstring(data.text)
+    root = ET.fromstring(data.content)
     projects = {}
     # Parsing is based on http://www.gentoo.org/dtd/projects.dtd as of 2016-11-10
     if root.tag.lower() != 'projects':


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2016-12-04  8:04 Mart Raudsepp
  0 siblings, 0 replies; 36+ messages in thread
From: Mart Raudsepp @ 2016-12-04  8:04 UTC (permalink / raw
  To: gentoo-commits

commit:     9664464413b7cd59f861eff01148454974e23030
Author:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
AuthorDate: Sun Dec  4 08:02:10 2016 +0000
Commit:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
CommitDate: Sun Dec  4 08:02:10 2016 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=96644644

sync: use requests response.json() directly instead of json.loads

This should ensure requests will handle UTF-8 fully correctly for us

Suggested-by: Doug Freed <dwfreed <AT> mtu.edu>

 backend/lib/sync.py | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/backend/lib/sync.py b/backend/lib/sync.py
index 22419bf..2d6244c 100644
--- a/backend/lib/sync.py
+++ b/backend/lib/sync.py
@@ -1,5 +1,4 @@
 import xml.etree.ElementTree as ET
-from flask import json
 import requests
 from .. import app, db
 from .models import Category, Maintainer, Package, PackageVersion
@@ -111,7 +110,7 @@ def sync_categories():
     url = pkg_url_base + "categories.json"
     data = http_session.get(url)
     # TODO: Handle response error (if not data)
-    categories = json.loads(data.text)
+    categories = data.json()
     existing_categories = {}
     # TODO: Use UPSERT instead (on_conflict_do_update) if we can rely on postgresql:9.5
     for cat in Category.query.all():
@@ -131,7 +130,7 @@ def sync_packages():
         if not data:
             print("No JSON data for category %s" % category.name) # FIXME: Better handling; mark category as inactive/gone?
             continue
-        packages = json.loads(data.text)['packages']
+        packages = data.json()['packages']
         # TODO: Use UPSERT instead (on_conflict_do_update)
         existing_packages = {}
         for pkg in Package.query.all():
@@ -151,5 +150,5 @@ def sync_versions():
             print("No JSON data for package %s" % package.full_name) # FIXME: Handle better; e.g mark the package as removed if no pkgmove update
             continue
         from pprint import pprint
-        pprint(json.loads(data.text))
+        pprint(data.json())
         break


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2016-12-05 17:46 Mart Raudsepp
  0 siblings, 0 replies; 36+ messages in thread
From: Mart Raudsepp @ 2016-12-05 17:46 UTC (permalink / raw
  To: gentoo-commits

commit:     8c264ac120faebd8463f9b6fadde65f40df2ddb0
Author:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
AuthorDate: Mon Dec  5 17:44:25 2016 +0000
Commit:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
CommitDate: Mon Dec  5 17:44:25 2016 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=8c264ac1

sync: return empty dict on projects retrieval error, so the caller won't error

 backend/lib/sync.py | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/backend/lib/sync.py b/backend/lib/sync.py
index 2d6244c..e53fa9b 100644
--- a/backend/lib/sync.py
+++ b/backend/lib/sync.py
@@ -8,16 +8,16 @@ pkg_url_base = "https://packages.gentoo.org/"
 http_session = requests.session()
 
 def get_project_data():
+    projects = {}
     data = http_session.get(proj_url)
     if not data:
         print("Failed retrieving projects.xml")
-        return
+        return projects
     root = ET.fromstring(data.content)
-    projects = {}
     # Parsing is based on http://www.gentoo.org/dtd/projects.dtd as of 2016-11-10
     if root.tag.lower() != 'projects':
         print("Downloaded projects.xml root tag isn't 'projects'")
-        return
+        return projects
     for proj_elem in root:
         if proj_elem.tag.lower() != 'project':
             print("Skipping unknown <projects> subtag <%s>" % proj_elem.tag)


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2016-12-07  0:34 Mart Raudsepp
  0 siblings, 0 replies; 36+ messages in thread
From: Mart Raudsepp @ 2016-12-07  0:34 UTC (permalink / raw
  To: gentoo-commits

commit:     f1a5e9bb01bb7fd802e7cf87b4e9dd675e910140
Author:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
AuthorDate: Wed Dec  7 00:30:06 2016 +0000
Commit:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
CommitDate: Wed Dec  7 00:30:06 2016 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=f1a5e9bb

models: Add description and last_sync_ts columns for Package

description we will get from package.g.o per-package detailed json,
last_sync_ts will be used to record when that detailed json was
last pulled, so that we can rate-limit as-needed.

If still using sqlite, can DROP TABLE package; and re-create with
./manage.py init
or add the columns manually
ALTER TABLE package ADD COLUMN description VARCHAR(500);
ALTER TABLE package ADD COLUMN last_sync_ts TIMESTAMP NOT NULL;

though that NOT NULL vs default on sqlalchemy's side for now might
pose an issue, solving of which is an easy exercise for those that care
instead of recreating.

 backend/lib/models.py | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/backend/lib/models.py b/backend/lib/models.py
index f842a8a..e06dcf8 100644
--- a/backend/lib/models.py
+++ b/backend/lib/models.py
@@ -1,3 +1,4 @@
+from datetime import datetime
 from .. import db
 
 
@@ -14,6 +15,8 @@ class Package(db.Model):
     name = db.Column(db.Unicode(128), nullable=False)
     category_id = db.Column(db.Integer, db.ForeignKey('category.id'), nullable=False)
     category = db.relationship('Category', backref=db.backref('packages', lazy='dynamic'))
+    description = db.Column(db.Unicode(500))
+    last_sync_ts = db.Column(db.TIMESTAMP, nullable=False, default=datetime.utcfromtimestamp(0))
 
     @property
     def full_name(self):
@@ -31,6 +34,7 @@ class PackageVersion(db.Model):
     def __repr__(self):
         return "<PackageVersion '%s/%s-%s'>" % (self.package.category.name, self.package.name, self.version)
 
+
 maintainer_project_membership_rel_table = db.Table('maintainer_project_membership_rel',
     db.Column('project_id', db.Integer, db.ForeignKey('maintainer.id')),
     db.Column('maintainer_id', db.Integer, db.ForeignKey('maintainer.id')),


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2016-12-07  1:58 Mart Raudsepp
  0 siblings, 0 replies; 36+ messages in thread
From: Mart Raudsepp @ 2016-12-07  1:58 UTC (permalink / raw
  To: gentoo-commits

commit:     ed46487bc107c4f404d23e6429e0e4050616459b
Author:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
AuthorDate: Wed Dec  7 01:55:18 2016 +0000
Commit:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
CommitDate: Wed Dec  7 01:55:18 2016 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=ed46487b

models: Add package maintainers relationship table and ORM relationships

 backend/lib/models.py | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/backend/lib/models.py b/backend/lib/models.py
index e06dcf8..ba20622 100644
--- a/backend/lib/models.py
+++ b/backend/lib/models.py
@@ -10,6 +10,11 @@ class Category(db.Model):
     def __repr__(self):
         return "<Category %r>" % self.name
 
+package_maintainer_rel_table = db.Table('package_maintainer_rel',
+    db.Column('package_id', db.Integer, db.ForeignKey('package.id')),
+    db.Column('maintainer_id', db.Integer, db.ForeignKey('maintainer.id')),
+)
+
 class Package(db.Model):
     id = db.Column(db.Integer, primary_key=True)
     name = db.Column(db.Unicode(128), nullable=False)
@@ -17,6 +22,9 @@ class Package(db.Model):
     category = db.relationship('Category', backref=db.backref('packages', lazy='dynamic'))
     description = db.Column(db.Unicode(500))
     last_sync_ts = db.Column(db.TIMESTAMP, nullable=False, default=datetime.utcfromtimestamp(0))
+    maintainers = db.relationship("Maintainer",
+        secondary=package_maintainer_rel_table,
+        backref='directly_maintained_packages')
 
     @property
     def full_name(self):
@@ -54,6 +62,7 @@ class Maintainer(db.Model):
         secondaryjoin=id==maintainer_project_membership_rel_table.c.maintainer_id,
         backref='projects')
     # projects relationship backref ^^
+    # directly_maintained_packages backref - list of packages maintained directly by given project or individual maintainer (as opposed to a bigger list that includes packages maintained by parent/child projects or projects the given individual maintainer is part of)
 
     def __repr__(self):
         return "<Maintainer %s '%s'>" % ("project" if self.is_project else "individual", self.email)


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2016-12-07  1:58 Mart Raudsepp
  0 siblings, 0 replies; 36+ messages in thread
From: Mart Raudsepp @ 2016-12-07  1:58 UTC (permalink / raw
  To: gentoo-commits

commit:     dde4a3a9c8fbe76897219886f21d046392d65730
Author:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
AuthorDate: Wed Dec  7 01:56:00 2016 +0000
Commit:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
CommitDate: Wed Dec  7 01:56:00 2016 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=dde4a3a9

sync: Add package description and maintainers sync

Maintains a sync timestamp to skip recently synced packages, so if a
previous run got stuck, we can skip re-doing it too soon.
Saves the DB transaction after every 100 packages, because packages.g.o
seems to rate-limit us, so at least we will have things saved into DB
periodically to cancel out when we get stuck and restart.

 backend/lib/sync.py | 49 +++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 45 insertions(+), 4 deletions(-)

diff --git a/backend/lib/sync.py b/backend/lib/sync.py
index e53fa9b..567da2d 100644
--- a/backend/lib/sync.py
+++ b/backend/lib/sync.py
@@ -1,8 +1,11 @@
 import xml.etree.ElementTree as ET
 import requests
+import time
+from datetime import datetime
 from .. import app, db
 from .models import Category, Maintainer, Package, PackageVersion
 
+SYNC_BUFFER_SECS = 30*60
 proj_url = "https://api.gentoo.org/metastructure/projects.xml"
 pkg_url_base = "https://packages.gentoo.org/"
 http_session = requests.session()
@@ -144,11 +147,49 @@ def sync_packages():
     db.session.commit()
 
 def sync_versions():
-    for package in Package.query.all():
+    cnt = 0
+    ts = datetime.utcfromtimestamp(time.time() - SYNC_BUFFER_SECS)
+    now = datetime.utcnow()
+    existing_maintainers = {}
+    for maintainer in Maintainer.query.all():
+        existing_maintainers[maintainer.email] = maintainer
+
+    for package in Package.query.filter(Package.last_sync_ts < ts).all():
+        cnt += 1
         data = http_session.get(pkg_url_base + "packages/" + package.full_name + ".json")
         if not data:
             print("No JSON data for package %s" % package.full_name) # FIXME: Handle better; e.g mark the package as removed if no pkgmove update
             continue
-        from pprint import pprint
-        pprint(data.json())
-        break
+
+        pkg = data.json()
+
+        print ("Updating package: %s" % package.full_name)
+        if 'description' in pkg:
+            package.description = pkg['description']
+
+        maintainers = []
+        if 'maintainers' in pkg:
+            for maint in pkg['maintainers']:
+                if 'email' not in maint:
+                    print("WARNING: Package %s was told to have a maintainer without an e-mail identifier" % package.full_name)
+                    continue
+                if maint['email'] in existing_maintainers: # FIXME: Some proxy-maintainers are using mixed case e-mail address, right now we'd be creating duplicates right now if the case is different across different packages
+                    maintainers.append(existing_maintainers[maint['email']])
+                else:
+                    is_project = False
+                    if 'type' in maint and maint['type'] == 'project':
+                        is_project = True
+                    print("Adding %s maintainer %s" % ("project" if is_project else "individual", maint['email']))
+                    new_maintainer = Maintainer(email=maint['email'], is_project=is_project, name=maint['name'] if 'name' in maint else None)
+                    db.session.add(new_maintainer)
+                    existing_maintainers[maint['email']] = new_maintainer
+                    maintainers.append(new_maintainer)
+
+        # Intentionally outside if 'maintainers' in pkg, because if there are no maintainers in JSON, it's falled to maintainer-needed and we need to clean out old maintainer entries
+        package.maintainers = maintainers # TODO: Retain order to know who is primary; retain description associated with the maintainership
+        package.last_sync_ts = now
+
+        if not cnt % 100:
+            print("%d packages updated, committing DB transaction" % cnt)
+            db.session.commit()
+            now = datetime.utcnow()


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2016-12-07  2:10 Mart Raudsepp
  0 siblings, 0 replies; 36+ messages in thread
From: Mart Raudsepp @ 2016-12-07  2:10 UTC (permalink / raw
  To: gentoo-commits

commit:     32483c9459bcfc4f7e3848b3c0e3dc6c1c41829d
Author:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
AuthorDate: Wed Dec  7 02:08:03 2016 +0000
Commit:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
CommitDate: Wed Dec  7 02:08:03 2016 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=32483c94

sync: Order package details syncing based on how old the last sync was

This way if we got stuck and re-run much later (or it has exceeded the buffer
time constant), we'll at least sync the oldest ones first, so we always end
up being less out of date with the oldest sync ts.

 backend/lib/sync.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/backend/lib/sync.py b/backend/lib/sync.py
index 567da2d..0250fba 100644
--- a/backend/lib/sync.py
+++ b/backend/lib/sync.py
@@ -154,7 +154,7 @@ def sync_versions():
     for maintainer in Maintainer.query.all():
         existing_maintainers[maintainer.email] = maintainer
 
-    for package in Package.query.filter(Package.last_sync_ts < ts).all():
+    for package in Package.query.filter(Package.last_sync_ts < ts).order_by(Package.last_sync_ts).all():
         cnt += 1
         data = http_session.get(pkg_url_base + "packages/" + package.full_name + ".json")
         if not data:


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2016-12-07  2:40 Mart Raudsepp
  0 siblings, 0 replies; 36+ messages in thread
From: Mart Raudsepp @ 2016-12-07  2:40 UTC (permalink / raw
  To: gentoo-commits

commit:     c6f4ea5ccc10c9441345f83d9ea6b0d2a121ede4
Author:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
AuthorDate: Wed Dec  7 02:39:40 2016 +0000
Commit:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
CommitDate: Wed Dec  7 02:39:40 2016 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=c6f4ea5c

sync: Don't forget to commit db transaction after all packages are synced

Sometimes don't need to cancel out, so save the updates after the last
modulo 100 to DB too :)

 backend/lib/sync.py | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/backend/lib/sync.py b/backend/lib/sync.py
index 0250fba..8c687c6 100644
--- a/backend/lib/sync.py
+++ b/backend/lib/sync.py
@@ -193,3 +193,5 @@ def sync_versions():
             print("%d packages updated, committing DB transaction" % cnt)
             db.session.commit()
             now = datetime.utcnow()
+
+    db.session.commit()


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2016-12-07  2:53 Mart Raudsepp
  0 siblings, 0 replies; 36+ messages in thread
From: Mart Raudsepp @ 2016-12-07  2:53 UTC (permalink / raw
  To: gentoo-commits

commit:     0522c4ccf0f4ca737572b8164cde6bb9c498ba7f
Author:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
AuthorDate: Wed Dec  7 02:52:48 2016 +0000
Commit:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
CommitDate: Wed Dec  7 02:52:48 2016 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=0522c4cc

sync: Increase the sync delta to 1 hour and print the sync count and oldest TS at start

 backend/lib/sync.py | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/backend/lib/sync.py b/backend/lib/sync.py
index 8c687c6..7ba583d 100644
--- a/backend/lib/sync.py
+++ b/backend/lib/sync.py
@@ -5,7 +5,7 @@ from datetime import datetime
 from .. import app, db
 from .models import Category, Maintainer, Package, PackageVersion
 
-SYNC_BUFFER_SECS = 30*60
+SYNC_BUFFER_SECS = 60*60 #1 hour
 proj_url = "https://api.gentoo.org/metastructure/projects.xml"
 pkg_url_base = "https://packages.gentoo.org/"
 http_session = requests.session()
@@ -154,7 +154,10 @@ def sync_versions():
     for maintainer in Maintainer.query.all():
         existing_maintainers[maintainer.email] = maintainer
 
-    for package in Package.query.filter(Package.last_sync_ts < ts).order_by(Package.last_sync_ts).all():
+    packages_to_sync = Package.query.filter(Package.last_sync_ts < ts).order_by(Package.last_sync_ts).all()
+    print("Going to sync %d packages%s" % (len(packages_to_sync), (" (oldest sync UTC timestamp: %s)" % packages_to_sync[0].last_sync_ts if len(packages_to_sync) else "")))
+
+    for package in packages_to_sync:
         cnt += 1
         data = http_session.get(pkg_url_base + "packages/" + package.full_name + ".json")
         if not data:


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2016-12-07  4:42 Mart Raudsepp
  0 siblings, 0 replies; 36+ messages in thread
From: Mart Raudsepp @ 2016-12-07  4:42 UTC (permalink / raw
  To: gentoo-commits

commit:     5f3073d21e0748a9414fbd516c3e032d0456ab35
Author:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
AuthorDate: Wed Dec  7 04:41:46 2016 +0000
Commit:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
CommitDate: Wed Dec  7 04:41:46 2016 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=5f3073d2

sync: Always handle e-mails in lower case to not end up with duplicates

Suggested-by: Doug Freed <dwfreed <AT> mtu.edu>

 backend/lib/models.py |  1 +
 backend/lib/sync.py   | 24 ++++++++++++++----------
 2 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/backend/lib/models.py b/backend/lib/models.py
index ba20622..2eb9e8c 100644
--- a/backend/lib/models.py
+++ b/backend/lib/models.py
@@ -50,6 +50,7 @@ maintainer_project_membership_rel_table = db.Table('maintainer_project_membershi
 
 class Maintainer(db.Model):
     id = db.Column(db.Integer, primary_key=True)
+    # TODO: This has to be unique case insensitive. Currently we have to always force lower() to guarantee this and find the proper maintainer entry; later we might want to use some sort of NOCASE collate rules here to keep the capitalization as preferred per master data
     email = db.Column(db.Unicode(50), nullable=False, unique=True)
     is_project = db.Column(db.Boolean, nullable=False, server_default='f', default=False)
     name = db.Column(db.Unicode(128))

diff --git a/backend/lib/sync.py b/backend/lib/sync.py
index 7ba583d..744811b 100644
--- a/backend/lib/sync.py
+++ b/backend/lib/sync.py
@@ -28,7 +28,9 @@ def get_project_data():
         proj = {}
         for elem in proj_elem:
             tag = elem.tag.lower()
-            if tag in ['email', 'name', 'url', 'description']:
+            if tag in ['email']:
+                proj[tag] = elem.text.lower()
+            if tag in ['name', 'url', 'description']:
                 proj[tag] = elem.text
             elif tag == 'member':
                 member = {}
@@ -36,19 +38,20 @@ def get_project_data():
                     member['is_lead'] = True
                 for member_elem in elem:
                     member_tag = member_elem.tag.lower()
-                    if member_tag in ['email', 'name', 'role']:
+                    if member_tag in ['email']:
+                        member[member_tag] = member_elem.text.lower()
+                    if member_tag in ['name', 'role']:
                         member[member_tag] = member_elem.text
                 if 'email' in member:
                     if 'members' not in proj:
                         proj['members'] = []
                     proj['members'].append(member)
-                    pass
             elif tag == 'subproject':
                 if 'ref' in elem.attrib:
                     if 'subprojects' not in proj:
                         proj['subprojects'] = []
                     # subprojects will be a list of (subproject_email, inherit-members) tuples where inherit-members is True or False. TODO: Might change if sync code will want it differently
-                    proj['subprojects'].append((elem.attrib['ref'], True if ('inherit-members' in elem.attrib and elem.attrib['inherit-members'] == '1') else False))
+                    proj['subprojects'].append((elem.attrib['ref'].lower(), True if ('inherit-members' in elem.attrib and elem.attrib['inherit-members'] == '1') else False))
                 else:
                     print("Invalid <subproject> tag inside project %s - required 'ref' attribute missing" % proj['email'] if 'email' in proj else "<unknown>")
             else:
@@ -77,7 +80,7 @@ def sync_projects():
                 existing_maintainers[email].url = data['url']
         else:
             print ("Adding project %s" % email)
-            new_maintainer = Maintainer(email=data['email'], is_project=True, description=data['description'], name=data['name'], url=data['url'])
+            new_maintainer = Maintainer(email=email, is_project=True, description=data['description'], name=data['name'], url=data['url'])
             db.session.add(new_maintainer)
             existing_maintainers[email] = new_maintainer
         members = []
@@ -176,16 +179,17 @@ def sync_versions():
                 if 'email' not in maint:
                     print("WARNING: Package %s was told to have a maintainer without an e-mail identifier" % package.full_name)
                     continue
-                if maint['email'] in existing_maintainers: # FIXME: Some proxy-maintainers are using mixed case e-mail address, right now we'd be creating duplicates right now if the case is different across different packages
-                    maintainers.append(existing_maintainers[maint['email']])
+                email = maint['email'].lower()
+                if email in existing_maintainers:
+                    maintainers.append(existing_maintainers[email])
                 else:
                     is_project = False
                     if 'type' in maint and maint['type'] == 'project':
                         is_project = True
-                    print("Adding %s maintainer %s" % ("project" if is_project else "individual", maint['email']))
-                    new_maintainer = Maintainer(email=maint['email'], is_project=is_project, name=maint['name'] if 'name' in maint else None)
+                    print("Adding %s maintainer %s" % ("project" if is_project else "individual", email))
+                    new_maintainer = Maintainer(email=email, is_project=is_project, name=maint['name'] if 'name' in maint else None)
                     db.session.add(new_maintainer)
-                    existing_maintainers[maint['email']] = new_maintainer
+                    existing_maintainers[email] = new_maintainer
                     maintainers.append(new_maintainer)
 
         # Intentionally outside if 'maintainers' in pkg, because if there are no maintainers in JSON, it's falled to maintainer-needed and we need to clean out old maintainer entries


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2016-12-07  7:10 Mart Raudsepp
  0 siblings, 0 replies; 36+ messages in thread
From: Mart Raudsepp @ 2016-12-07  7:10 UTC (permalink / raw
  To: gentoo-commits

commit:     8d90fa100941d73a026a7270f64d16fbe65dc8a5
Author:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
AuthorDate: Wed Dec  7 07:09:52 2016 +0000
Commit:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
CommitDate: Wed Dec  7 07:09:52 2016 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=8d90fa10

models: Add preliminary model and fields for keyword and p.mask storage

 backend/lib/models.py | 24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/backend/lib/models.py b/backend/lib/models.py
index 2eb9e8c..010d58f 100644
--- a/backend/lib/models.py
+++ b/backend/lib/models.py
@@ -2,6 +2,18 @@ from datetime import datetime
 from .. import db
 
 
+class Keyword(db.Model):
+    id = db.Column(db.Integer, primary_key=True)
+    # current longest entries would be of length 16 with "~sparc64-freebsd" and "~sparc64-solaris"
+    name = db.Column(db.Unicode(20), unique=True, nullable=False) # TODO: Force lower case?
+
+    @property
+    def stable(self):
+        return not self.name.startswith('~')
+
+    def __repr__(self):
+        return "<Keyword %r>" % self.name
+
 class Category(db.Model):
     id = db.Column(db.Integer, primary_key=True)
     name = db.Column(db.Unicode(30), unique=True, nullable=False)
@@ -19,12 +31,13 @@ class Package(db.Model):
     id = db.Column(db.Integer, primary_key=True)
     name = db.Column(db.Unicode(128), nullable=False)
     category_id = db.Column(db.Integer, db.ForeignKey('category.id'), nullable=False)
-    category = db.relationship('Category', backref=db.backref('packages', lazy='dynamic'))
+    category = db.relationship('Category', backref=db.backref('packages', lazy='select'))
     description = db.Column(db.Unicode(500))
     last_sync_ts = db.Column(db.TIMESTAMP, nullable=False, default=datetime.utcfromtimestamp(0))
     maintainers = db.relationship("Maintainer",
         secondary=package_maintainer_rel_table,
         backref='directly_maintained_packages')
+    # versions backref
 
     @property
     def full_name(self):
@@ -33,11 +46,18 @@ class Package(db.Model):
     def __repr__(self):
         return "<Package '%s/%s'>" % (self.category.name, self.name)
 
+package_version_keywords_rel_table = db.Table('package_version_keywords_rel',
+    db.Column('package_version_id', db.Integer, db.ForeignKey('package_version.id')),
+    db.Column('keyword_id', db.Integer, db.ForeignKey('keyword.id')),
+)
+
 class PackageVersion(db.Model):
     id = db.Column(db.Integer, primary_key=True)
     version = db.Column(db.Unicode(128), nullable=False)
     package_id = db.Column(db.Integer, db.ForeignKey('package.id'), nullable=False)
-    package = db.relationship('Package', backref=db.backref('versions', lazy='dynamic'))
+    package = db.relationship('Package', backref=db.backref('versions', lazy='select'))
+    keywords = db.relationship("Keyword", secondary=package_version_keywords_rel_table)
+    masks = db.Column(db.UnicodeText, nullable=True) # Concatenated mask reasons if p.masked, NULL if not a masked version. TODO: arch specific masks
 
     def __repr__(self):
         return "<PackageVersion '%s/%s-%s'>" % (self.package.category.name, self.package.name, self.version)


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2017-01-22 11:04 Gilles Dartiguelongue
  0 siblings, 0 replies; 36+ messages in thread
From: Gilles Dartiguelongue @ 2017-01-22 11:04 UTC (permalink / raw
  To: gentoo-commits

commit:     5f53c4b92b93e9206089a15ff3851925ed3b8952
Author:     Gilles Dartiguelongue <eva <AT> gentoo <DOT> org>
AuthorDate: Sun Jan 22 11:04:12 2017 +0000
Commit:     Gilles Dartiguelongue <eva <AT> gentoo <DOT> org>
CommitDate: Sun Jan 22 11:04:40 2017 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=5f53c4b9

sync: use dict-comprehension in sync_packages

 backend/lib/sync.py | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/backend/lib/sync.py b/backend/lib/sync.py
index 48629cc..d292291 100644
--- a/backend/lib/sync.py
+++ b/backend/lib/sync.py
@@ -143,9 +143,9 @@ def sync_packages():
             continue
         packages = data.json()['packages']
         # TODO: Use UPSERT instead (on_conflict_do_update)
-        existing_packages = {}
-        for pkg in Package.query.all():
-            existing_packages[pkg.name] = pkg
+
+        existing_packages = {pkg.name: pkg for pkg in Package.query.all()}
+
         for package in packages:
             if package['name'] in existing_packages:
                 continue # TODO: Update description once we keep that in DB


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2017-01-22 11:04 Gilles Dartiguelongue
  0 siblings, 0 replies; 36+ messages in thread
From: Gilles Dartiguelongue @ 2017-01-22 11:04 UTC (permalink / raw
  To: gentoo-commits

commit:     cd3166150bd42dc8b516e2776d4093418b19d423
Author:     Gilles Dartiguelongue <eva <AT> gentoo <DOT> org>
AuthorDate: Sun Jan 22 11:03:03 2017 +0000
Commit:     Gilles Dartiguelongue <eva <AT> gentoo <DOT> org>
CommitDate: Sun Jan 22 11:04:36 2017 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=cd316615

sync: fix broken sync_packages

I think there is a problem in the logic here but at least this gets me
past the initial sync.

 backend/lib/sync.py | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/backend/lib/sync.py b/backend/lib/sync.py
index 744811b..48629cc 100644
--- a/backend/lib/sync.py
+++ b/backend/lib/sync.py
@@ -131,7 +131,12 @@ def sync_categories():
 
 def sync_packages():
     for category in Category.query.all():
-        existing_packages = category.packages.all()
+        if not category.packages:
+            print('Category %s has no packages' % category.name)
+            existing_packages = []
+        else:
+            existing_packages = category.packages.all()
+
         data = http_session.get(pkg_url_base + "categories/" + category.name + ".json")
         if not data:
             print("No JSON data for category %s" % category.name) # FIXME: Better handling; mark category as inactive/gone?


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2017-01-22 11:59 Mart Raudsepp
  0 siblings, 0 replies; 36+ messages in thread
From: Mart Raudsepp @ 2017-01-22 11:59 UTC (permalink / raw
  To: gentoo-commits

commit:     fab9c6f0ce09830aa95fc3bdfe09c03663094660
Author:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
AuthorDate: Sun Jan 22 11:57:24 2017 +0000
Commit:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
CommitDate: Sun Jan 22 11:57:24 2017 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=fab9c6f0

sync: Fix pkg sync for packages that have a same named pkg in another category

Also fixes an InstrumentedList issue due to change from the categories.packages
relationship from dynamic loading to select in commit 8d90fa1009 having broken
that earlier

 backend/lib/sync.py | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/backend/lib/sync.py b/backend/lib/sync.py
index d292291..7c499b5 100644
--- a/backend/lib/sync.py
+++ b/backend/lib/sync.py
@@ -131,12 +131,6 @@ def sync_categories():
 
 def sync_packages():
     for category in Category.query.all():
-        if not category.packages:
-            print('Category %s has no packages' % category.name)
-            existing_packages = []
-        else:
-            existing_packages = category.packages.all()
-
         data = http_session.get(pkg_url_base + "categories/" + category.name + ".json")
         if not data:
             print("No JSON data for category %s" % category.name) # FIXME: Better handling; mark category as inactive/gone?
@@ -144,7 +138,7 @@ def sync_packages():
         packages = data.json()['packages']
         # TODO: Use UPSERT instead (on_conflict_do_update)
 
-        existing_packages = {pkg.name: pkg for pkg in Package.query.all()}
+        existing_packages = {pkg.name: pkg for pkg in category.packages}
 
         for package in packages:
             if package['name'] in existing_packages:


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2017-01-22 12:00 Gilles Dartiguelongue
  0 siblings, 0 replies; 36+ messages in thread
From: Gilles Dartiguelongue @ 2017-01-22 12:00 UTC (permalink / raw
  To: gentoo-commits

commit:     793722996da7f8c9120c678b16350363d30c6bf1
Author:     Gilles Dartiguelongue <eva <AT> gentoo <DOT> org>
AuthorDate: Sun Jan 22 11:39:41 2017 +0000
Commit:     Gilles Dartiguelongue <eva <AT> gentoo <DOT> org>
CommitDate: Sun Jan 22 12:00:20 2017 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=79372299

sync:  use assert for GLEP67 compliance check

Should never be raised actually but who knows.

 backend/lib/sync.py | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/backend/lib/sync.py b/backend/lib/sync.py
index 7c499b5..ba31477 100644
--- a/backend/lib/sync.py
+++ b/backend/lib/sync.py
@@ -175,15 +175,17 @@ def sync_versions():
         maintainers = []
         if 'maintainers' in pkg:
             for maint in pkg['maintainers']:
-                if 'email' not in maint:
-                    print("WARNING: Package %s was told to have a maintainer without an e-mail identifier" % package.full_name)
-                    continue
+                assert (
+                    'email' in maint and 'type' in maint,
+                    "Package %s maintainer %s entry not GLEP 67 valid" % (package.full_name, maint)
+                )
+
                 email = maint['email'].lower()
                 if email in existing_maintainers:
                     maintainers.append(existing_maintainers[email])
                 else:
                     is_project = False
-                    if 'type' in maint and maint['type'] == 'project':
+                    if maint['type'] == 'project':
                         is_project = True
                     print("Adding %s maintainer %s" % ("project" if is_project else "individual", email))
                     new_maintainer = Maintainer(email=email, is_project=is_project, name=maint['name'] if 'name' in maint else None)


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2017-01-22 12:00 Gilles Dartiguelongue
  0 siblings, 0 replies; 36+ messages in thread
From: Gilles Dartiguelongue @ 2017-01-22 12:00 UTC (permalink / raw
  To: gentoo-commits

commit:     24047d7602bbdbaae60f88e6811dc8570227161f
Author:     Gilles Dartiguelongue <eva <AT> gentoo <DOT> org>
AuthorDate: Sun Jan 22 11:58:33 2017 +0000
Commit:     Gilles Dartiguelongue <eva <AT> gentoo <DOT> org>
CommitDate: Sun Jan 22 12:00:24 2017 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=24047d76

sync: use ORM magics in sync_packages

ORM knows howto map objects to ids through relationships so skip the
details and focus on the thing you want to do.

 backend/lib/sync.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/backend/lib/sync.py b/backend/lib/sync.py
index ba31477..dbb44c2 100644
--- a/backend/lib/sync.py
+++ b/backend/lib/sync.py
@@ -144,7 +144,7 @@ def sync_packages():
             if package['name'] in existing_packages:
                 continue # TODO: Update description once we keep that in DB
             else:
-                new_pkg = Package(category_id=category.id, name=package['name'])
+                new_pkg = Package(category=category, name=package['name'])
                 db.session.add(new_pkg)
     db.session.commit()
 


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2017-01-22 12:08 Mart Raudsepp
  0 siblings, 0 replies; 36+ messages in thread
From: Mart Raudsepp @ 2017-01-22 12:08 UTC (permalink / raw
  To: gentoo-commits

commit:     ed727d30df105b6852f5118baa5a454965b6f4ba
Author:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
AuthorDate: Sun Jan 22 12:07:48 2017 +0000
Commit:     Mart Raudsepp <leio <AT> gentoo <DOT> org>
CommitDate: Sun Jan 22 12:07:48 2017 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=ed727d30

sync: Use dict comprehension in sync_categories as well

 backend/lib/sync.py | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/backend/lib/sync.py b/backend/lib/sync.py
index dbb44c2..c837c23 100644
--- a/backend/lib/sync.py
+++ b/backend/lib/sync.py
@@ -117,10 +117,8 @@ def sync_categories():
     data = http_session.get(url)
     # TODO: Handle response error (if not data)
     categories = data.json()
-    existing_categories = {}
     # TODO: Use UPSERT instead (on_conflict_do_update) if we can rely on postgresql:9.5
-    for cat in Category.query.all():
-        existing_categories[cat.name] = cat
+    existing_categories = {cat.name: cat for cat in Category.query.all()}
     for category in categories:
         if category['name'] in existing_categories:
             existing_categories[category['name']].description = category['description']


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2017-01-22 12:24 Gilles Dartiguelongue
  0 siblings, 0 replies; 36+ messages in thread
From: Gilles Dartiguelongue @ 2017-01-22 12:24 UTC (permalink / raw
  To: gentoo-commits

commit:     e8f79bda15a675e5802b0daad41144b082d20247
Author:     Gilles Dartiguelongue <eva <AT> gentoo <DOT> org>
AuthorDate: Sun Jan 22 12:07:52 2017 +0000
Commit:     Gilles Dartiguelongue <eva <AT> gentoo <DOT> org>
CommitDate: Sun Jan 22 12:23:56 2017 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=e8f79bda

sync: sort imports according to PEP8

 backend/lib/sync.py | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/backend/lib/sync.py b/backend/lib/sync.py
index c837c23..5e8240d 100644
--- a/backend/lib/sync.py
+++ b/backend/lib/sync.py
@@ -1,7 +1,9 @@
-import xml.etree.ElementTree as ET
-import requests
 import time
+import xml.etree.ElementTree as ET
 from datetime import datetime
+
+import requests
+
 from .. import app, db
 from .models import Category, Maintainer, Package, PackageVersion
 
@@ -10,6 +12,7 @@ proj_url = "https://api.gentoo.org/metastructure/projects.xml"
 pkg_url_base = "https://packages.gentoo.org/"
 http_session = requests.session()
 
+
 def get_project_data():
     projects = {}
     data = http_session.get(proj_url)


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2017-01-22 12:24 Gilles Dartiguelongue
  0 siblings, 0 replies; 36+ messages in thread
From: Gilles Dartiguelongue @ 2017-01-22 12:24 UTC (permalink / raw
  To: gentoo-commits

commit:     29a6bea1536dd23adbc84454aacb2c81d0499f82
Author:     Gilles Dartiguelongue <eva <AT> gentoo <DOT> org>
AuthorDate: Sun Jan 22 12:21:36 2017 +0000
Commit:     Gilles Dartiguelongue <eva <AT> gentoo <DOT> org>
CommitDate: Sun Jan 22 12:23:56 2017 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=29a6bea1

sync: replace assert with ValueError raise

Simpler expression, probably here to stay.

 backend/lib/sync.py | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/backend/lib/sync.py b/backend/lib/sync.py
index 0aab3bc..429d14b 100644
--- a/backend/lib/sync.py
+++ b/backend/lib/sync.py
@@ -178,10 +178,11 @@ def sync_versions():
         maintainers = []
         if 'maintainers' in pkg:
             for maint in pkg['maintainers']:
-                assert (
-                    'email' in maint and 'type' in maint,
-                    "Package %s maintainer %s entry not GLEP 67 valid" % (package.full_name, maint)
-                )
+                if 'email' not in maint or 'type' not in maint:
+                    raise ValueError(
+                        "Package %s maintainer %s entry not GLEP 67 valid" %
+                        (package.full_name, maint)
+                    )
 
                 email = maint['email'].lower()
                 if email in existing_maintainers:


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2017-01-22 12:24 Gilles Dartiguelongue
  0 siblings, 0 replies; 36+ messages in thread
From: Gilles Dartiguelongue @ 2017-01-22 12:24 UTC (permalink / raw
  To: gentoo-commits

commit:     5e7347647516660603dddeedcf570d0cfef27b1a
Author:     Gilles Dartiguelongue <eva <AT> gentoo <DOT> org>
AuthorDate: Sun Jan 22 12:18:00 2017 +0000
Commit:     Gilles Dartiguelongue <eva <AT> gentoo <DOT> org>
CommitDate: Sun Jan 22 12:23:56 2017 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=5e734764

sync: define project keys default values

Costs less than checking for it in each loop iteration and does no
harm later to loop on empty lists.

 backend/lib/sync.py | 64 +++++++++++++++++++++++++++--------------------------
 1 file changed, 33 insertions(+), 31 deletions(-)

diff --git a/backend/lib/sync.py b/backend/lib/sync.py
index 5e8240d..0aab3bc 100644
--- a/backend/lib/sync.py
+++ b/backend/lib/sync.py
@@ -28,7 +28,10 @@ def get_project_data():
         if proj_elem.tag.lower() != 'project':
             print("Skipping unknown <projects> subtag <%s>" % proj_elem.tag)
             continue
-        proj = {}
+        proj = {
+            'members': [],
+            'subprojects': [],
+        }
         for elem in proj_elem:
             tag = elem.tag.lower()
             if tag in ['email']:
@@ -46,14 +49,11 @@ def get_project_data():
                     if member_tag in ['name', 'role']:
                         member[member_tag] = member_elem.text
                 if 'email' in member:
-                    if 'members' not in proj:
-                        proj['members'] = []
                     proj['members'].append(member)
             elif tag == 'subproject':
                 if 'ref' in elem.attrib:
-                    if 'subprojects' not in proj:
-                        proj['subprojects'] = []
-                    # subprojects will be a list of (subproject_email, inherit-members) tuples where inherit-members is True or False. TODO: Might change if sync code will want it differently
+                    # subprojects will be a list of (subproject_email, inherit-members) tuples where inherit-members is True or False.
+                    # TODO: Might change if sync code will want it differently
                     proj['subprojects'].append((elem.attrib['ref'].lower(), True if ('inherit-members' in elem.attrib and elem.attrib['inherit-members'] == '1') else False))
                 else:
                     print("Invalid <subproject> tag inside project %s - required 'ref' attribute missing" % proj['email'] if 'email' in proj else "<unknown>")
@@ -86,32 +86,34 @@ def sync_projects():
             new_maintainer = Maintainer(email=email, is_project=True, description=data['description'], name=data['name'], url=data['url'])
             db.session.add(new_maintainer)
             existing_maintainers[email] = new_maintainer
+
         members = []
-        if 'subprojects' in data:
-            for subproject_email, inherit_members in data['subprojects']:
-                # TODO: How should we handle inherit_members?
-                if subproject_email in existing_maintainers:
-                    members.append(existing_maintainers[subproject_email])
-                else:
-                    print("Creating new project entry for subproject: %s" % subproject_email)
-                    new_subproject = Maintainer(email=subproject_email, is_project=True)
-                    db.session.add(new_subproject)
-                    existing_maintainers[subproject_email] = new_subproject
-                    members.append(new_subproject)
-        if 'members' in data:
-            for member in data['members']:
-                if member['email'] in existing_maintainers:
-                    # TODO: Stop overwriting the name from master data, if/once we have a proper sync source for individual maintainers (Gentoo LDAP?)
-                    if 'name' in member:
-                        existing_maintainers[member['email']].name = member['name']
-                    members.append(existing_maintainers[member['email']])
-                else:
-                    print("Adding individual maintainer %s" % member['email'])
-                    new_maintainer = Maintainer(email=member['email'], is_project=False, name=member['name'] if 'name' in member else None)
-                    db.session.add(new_maintainer)
-                    existing_maintainers[member['email']] = new_maintainer
-                    members.append(new_maintainer)
-            # TODO: Include role information in the association?
+
+        for subproject_email, inherit_members in data['subprojects']:
+            # TODO: How should we handle inherit_members?
+            if subproject_email in existing_maintainers:
+                members.append(existing_maintainers[subproject_email])
+            else:
+                print("Creating new project entry for subproject: %s" % subproject_email)
+                new_subproject = Maintainer(email=subproject_email, is_project=True)
+                db.session.add(new_subproject)
+                existing_maintainers[subproject_email] = new_subproject
+                members.append(new_subproject)
+
+        for member in data['members']:
+            if member['email'] in existing_maintainers:
+                # TODO: Stop overwriting the name from master data, if/once we have a proper sync source for individual maintainers (Gentoo LDAP?)
+                if 'name' in member:
+                    existing_maintainers[member['email']].name = member['name']
+                members.append(existing_maintainers[member['email']])
+            else:
+                print("Adding individual maintainer %s" % member['email'])
+                new_maintainer = Maintainer(email=member['email'], is_project=False, name=member['name'] if 'name' in member else None)
+                db.session.add(new_maintainer)
+                existing_maintainers[member['email']] = new_maintainer
+                members.append(new_maintainer)
+
+        # TODO: Include role information in the association?
         existing_maintainers[email].members = members
     db.session.commit()
 


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2017-01-22 12:36 Gilles Dartiguelongue
  0 siblings, 0 replies; 36+ messages in thread
From: Gilles Dartiguelongue @ 2017-01-22 12:36 UTC (permalink / raw
  To: gentoo-commits

commit:     b888c93b7892c532385626c9d2a55a8b11661e99
Author:     Gilles Dartiguelongue <eva <AT> gentoo <DOT> org>
AuthorDate: Sun Jan 22 12:35:17 2017 +0000
Commit:     Gilles Dartiguelongue <eva <AT> gentoo <DOT> org>
CommitDate: Sun Jan 22 12:35:17 2017 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=b888c93b

sync: use dict facilities for key retrieval with a default

 backend/lib/sync.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/backend/lib/sync.py b/backend/lib/sync.py
index 4cbfe1b..723c3af 100644
--- a/backend/lib/sync.py
+++ b/backend/lib/sync.py
@@ -192,7 +192,7 @@ def sync_versions():
                     if maint['type'] == 'project':
                         is_project = True
                     print("Adding %s maintainer %s" % ("project" if is_project else "individual", email))
-                    new_maintainer = Maintainer(email=email, is_project=is_project, name=maint['name'] if 'name' in maint else None)
+                    new_maintainer = Maintainer(email=email, is_project=is_project, name=maint.get('name'))
                     db.session.add(new_maintainer)
                     existing_maintainers[email] = new_maintainer
                     maintainers.append(new_maintainer)


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2017-01-22 12:36 Gilles Dartiguelongue
  0 siblings, 0 replies; 36+ messages in thread
From: Gilles Dartiguelongue @ 2017-01-22 12:36 UTC (permalink / raw
  To: gentoo-commits

commit:     f969ccffe04df2d1eeb014dfe67d58177da476fb
Author:     Gilles Dartiguelongue <eva <AT> gentoo <DOT> org>
AuthorDate: Sun Jan 22 12:34:13 2017 +0000
Commit:     Gilles Dartiguelongue <eva <AT> gentoo <DOT> org>
CommitDate: Sun Jan 22 12:34:13 2017 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=f969ccff

sync: reduce unneeded conditional evaluation

tags cannot be evaluated to go though these branches after the first if so switch to elif.

 backend/lib/sync.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/backend/lib/sync.py b/backend/lib/sync.py
index 429d14b..4cbfe1b 100644
--- a/backend/lib/sync.py
+++ b/backend/lib/sync.py
@@ -36,7 +36,7 @@ def get_project_data():
             tag = elem.tag.lower()
             if tag in ['email']:
                 proj[tag] = elem.text.lower()
-            if tag in ['name', 'url', 'description']:
+            elif tag in ['name', 'url', 'description']:
                 proj[tag] = elem.text
             elif tag == 'member':
                 member = {}
@@ -46,7 +46,7 @@ def get_project_data():
                     member_tag = member_elem.tag.lower()
                     if member_tag in ['email']:
                         member[member_tag] = member_elem.text.lower()
-                    if member_tag in ['name', 'role']:
+                    elif member_tag in ['name', 'role']:
                         member[member_tag] = member_elem.text
                 if 'email' in member:
                     proj['members'].append(member)


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2017-01-22 12:36 Gilles Dartiguelongue
  0 siblings, 0 replies; 36+ messages in thread
From: Gilles Dartiguelongue @ 2017-01-22 12:36 UTC (permalink / raw
  To: gentoo-commits

commit:     c71c75d3fbf28528c844f8280e0ef499dacb1819
Author:     Gilles Dartiguelongue <eva <AT> gentoo <DOT> org>
AuthorDate: Sun Jan 22 12:35:58 2017 +0000
Commit:     Gilles Dartiguelongue <eva <AT> gentoo <DOT> org>
CommitDate: Sun Jan 22 12:35:58 2017 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=c71c75d3

sync: use dict facilities for key retrieval with a default

 backend/lib/sync.py | 39 +++++++++++++++++++--------------------
 1 file changed, 19 insertions(+), 20 deletions(-)

diff --git a/backend/lib/sync.py b/backend/lib/sync.py
index 723c3af..02e1116 100644
--- a/backend/lib/sync.py
+++ b/backend/lib/sync.py
@@ -176,26 +176,25 @@ def sync_versions():
             package.description = pkg['description']
 
         maintainers = []
-        if 'maintainers' in pkg:
-            for maint in pkg['maintainers']:
-                if 'email' not in maint or 'type' not in maint:
-                    raise ValueError(
-                        "Package %s maintainer %s entry not GLEP 67 valid" %
-                        (package.full_name, maint)
-                    )
-
-                email = maint['email'].lower()
-                if email in existing_maintainers:
-                    maintainers.append(existing_maintainers[email])
-                else:
-                    is_project = False
-                    if maint['type'] == 'project':
-                        is_project = True
-                    print("Adding %s maintainer %s" % ("project" if is_project else "individual", email))
-                    new_maintainer = Maintainer(email=email, is_project=is_project, name=maint.get('name'))
-                    db.session.add(new_maintainer)
-                    existing_maintainers[email] = new_maintainer
-                    maintainers.append(new_maintainer)
+        for maint in pkg.get('maintainers', []):
+            if 'email' not in maint or 'type' not in maint:
+                raise ValueError(
+                    "Package %s maintainer %s entry not GLEP 67 valid" %
+                    (package.full_name, maint)
+                )
+
+            email = maint['email'].lower()
+            if email in existing_maintainers:
+                maintainers.append(existing_maintainers[email])
+            else:
+                is_project = False
+                if maint['type'] == 'project':
+                    is_project = True
+                print("Adding %s maintainer %s" % ("project" if is_project else "individual", email))
+                new_maintainer = Maintainer(email=email, is_project=is_project, name=maint.get('name'))
+                db.session.add(new_maintainer)
+                existing_maintainers[email] = new_maintainer
+                maintainers.append(new_maintainer)
 
         # Intentionally outside if 'maintainers' in pkg, because if there are no maintainers in JSON, it's falled to maintainer-needed and we need to clean out old maintainer entries
         package.maintainers = maintainers # TODO: Retain order to know who is primary; retain description associated with the maintainership


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2017-01-22 17:13 Gilles Dartiguelongue
  0 siblings, 0 replies; 36+ messages in thread
From: Gilles Dartiguelongue @ 2017-01-22 17:13 UTC (permalink / raw
  To: gentoo-commits

commit:     01fe45522776507f8b9e5d973c2982f66d78b6db
Author:     Gilles Dartiguelongue <eva <AT> gentoo <DOT> org>
AuthorDate: Sun Jan 22 17:12:53 2017 +0000
Commit:     Gilles Dartiguelongue <eva <AT> gentoo <DOT> org>
CommitDate: Sun Jan 22 17:12:53 2017 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=01fe4552

sync: add detail points to sync_versions

 backend/lib/sync.py | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/backend/lib/sync.py b/backend/lib/sync.py
index 02e1116..22008ea 100644
--- a/backend/lib/sync.py
+++ b/backend/lib/sync.py
@@ -152,6 +152,12 @@ def sync_packages():
     db.session.commit()
 
 def sync_versions():
+    """Synchronize packages version data from packages.gentoo.org.
+
+    For each package that has not been updated in the last SYNC_BUFFER_SECS,
+    pull package information and refresh its description, maintainers,
+    versions and keywords.
+    """
     cnt = 0
     ts = datetime.utcfromtimestamp(time.time() - SYNC_BUFFER_SECS)
     now = datetime.utcnow()
@@ -172,9 +178,12 @@ def sync_versions():
         pkg = data.json()
 
         print ("Updating package: %s" % package.full_name)
+
+        # 1. refresh description
         if 'description' in pkg:
             package.description = pkg['description']
 
+	# 2. refresh maintainers
         maintainers = []
         for maint in pkg.get('maintainers', []):
             if 'email' not in maint or 'type' not in maint:
@@ -198,6 +207,12 @@ def sync_versions():
 
         # Intentionally outside if 'maintainers' in pkg, because if there are no maintainers in JSON, it's falled to maintainer-needed and we need to clean out old maintainer entries
         package.maintainers = maintainers # TODO: Retain order to know who is primary; retain description associated with the maintainership
+
+        # TODO: 3. refresh versions
+
+        # TODO: 4. refresh keywords
+
+        # 5. mark package as refreshed
         package.last_sync_ts = now
 
         if not cnt % 100:


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2017-01-22 17:46 Gilles Dartiguelongue
  0 siblings, 0 replies; 36+ messages in thread
From: Gilles Dartiguelongue @ 2017-01-22 17:46 UTC (permalink / raw
  To: gentoo-commits

commit:     edc09cb3b2f3862e6fc5d5277041fbce091d3281
Author:     Gilles Dartiguelongue <eva <AT> gentoo <DOT> org>
AuthorDate: Sun Jan 22 17:45:56 2017 +0000
Commit:     Gilles Dartiguelongue <eva <AT> gentoo <DOT> org>
CommitDate: Sun Jan 22 17:45:56 2017 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=edc09cb3

sync: add version and keyword synchronization

 backend/lib/sync.py | 42 ++++++++++++++++++++++++++++++++++++++----
 1 file changed, 38 insertions(+), 4 deletions(-)

diff --git a/backend/lib/sync.py b/backend/lib/sync.py
index 22008ea..25b6ea0 100644
--- a/backend/lib/sync.py
+++ b/backend/lib/sync.py
@@ -5,7 +5,7 @@ from datetime import datetime
 import requests
 
 from .. import app, db
-from .models import Category, Maintainer, Package, PackageVersion
+from .models import Category, Keyword, Maintainer, Package, PackageVersion
 
 SYNC_BUFFER_SECS = 60*60 #1 hour
 proj_url = "https://api.gentoo.org/metastructure/projects.xml"
@@ -165,6 +165,8 @@ def sync_versions():
     for maintainer in Maintainer.query.all():
         existing_maintainers[maintainer.email] = maintainer
 
+    all_keywords = {kwd.name: kwd for kwd in Keyword.query.all()}
+
     packages_to_sync = Package.query.filter(Package.last_sync_ts < ts).order_by(Package.last_sync_ts).all()
     print("Going to sync %d packages%s" % (len(packages_to_sync), (" (oldest sync UTC timestamp: %s)" % packages_to_sync[0].last_sync_ts if len(packages_to_sync) else "")))
 
@@ -183,7 +185,7 @@ def sync_versions():
         if 'description' in pkg:
             package.description = pkg['description']
 
-	# 2. refresh maintainers
+        # 2. refresh maintainers
         maintainers = []
         for maint in pkg.get('maintainers', []):
             if 'email' not in maint or 'type' not in maint:
@@ -208,9 +210,41 @@ def sync_versions():
         # Intentionally outside if 'maintainers' in pkg, because if there are no maintainers in JSON, it's falled to maintainer-needed and we need to clean out old maintainer entries
         package.maintainers = maintainers # TODO: Retain order to know who is primary; retain description associated with the maintainership
 
-        # TODO: 3. refresh versions
+        # 3.1. refresh versions
+        pkg_versions = {pkgver.version: pkgver for pkgver in package.versions}
+        for version in pkg['versions']:
+            if version['version'] not in pkg_versions:
+                pkgver = PackageVersion(version=version['version'],
+                                        package=package)
+                db.session.add(pkgver)
+            else:
+                pkgver = pkg_versions[version['version']]
+
+            pkg_keywords = {kwd.name: kwd for kwd in pkgver.keywords}
+
+            # 4.1. synchronize new keywords
+            for keyword in version['keywords']:
+                if keyword in pkg_keywords:
+                    continue
+
+                # TODO: keywords should be initialized earlier to not have to
+                # worry about their existence here
+                if keyword not in all_keywords:
+                    kwd = Keyword(name=keyword)
+                    db.session.add(kwd)
+                    all_keywords[keyword] = kwd
+
+                pkgver.keywords.append(all_keywords[keyword])
+
+            # 4.2. cleanup removed keywords
+            for keyword, kwd_obj in pkg_keywords.items():
+                if keyword not in version['keywords']:
+                    db.session.delete(kwd_obj)
 
-        # TODO: 4. refresh keywords
+        # 3.2 cleanup dead revisions
+        for version, ver_obj in pkg_versions:
+            if version not in pkg['versions']:
+                db.session.delete(ver_obj)
 
         # 5. mark package as refreshed
         package.last_sync_ts = now


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [gentoo-commits] proj/grumpy:master commit in: backend/lib/
@ 2017-01-23  0:06 Gilles Dartiguelongue
  0 siblings, 0 replies; 36+ messages in thread
From: Gilles Dartiguelongue @ 2017-01-23  0:06 UTC (permalink / raw
  To: gentoo-commits

commit:     fe8d9eedef4fa5b406f304c83e064d62860d35df
Author:     Gilles Dartiguelongue <eva <AT> gentoo <DOT> org>
AuthorDate: Mon Jan 23 00:06:46 2017 +0000
Commit:     Gilles Dartiguelongue <eva <AT> gentoo <DOT> org>
CommitDate: Mon Jan 23 00:06:46 2017 +0000
URL:        https://gitweb.gentoo.org/proj/grumpy.git/commit/?id=fe8d9eed

sync: fix a missing .items to iterate on dict

 backend/lib/sync.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/backend/lib/sync.py b/backend/lib/sync.py
index 25b6ea0..c3ed83c 100644
--- a/backend/lib/sync.py
+++ b/backend/lib/sync.py
@@ -242,7 +242,7 @@ def sync_versions():
                     db.session.delete(kwd_obj)
 
         # 3.2 cleanup dead revisions
-        for version, ver_obj in pkg_versions:
+        for version, ver_obj in pkg_versions.items():
             if version not in pkg['versions']:
                 db.session.delete(ver_obj)
 


^ permalink raw reply related	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2017-01-23  0:06 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-12-04  6:27 [gentoo-commits] proj/grumpy:master commit in: backend/lib/ Mart Raudsepp
  -- strict thread matches above, loose matches on Subject: below --
2017-01-23  0:06 Gilles Dartiguelongue
2017-01-22 17:46 Gilles Dartiguelongue
2017-01-22 17:13 Gilles Dartiguelongue
2017-01-22 12:36 Gilles Dartiguelongue
2017-01-22 12:36 Gilles Dartiguelongue
2017-01-22 12:36 Gilles Dartiguelongue
2017-01-22 12:24 Gilles Dartiguelongue
2017-01-22 12:24 Gilles Dartiguelongue
2017-01-22 12:24 Gilles Dartiguelongue
2017-01-22 12:08 Mart Raudsepp
2017-01-22 12:00 Gilles Dartiguelongue
2017-01-22 12:00 Gilles Dartiguelongue
2017-01-22 11:59 Mart Raudsepp
2017-01-22 11:04 Gilles Dartiguelongue
2017-01-22 11:04 Gilles Dartiguelongue
2016-12-07  7:10 Mart Raudsepp
2016-12-07  4:42 Mart Raudsepp
2016-12-07  2:53 Mart Raudsepp
2016-12-07  2:40 Mart Raudsepp
2016-12-07  2:10 Mart Raudsepp
2016-12-07  1:58 Mart Raudsepp
2016-12-07  1:58 Mart Raudsepp
2016-12-07  0:34 Mart Raudsepp
2016-12-05 17:46 Mart Raudsepp
2016-12-04  8:04 Mart Raudsepp
2016-12-04  8:04 Mart Raudsepp
2016-12-04  7:44 Mart Raudsepp
2016-12-04  6:27 Mart Raudsepp
2016-12-04  5:26 Mart Raudsepp
2016-12-04  5:26 Mart Raudsepp
2016-12-04  4:56 Mart Raudsepp
2016-11-11  1:22 Mart Raudsepp
2016-11-10 15:50 Mart Raudsepp
2016-09-24  7:02 Mart Raudsepp
2016-09-07 20:21 Mart Raudsepp

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox