* [gentoo-dev] RFP: System to account users configurations
@ 2002-06-16 20:16 Rufiao
0 siblings, 0 replies; 6+ messages in thread
From: Rufiao @ 2002-06-16 20:16 UTC (permalink / raw
To: gentoo-dev
As stated in bug #3778 (http://bugs.gentoo.org/show_bug.cgi?id=3778):
1. Rationale
This system is inspired by Debian's popularity-contest package
(http://packages.debian.org/stable/misc/popularity-contest.html) with some
important enhancements. The key idea is to provide means for the Gentoo
community to account the most used packages, hardware configurations, kernel
versions, compile flags and profiles. Additionaly, this system aims to
provide the following advantages:
- Allow the creation of CD layouts which include the most used packages for
each profile
- Allow the developers to investigate the most used configurations, and focus
on them
for setting priorities, documentation, standard kernel configurations, etc.
- Give some figures about the number of active users in the community
2. Description
The system comprises of 2 subsistems:
- A client-side system that runs periodically through cron to grab information
from users' configurations and post them to the server system trhough HTTP.
This system does not require any user intervention beyond the initial
configuration.
- A server-side system running on the gentoo.org domain capable of receiving
the information provided by the users, store them on a database and create
statistics with them. Also, it provides a web front-end to query the
database.
The following information will be processed by the system:
- Packages installed, including their versions (as in
`qpkg -nc -I -v` from the gentoolkit package)
- Flags in make.conf (as in
`egrep "^(USE|CHOST|CFLAGS|CXXFLAGS)" /etc/make.conf`)
- CPU info (as in `egrep "^(model name|cpu MHz)" /proc/cpuinfo`)
- System memory (as in `egrep "^MemTotal:" /proc/meminfo`)
- PCI devices (as in
`lspci | colrm 1 8 | sed 's/\(.*\)(.*/\1/'` from the pciutils package)
- USB devices (as in `lsusb | grep iProduct | colrm 1 28` from the
usbutils packages)
- Kernel version (as in `uname -r`)
- Profile being used (as in
`ls -ld /etc/make.profile | awk '{print $NF}' | awk -F/ '{print $NF}'`)
In the client side, the procedure to provide data for the system is the
following:
- User emerge the package, which:
- Sets a crontab entry to let the system run periodically, possibly
requiring user intervention to specify when the system should run
- Points to an URL (in the gentoo.org domain) for signup
- User go to the provided url, which requests the e-mail from the user, and
that the user transcribe a random 4-letters message shown as an image to
a text box. These requirements are used to ensure, as long as possible,
the autenticity of the data and to avoid automated signups
- The server-side system will e-mail the user with a key, which must be
placed in the config file
- To post the information to the server-side system, the client-side system
can use the proxy settings defined on /etc/make.conf
- In the first set of data the server-side system receives, it will e-mail
a message to the user to let him know the system is running fine
Note that it is not guaranteed the system will have internet connectivity
when it gets run. In this case, it may keep periodically checking in the
background for a route to the server.
The following vars can be set on the config file:
- Key: as discussed above
- Acknowlege flag: send an e-mail to the user every time a set of data from
him is processed (defaults to false)
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [gentoo-dev] RFP: System to account users configurations
@ 2002-06-16 21:12 Faust Tanasescu
2002-06-16 23:11 ` Rufiao
0 siblings, 1 reply; 6+ messages in thread
From: Faust Tanasescu @ 2002-06-16 21:12 UTC (permalink / raw
To: gentoo-dev
>From: Rufiao <rufiao@gmx.net>
>Reply-To: gentoo-dev@gentoo.org
>To: gentoo-dev@gentoo.org
>Subject: [gentoo-dev] RFP: System to account users configurations
>Date: Sun, 16 Jun 2002 17:16:21 -0300
>
>
>As stated in bug #3778 (http://bugs.gentoo.org/show_bug.cgi?id=3778):
>
>1. Rationale
>
>This system is inspired by Debian's popularity-contest package
>(http://packages.debian.org/stable/misc/popularity-contest.html) with some
>important enhancements. The key idea is to provide means for the Gentoo
>community to account the most used packages, hardware configurations,
>kernel
>versions, compile flags and profiles. Additionaly, this system aims to
>provide the following advantages:
>
>- Allow the creation of CD layouts which include the most used packages for
> each profile
>- Allow the developers to investigate the most used configurations, and
>focus
> on them
> for setting priorities, documentation, standard kernel configurations,
>etc.
>- Give some figures about the number of active users in the community
>
>2. Description
>
>The system comprises of 2 subsistems:
>
>- A client-side system that runs periodically through cron to grab
>information
> from users' configurations and post them to the server system trhough
>HTTP.
> This system does not require any user intervention beyond the initial
> configuration.
>
>- A server-side system running on the gentoo.org domain capable of
>receiving
> the information provided by the users, store them on a database and
>create
> statistics with them. Also, it provides a web front-end to query the
> database.
>
>The following information will be processed by the system:
>
>- Packages installed, including their versions (as in
> `qpkg -nc -I -v` from the gentoolkit package)
>- Flags in make.conf (as in
> `egrep "^(USE|CHOST|CFLAGS|CXXFLAGS)" /etc/make.conf`)
>- CPU info (as in `egrep "^(model name|cpu MHz)" /proc/cpuinfo`)
>- System memory (as in `egrep "^MemTotal:" /proc/meminfo`)
>- PCI devices (as in
> `lspci | colrm 1 8 | sed 's/\(.*\)(.*/\1/'` from the pciutils package)
>- USB devices (as in `lsusb | grep iProduct | colrm 1 28` from the
> usbutils packages)
>- Kernel version (as in `uname -r`)
>- Profile being used (as in
> `ls -ld /etc/make.profile | awk '{print $NF}' | awk -F/ '{print $NF}'`)
>
>In the client side, the procedure to provide data for the system is the
>following:
>
>- User emerge the package, which:
> - Sets a crontab entry to let the system run periodically, possibly
> requiring user intervention to specify when the system should run
> - Points to an URL (in the gentoo.org domain) for signup
>- User go to the provided url, which requests the e-mail from the user, and
> that the user transcribe a random 4-letters message shown as an image to
> a text box. These requirements are used to ensure, as long as possible,
> the autenticity of the data and to avoid automated signups
Users are required to 1) want to participate to this survey 2) asked when
system should run information grab 3) go to URL to subscribe to service 4)
get magic key from server 5) set up client system 6) check it runs well.
We don't have many users and setup is very complicated to my taste for
somethng that brings nothing to me as a gentoo user. And we want people to
sue this. the more, the better.
I don't know about this, but as a gentoo user, if a system like this were
available I would not bother installing it. It is way too lenghty and I get
nothing out of it as an individual.
I propose making this whole process a lot simpler for the client. What we
must keep in mind is that no system is perfect, and to not fall into
paranoia. I therefore propose shortening the setup of this survey system to
something smaller.
1) user required to emerge package.
2) they are asked when the collect should run
and that's it
now how to keep people from abusing of this system is a whole new question
and I think we should treat it separately. However I'd like to propose
something as well.
it's the server's duty to protect itself from idiots. When client connects
to server to upload it's information file, the server sends the client a
unique key that expires after 1 week or couple days.. depends on how often
we want input. If client tries to send input again it could remove the key
file of course and claim it's new to the service, that's why the submitter's
IP address needs to be recorded for first-time users as well.
Of course system is not perfect... the idiot could change his IP address of
course no problemo ... he could disconnect/reconnectto his ISP or something
similar but that would be rael stupid. I don't think that many people would
actually attempt that.
I think that the person who would attempt this, if it's ever going to
happen, it's because our user base has grown very, very large and his impact
would be minimal to our system.
This is just an idea.. i'm sure there are better...
>- The server-side system will e-mail the user with a key, which must be
> placed in the config file
>- To post the information to the server-side system, the client-side system
> can use the proxy settings defined on /etc/make.conf
>- In the first set of data the server-side system receives, it will e-mail
> a message to the user to let him know the system is running fine
>
>Note that it is not guaranteed the system will have internet connectivity
>when it gets run. In this case, it may keep periodically checking in the
>background for a route to the server.
>
>The following vars can be set on the config file:
>
>- Key: as discussed above
>- Acknowlege flag: send an e-mail to the user every time a set of data from
> him is processed (defaults to false)
>_______________________________________________
>gentoo-dev mailing list
>gentoo-dev@gentoo.org
>http://lists.gentoo.org/mailman/listinfo/gentoo-dev
_________________________________________________________________
MSN Photos is the easiest way to share and print your photos:
http://photos.msn.com/support/worldwide.aspx
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [gentoo-dev] RFP: System to account users configurations
2002-06-16 21:12 Faust Tanasescu
@ 2002-06-16 23:11 ` Rufiao
2002-06-18 10:37 ` George Shapovalov
0 siblings, 1 reply; 6+ messages in thread
From: Rufiao @ 2002-06-16 23:11 UTC (permalink / raw
To: gentoo-dev
The abuse of this kind of system should be taken into account, since it may be quite easy for someone to create a bot (or whatever) capable of feeding the system with fake data, and by consequence destroy its reputation.
However, I agree this issue should not complicate the system setup. There are problems with the approach I've described, in particular for users who maintain more than a couple of Gentoo boxes (it may be inconvenient even for people who run more than one machine, due to the fact it's necessary to have one key per machine).
Debian's popularity-contest uses SMTP as its transport, both to avoid the need for constant internet connection and to have some means to ensure the identity of every contributing machine. I'm not sure SMTP can help on the identification of users at all, and it may complicate the setup even more for users who don't have local MTA spools set (and which want to participate but don't have constant connectivity), so I've discarded it.
Also, using the machine's IP addresses as a measure of abuse (by investigating how many posts occur for a given address) may lead to bad results, since some users have more than one machine under a 1:n NAT.
In the end, it may be better to simply avoid the signup, and use some 'loose' approach, which is to ask the user's e-mail to be used just in the case of abuse detection (of course a 'bad' user could provide a fake e-mail address, but in this case, after the detection of abuse and a unsucessful attempt to contact the user, all his provided data can be set to be automatically rejected by the server-side system).
But it may happen there's a better approach for this whole problem.. Any thoughts?
On Sun, 16 Jun 2002 17:12:52 -0400
"Faust Tanasescu" <faust_tanasescu@hotmail.com> wrote:
> >From: Rufiao <rufiao@gmx.net>
> >Reply-To: gentoo-dev@gentoo.org
> >To: gentoo-dev@gentoo.org
> >Subject: [gentoo-dev] RFP: System to account users configurations
> >Date: Sun, 16 Jun 2002 17:16:21 -0300
[...]
> >
> >In the client side, the procedure to provide data for the system is the
> >following:
> >
> >- User emerge the package, which:
> > - Sets a crontab entry to let the system run periodically, possibly
> > requiring user intervention to specify when the system should run
> > - Points to an URL (in the gentoo.org domain) for signup
> >- User go to the provided url, which requests the e-mail from the user, and
> > that the user transcribe a random 4-letters message shown as an image to
> > a text box. These requirements are used to ensure, as long as possible,
> > the autenticity of the data and to avoid automated signups
>
> Users are required to 1) want to participate to this survey 2) asked when
> system should run information grab 3) go to URL to subscribe to service 4)
> get magic key from server 5) set up client system 6) check it runs well.
>
> We don't have many users and setup is very complicated to my taste for
> somethng that brings nothing to me as a gentoo user. And we want people to
> sue this. the more, the better.
> I don't know about this, but as a gentoo user, if a system like this were
> available I would not bother installing it. It is way too lenghty and I get
> nothing out of it as an individual.
>
> I propose making this whole process a lot simpler for the client. What we
> must keep in mind is that no system is perfect, and to not fall into
> paranoia. I therefore propose shortening the setup of this survey system to
> something smaller.
>
> 1) user required to emerge package.
> 2) they are asked when the collect should run
>
> and that's it
>
> now how to keep people from abusing of this system is a whole new question
> and I think we should treat it separately. However I'd like to propose
> something as well.
>
> it's the server's duty to protect itself from idiots. When client connects
> to server to upload it's information file, the server sends the client a
> unique key that expires after 1 week or couple days.. depends on how often
> we want input. If client tries to send input again it could remove the key
> file of course and claim it's new to the service, that's why the submitter's
> IP address needs to be recorded for first-time users as well.
>
> Of course system is not perfect... the idiot could change his IP address of
> course no problemo ... he could disconnect/reconnectto his ISP or something
> similar but that would be rael stupid. I don't think that many people would
> actually attempt that.
>
> I think that the person who would attempt this, if it's ever going to
> happen, it's because our user base has grown very, very large and his impact
> would be minimal to our system.
>
>
> This is just an idea.. i'm sure there are better...
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [gentoo-dev] RFP: System to account users configurations
@ 2002-06-17 0:01 Faust Tanasescu
2002-06-17 0:12 ` Rufiao
0 siblings, 1 reply; 6+ messages in thread
From: Faust Tanasescu @ 2002-06-17 0:01 UTC (permalink / raw
To: gentoo-dev
I'm thinking of lots of glue, a perl script for client and https server on
gentoo.org to allow SSL (secure socket layer) communication between
client/server. It's a fresh approach to solve just this problem... Well
fresh is relative here ;)
Here's a link
http://developer.netscape.com/docs/manuals/security/sslin/contents.htm
>From: Rufiao <rufiao@gmx.net>
>Reply-To: gentoo-dev@gentoo.org
>To: gentoo-dev@gentoo.org
>Subject: Re: [gentoo-dev] RFP: System to account users configurations
>Date: Sun, 16 Jun 2002 20:11:37 -0300
>
>
>The abuse of this kind of system should be taken into account, since it may
>be quite easy for someone to create a bot (or whatever) capable of feeding
>the system with fake data, and by consequence destroy its reputation.
>
>However, I agree this issue should not complicate the system setup. There
>are problems with the approach I've described, in particular for users who
>maintain more than a couple of Gentoo boxes (it may be inconvenient even
>for people who run more than one machine, due to the fact it's necessary to
>have one key per machine).
>
>Debian's popularity-contest uses SMTP as its transport, both to avoid the
>need for constant internet connection and to have some means to ensure the
>identity of every contributing machine. I'm not sure SMTP can help on the
>identification of users at all, and it may complicate the setup even more
>for users who don't have local MTA spools set (and which want to
>participate but don't have constant connectivity), so I've discarded it.
>
>Also, using the machine's IP addresses as a measure of abuse (by
>investigating how many posts occur for a given address) may lead to bad
>results, since some users have more than one machine under a 1:n NAT.
>
>In the end, it may be better to simply avoid the signup, and use some
>'loose' approach, which is to ask the user's e-mail to be used just in the
>case of abuse detection (of course a 'bad' user could provide a fake e-mail
>address, but in this case, after the detection of abuse and a unsucessful
>attempt to contact the user, all his provided data can be set to be
>automatically rejected by the server-side system).
>
>But it may happen there's a better approach for this whole problem.. Any
>thoughts?
>
>On Sun, 16 Jun 2002 17:12:52 -0400
>"Faust Tanasescu" <faust_tanasescu@hotmail.com> wrote:
>
> > >From: Rufiao <rufiao@gmx.net>
> > >Reply-To: gentoo-dev@gentoo.org
> > >To: gentoo-dev@gentoo.org
> > >Subject: [gentoo-dev] RFP: System to account users configurations
> > >Date: Sun, 16 Jun 2002 17:16:21 -0300
>[...]
> > >
> > >In the client side, the procedure to provide data for the system is the
> > >following:
> > >
> > >- User emerge the package, which:
> > > - Sets a crontab entry to let the system run periodically, possibly
> > > requiring user intervention to specify when the system should run
> > > - Points to an URL (in the gentoo.org domain) for signup
> > >- User go to the provided url, which requests the e-mail from the user,
>and
> > > that the user transcribe a random 4-letters message shown as an
>image to
> > > a text box. These requirements are used to ensure, as long as
>possible,
> > > the autenticity of the data and to avoid automated signups
> >
> > Users are required to 1) want to participate to this survey 2) asked
>when
> > system should run information grab 3) go to URL to subscribe to service
>4)
> > get magic key from server 5) set up client system 6) check it runs well.
> >
> > We don't have many users and setup is very complicated to my taste for
> > somethng that brings nothing to me as a gentoo user. And we want people
>to
> > sue this. the more, the better.
> > I don't know about this, but as a gentoo user, if a system like this
>were
> > available I would not bother installing it. It is way too lenghty and I
>get
> > nothing out of it as an individual.
> >
> > I propose making this whole process a lot simpler for the client. What
>we
> > must keep in mind is that no system is perfect, and to not fall into
> > paranoia. I therefore propose shortening the setup of this survey system
>to
> > something smaller.
> >
> > 1) user required to emerge package.
> > 2) they are asked when the collect should run
> >
> > and that's it
> >
> > now how to keep people from abusing of this system is a whole new
>question
> > and I think we should treat it separately. However I'd like to propose
> > something as well.
> >
> > it's the server's duty to protect itself from idiots. When client
>connects
> > to server to upload it's information file, the server sends the client a
> > unique key that expires after 1 week or couple days.. depends on how
>often
> > we want input. If client tries to send input again it could remove the
>key
> > file of course and claim it's new to the service, that's why the
>submitter's
> > IP address needs to be recorded for first-time users as well.
> >
> > Of course system is not perfect... the idiot could change his IP
>address of
> > course no problemo ... he could disconnect/reconnectto his ISP or
>something
> > similar but that would be rael stupid. I don't think that many people
>would
> > actually attempt that.
> >
> > I think that the person who would attempt this, if it's ever going to
> > happen, it's because our user base has grown very, very large and his
>impact
> > would be minimal to our system.
> >
> >
> > This is just an idea.. i'm sure there are better...
>_______________________________________________
>gentoo-dev mailing list
>gentoo-dev@gentoo.org
>http://lists.gentoo.org/mailman/listinfo/gentoo-dev
_________________________________________________________________
Join the worlds largest e-mail service with MSN Hotmail.
http://www.hotmail.com
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [gentoo-dev] RFP: System to account users configurations
2002-06-17 0:01 Faust Tanasescu
@ 2002-06-17 0:12 ` Rufiao
0 siblings, 0 replies; 6+ messages in thread
From: Rufiao @ 2002-06-17 0:12 UTC (permalink / raw
To: gentoo-dev
Using https is not a big deal, but how would it help on this problem?
On Sun, 16 Jun 2002 20:01:20 -0400
"Faust Tanasescu" <faust_tanasescu@hotmail.com> wrote:
> I'm thinking of lots of glue, a perl script for client and https server on
> gentoo.org to allow SSL (secure socket layer) communication between
> client/server. It's a fresh approach to solve just this problem... Well
> fresh is relative here ;)
>
> Here's a link
> http://developer.netscape.com/docs/manuals/security/sslin/contents.htm
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [gentoo-dev] RFP: System to account users configurations
2002-06-16 23:11 ` Rufiao
@ 2002-06-18 10:37 ` George Shapovalov
0 siblings, 0 replies; 6+ messages in thread
From: George Shapovalov @ 2002-06-18 10:37 UTC (permalink / raw
To: gentoo-dev
Hi guys.
Nice to see voting/user feedback discussed here!
I have spent some time few month ago thinking about similar issue. I would
like to point out this link
http://www.its.caltech.edu/~georges/gentoo/epsp/vote0.html
where I present a few thoughts about possible voting system.
I should immediately note, that I was "designing" (well, "0" in the file name
should give you an idea about its status :)) that system for a very specific
purpose - to enhance quality control over ebuilds by allowing all users to
cast their votes indicating ebuild stability and (optionally) popularity.
Accumulation of additional information may provide nice statistics. However I
would like to add one "feature request" to Rufiao's proposal. Namely the
ability to configure what kinds of information are collected and sent
upstream.
As I see there are two possible approaches towards design of voting system:
1. active system - passive users
2. passive system - active users
In reality any reasonable voting system should include elements of both, the
question is more about proportions :). In that text I was leaning more
towards second option. You will find few arguments behind my thinking.
Nonetheless the system I was in the end describing seems to be very similar
to the one proposed by Rufiao :), including concerns about use of ips for
identification and requirement to register. Though overall procedure looks a
bit simplier.
Along the same lines there are two "boundary" positions WRT how much
information is collected and processed.
1. Collect info about individual systems in central location and use that to
build statistics. (pretty much necessary for 1st approach)
2. Only keep statistical info centrally and update it when user votes. May
play well with 2nd approach if done correctly.
As was pointed out second position raises abuse concerns. However I would
still prefer such approach if some care could turn up a reasonably secure
model. At least it would be worth to try that as a first implementation, as
it is much easier on resources and implementation.
Sorry about this rough posting, just wanted to bring up that link in case you
will be able find anything usefull there :). I will try to get back to this
topic and may be write something more detailed :).
George
On Sunday 16 June 2002 16:11, Rufiao wrote:
> The abuse of this kind of system should be taken into account, since it may
> be quite easy for someone to create a bot (or whatever) capable of feeding
> the system with fake data, and by consequence destroy its reputation.
>
> However, I agree this issue should not complicate the system setup. There
> are problems with the approach I've described, in particular for users who
> maintain more than a couple of Gentoo boxes (it may be inconvenient even
> for people who run more than one machine, due to the fact it's necessary to
> have one key per machine).
>
> Debian's popularity-contest uses SMTP as its transport, both to avoid the
> need for constant internet connection and to have some means to ensure the
> identity of every contributing machine. I'm not sure SMTP can help on the
> identification of users at all, and it may complicate the setup even more
> for users who don't have local MTA spools set (and which want to
> participate but don't have constant connectivity), so I've discarded it.
>
> Also, using the machine's IP addresses as a measure of abuse (by
> investigating how many posts occur for a given address) may lead to bad
> results, since some users have more than one machine under a 1:n NAT.
>
> In the end, it may be better to simply avoid the signup, and use some
> 'loose' approach, which is to ask the user's e-mail to be used just in the
> case of abuse detection (of course a 'bad' user could provide a fake e-mail
> address, but in this case, after the detection of abuse and a unsucessful
> attempt to contact the user, all his provided data can be set to be
> automatically rejected by the server-side system).
>
> But it may happen there's a better approach for this whole problem.. Any
> thoughts?
>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2002-06-18 10:38 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-06-16 20:16 [gentoo-dev] RFP: System to account users configurations Rufiao
-- strict thread matches above, loose matches on Subject: below --
2002-06-16 21:12 Faust Tanasescu
2002-06-16 23:11 ` Rufiao
2002-06-18 10:37 ` George Shapovalov
2002-06-17 0:01 Faust Tanasescu
2002-06-17 0:12 ` Rufiao
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox