public inbox for gentoo-dev@lists.gentoo.org
 help / color / mirror / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download: 
* Re: [gentoo-dev] [GLEP] Web Application Installation
  @ 2003-08-04 22:16 99% ` Stuart Herbert
  0 siblings, 0 replies; 1+ results
From: Stuart Herbert @ 2003-08-04 22:16 UTC (permalink / raw
  To: Max Kalika, Troy Dack, gentoo-dev

[-- Attachment #1: signed data --]
[-- Type: text/plain, Size: 17150 bytes --]

On Monday 04 August 2003 6:11 pm, Max Kalika wrote:
> Good morning!

Evening, Max.

> See, I don't think that running two (or more) copies of an application is
> supporting virtual hosts.  Upgrading all the files is a pain.  Why not have
> one installation of the core files and multiple installations of the config
> files?  This is, of course, very application specific -- most of them will
> still need database upgrades, but that has to be done by the sysadmin.

Because that won't quite work, unless the app is aware of virtual hosting - 
and many (if not most) aren't.  Think about it.

Here's an example.  Imagine hosting (oh, I don't know) www.iammax.com and 
www.maxisgreat.com on the same physical box.  Document roots for each are 
/var/www/<host>/public_html/ for arguments sake.

Now imagine running phpBB on both domains, to provide separate forums.  As far 
as phpBB is concerned, when it accesses, say, login.php on www.iammax.com, 
the URL is http://www.iammax.com/phpbb/login.php, which translates through as 
/var/www/www.iammax.com/public_html/phpbb/login.php.  Similarly, login.php on 
www.maxisgreat.com translates through as 
/var/www/www.maxisgreat.com/public_html/phpbb/login.php

Given this scenario, how do you make these two sites share the same phpBB code 
files?  Here's a few possible ways, and the problems that they cause.  Chip 
in with others - because this is the core problem.

a) A bit of mod_alias magic, and you make the /phpbb/ directories aliases for 
/usr/share/webapps/phpBB-<version>/files/.  If you do this, though, how do 
you get each copy of phpBB to use a separate set of configuration files?  
What happens when the app needs write-access to the directories?  The 
directories are on /usr, and we're all agreed that /usr should be mountable 
read-only.  And does every webserver have something like mod_alias in the 
first place?

If I've understood your eclass correctly, this is what it tries to do, yes?

b) A bit of .htaccess magic, and you have the directory structure of the 
webapp on /var, but directives telling PHP where to find the .php files by 
setting the include_path.  The config file problem is solved, because you can 
drop in local config files, and there are real directories that can be made 
writable.  Only problem with this approach is that not every type of web 
server has the equivalent of .htaccess files - and the PHP SAPI for each type 
of web server doesn't necessarily support configuration directives in config 
files either.  And that's before we think about Perl, Python and other ways 
that webapps can be implemented.

I believe that this is Robin's basic idea, if I've understood correctly.  It's 
a neat solution, but perhaps not a universal one.

c) A bit of find and cp -l, and you've got the directory structure of the 
webapp on /var, plus links back to the original on /usr.  Again, the config 
file problem is solved, because local copies can be parachuted in, and again 
there are real directories that can be made writable.  No webserver-specific 
tricks are required - as far as the web server is concerned, each domain has 
its own installation of phpBB.

d) ?? There must be other solutions to the problem that haven't been discussed 
in this thread.  Please - contribute!

> How do they handle db updates?  How do they handle config file updates?

I've never asked, and I don't want to know ;-)  Maybe Debian just doesn't 
change that much from year to year <grin>.

> > 1) Your eclass doesn't correctly (as I understand correct usage to be!)
> > support the apache2 USE flag.  Easy enough to do - see my webapp-apache
> > eclass for an example.
>
> Why base it on the flag?  If the webserver is installed and is supported,
> configure for it.

Because that's what USE flags are there for.  If the user puts '-apache2' in 
their USE flags, it's the job of the ebuild to respect that.  Otherwise, the 
ebuild is broken - and probably in breach of policy too.

> > 2) Your eclass doesn't specify the permissions that the source files
> > should be  installed under.  Again, easy enough to fix.
>
> This is very straightforward to add.

Agreed.

> > 3) Your eclass doesn't provide support for running multiple copies of the
> > same  app on the same machine.  This is a showstopper.
>
> This whole thing needs a separate discussion.  There's no portage support
> for dealing with more than one installation of the same version of the same
> package.

Why a separate discussion?  If this GLEP isn't here to address this 
fundamental issue, then I'd say that the GLEP has the wrong scope.

You're right - Portage currently can't do all of this on its own.  Perhaps it 
never will be able to.  As I understand it, that's why Robin's volunteered to 
write and maintain additional tools to bridge the gap.

> > 4) Your eclass requires the admin to stop and start Apache as part of the
> > install.  This is a showstopper.  Not every site will want to stop and
> > start  their web server just because they've installed a new app.
> > Imagine a site  hosting hundreds of domains, and having to take them
> > *all* offline at the  same time just because phpBB's been upgraded (for
> > example!).  Robin's idea of  creating .htaccess files under the document
> > root deals with this much more  managably - although I think we're gonna
> > end up using symlinks, as that'll  make it easy to support multiple web
> > servers.
>
> ..htaccess files only work in already web-accessable directories (in your
> case, DocumentRoot).  If we're putting applications in /usr/share/webapps,
> the webserver has no idea to look there. Which is what the Alias directive
> does in all the .conf files that are generated.  However, Alias is not
> allowed in .htaccess AFAIK.

See about for why I think .htaccess isn't the way to go anyway ;-)

> http://httpd.apache.org/docs-2.0/mod/mod_alias.html#alias
>
> Besides that, the server doesn't have to be restarted -- just HUPed.

Fair enough ;-)

> > 6) I *like* the idea that check_php() is in this
> > eclass, because that check is  specific to mod_php under Apache.  I'm
> > gonna steal that and add it to my  webapp-apache eclass ;-)
>
> Steal away!  

Done ;-)

> Keep in mind it is not completely accurate if the per-package
> USE flags goes into portage (http://bugs.gentoo.org/show_bug.cgi?id=13616).
> Things in PUSE will also have to be taken into account.  But this is a
> completely different matter and will be easy to integrate.

It's a nice start though.  Ideally, we need a programatic interface to 
portage, an API we can use to handle these types of queries.  karltk is 
working on one ;-)

> > 7) Your variable names are not generic enough for my liking ;-)
> > AWEB_CFG, for  example, might be better off being WEBAPP_CFG.
>
> Hey!  I originally had them as WEBAPP_* but was told to change the eclass
> to apache-webapp to distinguish from other servers so I wanted the
> variables to reflect the name of the eclass.  I'm as flexible on this as
> playdoh. :-)

/me bites back his comment about the person who told you to do that ;-)

> > 8) Instead of trying to supply an all-encompasing
> > apache-webapp_src_install(),  relying as it does on defining global
> > variables, I'd have supplied a number  of individual functions to do each
> > bit.  Say, a webapp-install-appconfig,  webapp-install-serverconfig each
> > taking parameters (this is off the top of my  head here ;-)  This is a
> > personal preference thing.
>
> Thats just more to call from the ebuild.  My goal was to have very small
> ebuilds -- just declare a few variables and inherit the eclass.

Hmm ... there's a tradeoff here.  Smaller ebuilds and less flexible (and 
re-usable) eclasses.  Or larger ebuilds, but more re-usable eclasses.  I 
guess I prefer the latter.

> > 9) If I'm not mistaken, your eclass does nothing to ensure that the
> > webapp can  find the configuration files you've moved into
> > /etc/webapps/$PN/.   Personally, I'm coming to the conclusion that
> > /etc/webapps/$PN/ isn't a good  idea, because again it doesn't support
> > the idea of running multiple copies of  the same app on the same machine.
>
> I'd say we should be shooting for an easy upgrade path first.  

Good point ;-)  But what's the point of installing and upgrading configuration 
files that the application is never going to look at? ;-) (Sorry, it's 22:30, 
and still ridiculously hot here)

> If config
> files aren't stored in /etc/webapps/${PN}, then we need to have a way of
> generating an env.d/${PN} which contains a CONFIG_PROTECT line, otherwise
> we're forcing sysadins to reconfigure the application at every upgrade.

Yeah, but if each instance of the installed app has its own config files, then 
what's the relevance of /etc/webapps at all?

> And as I mentioned before, there's no way for portage to handle
> multiple-copy installs, so I'm not sure the best way to go about achieving
> this goal.

This is the problem that I think we should be solving - how to support the 
installation, configuration, and upgrading of multiple-copy installs.

> > 10) I'm coming to the conclusion that 'emerge -u <webapp>' shouldn't
> > overwrite  the older version, but should always install alongside, in a
> > different slot,  so that sites can easily run different versions of apps
> > as required.  Perhaps  this should be configurable somehow?  Your eclass
> > doesn't make this possible.
>
> Why should this be handled any different then the rest of apps handled by
> portage?  

Because most of the rest of the apps handled by portage don't run in a virtual 
domain environment.  Webapps really are a different beast.

> If a sysadmin doesn't want older versions removed, just add
> AUTOCLEAN=no to make.conf.  Granted, the current eclass doesn't use the
> full ${PF} in the target directory, but that is easily changed.

Here's an example.  Imagine you're running your own hosting firm, and you have 
a non-trivial number of customers using the same webapp (say, phpBB for 
arguments sake).  A new version of phpBB comes out.  Some customers will ask 
for the upgrade, and some explicitly will ask you not to upgrade.  So, in 
this situation, you need to have two copies of phpBB installed on the same 
box at the same time.

Now let's look at what happens when you run 'emerge -u phpBB', with the 
appropriate ACCEPT_KEYWORDS of course.  Portage goes and installs the new 
version of phpBB over the top of the old phpBB files.  The old version of 
phpBB gets overwritten, yes?  I don't see how AUTOCLEAN will prevent that 
from happening.

The whole point of SLOTing apps (as I understand it) is to allow you to have 
multiple versions installed alongside each other.  This is the mechanism that 
Portage offers us.  

> Ok.  Hopefully we can get this thing hashed out and out the door so things
> can start getting into shape soon.

Hell yes.

> > 1) Apache1/Apache2 conundrum
> >
> > My eclass uses the detection technique adopted for mod_php, and no-one
> > has  complained about that.  If this eclass is invalid, then so's the
> > ebuild for  mod_php.  And I don't think it is.
>
> Ok, this one is a keeper.

Kudos to Robin - it's his algorithm.

> > 2) Support for multiple DocumentRoot configurations, and also
> > 3) Binary packages installing on machines with different DocumentRoot
> > values
> >
> > Until the GLEP is firmed up and approved, we don't have an agreed
> > solution to  implement.
>
> Ah!  Ok, so before we have this issues ironed out, the current status-quo
> will just be maintained.  Fair enough.

Agreed.

> > Please excuse me, but I don't want to put support for existing ebuilds on
> > hold  while we debate the GLEP.  I believe that we *have* to continue
> > support until  we're ready and able to switch.  Stopping maintenance
> > activities is *not* an  option.
>
> Since I'm the laziest person I know, I didn't want to do the work twice.
> I'll just hold off on introducing more ebuilds which will have to be
> converted later. :-)

That's up to you.  If you're maintaining webapp ebuilds currently in portage, 
though, I'd urge you not to stop maintaining them just because you're waiting 
for a design solution via this GLEP.

> > 4) Which user/group to use
> >
> > My class uses Robin's suggestion, and assumes that Apache is running with
> > the  default settings of apache.apache.
>
> Right, but as we already agreed, not all apps need to have their files
> owned by apache:apache.  This should be configurable in the eclass.
> Correct?

Do we need a new user/group to own most of the files?  And then we just make 
the files that need write-access owned by the webserver?

> > 5) DocumentRoot pointing to a read-only mount
> >
> > As far as I'm concerned, that's like trying to do 'make bzlilo' with
> > /boot not  mounted, or run an 'emerge' with /usr mounted read-only.  It's
> > the sysadmin's  job to make sure that any necessary filesystems are
> > mounted read/write before  an installation is attempted.  This is not a
> > problem unique to web  applications.
>
> The issue is not whether DocumentRoot is a read-only mount during the
> install, but during the day-to-day operations.  Running 'emerge' with /usr
> mounted read-only is not the same as having apache running a webapp that
> needs to write to /usr -- one is done seldomly, the other is all the time.

In his email in that thread you pointed me at, Robin was explicitly talking 
about /var being a read-only NFS mount at install time.

> > 6) "it's weak"
> >
> > That's not an argument, it's an opinion ;-)  Anyway, I've taken 5-10
> > lines of  broken and incorrectly duplicated code from a number of
> > ebuilds, and moved  them all into one place where they can be re-used and
> > maintained for now.   Reduced defects is a strong argument, not a weak
> > one.
>
> I never made the "weak" argument 

Again, this is in response to the comments in the archived thread - not this 
one ;-)  You didn't make the comment - someone else did.

> -- I don't believe in it myself.  Things
> need to be justified a bit better than that, so I'm completely with you on
> this one.

:)

> Ok, agreed.  Troy, is there any chance for another draft with some of these
> things incorporated?  (Thanks a bezillion, btw, for putting up with me) :-)

Who's "putting up" with you?  I don't feel like I am!  I'm just grateful that 
we're both interested in finding a solution to this problem.  Now, if just a 
few more people would chip in and help these discussions ... ;-)

> And a successful one at that from what I saw in my emerge --sync this
> morning. :-)  Thanks!

By the time I got Saturday evening, it was all but over.  Didn't see a single 
person in #gentoo-bugs who I could help :(

> If you say it is very easy to do, I'm on board.  

If it's not easy to do, then we'll scrap it and come up with something better.

> I can't personally speak
> for other webservers, so I'll leave the decision of whether/how to support
> others to those with the experience.  

Fair enough.

> So in any case, we have to pull the
> apache-specific things out into a separate framework.  Therefore things
> like DocumentRoot can't even be considered.  So a central location for
> webapps must be once again taken into account.  

Yep.  How does this sound as the design of the central layout?  Let's agree a 
design, so that it can be added to the GLEP.

* /usr/webapps/<app-name> as the main directory.  
* /usr/webapps/<app-name>/public_html/ for files served by the web server
* /usr/webapps/<app-name>/cgi-bin/ for CGI-BIN files
* /etc/webapps/<app-name>/ to hold the box-default config files
* <app-name> is ${PN} for non-slotted packages
* <app-name> is ${P} for slotted packages

> ok, I didn't realize that it wasn't a permanent solution.  All is ok.

Neat.

> >> How is mod_php related to the way applications are installed?
> >
> > Erm, how about the whole 'do I use Apache 1 or Apache 2' conumdrum?  See
> > the  mod_php ebuild for details.
>
> Whichever is installed gets configured.  Either or both get touched,
> depending on what is detected.  It is up to the sysadmin to start one or
> the either automatically with rc-update.  I don't see a problem if both are
> configured (if detected), but only one is running.

Yeah, but as discussed earlier: 

a) you need a standard way of detecting which one is installed, and 
b) you need to honour the 'apache2' USE flag

Take care,
Stu
-- 
Stuart Herbert                                              stuart@gentoo.org
Gentoo Developer                                       http://www.gentoo.org/
Beta packages for download            http://dev.gentoo.org/~stuart/packages/

GnuGP key id# F9AFC57C available from http://pgp.mit.edu
Key fingerprint = 31FB 50D4 1F88 E227 F319  C549 0C2F 80BA F9AF C57C
--

[-- Attachment #2: signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[relevance 99%]

Results 1-1 of 1 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2003-08-04 17:11     [gentoo-dev] [GLEP] Web Application Installation Max Kalika
2003-08-04 22:16 99% ` Stuart Herbert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox