* Re: [gentoo-dev] packages.gentoo.org lives!
@ 2007-11-30 20:00 99% ` Robin H. Johnson
0 siblings, 0 replies; 1+ results
From: Robin H. Johnson @ 2007-11-30 20:00 UTC (permalink / raw
To: gentoo-dev
[-- Attachment #1: Type: text/plain, Size: 4748 bytes --]
On Fri, Nov 30, 2007 at 10:11:31AM +0100, Jan Kundr?t wrote:
> > - See also RFC1738: 'Within the <path> and <searchpart> components, "/",
> > ";", "?" are reserved.'
> My copy of RFC1738 says (end of section 2.2):
...
> I wasn't able to find your quote in that file.
My quote was from the first sentence of RFC1738, sec 3.3 (HTTP), para 4.
> What is source of your definition of "valid query argument separator"?
<searchpath> is also better defined in RFC2396, section 3.4:
Within a query component, the characters ";", "/", "?", ":", "@",
"&", "=", "+", ",", and "$" are reserved.
Reserved because they have special meanings.
> > - Having a single valid URL for a given resource greatly improves cache
> > hit rates (and we do use caching heavily on the new site, 60% hit rate
> > at the moment, see further down as well).
> Redirecting clients to new URLs would give you perfect caching as well.
That's why I say i'm willing to do redirection at the cache level.
I do NOT want lots of users with old links to hit the actually web application
if it's just going to redirect all of them to a page that is already in the
cache.
> > - The old parsing and variable usage code was the source of multiple
> > bugs as well as the security issue that shuttered the site.
> Only because it passed the raw, unescaped values directly to shell,
> which is of course badly broken.
Have a look at the recent discussion about HTML5 issues
(http://www.crockford.com/html/), which also applies to web applications:
"HTML 5 is strict in the formulation of HTML entities. In the past, some
browsers have been too forgiving of malformed entities, exposing users to
security exploits. Browsers should not perform heroics to try to make bad
content displayable. Such heroics result in security vulnerabilities."
> > - I _want_ old sites to change to using the new form, which I do
> > advertise as being permanent resource URLs (as well as being much
> > easier to construct, take any "[CAT/]PN[-PF]" and slap it onto the
> > base URL, and you are done).
> Which isn't a reason for breaking old links, IMHO.
Visitors to the old /ebuilds/ or /packages/ links get a redirect to the
frontpage. While that isn't the content they were after, it's find to help them
find it.
> > That said, if somebody wants to point me to something decent so that
> > Squid can rewrite the URLs WITH the query parameters (the built-in squid
> > stuff seems to ignore them) and hit the cache, and that can add a big
> > warning at the top of the page, I'd be happy to use it for a transition
> > period, just like the RSS URLs (which are redirected until January 2008,
> > but only because they are automated, and not browsed by humans).
> Now that's something that sound reasonable. Why limit the period and
> don't provide it forever?
Time limited to force everybody to move over, and to not have to support
the redirections for the old version of the site forever, when they
weren't advertised as permanent URLs.
I did a quick hack up of some statistics, and I see that only 6.7% (5001 out of
(69434+5001)) of the overall visitors were arriving at the old locations and
not receiving the content they were originally interested in.
Based on these stats, I'd say we are doing well in getting users to
update their links for the new site already, since it's been up for 2
weeks now.
Successful page loads (2xx, 304), by section, for November 29th.
60 /verbump
114 /newpackage
167 /faq
645 /robots.txt
779 /categories
1037 /arch
2348 /category
3329 /favicon.ico
9084 /
9292 /media
20491 /package
35354 /feed
-----------------------------
69434 Total of data pages (no robots, css, images, favicon)
13266 Total of rotos, images, favicon.
Failed page loads (4xx, 5xx, 3xx excluding 304), by section and code, for
November 29th. Slew of 404 codes for PHP exploits excluded, and grouped by
how it was handled:
- Specific redirect for usage of an old RSS path:
25 /feed 301
91 /archs 301
- Redirected because requested object not found (invalid package, etc):
25 /arch 302
30 /category 302
44 /feed 406
164 /feed 302
632 /package 302
- Error or general redirect for an old URL:
11 /similar 404
22 /main 404
24 ///x86%20stable 404
44 /daily 404
222 /search 404
347 /images 404 (excluded from total)
2096 /ebuilds 302
2582 /packages 302
-----------------------------
5001 Total (no images)
--
Robin Hugh Johnson
Gentoo Linux Developer & Infra Guy
E-Mail : robbat2@gentoo.org
GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85
[-- Attachment #2: Type: application/pgp-signature, Size: 321 bytes --]
^ permalink raw reply [relevance 99%]
Results 1-1 of 1 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2007-11-14 2:58 [gentoo-dev] packages.gentoo.org lives! Robin H. Johnson
2007-11-29 15:20 ` Mike Frysinger
2007-11-29 18:33 ` Robin H. Johnson
2007-11-30 9:11 ` Jan Kundrát
2007-11-30 20:00 99% ` Robin H. Johnson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox