From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) by finch.gentoo.org (Postfix) with ESMTP id 28C441387FD for ; Fri, 6 Jun 2014 19:56:02 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 25AACE0A4F; Fri, 6 Jun 2014 19:55:56 +0000 (UTC) Received: from mail-ie0-f194.google.com (mail-ie0-f194.google.com [209.85.223.194]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 1472BE095A for ; Fri, 6 Jun 2014 19:55:54 +0000 (UTC) Received: by mail-ie0-f194.google.com with SMTP id rp18so709665iec.5 for ; Fri, 06 Jun 2014 12:55:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; bh=jPOPJTLla95qzqJnU3jeMUNOqghrNvs1/9rWfkfAEiY=; b=q4CbNmjMTCPSRcMHA7RBAP7IkB8ZCDDQ5XN5IUiiU7jYkbXfUXBR46gGTI1KflTEuO BGu2gGhJYzXlfdITOIEHb5injGSNuvMC2TM48KlsEWtwv7OjfBG2XLScXJoEp+Eemffh ALqAp5tzGBjq2IbCAE/1JoQUjPpY3Ow4qQEJKe7SYLLym4jWozEUycjTumzR3PUP39Xo K+fws3sKS5E7vwmolLE5Nl+1fqV0HkqCEloDhUvtuxNtCdHhLxjSbj4uIEGDuCEdN3la U+WC2GGO2iwmk8gwoTK6XvvcnqrCHaoGxFI1UoEjupLxkwfg1WYyO5N5m3EnqGu9MH9N +K2A== X-Received: by 10.50.143.8 with SMTP id sa8mr34071332igb.29.1402084554302; Fri, 06 Jun 2014 12:55:54 -0700 (PDT) Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-user@lists.gentoo.org Reply-to: gentoo-user@lists.gentoo.org MIME-Version: 1.0 Received: by 10.64.136.226 with HTTP; Fri, 6 Jun 2014 12:55:34 -0700 (PDT) In-Reply-To: <20140606183928.GC4718@solfire> References: <20140606025619.GB3837@solfire> <20140606183928.GC4718@solfire> From: =?UTF-8?B?Q2FuZWsgUGVsw6FleiBWYWxkw6lz?= Date: Fri, 6 Jun 2014 14:55:34 -0500 Message-ID: Subject: Re: [gentoo-user] OT: Mapping random numbers (PRNG) To: gentoo-user@lists.gentoo.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Archives-Salt: 5953180b-7f42-40c3-b776-ca7b9a25d4db X-Archives-Hash: bbdaab2ede8d0465f3032919b8e56266 On Fri, Jun 6, 2014 at 1:39 PM, wrote: > Canek Pel=C3=A1ez Vald=C3=A9s [14-06-06 17:36]: >> On Thu, Jun 5, 2014 at 9:56 PM, wrote: >> > Hi, >> > >> > I am experimenting with the C code of the ISAAC pseudo random number g= enerator >> > (http://burtleburtle.net/bob/rand/isaacafa.html). >> > >> > Currently the implementation creates (on my embedded linux) 32 bit >> > hexadecimal output. >> >> So it's a 32 bit integer. >> >> > From this I want to create random numbers in the range of [a-Za-z0-9] >> > *without violating randomness* and (if possible) without throwing >> > away bits of the output. >> >> You mean *characters* int the range [A-Za-z0-9]? >> >> > How can I do this mathemtically (in concern of the quality of output) >> > correct? >> >> The easiest thing to do would be: >> >> ------------------------------------------------------------------------= ------- >> #include >> #include >> #include >> >> #define N (26+26+10) >> >> static char S[] =3D { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', >> 'K', 'L', 'M', >> 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', >> 'X', 'Y', 'Z', >> 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', >> 'k', 'l', 'm', >> 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', >> 'x', 'y', 'z', >> '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' }; >> >> int >> next_character() >> { >> // Use the correct call for ISAAC instead of rand() >> unsigned int idx =3D rand() % N; >> return S[idx]; >> } >> >> int >> main(int argc, char* argv[]) >> { >> // Use the correct call for initializing the ISAAC seed >> srand((unsigned int)time(NULL)); >> for (int i =3D 0; i < 20; i++) // --std=3Dc99 >> printf("%c\n", next_character()); >> return 0; >> } >> ------------------------------------------------------------------------= ------- >> >> If the ISAAC RNG has a good distribution, then the next_character() >> function will give a good distribution among the set [A-Za-z0-9]. >> >> Unless I missunderstood what you meant with "create random numbers in >> the range of [a-Za-z0-9]". >> >> Regards. >> -- >> Canek Pel=C3=A1ez Vald=C3=A9s >> Profesor de asignatura, Facultad de Ciencias >> Universidad Nacional Aut=C3=B3noma de M=C3=A9xico >> > > Hi, > > Thank you very much for the input! :) > > I have a question about the algorithm: > Suppose rand() has an equal distribution of numbers and furthermore > one has a count of 2^32 random numbers listed in numerical sort > order. > In this list each number would appear (nearly) with the same count: 1 > > To get an better imagination of that...suppose the rand() would only > return numbers in the range of 1...12 and the alphabet has only 8 > characters (as 2^32 is not devideable by 62) > > rand(): > 1 2 3 4 5 6 7 8 9 10 11 12 > > rand()%N : rand()%7 > 1 2 3 4 5 6 7 0 1 2 3 4 > > or in other words: An even distribution of numbers of rand() > would result in a unevenly distributed sequence of characters...or? > This would break the quality of ISAACs output. > > I am sure I did something wrong here...but where is the logic trap? In theory, it doesn't matter if the number of random characters you want is not multiple of the size of the set of characters. In practice, it doesn't matter if the number of random characters you want is big enough. Consider the following modification to my program (I changed the order of the array S so it's sorted and I can do binary search): ---------------------------------------------------------------------- #include #include #include #define N (10+26+26) static char S[] =3D { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z' }; static int index_of_aux(char c, int a, int b) { if (a > b) return -1; int m =3D (a + b) / 2; if (S[m] =3D=3D c) return m; if (S[m] > c) return index_of_aux(c, a, m-1); return index_of_aux(c, m+1, b); } static int index_of(char c) { return index_of_aux(c, 0, N-1); } int next_character() { // Use the correct call for isaac instead of rand() unsigned int idx =3D rand() % N; return S[idx]; } int main(int argc, char* argv[]) { srand((unsigned int)time(NULL)); int total =3D 1000000; // Size =3D 62 int count[] =3D { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; for (int i =3D 0; i < total; i++) { char c =3D next_character(); count[index_of(c)]++; } int min =3D total, max =3D 0; for (int i =3D 0; i < N; i++) { printf("Count %d: %d\n", i, count[i]); min =3D (count[i] < min) ? count[i] : min; max =3D (count[i] > max) ? count[i] : max; } printf("Min: %d\n", min); printf("Max: %d\n", max); printf("Diff: %d\n", (max-min)); return 0; } ---------------------------------------------------------------------- The output is the following: Count 0: 16060 Count 1: 16102 Count 2: 15968 Count 3: 16075 ... Count 59: 16220 Count 60: 16143 Count 61: 16177 Min: 15819 Max: 16366 Diff: 547 As you can see, all the characters appear around 16,000 times, so it's pretty well distributed. If the RNG has a good distribution, it will *always* look something like this. The difference will *never* be 0, since no RNG is perfectly distributed. So, TL;DR: it doesn't matter if total is not multiple of N. Regards. --=20 Canek Pel=C3=A1ez Vald=C3=A9s Profesor de asignatura, Facultad de Ciencias Universidad Nacional Aut=C3=B3noma de M=C3=A9xico