March 11, 2010

Estimating Probability of Collisions…

Filed under: Development,Fun,Python — Jeff @ March 12, 2010 3:08 pm

We’re working on redesigning our licensing code and part of that involved generating a unique installation ID for each installation of our software.  The goal is for this identifier to be absolutely unique for each client while still being small enough that it would be feasible for them to read it over the phone to us. 

The identifier is generated from hashing together a lot of system data and converting that into a N digit code, where N is large enough to be unique but small enough not to be cumbersome.

That got us thinking…

For various identifier lengths, how many identifiers could we generate before we had, say, a 1% chance of a collision?

This is an implementation of the birthday paradox (ie. what are the chances two students in a class of 30 share the same birthday). 

It turns out that the math to solve this problem produces some absolutely ginormous numbers.  Excel gave up rather quickly as did Python’s built-in numeric types but fortunately the add-on mpmath library can handle such ridiculous numbers with ease.

A lunch hour of poking at it yielded the following chart.

image

From this you can see that with a key size of only 10 digits and only a few thousand identifiers you are risking a collision while for  a 12 digit key one can safely handle 100,000 + identifiers.  Moving to 15 or 20 digit identifiers makes the chance of collision insignificant unless we are dealing with enormous numbers of identifiers (which we are not).

I’ve created an online calculator for the birthday paradox here.

from mpmath import *
def collisionProbability(nKeys, nEntities):
    t1 = mpf(1)
    for i in range(nKeys-nEntities+1, nKeys+1):
        t1 *= i
    return 1 - t1 * power(mpf(1)/nKeys,nEntities)

April 26, 2006

Boo!

Filed under: .Net,Python — Jeff @ April 26, 2006 9:02 am

No this isn’t a Halloween post or anything like that. This post is about a fancy Pythonic language for the .NET CLR called Boo.

Boo is a strongly typed programming language that has a very similar syntax to Python and it targets the .NET CLR. The syntax differs in quite a few places to make to Boo functionality mesh better with .NET and also there is a lot of syntactic “sugar” added that makes using Boo super fun. Boo comes with a compiler that can turn Boo code into standard executables/libraries. It also comes with an interpreter. Even more interesting is that the interpeter is available in a library that can be used in other software.

While I’m pretty sure we won’t be using Boo for our main software development tasks anytime soon it looks like a marvelous tool for providing user scriptability in our apps. The application I work on here requires vaste amounts of customization per client and having the flexibility of an embedded scripting language that can poke and prod our business objects would be swell. And with Boo it’s really easy.

For example: Say we have a business object that has a name property and we want to allow the user to manipulate that property we can create a Boo interpreter and pass the business object into the environment and let them provide a chunk of script to manipulate our object. It’s that easy. The following code shows how easy this is:

using System;
using System.Collections.Generic;
using System.Text;
using Boo.Lang.Interpreter;
using Boo.Lang.Compiler;

namespace ConsoleApplication1
{
    public class BusinessObject
    {
        private string m_name;

        public string Name
        {
            get { return m_name; }
            set { m_name = value; }
        }
    }

    class Program
    {
        static void Main(string[] args)
        {
            string code = "bo1.Name = bo1.Name.ToUpper()";

            // Create a play business object
            BusinessObject bo1 = new BusinessObject();
            bo1.Name = "Jeff Clement";

            // Setup the Boo Interpreter
            InteractiveInterpreter interp = new InteractiveInterpreter();
            interp.SetValue("bo1", bo1);

            // try running our script
            CompilerContext context;
            context = interp.Eval(code);

            // If compiler raised any errors throw them as exceptions
            if (context.Errors.Count > 0)
                throw new Exception(context.Errors.ToString());

            Console.WriteLine("New value of Name property is " + bo1.Name);
        }
    }
}

The output from the above is the string “New Value of Name property is JEFF CLEMENT” because the Boo script ran against that object and upper-cased the name property.

Anyways that’s about all I have to say. It looks like a very promising tool and I hope we get a chance to roll it out here :)