Contact Us

If you are interested in our services leave your contact details below and our sales representatives will contact you.

The organization which you represent
Email address we will use to contact you
Longer contact form…
 
  • About

    mFabrik Blog is about mobile and web software development, open source and Linux. We tell exciting tales where business, technology, web and mobile convergence.

    Creative Commons License
    This work is licensed under a Creative Commons Attribution 3.0 Unported License.

Plone, KSS, Javascript, field validation and the cup of WTF



Knowledge does hurt. Today I had my sweet cup of WTF. We are developing a medical database based on Plone 3.1. It uses very advanced AJAX framework called KSS – basically you can avoid the pain of writing pure Javascript by crafting special CSS like stylesheets which bind server-side Python code to any Javascript event. This makes AJAX programming a joy. You can easily combine server-side logic with user interface events, like field validation.

Well… then there was an error. KSS validation was not working for the text fields on a certain pages…. or it did sometimes. We were not sure. This is so called Heisenbug. I armed myself with sleepy eyes, Firebug and a lot of energy drinks.

I saw a KSS error in the Firebug log window and failed HTTP POST in the server logs.

Invalid request.

The parameter, value, was omitted from the request.

Looks like the field value was not properly posted for the field validation.

The first thing was locate the error and get function traceback for the messy situation. Unfortunately Firefox Javascript engine or Firebug cannot show tracebacks properly… the grass is so much greener on the other side of the fence. So I had to manually search through the codebase by manually plotting console.log() calls here and there.

Finally I thought I pinpointed the cause of the failure. By shaking finger (excitement, tireness and all that extra caffeinen from energy drinks), I opened the Javascript file just to realize why Javascript is utterly utterly shitty and why no sane person wants to do low level Javascript development. If ECMA standard committee had been clever and had been able to enforce anything long time ago, the following piece could be replaced with one function call.

fo.getValueOfFormElement = function(element) {
    // Returns the value of the form element / or null
    // First: update the field in case an editor is lurking
    // in the background
    this.fieldUpdateRegistry.doUpdate(element);
    if (element.disabled) {
        return null;
    }
    // Collect the data
    if (element.selectedIndex != undefined) {
        // handle single selects first
        if(!element.multiple) {
                if (element.selectedIndex < 0) {
                    value="";
                } else {
                    var option = element.options[element.selectedIndex];
                    // on FF and safari, option.value has the value
                    // on IE, option.text needs to be used
                    value = option.value || option.text;
                }
        // Now process selects with the multiple option set
        } else {
            var value = [];
            for(i=0; i<element.options.length; i++) {
                var option = element.options[i];
                if(option.selected) {
                    // on FF and safari, option.value has the value
                    // on IE, option.text needs to be used
                    value.push(option.value || option.text);
                }
            }
        }
    // Safari 3.0.3 no longer has "item", instead it works
    // with direct array access []. Although other browsers
    // seem to support this as well, we provide checking
    // in both ways. (No idea if item is still needed.)
    } else if (typeof element.length != 'undefined' &&
        ((typeof element[0] != 'undefined' &&
        element[0].type == "radio") ||
        (typeof element.item(0) != 'undefined' &&
        element.item(0).type == "radio"))) {
        // element really contains a list of input nodes,
        // in this case.
        var radioList = element;
        value = null;
        for (var i=0; i < radioList.length; i++) {
            var radio = radioList[i] || radioList.item(i);
            if (radio.checked) {
                value = radio.value;
            }
        }
    } else if (element.type == "radio" || element.type == "checkbox") {
        if (element.checked) {
           value = element.value;
        } else {
            value = null;
        }
    } else if ((element.tagName.toLowerCase() == 'textarea')
               || (element.tagName.toLowerCase() == 'input' &&
                    element.type != 'submit' && element.type != 'reset')
              ) {
        value = element.value;
    } else {
        value = null;
    }
    return value;
};

It turned out that the element in this case was an empty list of radio buttons. When you are tab keying through a radio button group without any value selected, like in the case a content object is just created, KSS validation is triggered even though there is no value in any of the radio buttons. This makes KSS think the value is null and it does not properly handle the situation. This does not cause any user visible effects unless you have Javascript debugging on (Firebug + debugging mode in Plone’s Javascript registry).

But this was not the bug I was looking for. It was just masking the original bug, because I had an empty radio button group next to the text field whose validation was not correctly done. More server side debugging…

I inserted some funky debug prints to Archetypes.Field.validate_validators():

validate_validators()
Calling validators:(('isEmptyNoError', V_SUFFICIENT), ('validDecRange03', V_REQUIRED))

We can see that not triggered validator, validDecRange03, is still with us. Then I add more debug prints to see where things go wrong, this time to to Products.validation.chain.__call__.

Calling validators:(('isEmptyNoError', V_SUFFICIENT), ('validDecRange03', V_REQUIRED))
Name:isEmptyNoError value:234234234234234234234234234232342342342344534232342344234534554 result:True

Ok – we have a case here. isEmptyNoError validator is executed before our custom validator. Since this validator is flagged as “sufficient” other validators are not evaluated. I think this has not been the case before and our validator have worked properly… maybe there was API change in Plone 3.1 which broke the things?

After digging and digging and digging I found this 4 years old bug. Let’s open the famous isEmptyNoError source code in Products.validation.validators.EmptyValidator:

class EmptyValidator:
    __implements__ = IValidator

    def __init__(self, name, title='', description='', showError=True):
        self.name = name
        self.title = title or name
        self.description = description
        self.showError = showError

    def __call__(self, value, *args, **kwargs):
        isEmpty  = kwargs.get('isEmpty', False)
        instance = kwargs.get('instance', None)
        field    = kwargs.get('field', None)

        # XXX: This is a temporary fix. Need to be fixed right for AT 2.0
        #      content_edit / BaseObject.processForm() calls
        #      widget.process_form a second time!
        if instance and field:
            widget  = field.widget
            request = getattr(instance, 'REQUEST', None)
            if request and request.form:
                form   = request.form
                result = widget.process_form(instance, field, form,
                                             empty_marker=_marker,
                                             emptyReturnsMarker=True)
                if result is _marker or result is None:
                    isEmpty = True

        if isEmpty:
            return True
        elif value == '' or value is None:
            return True
        else:
            if getattr(self, 'showError', False):
                return ("Validation failed(%(name)s): '%(value)s' is not empty." %
                       { 'name' : self.name, 'value': value})
            else:
                return False

There is my WTF. Or XXX – thanks for the kisses. My guess is that because KSS validation is executed in special context, the magical REQUEST might not be there. The “is sufficient” validator fails because of the some sort of god forgotten magic and thus all custom validators fail in KSS when the field is not required.

The workaround: I add my own greetings to the code:

        # XXX: This is a temporary fix. Need to be fixed right for AT 2.0
        #      content_edit / BaseObject.processForm() calls
        #      widget.process_form a second time!
        if instance and field:
            widget  = field.widget
            request = getattr(instance, 'REQUEST', None)

            # XXX: Whatever this all does, it does not work for KSS validation
            # requests and we should ignore this
            if "fieldname" in request:
              return False

            if request and request.form:
                form   = request.form
                result = widget.process_form(instance, field, form,
                                             empty_marker=_marker,
                                             emptyReturnsMarker=True)
                if result is _marker or result is None:
                    isEmpty = True

If Zope 3 drives you smoking Plone 3 drives me drinking. No wonder newbies steer away from Plone – if you hadn’t been armed with 8 years of web development experience you would never have figured out what’s going on with such a simple thing as adding a custom validator. A comment added to the bug tracker.

Verifying Javascript files with JSLint from Eclipse IDE



With Web 2.0, Javascript has gained a great foothold in the web development. Javascript, which was originally indented to be few lines of interactivity here and there, is now used for full-blown UI frameworks like Ext JS and Dojo. Because its toy background, Javascript is not the easiest language to debug. Also, the lovely phenomenon known as browser wars has ensured that professional Javascript development is PITA due to browser incompatibilities – It seconds the Symbian C++ embedded programming if you know what I mean.

Javascript is executed run-time. Thus, any errors cannot be cached until the code is run. JSLint is a verifier tool which checks Javascript syntax offline. Of course, it cannot detect application logic errors, but it makes sure that you don’t have cross-browser compability problems in your Javascript syntax. Maybe the most famous of these problems is the extra comma before the list termination which is ok for Firefox, but kills the page on Internet Explorer.

Some argue that this does not really catch bugs and instead a comprehensive unit-test suite should be used. This might be a requirement for a platform level Javascript library code, but it is often an overkill for your little site with some flashy dialog windows.

Installing JSLint

JSLint itself is written in Javascript. So you need to offline Javascript interpreter to run it. On Linux there exists Rhino. On Ubuntu, Rhino package is broken. and you need to download the orignal Rhino JAR from here and copy it to /usr/share/java.

sudo apt-get install rhino # Note https://bugs.launchpad.net/ubuntu/+source/rhino/+bug/93885

Then get JSLint offline version from here.

Running JSLint straight from the Eclipse

We use JSEclipse plug-in in Eclipse for Javascript development. Thus, it is convenient to to execute JSLint directly from the Eclipse IDE.

Drag and drop jslint.js to your project/scripts folder in Eclipse.

Create a new External tool like in the example below.

screenshot-external-tools.png

Then you can run it by just clicking “run external tool” icon. Output comes to the console window.

screenshot-pydev-nurse-source-server-media-js-ui-mapjs-eclipse-sdk.png

Checking all Javascript files during unit test

verify_javascript.bash can be executed from shell and it checks all JS files in your project. This is a nice feature to integrate to your unit test cycle.

Example:

$ bash scripts/verify_javascript.bash source/server/media/jssource/server/media/js/jquery.js

Lint at line 11 character 30: Expected '{' and instead saw 'var'.

...

Debugging Django memory leak with TrackRefs and Guppy



I run Django in a standalone long-running application (video encoding server). It leaked memory severely. By using htop, one was seeing two gigabytes reserved for /usr/bin/python after a while. Before starting the debugging session, I had no faintest idea what could be the cause of the problem. Django is robust technology – this kind of things haven’t happened for me before. Since I was running Django in standalone mode, I suspected that some query cache does not get cleared. But random poking around the source code didn’t give any clues.

It was time to do some serious memory debugging for Python.

Python as is doesn’t leak memory, since it’s garbage collected virtual machine. All “leaks” are design problems in the application logic.I found a good primer here what’s going inside Python’s memory management.

First I tried this nice TrackRefs class from Zope. It relies on Python’s own in-interpreter functions to monitor objects.

class TrackRefs:
    """Object to track reference counts across test runs."""

    def __init__(self, limit=40):
        self.type2count = {}
        self.type2all = {}
        self.limit = limit

    def update(self):
        obs = sys.getobjects(0)
        type2count = {}
        type2all = {}
        for o in obs:
            all = sys.getrefcount(o)

            if type(o) is str and o == '<dummy key>':
                # avoid dictionary madness
                continue
            t = type(o)
            if t in type2count:
                type2count[t] += 1
                type2all[t] += all
            else:
                type2count[t] = 1
                type2all[t] = all

        ct = [(type2count[t] - self.type2count.get(t, 0),
               type2all[t] - self.type2all.get(t, 0),
               t)
              for t in type2count.iterkeys()]
        ct.sort()
        ct.reverse()
        printed = False

        logger.debug("----------------------")
        logger.debug("Memory profiling")
        i = 0
        for delta1, delta2, t in ct:
            if delta1 or delta2:
                if not printed:
                    logger.debug("%-55s %8s %8s" % ('', 'insts', 'refs'))
                    printed = True

                logger.debug("%-55s %8d %8d" % (t, delta1, delta2))

                i += 1
                if i >= self.limit:
                    break 

        self.type2count = type2count
        self.type2all = type2all

You need to have Python compiled in debug mode to have sys.getobjects() method. Luckily this beefed up Python binary is availalble from Ubuntu’s stock repository:

sudo apt-get install python-dbg python-mysqldb-dbg

Note that native Python extensions don’t work unless they are specifically compiled against the Python debug build (python-mysqldb-dbg)..

Then I add TrackRefs to my main loop:

    def run(self):

        self.running = True

        logger.info("Started worker " + self.get_worker_id_string())

        # Memory leak tracking
        tracker = TrackRefs()

        while self.running:

            self.mark_for_download()

            self.process_downloads()

            self.process_encodings()

            tracker.update() # Dump memory here
            time.sleep(settings.WORKER_POLL_DELAY)

And after running a while I start getting interesting results:

7956 [2008-03-07 02:59:28,767] INFO Jobs needing sources to download 0
7956 [2008-03-07 02:59:28,768] DEBUG Processable jobs: 0
7956 [2008-03-07 02:59:29,754] DEBUG ----------------------
7956 [2008-03-07 02:59:29,754] DEBUG Memory profiling
7956 [2008-03-07 02:59:29,754] DEBUG                                                            insts     refs
7956 [2008-03-07 02:59:29,754] DEBUG <type 'int'>                                                 150   137406
7956 [2008-03-07 02:59:29,755] DEBUG <type 'tuple'>                                               117   130211
7956 [2008-03-07 02:59:29,755] DEBUG <type 'dict'>                                                  5     8331
7956 [2008-03-07 02:59:29,755] DEBUG <type 'str'>                                                   3    27643
7956 [2008-03-07 02:59:29,755] DEBUG <type 'unicode'>                                               3     4606
7956 [2008-03-07 02:59:29,755] DEBUG <type 'list'>                                                  3     3492
7956 [2008-03-07 02:59:29,756] DEBUG <type 'frame'>                                                 1      962
7956 [2008-03-07 02:59:29,756] DEBUG <type 'cell'>                                                  0    12948
7956 [2008-03-07 02:59:29,756] DEBUG <type 'function'>                                              0     9479

Woah! Who reserved 130 000 ints and tuples? No wonder that soon python gulps 1 gigabytes of memory. Since this is the only number which grows during the main loop cycling and there is no references to classes or objects debugging becomes a bit more difficult. I need to try to cross-reference the difficult tuple objects.

This didn’t go well – with gc.get_referrers() recurive parsing I got some results (example below). But it became soom clearthat debug references from the system itself was impossible: the memory debugging code will always create nasty cyclic references to the system, since it needs to track the objects. I gave up. There had to be something better.

9154 [2008-03-07 04:05:23,571] DEBUG /var/lib/python-support/python2.5/MySQLdb/connections.pyc
9154 [2008-03-07 04:05:23,571] DEBUG <type 'function'>
9154 [2008-03-07 04:05:23,572] DEBUG defaulterrorhandler
9154 [2008-03-07 04:05:23,572] DEBUG <type 'function'>
9154 [2008-03-07 04:05:23,572] DEBUG string_literal
9154 [2008-03-07 04:05:23,572] DEBUG <type 'function'>
9154 [2008-03-07 04:05:23,573] DEBUG unicode_literal
9154 [2008-03-07 04:05:23,573] DEBUG <type 'function'>
9154 [2008-03-07 04:05:23,573] DEBUG string_decoder
9154 [2008-03-07 04:05:23,573] DEBUG <type 'function'>
9154 [2008-03-07 04:05:23,573] DEBUG __exit__
9154 [2008-03-07 04:05:23,574] DEBUG <type 'function'>
9154 [2008-03-07 04:05:23,574] DEBUG begin
9154 [2008-03-07 04:05:23,574] DEBUG <type 'function'>
9154 [2008-03-07 04:05:23,574] DEBUG __init__
9154 [2008-03-07 04:05:23,574] DEBUG <type 'function'>
9154 [2008-03-07 04:05:23,575] DEBUG show_warnings

There was: Guppy. Thank you Sverker Nilsson! You saved my day.

Since the API of Guppy is a little eccentric, here are some examples for you:

# init heapy
heapy = guppy.hpy()

# Print memory statistics
def update():
  print heapy.heap()

# Print relative memory consumption since last sycle
def update():
   print heapy.heap()
   heapy.setref()

# Print relative memory consumption w/heap traversing
def update()
    print heapy.heap().get_rp(40)
    heapy.setref()

With heapy.heap() ; heapy.setref() I got this output:

Partition of a set of 12 objects. Total size = 3544 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0      3  25     2244  63      2244  63 unicode
     1      2  17      708  20      2952  83 types.FrameType
     2      3  25      432  12      3384  95 dict (no owner)
     3      3  25      120   3      3504  99 str
     4      1   8       40   1      3544 100 list

One adds get_rp() travelsal magic and everything becomes clear:

Reference Pattern by <[dict of] class>.
 0: _ --- [-] 14 (dict (no owner) | list | str | types.FrameType | types.Gene...
 1: a      [-] 3 dict (no owner): 0x8c11f34*2, 0x8c1bd54*2, 0x8c1f854*2
 2: aa ---- [-] 1 list: 0x833c504*18
 3: a3       [-] 1 dict of django.db.backends.mysql.base.DatabaseWrapper: 0x8...
 4: a4 ------ [-] 1 dict (no owner): 0x83a65d4*2
 5: a5         [R] 1 guppy.heapy.heapyc.RootStateType: 0xb787c7a8L
 6: a3b ----- [-] 1 django.db.backends.mysql.base.DatabaseWrapper: 0x8356a34
 7: a3ba       [S] 7 dict of module: ..db, ..models, ..query, ..transaction...
 8: b ---- [S] 1 types.FrameType: <<lambda> at 0x8b16ecc>
 9: c      [-] 2 list: 0x833c504*18, 0xb7dafe6cL*5
<Type e.g. '_.more' for more.>

What there could in DatabaseWrapper object which is growing and growing… query debugger. Django keeps track of all queries for debugging purposes (connection.queries). This list is reseted at the end of HTTP request. But in standalone mode, there are no requests. So you need to manually reset to queries list after each working cycle.

        while self.running:

            self.mark_for_download()

            self.process_downloads()

            self.process_encodings()

            tracker.update()

            time.sleep(settings.WORKER_POLL_DELAY)

            # Clear database connection ad reset query debugger
            # between cycles to make sure that
            # related resources get released
            reset_queries()
            connection.close()

            print str(connection.queries)

But even after this fix, I got increase in tuple and int usage when monitoring with TrackRefs. But when I run heapy.heap() alone, there is no increase. So the tuple and int consumption must have been caused by TrackRef, sys.getobjects, gc, etc. magic itself.