Wednesday, November 23, 2011

Sort a dictionary of dictionaries in Python

Sounds like a big challenge? not at all!

Problem description:
Suppose you have a dictionary of dictionaries. Every key has a dictionary assigned to it, for example:
mydict = {'hello':dict(key1='val10', key2='val20', key3='val30'),
              'world':dict(key1='val11', key2='val21', key3='val31'),
              'howru':dict(key1='val12', key2='val22', key3='val32')}

and your goal is to get a list of mydict "inner" dictionaries, ordered by key2.

A solution for example:
from operator import itemgetter
mydict_values = mydict.values()
mydict_values.sort(key=itemgetter("key2"))

Explained:
mylist.values() gets the list of values from mydict, which is the "inner" dictionaries.
I'm using sort to sort the list of dictionaries, by key, which is looking for item named key2 as the key.

That's all for this time :-)

Tuesday, November 15, 2011

Temporary disposable email address using Gmail

This is very useful for testing registration processes, where you need to register with a new email address every time.
Suppose you have a gmail address:
name@gmail.com .
You can send emails to:
name+2@gmail.com, name+cnn@gmail.com .
The rule is that you can add any alphanumeric characters after the ‘+’ sign.
All emails would be sent to your name@gmail.com account.

Is nice :-)

Thursday, June 23, 2011

Django forms inheritance

I'm using Django forms and needed to create a form for password reset use case.
It turned out that Django already has a built-in form for this operation, but I needed some extra tweaks on it. In addition, I had 2 use cases which required 2 different validations on the form.
Instead of creating 2 new forms from scratch I decided to use Python's inheritance capabilities and produced the following:

class MyFirstPasswordResetForm (django.contrib.auth.forms.PasswordResetForm):
## Override's Django's email field, since I needed a different error message
email = forms.EmailField(label=_("E-mail"), max_length=75, required=True,
error_messages={'invalid': _(u'Please enter a valid email address')})

## My new field
my_hidden_field = forms.CharField(widget=forms.HiddenInput, required=True)

def clean_email(self):
"""
Calling Django's clean_email function, but overrides Django's error message
"""
try:
return super(MyFirstPasswordResetForm, self).clean_email()
except forms.ValidationError:
raise forms.ValidationError(_("This address was not found in our DB - are you sure this is the email address you used to register?"))

def save(self, domain_override=None, email_template_name='registration/password_reset_email.html',
use_https=False, token_generator=default_token_generator):
"""
Overrides Django's save function
"""
## Do some stuff
pass

And this class inherits from it:
class MySecondPasswordResetForm(MyFirstPasswordResetForm):
'''
Inherits "email", "my_hidden_field" and "save" from MyFirstPasswordResetForm
'''
def clean_email(self):
"""
Override just this function.
"""
email = super(MySecondPasswordResetForm, self).clean_email()
## Additonal validations here, specific for MySecondPasswordResetForm

return email


And it saves code duplication, development and testing time !

Monday, April 18, 2011

Django models - order_by CharField case insensitive

Working with Django models, I needed to sort my data by a string (CharField) column:

class MyModelName(models.Model):
   is_mine = models.BooleanField(default=False)
   name = models.CharField(max_length=100)

So I used this python code:
MyModelName.objects.filter( is_mine=1 ).order_by('name')

However, the default sorting is case sensitive, which caused odd results order:

A
B
C
a
b
c

The solution was to normalize the data (change to lowercase) and then sort:
MyModelName.objects.filter( is_mine=1 ).extra( select={'lower_name': 'lower(name)'}).order_by('lower_name')

So now I get this result:
A
a
B
b
C
c

Which is exactly what I need !! :-)

More on Django API "extra" function here.

Wednesday, October 06, 2010

The ? : operator in Python

We're all familiar with the "? :" operator from Java and other languages, which can be useful in some cases, for example:
max = (x > y) ? x : y
While Python doesn't have this operator built-in, one can definitely mock the behavior:
max = x > y and x or y
How it works?
The expression is parsed left to right, so first "x > y" is parsed.
If "x > y" yields "True", the "and x" part is parsed and since "True and x" yields x, this is the result of the entire expression (no need to parse the right side of the "or" since the left side was successful).
if "x > y" yields "False", then the "or y" part is parsed, resulting y.

I like it :)

Tuesday, September 28, 2010

Split URL to sub domains using Python

Hi again,
For my new and exciting project (hopefully more on this later) I needed to get all sub domains possibilities of a URL, so if this is the input URL:
http://a.b.c.d/test.html
I needed to get the following sub domains:
['a.b.c.d', 'b.c.d', 'c.d', 'd']
Here's the good news: this is pretty simple with Python ! :)
import urlparse
url='http://a.b.c.d/test.html'
hostname=urlparse.urlparse(url).hostname
sh=hostname.split('.')
subdomains=[".".join(sh[i:]) for i in range(len(sh))]
print subdomains
>> ['a.b.c.d', 'b.c.d', 'c.d', 'd']
Nice, isn't it?

Tuesday, May 25, 2010

BeautifulSoup Toturial Presentation

I gave an introductory tutorial presentation re: BeautifulSoup python module to the R&D department at work.
It is based on BeautifulSoup documentation which is:
“an HTML/XML parser for Python that can turn even invalid markup into a parse tree. It provides  simple, idiomatic ways of navigating, searching, and modifying the parse tree. It commonly saves  programmers hours or days of work.”
Read the tutorial here.
Feel free to ask BeautifulSoup questions.

Cheers.

Thursday, April 29, 2010

JSON eval problem "Error: invalid label"

The other day I created a web service receiving XHR (ajax) requests and returning JSON response.
The calling code eval'ed the JSON in order to use it as a JavaScript object, like this:

var ret_json = service.result; // AJAX response JSON string

var ret_json_obj = eval(ret_json); // ERROR
In IE I got error: "; expected" and in FF I got "Error: invalid label".

Solution was to wrap the ret_json with parenthesis and a string, like this:

var ret_json = service.result; // AJAX response JSON string

var ret_json_obj = eval("(" + ret_json + ")"); // WORKING

The text must be wrapped in parenthesis to avoid tripping on an ambiguity in JavaScript's syntax.

Read here more about JSON.

Cheers.

Monday, March 01, 2010

Usability: Web Form Design Best Practices

Recently I picked up a Usability session about Web Form Design Best Practices, by Luke Wroblewski, which took place last year at MIX09, Las Vegas.
It's a bit lengthy (>70 mins) but worth it.
I bring you here the summary of the session:


Web Form Design
Best practices

According to slides by Luke Wroblewski

1. Path to Completion.
a.     Illuminate clear path to completion
b.     Use progress indicators to communicate scope, status and position
c.     If requiring substantial time or information look-up, consider using a start page
d.     Use more general progress indicators for form with variable sequence.

2. Label Alignment
a.     For reduced completion time & familiar data input (name, address etc.): top aligned.
b.     When vertical screen space is a constraint: right aligned.
c.     For unfamiliar or advanced data entry: left aligned.

3.     Help & Tips
a.     Minimize the amount of help & tips required to fill out a form.
b.     Help visible and adjacent to a data request is most useful.
c.     When people maybe unsure about why or how to answer, consider automatic inline system.
d.     For complex & reused forms, consider user-activated system.
e.     Use inline help unless you have a lot of help content (text, graphics, charts).
f.       Use a consistent help section if you have a lot of help content.

4.     Inline Validation
a.     Use inline validation for inputs that have potentially high error rates.
b.     Use suggested inputs to disambiguate.
c.     Communicate limits.

5.     Primary & secondary actions
a.     Avoid secondary actions if possible
b.     Otherwiste, ensure a clear visual distinction between primary & secondary actions ("submit" and "cancel").
c.     Align primary actions with input fields for a clear path to completion.

6.     Actions in progress
a.     Provide indication of tasks in progress.
b.     Disable "submit" button after user clicks it to avoid supplicate submissions.
c.     Consider opportunities to streamline legal requirements.

7.     Error
a.     Clearly communicate an error has occurred: top placement, visual contrast.
b.     Provide actionable remedies to correct errors.
c.     Associate responsible fields with primary error message.
d.     "Double" the visual language where errors have occurred.

8.     Unnecessary inputs
a.     Look for opportunities to remove unnecessary inputs.
b.     Do not complicate questions for the sake of removing inputs.

9.     Form organization
a.     Take the time to evaluate every question you ask.
b.     Ensure your forms speak with one voice.
c.     Strive for succinctness.
d.     If a form naturally breaks down into few short topics, use a single web page.
e.     When a form contains a large number of questions that are only related by a few topics, try multiple web pages.
f.       When a form contains a large number of questions related to a single topic, one long web page.

10.            Gradual engagement
a.     Try to avoid sign-up forms.
b.     Reflect your service's core essence through lightweight interactions.
c.     Make people successful instantly.
d.     If you auto-generate accounts, ensure there is a clear way to access them.
e.     Do not simply distribute the various input fields in a sign-up form across multiple pages.



That's it. Hope you find it helpful.

Monday, February 01, 2010

Convert PowerPoint to HTML with python

After I converted MS Word to HTML (and fed it to the application..) the next stage was to convert MS PowerPoint to HTML.
I thought it would be rather straight forward, given the success I experienced with openoffice headless api converting Word to HTML. It wasn't.
openoffice converts ppt to html (filter "impress_html_Export"), that's right. The output is a set of files, in which each ppt slide is converted to image (screenshot) and HTML. While the screenshots are good, the HTML is not satisfactory. Embedded images in the ppt doesn't appear in the converted HTML, and the same happened for tables. In addition, using the "2 column layout" produced HTML with only the left-column text, leaving the right-column text out. Same happened for any content added to a blank layout template (e.g. text boxes). In addition, numbered list (ol) where converted to bullets (ul).
Needless to say this solution is out of the question.

So here I was, looking for a way to convert ppt to html, using Java or (preferably) Python.
Looking for a Python module to do the job I found win32com, which may be good but not relevant for me since our servers don't run Windows. Although win32com CAN run on debian I preferred working with software that is not Windows dependant.

AND THEN... I found odfpy.
It's a GPL software defining itself as "Python API and tools to manipulate OpenDocument files".
Since openoffice document is basically an archive file, this module reads and writes the archive structure, allowing for easy manipulation of all kinds of openoffice formats.
In addition, it has some built-in scripts for common tasks, e.g. odf2xhtml(which I'm using), odfoutline, csv2odfand more.
SO, I'm converting the ppt to odp using openoffice headless api, and then convert the odp to HTML using odfpy.

And it works !

Sunday, January 03, 2010

Explore registry of dead pc

My PC is dead.
Well, I have a new one now :-)
As it was a Windows machine, some valuable data was stored in the registry but since the pc refused to launch again (yes, even with Ubuntu 9.0 on-the-fly-disk..) I thought I've lost it.
Searching the Internet I found this simple yet useful Windows Registry File Viewer. Simply download the zip file and run the .exe file.
Now I took the hard disk from my late pc, connected it to my new pc, and browsed to the registry file:
C:\Documents and Settings\username\ntuser.dat

A really important feature of this registry viewer is that it's also a registry exporter. Select the registry node you'd like to export, click "file" and then "export to REGEDIT4 format" and you're good.
In order to import this registry file, just double click it and it's values are added to your current pc registry file.

Original FAQ page.

Wednesday, December 02, 2009

openoffice Word to HTML conversion failed, ErrorCodeIOException - ErrCode = 1287

I'm converting Word documents to html using openoffice 3.1 headless version with uno protocol (python).
Reading the Word document with loadComponentFromURL was successful, but saving it as html using storeToURL threw com.sun.star.task.ErrorCodeIOException with ErrCode = 1287.

Interesting thing was the conversion worked great when I ran the code locally on the production server (debian 64), using the python's main function (if __name__ == "__main__"), but failed when I ran it from a remote machine, using http post.

I couldn't find this error code on the internet, so now that I've found the solution I'm posting it.

The issue was improper permissions for the soffice process to write the file in the target directory.

The soffice process ran under user A, and the python code ran under user B. The target directory was created in the python code, user B, so the soffice couldn't write in it.
When I ran the code locally, I logged in with user A, and that's why it worked. Had I logged in with user B I would have seen the same error.
Solution was to add users A and B to the same user group on the debian machine.

QED.

Sunday, November 15, 2009

tidy crashed on Debian 64bit

I'm was working on a python project that invoked a tidy process. We were using the tidy python wrapper utidylib for that, with this code:
import tidy
tidy_obj = tidy.parseString(html, {})
tidy_outstream = StringIO.StringIO()
tidy_obj.write(tidy_outstream)
tidy_html = tidy_outstream.getvalue()
And it worked fine. Almost.
When  invoked it more frequently, i.e. every few seconds, it sometimes succeeded and sometimes crashed, FOR THE SAME HTML INPUT.
Searching the Internet I found that there is a known issue of tidy crashes on Debian 64 bit machines. Too bad.
Looking for an alternate tool to do the same trick I found another python wrapper for tidy that pretends to solve this issue: pytidylib. And the syntax is simpler:
import tidylib
tidy_html, errs = tidylib.tidy_document(html, {})
And guess what ? tidy process no longer crashes !
And while you are reading this, you may find this list of tidy options useful.
Enjoy tidying :-)

Thursday, November 05, 2009

Free Vulnerabilities Detector Tool - Secunia Personal Software Inspector (PSI)

I found this free software that scans your pc and detects software vulnerabilities and code flaws, alerts for installed programs that expose you to security threats and lists the latest security updates and patches you need to install. It also gives you the link to the relevant upgrade.

I scanned my pc and it found some programs needed an update such as: old version of firefox I had installed aside to the latest version I have,  ancient version of Adobe Reader, stale JRE, vulnerable Windows Media Player (6.x ...), .NET framework needed to be patched and some other flaws.

After I found some trojan horses on my machine last week, this is a tool I recommend wholeheartedly.

Secunia Personal Software Inspector (PSI), download here:
http://secunia.com/vulnerability_scanning/personal/

Wednesday, September 23, 2009

Dojo 1.0 and Beyond

Here's a presentation I gave last year about the Dojo library.

The presentation will teach you:
1. What is Dojo.
2. Dojo design goals.
3. Dojo concerns (widgets, packaging, data access, DOM scripting and performance).
4. Difference between Dojo Core, Dojo Dijit and DojoX and what's inside each package.

Presentation URL:
What is Dojo 1.0 and Beyond

For more about Dojo, visit dojotoolkit .

Here I go again...

It's been a while since I updated this blog... sorry :-)

SO (crossing my fingers),
I'm back in town, and plan to update this blog regularly.
I'll start with posting some presentations I gave at work (sharedbook dot com).

Stay tuned !

Thursday, November 09, 2006

SharedBook "mashup"

We all know that nowadays "mashups" is the hottest word around the web.
You can mashup your video from YouTube in your home page, you can mashup Google maps (which is considered the "killer app" for mashup) and many more.

So, I was thinking, why not "mashup" my SharedBook's BabyBook in my blog ?

Well, why not indeed?
We created the SharedBook Mashup !

Now you can create your SharedBook, and mashup a flash widget of it in your blog or site. Isn't that great ? :-)

Photo with Greg Murray

Greg (on the left) presented me the jMaki library, which turned out to be very interesting.

I also attended Greg's "Ajax and Java" session which was also very interesting.

Thanks Greg !

The Ajax Experience Conference

Lately I attended the Ajax Experience Conference which took place on October 23-25 2006, at Boston.

The convention was very educative and professionally organized.
I've started today a series of lectures at the office to pass the knowledge to my colleagues.
Today I gave an introduction lecture, and I plan to lecture on the following:

  1. JavaScript Libraries (#1: Dojo Toolkit)
  2. UI (Usability != Aesthetics, Designing UI for ajax)
  3. Accessibility
  4. Troubleshooting and Testing(Glassbox, Selenuim 0.8.0 - supports frame)
  5. Semantic Web (tagging, RSS)
  6. Unobtrusive Dev
  7. Case Studies (Yahoo, Netflix)
  8. Mashup (here's a common example)