Monday, December 6, 2010

python interview layout

Recently, I came across interview layout on comp.lang.python by
Tim Chase, I can not withstand temptation of reciting:

Basic Python:
=============
- do they know a tuple/list/dict when they see it?
- when to use list vs. tuple vs. dict. vs. set
- can they use list comprehensions (and know when not to
abuse them? :)
- can they use tuple unpacking for assignment?
- string building...do they use "+=" or do they build a list
and use .join() to recombine them efficiently
- truth-value testing questions and observations (do they
write "if x == True" or do they just write "if x")
- basic file-processing (iterating over a file's lines)
- basic understanding of exception handling
Broader Basic Python:
=====================
- questions about the standard library ("do you know if
there's a standard library for doing X?", or "in which
library would you find [common functionality Y]?") Most
of these are related to the more common libraries such as
os/os.path/sys/re/itertools
- questions about iterators/generators
- questions about map/reduce/sum/etc family of functions
- questions about "special" methods (____)
More Advanced Python:
=====================
- can they manipulate functions as first-class objects
(Python makes it easy, but do they know how)
- more detailed questions about the std. libraries (such as
datetime/email/csv/zipfile/networking/optparse/unittest)
- questions about testing (unittests/doctests)
- questions about docstrings vs. comments, and the "Why" of
them
- more detailed questions about regular expressions
- questions about mutability
- keyword/list parameters and unpacked kwd args
- questions about popular 3rd-party toolkits (BeautifulSoup,
pyparsing...mostly if they know about them and when to use
them, not so much about implementation details)
- questions about monkey-patching
- questions about PDB
- questions about properties vs. getters/setters
- questions about classmethods
- questions about scope/name-resolution
- use of lambda
Python History:
===============
- decorators added in which version?
- "batteries included" SQL-capible DB in which version?
- the difference between "class Foo" and "class Foo(object)"
- questions from "import this" about pythonic code
Python Resources:
=================
- what do they know about various Python web frameworks
(knowing a few names is usually good enough, though
knowledge about the frameworks is a nice plus) such as
Django, TurboGears, Zope, etc.
- what do they know about various Python GUI frameworks and
the pros/cons of them (tkinter, wx, pykde, etc)
- where do they go with Python related questions (c.l.p,
google, google-groups, etc)
Other Process-releated things:
==============================
- do they use revision control
(RCS/CVS/Subversion/Mercurial/Git...anything but VSS) and
know how to use it well
- do they write automated tests for their code
Touchy-feely things:
====================
- tabs vs. spaces, and their reasoning
- reason for choosing Python
- choice of editor/IDE

Friday, October 22, 2010

Mercurial merging tip

This trick helps doing a lot of merges in a commandline
hg merge `hgrev dev`
where hgrev is a bash function for finding the changeset hash:
hgrev() { local headname=$1; local rev=`hg head $headname | head -n 1 | gawk '{split($2,a,":"); print a[2]}'`; echo $rev }

Monday, October 11, 2010

write a readable code

Readability is a king. In a language, which major success happened mostly due to its improvements in readability, it is all about profound and aesthetic forms. Have you thought about it that way - the code as we read it is not the same code as we write it. Indeed, the "encoding" and "decoding" paradigm is a constantly reversed process during the software development. But what I personally find the most irritating is an expression "to write code" or simply "to code", which is irresponsibly attributed to the program composing activity. It is not only the semantics that suffers, but the very creative soul of the intellectual act.

Coding is all about translating a language of the concept into a language of the program. But in practice, not code, but stenographic writing is what is so often produced as an output. A perfect programming solution is already a language itself. An average developer is first lured by promising advantages of certain language, then steadily and dogmatically converted by its "exciting" lovelinesses, finally starts desperately advocating for something which he does not really masters or senses. All this painful evolution is normally cycled, when our newly matured adept continues the bloody expansion by converting new casualties. But the trick with any formal language is that it is not more than a handy instrument to put a group of symbols in a grammatically arranged joint. All the work, left there untouched, is to be conducted by the author of the program. It is the author who declares variables and drives the semantical wheel of specifying the identificators. It is he who governs the distribution of complexity through vast functionality. It is he who makes things come alive by telling the story behind the algorithm.

So, the real question is how definitive a good program could be? The code which is self-explanatory is a masterpiece of an inter-human communication (omitting the presence of computer as a mere necessity), thus automatically must be graded as a program. Such code does not require any artificial documentation or redundant comments. Such code is a pure logical semantics - above any formalism or hyper-consistency.

The introduction of the post was indeed a dramatical overkill. A reasonable need for an example must bring the discussion to a firm ground.

Python is one of those languages that, although intended to grasp the impossible - propose a constitution for perfect formalization - and, IMHO, failed, nonetheless has done better than most of its competitors. It is always up to python developer to resolve the design dogma "there is one and only one way to do something".

The following equivalent statements are a good example based on os module usage for some strings (a, b, c):

assert(os.path.join('a', 'b', 'c') ==  os.sep.join(('a', 'b', 'c')))

Mastering the ability to conclude which construction maybe preferable to the code readability can turn out to be an important asset in producing consistent code.

Another important example is:

a = ''
assert(len(a) <= 0 == not a)

Implicit casting of a string or a list variables of zero length to boolean False, may turn out to be priceless in producing shorter, but more readable forms.

There are even more examples. Most of them must be known from studying PEP8 (Programming Recommendations). Others are an important piece of valuable practical knowledge, collected with time.

Saturday, July 25, 2009

youtube, download, python

Well, I was actually too lazy to install some firefox plugins (whatsoever), and wanted just to have some invocable wget-type cli script at my hand to do the job:

  1. get html from youtube by initial url,
  2. parse html and work out a proper url to obtain video stream
  3. download the flv stream into desired file
  4. and, do some nice printing, maybe even a progress bar

So this is pretty amazing - it took me like less then 15 minutes to google, read, download, patch and that is it. Indeed, I am truly underestimating the power of "pythonic" python: any average user aka software developer is capable of doing miracles with the tool.




#!/usr/bin/env python2.5
# encoding: utf-8

import urlgrabber, re
import urlgrabber.progress
import sys, string

def get_video_url(url):
video_id = re.compile("\?v=([^&]*)").findall(url)[0]
a = urlgrabber.urlread(url)
param = re.compile("watch_fullscreen\?([^\']*)\'\;").findall(a)
if len(param) == 0:
print 'video params are not found'
sys.exit()
params = param[0].split('&') # break url params
for param in params:
if param[0:2] == 't=':
param = param[2:]
break
return "http://youtube.com/get_video.php?video_id=" + video_id \
+ "&t=" + param

#===============================================================================
# cli
#===============================================================================
if len(sys.argv) == 3:
url,file_name = sys.argv[1:]
else:
url = raw_input("Enter the URL: ")
file_name = raw_input("Enter a filename: ")

url = get_video_url(url)

#===============================================================================
# ouput header
#===============================================================================
hr = '*'*80 +"\n"
msg = hr + ('* %s: %s' % (string.replace(('Save to file').zfill(14),'0',' '),
file_name)) + "\n"
msg += ('* %s: %s' % (string.replace(('Video Url').zfill(14),'0',' '),
url)) + "\n"
print msg

#===============================================================================
# download
#===============================================================================
prog = urlgrabber.progress.text_progress_meter()
urlgrabber.urlgrab(url, file_name, progress_obj=prog)


First of all, big thanks for python community (that actually shares and provides all these interesting results for googling activities). Well, the urlgrabber module also seems to be a nice piece of work.

As for the code - self explanatory, if one ignore the get_video_url primitive.

Friday, July 24, 2009

python decorators and protected methods

Decorators is a nice tool to provide further flexibility for python applications, by annotating or "decorating" statements at code attributes.

Let us shortly review some basic concepts that we meet in python. Python has an interesting mix of OO and Module programming paradigms, besides all the rest. That is, module mechanism methods, attributes and classes mechanism (Old-style "class A():" and new style "class A(object):") with attributes (properties - do not confuse with @property decorator) . Classes are conceptually similar to modules, but not vise versa. Further, we will talk about how to decorate methods.

As your know, python has notation for private variables starting with underline symbol ("__" double underline is recommended ). All other (attributes, methods) are public. And this can be understand as a python way of seeing things - minimalistic programming.

So, for creating static methods we have @classmethod decorator.

We will build a similar decorator to imitate protected methods (for example, like in Java), in following manner:

##
# Decorator @protected
#
def protected(annotated_func):
'''Decorator @protected - function to be applied for "protected" property behavior emulation'''
import traceback, inspect
# important: class is does not actually exist at the moment we want to obtain a method,
# i.e. annotated_func is a function, because class is in a process of declaration
# (is not declared yet)
def protected_func(*args, **kwds):
self = args[0]
invoking_self = None
# work with stack frames to obtain source of invocation
frames = inspect.stack()
frame = frames[1][0]
try:
invoking_self = frame.f_locals['self']
if not self == invoking_self \
or not isinstance(invoking_self, self.__class__): # access only from heir or itself
raise Exception('Attempting to access an instancemethod \'%s\' in class \'%s\'\
that is protected' % (annotated_func.func_name, self.__class__.__name__))
finally:
del frames
del frame
del invoking_self
del self
return annotated_func(*args, **kwds)
protected_func.func_name = annotated_func.func_name
return protected_func