Saturday, July 25, 2009

youtube, download, python

Well, I was actually too lazy to install some firefox plugins (whatsoever), and wanted just to have some invocable wget-type cli script at my hand to do the job:

  1. get html from youtube by initial url,
  2. parse html and work out a proper url to obtain video stream
  3. download the flv stream into desired file
  4. and, do some nice printing, maybe even a progress bar

So this is pretty amazing - it took me like less then 15 minutes to google, read, download, patch and that is it. Indeed, I am truly underestimating the power of "pythonic" python: any average user aka software developer is capable of doing miracles with the tool.




#!/usr/bin/env python2.5
# encoding: utf-8

import urlgrabber, re
import urlgrabber.progress
import sys, string

def get_video_url(url):
video_id = re.compile("\?v=([^&]*)").findall(url)[0]
a = urlgrabber.urlread(url)
param = re.compile("watch_fullscreen\?([^\']*)\'\;").findall(a)
if len(param) == 0:
print 'video params are not found'
sys.exit()
params = param[0].split('&') # break url params
for param in params:
if param[0:2] == 't=':
param = param[2:]
break
return "http://youtube.com/get_video.php?video_id=" + video_id \
+ "&t=" + param

#===============================================================================
# cli
#===============================================================================
if len(sys.argv) == 3:
url,file_name = sys.argv[1:]
else:
url = raw_input("Enter the URL: ")
file_name = raw_input("Enter a filename: ")

url = get_video_url(url)

#===============================================================================
# ouput header
#===============================================================================
hr = '*'*80 +"\n"
msg = hr + ('* %s: %s' % (string.replace(('Save to file').zfill(14),'0',' '),
file_name)) + "\n"
msg += ('* %s: %s' % (string.replace(('Video Url').zfill(14),'0',' '),
url)) + "\n"
print msg

#===============================================================================
# download
#===============================================================================
prog = urlgrabber.progress.text_progress_meter()
urlgrabber.urlgrab(url, file_name, progress_obj=prog)


First of all, big thanks for python community (that actually shares and provides all these interesting results for googling activities). Well, the urlgrabber module also seems to be a nice piece of work.

As for the code - self explanatory, if one ignore the get_video_url primitive.

Friday, July 24, 2009

python decorators and protected methods

Decorators is a nice tool to provide further flexibility for python applications, by annotating or "decorating" statements at code attributes.

Let us shortly review some basic concepts that we meet in python. Python has an interesting mix of OO and Module programming paradigms, besides all the rest. That is, module mechanism methods, attributes and classes mechanism (Old-style "class A():" and new style "class A(object):") with attributes (properties - do not confuse with @property decorator) . Classes are conceptually similar to modules, but not vise versa. Further, we will talk about how to decorate methods.

As your know, python has notation for private variables starting with underline symbol ("__" double underline is recommended ). All other (attributes, methods) are public. And this can be understand as a python way of seeing things - minimalistic programming.

So, for creating static methods we have @classmethod decorator.

We will build a similar decorator to imitate protected methods (for example, like in Java), in following manner:

##
# Decorator @protected
#
def protected(annotated_func):
'''Decorator @protected - function to be applied for "protected" property behavior emulation'''
import traceback, inspect
# important: class is does not actually exist at the moment we want to obtain a method,
# i.e. annotated_func is a function, because class is in a process of declaration
# (is not declared yet)
def protected_func(*args, **kwds):
self = args[0]
invoking_self = None
# work with stack frames to obtain source of invocation
frames = inspect.stack()
frame = frames[1][0]
try:
invoking_self = frame.f_locals['self']
if not self == invoking_self \
or not isinstance(invoking_self, self.__class__): # access only from heir or itself
raise Exception('Attempting to access an instancemethod \'%s\' in class \'%s\'\
that is protected' % (annotated_func.func_name, self.__class__.__name__))
finally:
del frames
del frame
del invoking_self
del self
return annotated_func(*args, **kwds)
protected_func.func_name = annotated_func.func_name
return protected_func