双木成林:喋喋不休

I leave no trace of wings in the air, but I am glad I have had my flight.

C++的pImpl idiom小探

without comments

好久没写博客了。最近重拾C++,在看Effective Modern C++,看到有关pImpl idiom的相关内容,以前这块就没有搞得很清楚,就自己研究了一下

首先, PImpl Idiom的作用,我看起来主要有两个。一是为了让实现和接口分离,二减少重新编译的时间,当具体实现代码所依赖的头文件改动的时候,类本身的头文件不会变动,而导致所有依赖的库重新编译。

第一次实现的版本如下

#include <memory>
#include <string>
class Person {
public:
    Person(std::string name);
    void set_name(std::string name);
    std::string name();
private:
    struct Impl;
    std::unique_ptr<Impl> pImpl_;
};

编译的时候遇到一个错误:

error: invalid application of ‘sizeof’ to an incomplete type ‘Person::Impl’
static_assert(sizeof(_Tp) > 0, “default_delete can not delete incomplete type”);
^~~~~~~~~~~

原因就是, 没有自定义Person的destructor,系统就会自动加上一个。。而在这个系统自动加上Person的destructor中,会尝试去销毁pImpl这个unique_ptr,而且由于系统采用的是inline的方式插入destructor,所以相当于在头文件中,就包含了上述的那个sizeof(_Tp)的代码。

而与此同时,struct Impl;是一个前置声明语句,编译的时候是不知道他的具体大小的。所以编译就出错了。

解决办法就是,在Person中声明一个desturctor,让编译器不会自动生成默认的deconstructor就行。但是这样同样会有问题,我在这里其实并不需要特殊处理,所以我又想用系统自动生成的代码。怎么办呢?

解决办法就是在实现这个方法的时候,加上一个= default这个c++11新加的方法。

新代码如下:

person.hpp
class Person {
public:
    Person(std::string name);
    virtual ~Person();
    void set_name(std::string name);
    std::string name();
private:
    struct Impl;
    std::unique_ptr<Impl> pImpl_;
};
person.cpp
Person::~Person() = default;

问题完美解决

按照一般的想法,理解一个东西与否,还需要提出一个假设,然后验证这个假设是否和事实一样。那么我在这里假设,如果我直接把这个Person::~Person() = default;的代码移到头文件之中去,让他用inline的方式生成destructor,同样会有编译错误。

代码改成如下:

class Person {
public:
    Person(std::string name);
    virtual ~Person() = default;
    void set_name(std::string name);
    std::string name();
private:
    struct Impl;
    std::unique_ptr<Impl> pImpl_;
};

不出所料果然出现了同样的编译错误, 问题解决。

Written by linluxiang

三月 13th, 2015 at 3:28 下午

Posted in Uncategorized

Fix ldconfig not found in debian lenny

with one comment

Thx to http://ubuntuforums.org/showthread.php?t=1266104

The point is

1. Get hold of ldconfig & ldconfig.real from the correct version:

Code: aptitude download libc6

unpack it:

Code: dpkg-deb -x libc6*.deb libc6-unpacked/

copy them out: Code: sudo cp libc6-unpacked/sbin/ldconfig* /sbin/

At this point I did:

Code: sudo apt-get -f install

sudo dpkg-reconfigure libc6

sudo dpkg-reconfigure libc6-i386

sudo apt-get install –reinstall libc6

sudo apt-get install –reinstall libc6-i386

Of which the reinstalls seemed to be the crucial ones.

Now all errors seem to be gone.

Written by linluxiang

十月 19th, 2011 at 7:04 下午

Posted in Uncategorized

Stay Hungry, Stay Foolish

without comments

I am honored to be with you today at your commencement from one of the finest universities in the world. I never graduated from college. Truth be told, this is the closest I’ve ever gotten to a college graduation. Today I want to tell you three stories from my life. That’s it. No big deal. Just three stories.

The first story is about connecting the dots.

I dropped out of Reed College after the first 6 months, but then stayed around as a drop-in for another 18 months or so before I really quit. So why did I drop out?

It started before I was born. My biological mother was a young, unwed college graduate student, and she decided to put me up for adoption. She felt very strongly that I should be adopted by college graduates, so everything was all set for me to be adopted at birth by a lawyer and his wife. Except that when I popped out they decided at the last minute that they really wanted a girl. So my parents, who were on a waiting list, got a call in the middle of the night asking: “We have an unexpected baby boy; do you want him?” They said: “Of course.” My biological mother later found out that my mother had never graduated from college and that my father had never graduated from high school. She refused to sign the final adoption papers. She only relented a few months later when my parents promised that I would someday go to college.

And 17 years later I did go to college. But I naively chose a college that was almost as expensive as Stanford, and all of my working-class parents’ savings were being spent on my college tuition. After six months, I couldn’t see the value in it. I had no idea what I wanted to do with my life and no idea how college was going to help me figure it out. And here I was spending all of the money my parents had saved their entire life. So I decided to drop out and trust that it would all work out OK. It was pretty scary at the time, but looking back it was one of the best decisions I ever made. The minute I dropped out I could stop taking the required classes that didn’t interest me, and begin dropping in on the ones that looked interesting.

It wasn’t all romantic. I didn’t have a dorm room, so I slept on the floor in friends’ rooms, I returned coke bottles for the 5¢ deposits to buy food with, and I would walk the 7 miles across town every Sunday night to get one good meal a week at the Hare Krishna temple. I loved it. And much of what I stumbled into by following my curiosity and intuition turned out to be priceless later on. Let me give you one example:

Reed College at that time offered perhaps the best calligraphy instruction in the country. Throughout the campus every poster, every label on every drawer, was beautifully hand calligraphed. Because I had dropped out and didn’t have to take the normal classes, I decided to take a calligraphy class to learn how to do this. I learned about serif and san serif typefaces, about varying the amount of space between different letter combinations, about what makes great typography great. It was beautiful, historical, artistically subtle in a way that science can’t capture, and I found it fascinating.

None of this had even a hope of any practical application in my life. But ten years later, when we were designing the first Macintosh computer, it all came back to me. And we designed it all into the Mac. It was the first computer with beautiful typography. If I had never dropped in on that single course in college, the Mac would have never had multiple typefaces or proportionally spaced fonts. And since Windows just copied the Mac, it’s likely that no personal computer would have them. If I had never dropped out, I would have never dropped in on this calligraphy class, and personal computers might not have the wonderful typography that they do. Of course it was impossible to connect the dots looking forward when I was in college. But it was very, very clear looking backwards ten years later.

Again, you can’t connect the dots looking forward; you can only connect them looking backwards. So you have to trust that the dots will somehow connect in your future. You have to trust in something — your gut, destiny, life, karma, whatever. This approach has never let me down, and it has made all the difference in my life.

My second story is about love and loss.

I was lucky — I found what I loved to do early in life. Woz and I started Apple in my parents garage when I was 20. We worked hard, and in 10 years Apple had grown from just the two of us in a garage into a $2 billion company with over 4000 employees. We had just released our finest creation — the Macintosh — a year earlier, and I had just turned 30. And then I got fired. How can you get fired from a company you started? Well, as Apple grew we hired someone who I thought was very talented to run the company with me, and for the first year or so things went well. But then our visions of the future began to diverge and eventually we had a falling out. When we did, our Board of Directors sided with him. So at 30 I was out. And very publicly out. What had been the focus of my entire adult life was gone, and it was devastating.

I really didn’t know what to do for a few months. I felt that I had let the previous generation of entrepreneurs down – that I had dropped the baton as it was being passed to me. I met with David Packard and Bob Noyce and tried to apologize for screwing up so badly. I was a very public failure, and I even thought about running away from the valley. But something slowly began to dawn on me — I still loved what I did. The turn of events at Apple had not changed that one bit. I had been rejected, but I was still in love. And so I decided to start over.

I didn’t see it then, but it turned out that getting fired from Apple was the best thing that could have ever happened to me. The heaviness of being successful was replaced by the lightness of being a beginner again, less sure about everything. It freed me to enter one of the most creative periods of my life.

During the next five years, I started a company named NeXT, another company named Pixar, and fell in love with an amazing woman who would become my wife. Pixar went on to create the worlds first computer animated feature film, Toy Story, and is now the most successful animation studio in the world. In a remarkable turn of events, Apple bought NeXT, I returned to Apple, and the technology we developed at NeXT is at the heart of Apple’s current renaissance. And Laurene and I have a wonderful family together.

I’m pretty sure none of this would have happened if I hadn’t been fired from Apple. It was awful tasting medicine, but I guess the patient needed it. Sometimes life hits you in the head with a brick. Don’t lose faith. I’m convinced that the only thing that kept me going was that I loved what I did. You’ve got to find what you love. And that is as true for your work as it is for your lovers. Your work is going to fill a large part of your life, and the only way to be truly satisfied is to do what you believe is great work. And the only way to do great work is to love what you do. If you haven’t found it yet, keep looking. Don’t settle. As with all matters of the heart, you’ll know when you find it. And, like any great relationship, it just gets better and better as the years roll on. So keep looking until you find it. Don’t settle.

My third story is about death.

When I was 17, I read a quote that went something like: “If you live each day as if it was your last, someday you’ll most certainly be right.” It made an impression on me, and since then, for the past 33 years, I have looked in the mirror every morning and asked myself: “If today were the last day of my life, would I want to do what I am about to do today?” And whenever the answer has been “No” for too many days in a row, I know I need to change something.

Remembering that I’ll be dead soon is the most important tool I’ve ever encountered to help me make the big choices in life. Because almost everything — all external expectations, all pride, all fear of embarrassment or failure – these things just fall away in the face of death, leaving only what is truly important. Remembering that you are going to die is the best way I know to avoid the trap of thinking you have something to lose. You are already naked. There is no reason not to follow your heart.

About a year ago I was diagnosed with cancer. I had a scan at 7:30 in the morning, and it clearly showed a tumor on my pancreas. I didn’t even know what a pancreas was. The doctors told me this was almost certainly a type of cancer that is incurable, and that I should expect to live no longer than three to six months. My doctor advised me to go home and get my affairs in order, which is doctor’s code for prepare to die. It means to try to tell your kids everything you thought you’d have the next 10 years to tell them in just a few months. It means to make sure everything is buttoned up so that it will be as easy as possible for your family. It means to say your goodbyes.

I lived with that diagnosis all day. Later that evening I had a biopsy, where they stuck an endoscope down my throat, through my stomach and into my intestines, put a needle into my pancreas and got a few cells from the tumor. I was sedated, but my wife, who was there, told me that when they viewed the cells under a microscope the doctors started crying because it turned out to be a very rare form of pancreatic cancer that is curable with surgery. I had the surgery and I’m fine now.

This was the closest I’ve been to facing death, and I hope it’s the closest I get for a few more decades. Having lived through it, I can now say this to you with a bit more certainty than when death was a useful but purely intellectual concept:

No one wants to die. Even people who want to go to heaven don’t want to die to get there. And yet death is the destination we all share. No one has ever escaped it. And that is as it should be, because Death is very likely the single best invention of Life. It is Life’s change agent. It clears out the old to make way for the new. Right now the new is you, but someday not too long from now, you will gradually become the old and be cleared away. Sorry to be so dramatic, but it is quite true.

Your time is limited, so don’t waste it living someone else’s life. Don’t be trapped by dogma — which is living with the results of other people’s thinking. Don’t let the noise of others’ opinions drown out your own inner voice. And most important, have the courage to follow your heart and intuition. They somehow already know what you truly want to become. Everything else is secondary.

When I was young, there was an amazing publication called The Whole Earth Catalog, which was one of the bibles of my generation. It was created by a fellow named Stewart Brand not far from here in Menlo Park, and he brought it to life with his poetic touch. This was in the late 1960′s, before personal computers and desktop publishing, so it was all made with typewriters, scissors, and polaroid cameras. It was sort of like Google in paperback form, 35 years before Google came along: it was idealistic, and overflowing with neat tools and great notions.

Stewart and his team put out several issues of The Whole Earth Catalog, and then when it had run its course, they put out a final issue. It was the mid-1970s, and I was your age. On the back cover of their final issue was a photograph of an early morning country road, the kind you might find yourself hitchhiking on if you were so adventurous. Beneath it were the words: “Stay Hungry. Stay Foolish.” It was their farewell message as they signed off. Stay Hungry. Stay Foolish. And I have always wished that for myself. And now, as you graduate to begin anew, I wish that for you.

Stay Hungry. Stay Foolish.

Thank you all very much.

 

 

Written by linluxiang

十月 7th, 2011 at 12:19 下午

Posted in Uncategorized

iOS开发中不要使用相对路径

with one comment

今天在写iOS应用的时候遇到一个问题,使用[NSData dataWithContentOfFile:@"foo"]的时候,总是无法读取正确的文件内容。而使用[NSData dataWithContentOfFile:[[NSBundle mainBundle] pathForResource:@”foo” ofType:@”"]的时候就可以。

经过Google以后知道。当使用相对路径的时候,其实他相对的当前目录并不是程序运行的目录,而是“/”。只有使用[NSBundle mainBundle]来生成的路径才是文件真正的路径。

在此记录一下:在以后的开发中不直接使用任何相对路径,而是使用经过计算以后的绝对路径。

Written by linluxiang

四月 13th, 2011 at 5:51 下午

Posted in Uncategorized

拉肚子的孩纸你伤不起啊!

without comments

去婺源旅游,结果得了急性肠胃炎,一天之内拉了13次。更加关键的是,其中大多数是在高速公路上面完成的。估计这辈子也没什么机会体验这么多的高速公路服务区的厕所了。最豪华的厕所自动冲水,最简陋的就一条坑。最诡异的是旁边按阀门我这边冲水。囧。当然大多数都是平淡无奇的,顶多加上门坏了不出水神马的。

另外着重感谢某童鞋,除了帮我盖被子提供纸巾以外,还舍身取义拦在大巴前防止司机偷偷开走。此情可歌可泣,令人赞叹。特此表扬,以资鼓励。另外说一句,某童鞋除了以上行为以外,还顺便看完了一部电子书,打败了入侵后花园的僵尸,敲太鼓创下了新纪录,以及穿越到三国时期当了一回将军。真正的多才多艺,五好儿童,德智体美劳全面发展。这样的孩子有木有啊有木有。

最后郑重提醒大家一句。出门在外,一定要带够卫生纸啊!!!虽然我的确带够了。

Written by linluxiang

四月 6th, 2011 at 8:11 下午

Posted in 心情,生活

Tagged with

WSGI初探

with one comment

此文来自本人原JavaEye博客。原文地址

前言

本文不涉及WSGI的具体协议的介绍,也不会有协议完整的实现,甚至描述中还会掺杂着本人自己对于WSGI的见解。所有的WSGI官方定义请看http://www.python.org/dev/peps/pep-3333/

WSGI是什么?

WSGI的官方定义是,the Python Web Server Gateway Interface。从名字就可以看出来,这东西是一个Gateway,也就是网关。网关的作用就是在协议之间进行转换。

也就是说,WSGI就像是一座桥梁,一边连着web服务器,另一边连着用户的应用。但是呢,这个桥的功能很弱,有时候还需要别的桥来帮忙才能进行处理。

下面对本文出现的一些名词做定义。wsgi app,又称应用 ,就是一个WSGI application。wsgi container ,又称容器 ,虽然这个部分常常被称为handler,不过我个人认为handler容易和app混淆,所以我称之为容器。 wsgi_middleware ,又称*中间件*。一种特殊类型的程序,专门负责在容器和应用之间干坏事的。

一图胜千言,直接来一个我自己理解的WSGI架构图吧

可以看出,服务器,容器和应用之间存在着十分纠结的关系。下面就要把这些纠结的关系理清楚。

WSGI应用

WSGI应用其实就是一个callable的对象。举一个最简单的例子,假设存在如下的一个应用:

def application(environ, start_response):
  status = '200 OK'
  output = 'World!'
  response_headers = [('Content-type', 'text/plain'),
                      ('Content-Length', str(12)]
  write = start_response(status, response_headers)
  write('Hello ')
  return [output]

这个WSGI应用简单的可以用简陋来形容,但是他的确是一个功能完整的WSGI应用。只不过给人留下了太多的疑点,environ是什么?start_response是什么?为什么可以同时用write和return来返回内容?

对于这些疑问,不妨自己猜测一下他的作用。联想到CGI,那么environ可能就是一系列的环境变量,用来表示HTTP请求的信息,比如说method之类的。start_response,可能是接受HTTP response头信息,然后返回一个write函数,这个write函数可以把HTTP response的body返回给客户端。return自然是将HTTP response的body信息返回。不过这里的write和函数返回有什么区别?会不会是其实外围默认调用write对应用返回值进行处理?而且为什么应用的返回值是一个列表呢?说明肯定存在一个对应用执行结果的迭代输出过程。难道说他隐含的支持iterator或者generator吗?

等等,应用执行结果?一个应用既然是一个函数,说明肯定有一个对象去执行它,并且可以猜到,这个对象把environ和start_response传给应用,将应用的返回结果输出给客户端。那么这个对象是什么呢?自然就是WSGI容器了。

WSGI容器

先说说WSGI容器的来源,其实这是我自己编造出来的一个概念。来源就是JavaServlet容器。我个人理解两者有相似的地方,就顺手拿过来用了。

WSGI容器的作用,就是构建一个让WSGI应用成功执行的环境。成功执行,意味着需要传入正确的参数,以及正确处理返回的结果,还得把结果返回给客户端。

所以,WSGI容器的工作流程大致就是,用webserver规定的通信方式,能从webserver获得正确的request信息,封装好,传给WSGI应用执行,正确的返回response。

一般来说,WSGI容器必须依附于现有的webserver的技术才能实现,比如说CGI,FastCGI,或者是embed的模式。

下面利用CGI的方式编写一个最简单的WSGI容器。关于WSGI容器的协议官方文档并没有具体的说如何实现,只是介绍了一些需要约束的东西。具体内容看PEP3333中的协议。

#!/usr/bin/python
#encoding:utf8  

import cgi
import cgitb
import sys
import os  

#Make the environ argument
environ = {}
environ['REQUEST_METHOD'] = os.environ['REQUEST_METHOD']
environ['SCRIPT_NAME'] = os.environ['SCRIPT_NAME']
environ['PATH_INFO'] = os.environ['PATH_INFO']
environ['QUERY_STRING'] = os.environ['QUERY_STRING']
environ['CONTENT_TYPE'] = os.environ['CONTENT_TYPE']
environ['CONTENT_LENGTH'] = os.environ['CONTENT_LENGTH']
environ['SERVER_NAME'] = os.environ['SERVER_NAME']
environ['SERVER_PORT'] = os.environ['SERVER_PORT']
environ['SERVER_PROTOCOL'] = os.environ['SERVER_PROTOCOL']
environ['wsgi.version'] = (1, 0)
environ['wsgi.url_scheme'] = 'http'
environ['wsgi.input']        = sys.stdin
environ['wsgi.errors']       = sys.stderr
environ['wsgi.multithread']  = False
environ['wsgi.multiprocess'] = True
environ['wsgi.run_once']     = True  

#make the start_response argument
#注意,WSGI协议规定,如果没有body内容,是不能返回http response头信息的。
sent_header = False
res_status = None
res_headers = None  

def write(body):
    global sent_header
    if sent_header:
        sys.stdout.write(body)
    else:
        print res_status
        for k, v in res_headers:
            print k + ': ' + v
        print
        sys.stdout.write(body)
        sent_header = True  

def start_response(status, response_headers):
    global res_status
    global res_headers
    res_status = status
    res_headers = response_headers
    return write  

#here is the application
  def application(environ, start_response):
    status = '200 OK'
    output = 'World!'
    response_headers = [('Content-type', 'text/plain'),
                        ('Content-Length', str(12)]
    write = start_response(status, response_headers)
    write('Hello ')
    return [output]  

#here run the application
result = application(environ, start_response)
for value in result:
    write(value)

看吧。其实实现一个WSGI容器也不难。

不过我从WSGI容器的设计中可以看出WSGI的应用设计上面存在着一个重大的问题就是:为什么要提供两种方式返回数据?明明只有一个write函数,却既可以在application里面调用,又可以在容器中传输应用的返回值来调用。如果说让我来设计的话,直接把start_response给去掉了。就用application(environ)这个接口。传一个方法,然后返回值就是status, response_headers和一个字符串的列表。实际传输的方法全部隐藏了。用户只需要从environ中读取数据处理就行了。。

可喜的是,搜了一下貌似web3的标准里面应用的设计和我的想法类似。希望web3协议能早日普及。

Middleware中间件

中间件是一类特殊的程序,可以在容器和应用之间干一些坏事。。其实熟悉python的decorator的人就会发现,这和decoraotr没什么区别。

下面来实现一个route的简单middleware。

class Router(object):
    def __init__(self):
        self.path_info = {}
    def route(self, environ, start_response):
        application = self.path_info[environ['PATH_INFO']]
        return application(environ, start_response)
    def __call__(self, path):
        def wrapper(application):
            self.path_info[path] = application
        return wrapper

这就是一个很简单的路由功能的middleware。将上面那段wsgi容器的代码里面的应用修改成如下:

router = Router()  

#here is the application
@router('/hello')
def hello(environ, start_response):
    status = '200 OK'
    output = 'Hello'
    response_headers = [('Content-type', 'text/plain'),
                        ('Content-Length', str(len(output)))]
    write = start_response(status, response_headers)
    return [output]  

@router('/world')
def world(environ, start_response):
    status = '200 OK'
    output = 'World!'
    response_headers = [('Content-type', 'text/plain'),
                        ('Content-Length', str(len(output)))]
    write = start_response(status, response_headers)
    return [output]
#here run the application
result = router.route(environ, start_response)
for value in result:
    write(value)

这样,容器就会自动的根据访问的地址找到对应的app执行了。

延伸

写着写着,怎么越来越像一个框架了?看来Python开发框架真是简单。。

其实从另外一个角度去考虑。如果把application当作是一个运算单元。利用middleware调控IO和运算资源,那么利用WSGI组成一个分布式的系统。

好吧,全文完

Written by linluxiang

三月 3rd, 2011 at 8:40 下午

Posted in Python,技术

Tagged with , ,

Python闭包再研究

with 2 comments

此文来自本人原JavaEye博客,原文地址

前两天写了一篇文章,讲了一下Python的闭包。刚好今天又看到一个小问题,和Python闭包有点相关。顺手记录下来。

如下一段代码

funcs = []
for i in xrange(10):
    def bar(n):
        return n + i
    funcs.append(bar)  

print funcs[3](5)

这段代码中,我们期望得到的结果是3+5为8。但是实际得到的结果是什么呢?是14。

14是怎么来的?

反汇编看看:

7           0 LOAD_FAST                0 (n)
            3 LOAD_GLOBAL              0 (i)
            6 BINARY_ADD
            7 RETURN_VALUE

注意i是global。

得到14的原因就是,funcs[3]这个函数对象获取i的值,是在执行的时候。而i的作用域是global。也就是说,当这个func开始执行的时候,i已经变成9了。那么结果当然等于14了。

从这个结果看,以上代码和下面代码效果是等价的。

funcs = []
for i in xrange(10):
    pass  

def bar(n):
    return n + i  

funcs.append(bar)#这句重复10遍
print funcs[3](5)

很无趣吧。那么考虑一下,如果把i丢到闭包来做会怎样?

funcs = []
def foo(m):
    for i in xrange(m):
        def bar(n):
            return n + i
        funcs.append(bar)
foo(10)
print funcs[3](5)

很遗憾,结果依然是14.

反汇编代码如下:

9           0 LOAD_FAST                0 (n)
            3 LOAD_DEREF               0 (i)
            6 BINARY_ADD
            7 RETURN_VALUE

唉,只是傻傻的远程访问而已。

“所有的bar代码中,i仅仅只是在closure中的一个引用而已。指向的依然是同一个对象。当这个对象被改变,所有的bar执行的时候获得的值都是修改后的值”。

顺手写了一段JavaScript来测试,发现结果是一样的。也是会全局改变。具体代码如下:

但是用haskell实现了一个,完全符合预期的结果。

main = do
    let funcs = [(\n -> n + i) | i <- [0..9] ]
    let x0: x1 : x2: x3: xs = funcs
    return (x3 5)

返回结果是8。

看来Python对FP的支持还是比不上Haskell这种正统的函数式语言。

个人觉得如果Python要发展FP的话,可以考虑如下解决方案:区别本地变量和闭包变量。当声明函数的时候将闭包变量拷贝一份到本地,同时保留指向原闭包对象的引用。在搜索名字的时候,始终都是先搜索本地变量,再搜索本地闭包变量副本,再搜索全局变量。当需要修改的时候,如果是本地变量就直接修改。如果是闭包变量的时候,将原来闭包真正指向的对象进行修改,同时覆盖掉本地的闭包副本。

这个方法只是暂时的考虑,没有仔细的推敲。乍看之下似乎可以解决问题。

不过,Python毕竟是OO的语言。没有Haskell或者Lisp那种天生的作用域的控制能力。唉。那就用OO的方式来搞Python把。

Written by linluxiang

三月 3rd, 2011 at 7:38 下午

Posted in Python,技术

Tagged with

Python闭包研究

without comments

此文来自本人原JavaEye博客,原文地址

其实很早以前就想写这么一篇文章了。一直没有机会。正好今天和同事讨论Python闭包的问题,趁着没遗忘赶快记录下来。以下代码运行的Python版本是2.5。

问题还是那个很经典的问题:如下代码会抛一个错误

def foo():
    a = 1
    def bar():
        a = a + 1
    bar()
    print a

错误则是:

UnboundLocalError: local variable 'a' referenced before assignment

原因分析,直接上dis模块解析bar的汇编代码。得到以下结果:

12           0 LOAD_FAST                0 (a)
             3 LOAD_CONST               1 (1)
             6 INPLACE_ADD
             7 STORE_FAST               0 (a)
            10 LOAD_CONST               0 (None)
            13 RETURN_VALUE

可以看到,造成这个异常的结果是LOAD_FAST没有找到local变量。STORE_FAST语句的作用是绑定一个local变量。那么在储存变量之前就先去读,当然是会报错了。可是,明明是a = a + 1。而按照赋值语句先执行右边的规律来看,他应该先去外层的a那里读取值,然后再新建一个local的名字a,把值赋给local的a啊?

原因暂且放下,先看一段能正常执行的代码。

把前面代码中的a = a + 1改成b = a + 1。反汇编得到以下代码。

13           0 LOAD_DEREF               0 (a)
             3 LOAD_CONST               1 (1)
             6 BINARY_ADD
             7 STORE_FAST               0 (b)
            10 LOAD_CONST               0 (None)
            13 RETURN_VALUE            

果然按照原来设想的一样,a在这个地方变成了LOAD_DEREF,变成了访问外围的值,然后和1想加以后,储存在一个本地的变量b里面。

正确的程序和错误的程序的差别就是,错误的里面,a是赋值语句的左边。

这看起来不经心的一个差别,会不会是原因呢?答案是YES!看python的PEP227中的一段话。

If a name is bound anywhere within a code block, all uses of the name within the block are treated as references to the current block.

这句话非常拗口。我换一种通俗的方式来解释一下。模拟一下python编译器的行为。首先编译器看到了a = a + 1这句话,发现这是一个赋值语句。先检查右边,遇到了一个名字叫做a的东西。a是什么?编译器问自己。会不会是一个局部变量?于是编译器就傻傻的找到规则,规则表说:如果一个名字出现在参数声明,赋值语句(左边),函数声明,类声明,import语句,for语句和except语句中,这就是一个局部变量。ok。编译器从头到尾一看,a就在一个赋值语句的左边,那么a是一个局部变量没跑了。于是生成一条记录LOAD_FAST 0。你是局部变量,让你运行快一点。接着,分析完右边分析左边,赋值语句左边一定是一个局部变量,简单,你就在0号位置把,直接生成STORE_FAST 0,把栈顶的值给你。编译器顺利的编译结束。下面轮到虚拟机运行了。虚拟运行到这个语句就犯糊涂了,叫我LOAD_FAST 0。可是0里面什么东西都没有啊。我擦勒。只好报错了。

而第二段代码为什么能够正确执行呢?其实就是因为,编译器在整个代码块里面没有发现有绑定名字给a,也没有发现a是一个global对象,所以,就生成一个LOAD_DEREF 语句,告诉虚拟机,a不在这个里面。到别的地方去找他。

那么这个别的地方究竟是什么地方呢?如果python没有这个一定是局部变量的规则,是不是就能修改了呢?

我们继续分析。

先找到LOAD_DEREF的定义是什么?查看dis这个模块的说明,里面有如下的文字:

LOAD_DEREF(i)
Loads the cell contained in slot i of the cell and free variable storage. Pushes a reference to the object the cell contains on the stack.

大意就是,加载cell[i]到栈顶。cell是一个什么?这时候,联想到Python的CodeObject里面有一个属性叫做co_cellvars.会不会和这个有关?

查了文档以后发现如下定义

co_cellvars is a tuple containing the names of local variables that are referenced by nested functions;

被嵌套的函数引用的局部变量?好奇特的说法啊。真这么神奇?执行下列代码。

def foo():
    a = 1
    def bar():
        b = a + 1
    print 'bar cellvars:', bar.func_code.co_cellvars  

foo()  

print 'foo cellvars:', foo.func_code.co_cellvars

执行结果是:

bar cellvars: ()
foo cellvars: ('a',)

还真是的,a在bar中引用了,所以被加入到cellvars里面。*需要注意的是,他这里只是把名字放到了cellvar中,也就是说,这个闭包中的对象,依然只是一个引用而已。当这个bar调用的时候,是会顺着引用找到真正的值的。而如果真正的值被修改,在所有的bar里面都会体现。

这个过程是怎么加入的呢?反汇编一下foo的代码:

2           0 LOAD_CONST               1 (1)
            3 STORE_DEREF              0 (a)  

3           6 LOAD_CLOSURE             0 (a)
            9 BUILD_TUPLE              1
           12 LOAD_CONST               2 (<code object bar at 0x48f458, file "test.py", line 3>)
           15 MAKE_CLOSURE             0
           18 STORE_FAST               0 (bar)
           21 LOAD_CONST               0 (None)
           24 RETURN_VALUE

看到奇特的STORE_DEREF, LOAD_CLOSURE, MAKE_CLOSURE指令。

这三个指令的作用分别如下:

STORE_DEREF(i)
Stores TOS into the cell contained in slot i of the cell and free variable storage.

LOAD_CLOSURE(i)
Pushes a reference to the cell contained in slot i of the cell and free variable storage. The name of the variable is co_cellvars[i] if i is less than the length of co_cellvars. Otherwise it is co_freevars[i - len(co_cellvars)].

MAKE_CLOSURE(argc)
Creates a new function object, sets its func_closure slot, and pushes it on the stack. TOS is the code associated with the function, TOS1 the tuple containing cells for the closure’s free variables. The function also has argc default parameters, which are found below the cells.

看来是编译器发现foo函数里面有一个嵌套的bar函数以后,就把在bar中引用的局部变量a放到一个cell当中,然后将所有的对象都生成成一个tuple,赋值给bar这个funcobject的func_closure。

为了查看神奇的效果,写下面一段代码运行一下看看:

def foo():
    a = 1
    def bar():
        b = a + 1
    return bar  

b = foo()
print 'bar func_closure:', b.func_closure

如果这程序按照猜测的结果运行,那么将会返回一个cell的tuple。执行结果如下。

bar func_closure: (<cell at 0x454690: int object at 0x803388>,)

果然不出所料。那么func_closure的作用在文档里面怎么描述呢?

func_closure None or a tuple of cells that contain bindings for the function’s free variables. Read-only

看来这个东东涉及到的是Python的名字查找顺序的问题。先local,再闭包,再global。

详细内容可以参看PEP227里面有这么一句话。

The implementation adds several new opcodes and two new kinds of

names in code objects. A variable can be either a cell variable
or a free variable for a particular code object. A cell variable
is referenced by containing scopes; as a result, the function
where it is defined must allocate separate storage for it on each
invocation. A free variable is referenced via a function’s
closure.

The choice of free closures was made based on three factors.

First, nested functions are presumed to be used infrequently,
deeply nested (several levels of nesting) still less frequently.
Second, lookup of names in a nested scope should be fast.
Third, the use of nested scopes, particularly where a function
that access an enclosing scope is returned, should not prevent
unreferenced objects from being reclaimed by the garbage
collector.

相信看到前面func_closure是readonly,大家一定非常失望。看看别的语言的实现如何。

javascript的版本1。

function foo(){
    var num = 1;
    function bar(){
        var num = num + 1;
        alert(num);
    }
    bar()
}
foo();

这个版本会报NaN。。说明Python的问题Javascipt也有。

那如果说num不声明为var呢?

function foo(){
    var num = 1;
    function bar(){
        num = num + 1;
        alert(num);
    }
    bar()
}
foo();

正确提示2.。

要是Python也有这样的机制好了。。

令人高兴的是,python3里面终于改观了。从语法到底层全都支持了(貌似是一个性质)。

语法上加上了nonlocal关键字。

def foo():
    a = 1
    def bar():
        nonlocal a
        a = a + 1
        print(a)
    return bar  

foo()()

正确返回2!!

底层加上了可爱的下面两个函数。

PyObject* PyFunction_GetClosure(PyObject *op)¶
Return value: Borrowed reference.
Return the closure associated with the function object op. This can be NULL or a tuple of cell objects.
int PyFunction_SetClosure(PyObject *op, PyObject *closure)
Set the closure associated with the function object op. closure must be Py_None or a tuple of cell objects.
Raises SystemError and returns -1 on failure.

终于可以操作闭包了。哈哈哈哈。。

其实说到最后,如果python中有种机制能支持匿名代码块就好了。嘿嘿。到此结束。

Written by linluxiang

二月 23rd, 2011 at 7:13 下午

Posted in Python,技术

MacOSX下Python2.5版本的locale的编码问题

without comments

此文来自本人JavaEye博客,原文地址

今天更新mercurial的时候遇到了一个问题。

执行hg,结果报错:LookupError: unknown encoding: x-mac-simp-chinese

想到这个问题我以前在用django的时候碰到过,原来以为是django的问题,现在才知道原来是普遍的python的问题。

去hg的源代码里面minirst.py里面看了一下,发现是直接调用mercurial的encoding函数的encoding这个变量。

找到encoding.py里面,

try:
    encoding = os.environ.get("HGENCODING")
    if not encoding:
        encoding = locale.getpreferredencoding() or 'ascii'
        encoding = _encodingfixers.get(encoding, lambda: encoding)()
except locale.Error:
    encoding = 'ascii'

原来是locale这个模块搞的鬼。。

去locale.py里面看了一下,发现以下代码:

if sys.platform in ('win32', 'darwin', 'mac'):
    # On Win32, this will return the ANSI code page
    # On the Mac, it should return the system encoding;
    # it might return "ascii" instead
    def getpreferredencoding(do_setlocale = True):
        """Return the charset that the user is likely using."""
        import _locale
        return _locale._getdefaultlocale()[1]

尝试执行了一下,直接返回了’x-mac-simp-chinese’

为了了解正确的结果,python2.6 -c ‘import locale; print(locale.getpreferredencoding());’返回结果’UTF-8′.

而UTF-8正是我设置的LC_ALL和LANG的结果。

看来是这个_locale模块搞得鬼。不过_locale啊。看名字就是c写的。为了省力。直接把

if sys.platform in ('win32', 'darwin', 'mac'):

改成了

if sys.platform in ('win32'):

然后顺手搜索了一下locale.py中的_locale,把所有的都改了。

执行hg,一切正常。

顺带搜了一下这个问题python的buglist里面有没有,果然看到了。http://bugs.python.org/issue1276。不过略看了一下,发现python2.5.x被无情的忽略了。看来只能自己hack了。:)。

Written by linluxiang

二月 23rd, 2011 at 6:58 下午

Posted in Python,技术

Tagged with ,

Python中globals对象的回收顺序分析

without comments

此文来自本人原JavaEye博客,原文地址

先提示,本文需要一定的python源码基础。许多内容请参考《python源码剖析》。下面切入正题。

今天在群里有人问了一个问题。形如如下的一段程序。

class person:
    sum = 0
    def __init__(self,name):
        self.name=name
        person.sum += 1  

    def __del__(self):
        person.sum -= 1
        print "%s is leaving" % self.name  

a = person('a')
a2 = person('a2')

这段程序的预期的执行结果应该是”a is leaving”和”a2 is leaving”。但是实际上却出乎意料之外,实际的执行结果如下:

a is leaving
Exception exceptions.AttributeError: "'NoneType' object has no attribute 'sum'" in
<bound method person.__del__ of <__main__.person instance at 0x4a18f0>> ignored

为什么引用的名字不同造成的结果会有这么大的差别呢?

分析表面的原因,是person这个引用被指向了None。那他是不是真的是None呢?

__main__.person
None
Exception exceptions.AttributeError: "'NoneType' object has no attribute 'sum'"
 in <bound method person.__del__ of <__main__.person instance at 0x4a18c8>> ignored

看来是真的变成了None了。

初步分析原因,应该是程序在执行结束以后,python虚拟机清理环境的时候将”person”这个符号先于”a2″清理了,所以导致在a2的析构函数中无法找到”person”这个符号了。

但是转念一想还是不对,如果是”person”符号找不到了,应该是提示“name ‘person’ is not defined”才对。说明”person”这个符号还在,那”person”指向的class_object对象还在吗?改变程序为以下格式:

class person:
    sum = 0
    def __init__(self,name):
        self.name=name
        person.sum += 1  

    def __del__(self):
        #person.sum -= 1
        self.__class__.sum -= 1 #1
        #print "%s is leaving" % self.name  

a = person('a')
a2 = person('a2')

#1就是修改部分,利用自身的__class__来操作。运行结果一切正常。

说明python虚拟机在回收的过程中,只是将”person”这个符号设置成None了。这个结论同时带来2个问题:第一,为什么会设置成None?第二:为什么”person”会先于”a2″而晚于”a”被回收?

先来分析第二个问题。第一反应是不是按照字母的顺序来回收?但是马上这个结论被推翻。”a”和”a2″都在”person”的前面。那么难道是根据globals()这个字典的key顺序来回收?执行一下globals().keys()方法,得到以下结果:

['a', '__builtins__', '__file__', 'person', 'a2', '__name__', '__doc__'];

看来的确是这样。

但是为什么是这样?要得出这个结论,看来只有去python源码中找答案了。

大家都知道,python代码在运行的时候,会存在一个frameobject对象来表示运行时的环境。类似于c语言的栈帧,也有点像lisp的函数的生存空间,看起来答案要从frameobject.c这个文件中去找了。

在frameobject.c中发现了一个函数:static void frame_dealloc(PyFrameObject *f)。看来解决问题的关键就在眼前。

在frame_dealloc里面截取了以下一段代码:

Py_XDECREF(f->f_back);
Py_DECREF(f->f_builtins);
Py_DECREF(f->f_globals);//1
Py_CLEAR(f->f_locals);
Py_CLEAR(f->f_trace);
Py_CLEAR(f->f_exc_type);
Py_CLEAR(f->f_exc_value);
Py_CLEAR(f->f_exc_traceback);

原来减少了引用啊。。关于Py_DECREF这个宏,python源码里面的解释是这样的:

The macros Py_INCREF(op) and Py_DECREF(op) are used to increment or decrement reference counts. Py_DECREF calls the object’s deallocator function when the refcount falls to 0;

这么说来,我们就要去找f_globals的析构函数了。f_globals是个什么呢?当然是PyDictObject了。证据么遍地都是啊,比如随手找了一个,在PyFrameObject * PyFrame_New(PyThreadState *tstate, PyCodeObject *code, PyObject *globals,PyObject *locals)这个函数里面有一段代码:

#ifdef Py_DEBUG
    if (code == NULL || globals == NULL || !<span style="color: #ff0000;">PyDict_Check(globals)</span> ||
        (locals != NULL && !PyMapping_Check(locals))) {
        PyErr_BadInternalCall();
        return NULL;
    }
#endif

PyDict_Check。。。检查是否是Dict对象。好吧,此处略过,直接奔向dictobject.c看看里面的代码。

static void
dict_dealloc(register dictobject *mp)
{
    register dictentry *ep;
    Py_ssize_t fill = mp->ma_fill;
    PyObject_GC_UnTrack(mp);
    Py_TRASHCAN_SAFE_BEGIN(mp)
    <span style="color: #ff0000;">for (ep = mp->ma_table; fill > 0; ep++) {
        if (ep->me_key) {
            --fill;
            Py_DECREF(ep->me_key);  #
            Py_XDECREF(ep->me_value); #仅仅只是引用计数减一
        }
    }</span>
//以下略

哈哈哈。还真是按照key的顺序来一个一个清除的。

不过,怎么又回到了Py_DECREF啊?

看来最终解释这个问题要回到GC上面了。

其实从这个地方也可以看出第一个问题的答案了,为什么是None?

从上面代码可以看出,dictobject对象在析构的时候,仅仅只是将value的引用计数减一,至于这个对象什么时候被真正回收,其实是由GC决定而不确定的。也就是说为什么是None,是因为减一了以后,凑巧GC到了而已。

根据Python本身的文档。

Warning: Due to the precarious circumstances under which __del__() methods are invoked, exceptions that occur during their execution are ignored, and a warning is printed to sys.stderr instead. Also, when __del__() is invoked in response to a module being deleted (e.g., when execution of the program is done), other globals referenced by the __del__() method may already have been deleted. For this reason, __del__() methods should do the absolute minimum needed to maintain external invariants. Starting with version 1.5, Python guarantees that globals whose name begins with a single underscore are deleted from their module before other globals are deleted; if no other references to such globals exist, this may help in assuring that imported modules are still available at the time when the __del__() method is called.

Python不能保证_del被调用的时候所有的引用都有,所以,尽量不要overried类的del_方法。

到此结束。

Written by linluxiang

二月 23rd, 2011 at 6:51 下午

Posted in Python,技术

Tagged with